Re: ANALYZE patch for review - Mailing list pgsql-patches
From | Mark Cave-Ayland |
---|---|
Subject | Re: ANALYZE patch for review |
Date | |
Msg-id | 8F4A22E017460A458DB7BBAB65CA6AE502654D@openmanage Whole thread Raw |
In response to | ANALYZE patch for review ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>) |
Responses |
Re: ANALYZE patch for review
|
List | pgsql-patches |
Hi Tom, > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 29 January 2004 15:31 > To: Mark Cave-Ayland > Cc: pgsql-patches@postgresql.org > Subject: Re: [PATCHES] ANALYZE patch for review > > <lots cut about pointers> OK, I've had another attempt at writing the code as you suggested but the more I work on it the less I like it :(. What I would like to do is make the VacAttrStats structure so that it just contains the information that is updated in the pg_statistic table, however this fell apart when I realised that update_attstats() suddenly requires the attr and attrtype fields to be present. Doh. So I'd like to propose a slightly different solution. I think that examine_attribute() should return a pointer to a custom structure containing any information that needs to be passed to the datatype specific routine (not the entire VacAttrStats structure), or NULL if the column should not be analyzed. I'm also considering changing the examine_attribute() input parameters to be Relation, Attribute, Type for the current column along with a pointer to a bool to indicate whether or not the column should be analyzed or not. If examine_attribute() sets the bool to false then the column is ignored. If the bool is set to true then a VacAttrStats structure is created in memory, and then the Attribute and Type tuple information is copied into the VacAttrStats structure. A new field for VacAttrStats will contain the pointer to the custom structure returned by examine_attribute() which can then be passed into the compute_*_stats() functions as an extra parameter. This seems to achieve the aims of abstracting the statistics data from the intermediate information required by the statistics routines, allowing extra/custom data to be passed between the typanalyze function and the statistics algorithm, and allowing the user to have the attr and attrtype structures given to them. The only thing I don't really like about this is providing a pointer to a bool in examine_attribute() - however this is needed to distinguish from a NULL meaning 'I have no custom data but the analyze function should still be called' and 'This column should not be analyzed'. I can't think of a better solution at the moment. > > I'm beginning to think that perhaps we're looking at this > in the wrong > > way, and that a more elegant version of what you're > suggesting could > > be implemented using a major/minor method of identifying a > statistics > > type. > > If you suppose that the "major" field is the upper bits of > the statistics ID value, then this is just a slightly > different way of thinking about the range-based allocation > method I suggested before. However, the range-based method > can adapt to allocating different amounts of identifier space > to different owners, whereas a major/minor approach can't > easily do that since you've defined it to be 2^N minor IDs > for each major code. I was thinking perhaps in terms of an extra staowner int2 field in pg_statistic where the IDs are allocated by the PGDG. Then each group/project would only require one owner id to be allocated to them and then have the existing 2^16 stakind space to organise themselves. The advantage of this is that projects can allocate their own stakind fields, implementing new or improved statistic algorithms without having to wait on the new allocation from the PGDG. Many thanks, Mark. --- Mark Cave-Ayland Webbased Ltd. Tamar Science Park Derriford Plymouth PL6 8BX England Tel: +44 (0)1752 764445 Fax: +44 (0)1752 764446 This email and any attachments are confidential to the intended recipient and may also be privileged. If you are not the intended recipient please delete it from your system and notify the sender. You should not copy it or use it for any purpose nor disclose or distribute its contents to any other person.
pgsql-patches by date: