Thread: Performance issue in foreign-key-aware join estimation
In connection with David Rowley's proposal to change bitmapset.c to use 64-bit words, I dug out an old test case I had for a complex-to-plan query (attached). Andres Freund posted this to the lists perhaps ten years ago, though I can't locate the original posting right now. I was distressed to discover via perf that 69% of the runtime of this test now goes into match_eclasses_to_foreign_key_col(). That seems clearly unacceptable. There aren't an unreasonable number of foreign key constraints in the underlying schema --- mostly one per table, and there's only one table with as many as 3. However, poking around in the planner data structures, it turns out there are: 888 base relations 1005 EquivalenceClasses 167815 fkey_list entries initially 690 fkey_list entries after match_foreign_keys_to_quals trims them So the reason match_eclasses_to_foreign_key_col is so dominant in the runtime is it's invoked 167815 times and has to scan a thousand EquivalenceClasses (unsuccessfully) on most of those calls. How did the fkey_list get that big? I think the issue is that the query touches the same tables many many times (888 baserels, but there are only 20 distinct tables in the schema) and we get an O(N^2) growth in the apparent number of FKs. Clearly, we ought to rethink that data structure. I'm not sure offhand how to make it better, but this is pretty awful. Perhaps there'd also be some use in having better indexing for the EquivalenceClass list, but again I'm not sure what that'd look like. regards, tom lane \set ON_ERROR_STOP on DROP SCHEMA IF EXISTS test_data CASCADE; DROP SCHEMA IF EXISTS test_view CASCADE; CREATE SCHEMA test_data; CREATE SCHEMA test_view; SET SEARCH_PATH = test_data, test_view; CREATE TABLE proband ( proband_id bigserial PRIMARY KEY ); CREATE TABLE sample ( sample_id bigserial PRIMARY KEY ); CREATE TABLE proband__sample ( proband_id bigint NOT NULL REFERENCES proband, sample_id bigint NOT NULL REFERENCES sample, UNIQUE(proband_id, sample_id) ); CREATE TABLE project ( project_id bigserial PRIMARY KEY, name text NOT NULL UNIQUE ); /* * Stuff like: * -Double Entry * -Single Entry * -Machine Read * - ... */ CREATE TABLE data_quality ( data_quality_id bigserial PRIMARY KEY, name text NOT NULL UNIQUE, description text ); CREATE TABLE information_set ( information_set_id bigserial PRIMARY KEY, name text NOT NULL UNIQUE, description text ); CREATE TABLE information ( information_id bigserial PRIMARY KEY, information_set_id bigint NOT NULL REFERENCES information_set, name text NOT NULL UNIQUE, description text ); CREATE INDEX information__information_set_id ON information (information_set_id); CREATE TABLE information_set_instance ( information_set_instance_id bigserial PRIMARY KEY, information_set_id bigint NOT NULL REFERENCES information_set ); CREATE INDEX information_set_instance__information_set_id ON information_set_instance (information_set_id); CREATE TABLE information_set_instance__proband ( information_set_instance_id bigint NOT NULL REFERENCES information_set_instance, proband_id bigint NOT NULL REFERENCES proband, UNIQUE (information_set_instance_id, proband_id) ); CREATE INDEX information_set_instance__proband__information_set_id ON information_set_instance__proband (information_set_instance_id); CREATE INDEX information_set_instance__proband__proband_id ON information_set_instance__proband (proband_id); CREATE TABLE information_set_instance__sample ( information_set_instance_id bigint NOT NULL REFERENCES information_set_instance, sample_id bigint NOT NULL REFERENCES sample, UNIQUE (information_set_instance_id, sample_id) ); CREATE INDEX information_set_instance__sample__information_set_id ON information_set_instance__sample (information_set_instance_id); CREATE INDEX information_set_instance__sample__sample_id ON information_set_instance__sample (sample_id); CREATE TABLE information_instance ( information_instance_id bigserial PRIMARY KEY, information_set_instance_id bigint NOT NULL REFERENCES information_set_instance, information_id bigint NOT NULL REFERENCES information, data_quality_id int REFERENCES data_quality ); CREATE INDEX information_instance__information_set_instance_id ON information_set_instance(information_set_instance_id); CREATE INDEX information_instance__information_id ON information(information_id); CREATE TABLE information_about_tnm ( information_about_tnm bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, t int, n int, m int ); CREATE INDEX information_about_tnm__information_instance_id ON information_about_tnm(information_instance_id); CREATE TABLE information_about_sequenced_data ( information_about_sequenced_data bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, filename text, data text ); CREATE INDEX information_about_sequenced_data__information_instance_id ON information_about_sequenced_data(information_instance_id); CREATE TABLE information_about_time ( information_about_sequenced_data bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, data timestamp ); CREATE INDEX information_about_time__information_instance_id ON information_about_time(information_instance_id); CREATE TABLE information_about_text ( information_about_text_data bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, data text ); CREATE INDEX information_about_text__information_instance_id ON information_about_text(information_instance_id); /* * Normally refers to a table containing icd10 classiffication codes */ CREATE TABLE information_about_icd10_classification ( information_about_icd10_classification_data bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, data int ); CREATE INDEX information_about_icd10_classification__information_instance_id ON information_about_icd10_classification(information_instance_id); CREATE TABLE information_about_location ( information_about_allowance bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, street text, building text -- Additional Attributes ); CREATE INDEX information_about_location__information_instance_id ON information_about_location(information_instance_id); CREATE TABLE information_about_allowance ( information_about_allowance bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, granted bool ); CREATE INDEX information_about_allowance__information_instance_id ON information_about_allowance(information_instance_id); CREATE TABLE information_about_clinic ( information_about_clinic bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, clinic_id int ); CREATE INDEX information_about_clinic__information_instance_id ON information_about_allowance(information_instance_id); CREATE TABLE information_about_personel ( information_about_personel bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, name text ); CREATE INDEX information_about_personel__information_instance_id ON information_about_allowance(information_instance_id); /* * Lots of other * information_about_* tables */ CREATE TABLE information_about_placeholder ( information_about_personel bigserial PRIMARY KEY, information_instance_id bigint REFERENCES information_instance, data text ); CREATE INDEX information_about_placeholder__information_instance_id ON information_about_allowance(information_instance_id); /* * Views */ DROP SCHEMA IF EXISTS test_view CASCADE; CREATE SCHEMA test_view; SET SEARCH_PATH = test_view, test_data; CREATE OR REPLACE VIEW information_set_completition_status AS SELECT information_set.information_set_id, information_set_instance.information_set_instance_id, COUNT(information.information_id) AS possible_information_nr, COUNT(information_instance.information_id) AS available_information_nr FROM information_set_instance JOIN information_set USING(information_set_id) JOIN information USING(information_set_id) LEFT JOIN information_instance USING(information_id) GROUP BY information_set.information_set_id, information_set_instance.information_set_instance_id; CREATE OR REPLACE VIEW information_generic_allowance AS SELECT information_set_instance__proband.proband_id ,information_set.information_set_id ,information_set_instance.information_set_instance_id ,generic_allowance.granted AS generic_allowance_granted ,anonymized_use_noncommercial_research.granted AS anonymized_use_noncommercial_research_granted ,anonymized_use_commercial_research.granted AS anonymized_use_commercial_research_granted ,contact_in_noncommercial_research.granted AS contact_in_noncommercial_research_granted ,contact_in_commercial_research.granted AS contact_in_commercial_research_granted ,information_6.granted AS information_6_granted ,information_7.granted AS information_7_granted ,information_8.granted AS information_8_granted ,information_9.granted AS information_9_granted ,information_10.granted AS information_10_granted ,information_11.granted AS information_11_granted FROM information_set_instance__proband JOIN information_set_instance USING (information_set_instance_id) JOIN information_set USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'generic_allowance' ) AS generic_allowance USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'anonymized_use_noncommercial_research' ) AS anonymized_use_noncommercial_research USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'anonymized_use_commercial_research' ) AS anonymized_use_commercial_research USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'contact_in_noncommercial_research' ) AS contact_in_noncommercial_research USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'contact_in_commercial_research' ) AS contact_in_commercial_research USING (information_set_id) /* * All further joins have pointless names to ease the writing */ LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_5' ) AS information_6 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_6' ) AS information_7 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_7' ) AS information_8 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_8' ) AS information_9 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_9' ) AS information_10 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_10' ) AS information_11 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_11' ) AS information_12 USING (information_set_id) WHERE TRUE AND information_set.name = 'generic_allowance_v3'; CREATE OR REPLACE VIEW information_genetic_allowance AS SELECT information_set_instance__proband.proband_id ,information_set.information_set_id ,information_set_instance.information_set_instance_id ,notification_incurable_illness_found.granted AS notification_incurable_illness_found_granted ,notification_curable_illness_found.granted AS notification_curable_illness_found_granted ,notification_inheritable_illness_found.granted AS notification_inheritable_illness_found_granted ,use_of_genetic_data_if_ill_internal.granted AS use_of_genetic_data_if_ill_internal_granted ,use_of_genetic_data_if_ill_external.granted AS use_of_genetic_data_if_ill_external_granted ,information_6.granted AS information_6_granted ,information_7.granted AS information_7_granted ,information_8.granted AS information_8_granted ,information_9.granted AS information_9_granted ,information_10.granted AS information_10_granted ,information_11.granted AS information_11_granted ,information_12.granted AS information_12_granted ,information_13.granted AS information_13_granted ,information_14.granted AS information_14_granted ,information_15.granted AS information_15_granted ,information_16.granted AS information_16_granted ,information_17.granted AS information_17_granted ,information_18.granted AS information_18_granted ,information_19.granted AS information_19_granted ,information_20.granted AS information_20_granted ,information_21.granted AS information_21_granted FROM information_set_instance__proband JOIN information_set_instance USING (information_set_instance_id) JOIN information_set USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'notification_incurable_illness_found' ) AS notification_incurable_illness_found USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'notification_curable_illness_found' ) AS notification_curable_illness_found USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'notification_inheritable_illness_found' ) AS notification_inheritable_illness_found USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'use_of_genetic_data_if_ill_internal' ) AS use_of_genetic_data_if_ill_internal USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'use_of_genetic_data_if_ill_external' ) AS use_of_genetic_data_if_ill_external USING (information_set_id) /* * All further joins have pointless names to ease the writing */ LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_6' ) AS information_6 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_7' ) AS information_7 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_8' ) AS information_8 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_9' ) AS information_9 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_10' ) AS information_10 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_11' ) AS information_11 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_12' ) AS information_12 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_13' ) AS information_13 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_14' ) AS information_14 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_15' ) AS information_15 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_16' ) AS information_16 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_17' ) AS information_17 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_18' ) AS information_18 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_19' ) AS information_19 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_20' ) AS information_20 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_allowance.granted FROM information JOIN information_instance USING (information_id) JOIN information_about_allowance USING (information_instance_id) WHERE true AND information.name = 'information_21' ) AS information_21 USING (information_set_id) WHERE TRUE AND information_set.name = 'genetic_allowance_v3'; CREATE OR REPLACE VIEW information_patient_diagnosis AS SELECT information_set_instance__proband.proband_id ,information_set.information_set_id ,information_set_instance.information_set_instance_id ,diagnosis_icd10.data AS diagnosis_icd10 FROM information_set_instance__proband JOIN information_set_instance USING (information_set_instance_id) JOIN information_set USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_icd10_classification.data FROM information JOIN information_instance USING (information_id) JOIN information_about_icd10_classification USING (information_instance_id) WHERE true AND information.name = 'diagnosis' ) AS diagnosis_icd10 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_time.data FROM information JOIN information_instance USING (information_id) JOIN information_about_time USING (information_instance_id) WHERE true AND information.name = 'diagnosis_date' ) AS diagnosis_date USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_clinic.clinic_id FROM information JOIN information_instance USING (information_id) JOIN information_about_clinic USING (information_instance_id) WHERE true AND information.name = 'diagnosing_clinic' ) AS diagnosis_clinic USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_personel.name FROM information JOIN information_instance USING (information_id) JOIN information_about_personel USING (information_instance_id) WHERE true AND information.name = 'diagnosing_personel' ) AS diagnosis_personel USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_1' ) AS information_1 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_2' ) AS information_2 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_3' ) AS information_3 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_4' ) AS information_4 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_5' ) AS information_5 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_6' ) AS information_6 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_7' ) AS information_7 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_8' ) AS information_8 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_9' ) AS information_9 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_10' ) AS information_10 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_11' ) AS information_11 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_12' ) AS information_12 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_13' ) AS information_13 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_14' ) AS information_14 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_15' ) AS information_15 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_16' ) AS information_16 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_17' ) AS information_17 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_18' ) AS information_18 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_19' ) AS information_19 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_20' ) AS information_20 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_21' ) AS information_21 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_22' ) AS information_22 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_23' ) AS information_23 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_24' ) AS information_24 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_25' ) AS information_25 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_26' ) AS information_26 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_27' ) AS information_27 USING (information_set_id) WHERE TRUE AND information_set.name = 'patient_diagnosis_plain'; CREATE OR REPLACE VIEW information_patient_placeholder AS SELECT information_set_instance__proband.proband_id ,information_set.information_set_id ,information_set_instance.information_set_instance_id ,information_1.data AS information_1_data ,information_2.data AS information_2_data ,information_3.data AS information_3_data ,information_4.data AS information_4_data ,information_5.data AS information_5_data ,information_6.data AS information_6_data ,information_7.data AS information_7_data ,information_8.data AS information_8_data ,information_9.data AS information_9_data ,information_10.data AS information_10_data ,information_11.data AS information_11_data ,information_12.data AS information_12_data ,information_13.data AS information_13_data ,information_14.data AS information_14_data ,information_15.data AS information_15_data ,information_16.data AS information_16_data ,information_17.data AS information_17_data ,information_18.data AS information_18_data ,information_19.data AS information_19_data ,information_20.data AS information_20_data ,information_21.data AS information_21_data ,information_22.data AS information_22_data ,information_23.data AS information_23_data ,information_24.data AS information_24_data ,information_25.data AS information_25_data ,information_26.data AS information_26_data ,information_27.data AS information_27_data FROM information_set_instance__proband JOIN information_set_instance USING (information_set_instance_id) JOIN information_set USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_1' ) AS information_1 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_2' ) AS information_2 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_3' ) AS information_3 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_4' ) AS information_4 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_5' ) AS information_5 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_6' ) AS information_6 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_7' ) AS information_7 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_8' ) AS information_8 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_9' ) AS information_9 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_10' ) AS information_10 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_11' ) AS information_11 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_12' ) AS information_12 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_13' ) AS information_13 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_14' ) AS information_14 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_15' ) AS information_15 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_16' ) AS information_16 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_17' ) AS information_17 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_18' ) AS information_18 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_19' ) AS information_19 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_20' ) AS information_20 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_21' ) AS information_21 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_22' ) AS information_22 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_23' ) AS information_23 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_24' ) AS information_24 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_25' ) AS information_25 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_26' ) AS information_26 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_27' ) AS information_27 USING (information_set_id) WHERE TRUE AND information_set.name = 'patient_placeholder'; CREATE OR REPLACE VIEW information_sample_placeholder AS SELECT information_set_instance__sample.sample_id ,information_set.information_set_id ,information_set_instance.information_set_instance_id ,information_1.data AS information_1_data ,information_2.data AS information_2_data ,information_3.data AS information_3_data ,information_4.data AS information_4_data ,information_5.data AS information_5_data ,information_6.data AS information_6_data ,information_7.data AS information_7_data ,information_8.data AS information_8_data ,information_9.data AS information_9_data ,information_10.data AS information_10_data ,information_11.data AS information_11_data ,information_12.data AS information_12_data ,information_13.data AS information_13_data ,information_14.data AS information_14_data ,information_15.data AS information_15_data ,information_16.data AS information_16_data ,information_17.data AS information_17_data ,information_18.data AS information_18_data ,information_19.data AS information_19_data ,information_20.data AS information_20_data ,information_21.data AS information_21_data ,information_22.data AS information_22_data ,information_23.data AS information_23_data ,information_24.data AS information_24_data ,information_25.data AS information_25_data ,information_26.data AS information_26_data ,information_27.data AS information_27_data FROM information_set_instance__sample JOIN information_set_instance USING (information_set_instance_id) JOIN information_set USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_1' ) AS information_1 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_2' ) AS information_2 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_3' ) AS information_3 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_4' ) AS information_4 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_5' ) AS information_5 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_6' ) AS information_6 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_7' ) AS information_7 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_8' ) AS information_8 USING (information_set_id) JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_9' ) AS information_9 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_10' ) AS information_10 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_11' ) AS information_11 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_12' ) AS information_12 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_13' ) AS information_13 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_14' ) AS information_14 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_15' ) AS information_15 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_16' ) AS information_16 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_17' ) AS information_17 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_18' ) AS information_18 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_19' ) AS information_19 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_20' ) AS information_20 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_21' ) AS information_21 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_22' ) AS information_22 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_23' ) AS information_23 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_24' ) AS information_24 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_25' ) AS information_25 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_26' ) AS information_26 USING (information_set_id) LEFT JOIN ( SELECT information.information_set_id, information_about_placeholder.data FROM information JOIN information_instance USING (information_id) JOIN information_about_placeholder USING (information_instance_id) WHERE true AND information.name = 'information_27' ) AS information_27 USING (information_set_id) WHERE TRUE AND information_set.name = 'sample_placeholder'; SET SEARCH_PATH = test_data, test_view; /* * moderately complex query */ explain SELECT * FROM proband JOIN proband__sample USING(proband_id) JOIN sample USING (sample_id) JOIN information_generic_allowance USING (proband_id) JOIN information_genetic_allowance USING (proband_id) JOIN information_patient_diagnosis diag_1 USING (proband_id) JOIN information_patient_diagnosis diag_2 USING (proband_id) JOIN information_patient_placeholder patient_histology USING (proband_id) JOIN information_sample_placeholder sample_rna_experiment_1_status USING (sample_id) JOIN information_sample_placeholder sample_rna_experiment_1_data USING (sample_id) JOIN information_sample_placeholder sample_rna_experiment_2_status USING (sample_id) JOIN information_sample_placeholder sample_rna_experiment_2_data USING (sample_id) JOIN information_sample_placeholder sample_rna_experiment_3_status USING (sample_id) JOIN information_sample_placeholder sample_rna_experiment_3_data USING (sample_id) WHERE TRUE AND information_generic_allowance.generic_allowance_granted = true AND information_genetic_allowance.information_6_granted = true AND diag_1.diagnosis_icd10 = 1343 /*some icd code*/ AND diag_2.diagnosis_icd10 = 1344 /*another icd code*/ AND ( SELECT available_information_nr / available_information_nr FROM information_set_completition_status WHERE information_set_completition_status.information_set_instance_id = patient_histology.information_set_instance_id ) > 0.8 AND ( SELECT available_information_nr / available_information_nr FROM information_set_completition_status WHERE information_set_completition_status.information_set_instance_id = diag_1.information_set_instance_id ) > 0.5 AND ( SELECT available_information_nr / available_information_nr FROM information_set_completition_status WHERE information_set_completition_status.information_set_instance_id = diag_2.information_set_instance_id ) > 0.5;
On Fri, 21 Dec 2018 at 06:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I was distressed to discover via perf that 69% of the runtime of this > test now goes into match_eclasses_to_foreign_key_col(). That seems > clearly unacceptable. Agreed. That's pretty terrible. I looked at this a bit and see that match_eclasses_to_foreign_key_col() is missing any smarts to skip equivalence classes that don't have ec_relids bits for both rels. With that added the run-time is reduced pretty dramatically. I've only tested with a debug build as of now, but I get: Unpatched: $ pgbench -n -T 60 -f query.sql postgres latency average = 18411.604 ms Patched: latency average = 8748.177 ms Going by my profiler this drops match_eclasses_to_foreign_key_col() down to just 10% of total planner time for this query. The new bms_is_member() call is pretty hot inside that function though. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Sat, 22 Dec 2018 at 09:28, David Rowley <david.rowley@2ndquadrant.com> wrote: > Going by my profiler this drops match_eclasses_to_foreign_key_col() > down to just 10% of total planner time for this query. The new > bms_is_member() call is pretty hot inside that function though. I should have said 28% instead of 10%. 10% is the time spent exclusively just in that function (not in a function called by that function). The bms_is_member() call I mention above is about 20% of the total time, which likely includes some other call sites too. Back in [1], I mentioned that I'd like to start moving away from our linked list based implementation of List and start using an array based version instead. If we had this we could easily further improve this code fkey matching code to not even look near equivalence classes that don't contain members for the relations in question with a design something like: 1. Make PlannerInfo->eq_classes an array list instead of an array, this will significantly improve the performance of list_nth(). 2. Have a Bitmapset per relation that indexes which items in eq_classes have ec_members for the given relation. 3. In match_eclasses_to_foreign_key_col() perform a bms_overlap() on the eq_classes index bitmapsets for the relations at either side of the foreign key then perform a bms_next_member() type loop over the result of that in order to skip over the eq_classes items that can't match. Since match_foreign_keys_to_quals() calls match_eclasses_to_foreign_key_col() once for each item in PlannerInfo->fkey_list (167815 items in this case) and again for each foreign key column in those keys, then this should significantly reduce the effort required since we make a pass over *every* equivalence class each time match_eclasses_to_foreign_key_col() gets called. This array list implementation is something I did want to get one for PG12. The height of the bar to do this is pretty high given what was mentioned in [2]. [1] https://www.postgresql.org/message-id/CAKJS1f_2SnXhPVa6eWjzy2O9A%3Docwgd0Cj-LQeWpGtrWqbUSDA%40mail.gmail.com [2] https://www.postgresql.org/message-id/21592.1509632225%40sss.pgh.pa.us -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, 22 Dec 2018 at 10:53, David Rowley <david.rowley@2ndquadrant.com> wrote: > Back in [1], I mentioned that I'd like to start moving away from our > linked list based implementation of List and start using an array > based version instead. If we had this we could easily further improve > this code fkey matching code to not even look near equivalence classes > that don't contain members for the relations in question with a design > something like: > > 1. Make PlannerInfo->eq_classes an array list instead of an array, > this will significantly improve the performance of list_nth(). > 2. Have a Bitmapset per relation that indexes which items in > eq_classes have ec_members for the given relation. > 3. In match_eclasses_to_foreign_key_col() perform a bms_overlap() on > the eq_classes index bitmapsets for the relations at either side of > the foreign key then perform a bms_next_member() type loop over the > result of that in order to skip over the eq_classes items that can't > match. Using the above idea, but rather than going to the trouble of storing PlannerInfo->eq_classes as an array type list, if we build the array on the fly inside match_foreign_keys_to_quals(), then build a Bitmapset type index to mark which of the eclasses contains members for each relation, then I can get the run-time for the function down to just 0.89%. Looking at other functions appearing high on the profile I also see; have_relevant_eclass_joinclause() (14%), generate_join_implied_equalities_for_ecs() (23%). I think if we seriously want to improve planning performance when there are many stored equivalence classes, then we need to have indexing along the lines of what I've outlined above. I've attached the patch I used to test this idea. It might be possible to develop this into something committable, perhaps if we invent a new function in equivclass.c that builds the index into a single struct and we pass a pointer to that down to the functions that require the index. Such a function could also optionally skip indexing eclasses such as ones with ec_has_volatile or ec_has_const when the use case for the index requires ignoring such eclasses. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Mon, 24 Dec 2018 at 09:38, David Rowley <david.rowley@2ndquadrant.com> wrote: > Using the above idea, but rather than going to the trouble of storing > PlannerInfo->eq_classes as an array type list, if we build the array > on the fly inside match_foreign_keys_to_quals(), then build a > Bitmapset type index to mark which of the eclasses contains members > for each relation, then I can get the run-time for the function down > to just 0.89%. Looking at other functions appearing high on the > profile I also see; have_relevant_eclass_joinclause() (14%), > generate_join_implied_equalities_for_ecs() (23%). I've now expanded the proof of concept patch to use this indexing technique for have_relevant_eclass_joinclause() and generate_join_implied_equalities_for_ecs(). With Tom's test from up-thread, I get: Master: latency average = 14125.374 ms Patched: latency average = 2417.164 ms There are some other cases, such as generate_implied_equalities_for_column(), that are possibly also indexable, but in that case, we cannot use ec_relids to help build the index since it does not keep track of other member relation equivalence class members. That function is appearing at about 3.3% of total plan time with the patched version of the code, so there's still some small gains to be had there. The performance of has_relevant_eclass_joinclause() could likely also be improved from this indexing. According to my profiling, it's currently still about 2.6% of total planning time with the patched version. I've attached the updated (rough) proof of concept patch. I ended up stuffing the equivalence class index structure into PlannerInfo so that it would be available in all the places it was required, but build just once higher up the call stack. I don't believe this is the correct solution for a finished patch, but I didn't really have any better ideas and this seemed good enough to demonstrate what the performance could look like. Other ideas I have to further improve the performance of this query would be to move the fkey_list out of PlannerInfo and instead include a per-relation list inside RelOptInfo. This would allow get_foreign_key_join_selectivity() to just look at foreign keys that are relevant to the given relations rather than having to skip all foreign keys that are not. This function is still accounting for about 5.5% of the total planning time for this query. I imagine it wouldn't hurt match_foreign_keys_to_quals() too much to have it loop over each RelOptInfo and look at the foreign keys defined on each of those. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 12/24/18 1:07 PM, David Rowley wrote: > On Mon, 24 Dec 2018 at 09:38, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Using the above idea, but rather than going to the trouble of storing >> PlannerInfo->eq_classes as an array type list, if we build the array >> on the fly inside match_foreign_keys_to_quals(), then build a >> Bitmapset type index to mark which of the eclasses contains members >> for each relation, then I can get the run-time for the function down >> to just 0.89%. Looking at other functions appearing high on the >> profile I also see; have_relevant_eclass_joinclause() (14%), >> generate_join_implied_equalities_for_ecs() (23%). > > I've now expanded the proof of concept patch to use this indexing > technique for have_relevant_eclass_joinclause() and > generate_join_implied_equalities_for_ecs(). With Tom's test from > up-thread, I get: > > Master: > latency average = 14125.374 ms > > Patched: > latency average = 2417.164 ms > Yes, I can confirm these measurements. On my machine, timing on master is about 10530ms, with v1 of the patch it drops to 2600ms and v2 pushes it down to 1610ms. I however observe failures on 4 regression test suites - inherit, equivclass, partition_join and partition_prune (diff attached). That's a bit surprising, because AFAICS the patch merely optimizes the execution and should not change the planning otherwise (all the failures seem to be a query plan changing in some way). I'm not sure if the plans are correct or better than the old ones. The other thing is that we probably should not use a single test case to measure the optimization - I doubt it can improve less extreme queries, but maybe we should verify it does not regress them? With the patch attached, bms_overlap gets quite high in the profiles. I think one reason for that is that all bitmap operations (not just bms_overlap) always start the loop at 0. For the upper boundary is usually determined as Min() of the lengths, but we don't do that for lower boundary because we don't track that. The bitmaps for eclasses are likely sparse, so this is quite painful. Attaches is a simple patch that adds tracking of "minword" and uses it in various bms_ methods to skip initial part of the loop. On my machine this reduces the timings by roughly 5% (from 1610 to 1530 ms). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, 25 Dec 2018 at 13:46, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > I however observe failures on 4 regression test suites - inherit, > equivclass, partition_join and partition_prune (diff attached). That's a > bit surprising, because AFAICS the patch merely optimizes the execution > and should not change the planning otherwise (all the failures seem to > be a query plan changing in some way). I'm not sure if the plans are > correct or better than the old ones. Seems I didn't run the tests after doing a last-minute move of the create_eclass_index() call in make_one_rel(). I'd previously had it just below set_base_rel_pathlists(), which meant that the index didn't exist in some cases so it would fall back on the original code. It appears the use case call from set_base_rel_pathlists() require is_child eclass members to be indexed too, but these were not in v1 or v2 since ec_relids does not record child rel members. It seems simple enough to fix this just by adding ec_allrelids and setting it for all members rather than just non-children members. This also allows me to add the two additional cases to allow generate_implied_equalities_for_column() and has_relevant_eclass_joinclause() to also make use of the eclass index. This further reduces the total planning time for the query on my machine to 2304 ms. > The other thing is that we probably should not use a single test case to > measure the optimization - I doubt it can improve less extreme queries, > but maybe we should verify it does not regress them? Agreed. That still needs to be verified. Although I'm yet unsure what sort of form we could use this idea in. I wonder if it's fine to create this eclass index on the fly, or if it's something we should keep maintained all the time. The problem I saw with keeping it all the time is down to eq_classes being a List and list_nth() is slow, which means the bms_next_member() loops I've added would perform poorly when compiled with a linked list lookup. > With the patch attached, bms_overlap gets quite high in the profiles. I > think one reason for that is that all bitmap operations (not just > bms_overlap) always start the loop at 0. For the upper boundary is > usually determined as Min() of the lengths, but we don't do that for > lower boundary because we don't track that. The bitmaps for eclasses are > likely sparse, so this is quite painful. Attaches is a simple patch that > adds tracking of "minword" and uses it in various bms_ methods to skip > initial part of the loop. On my machine this reduces the timings by > roughly 5% (from 1610 to 1530 ms). Seems like an interesting idea, although for the most part, Bitmapsets are small and this adds some overhead to all use cases. I doubt it is worth the trouble for bms_is_member(), since that does not need to perform a loop over each bitmapword. It looks like most of the bms_overlap() usages are caused by functions like have_join_order_restriction(), join_is_legal(), check_outerjoin_delay() and add_paths_to_joinrel(), all of which are performing a loop over the join_info_list. Perhaps that could be indexed a similar way to the eq_classes List. I'd imagine there's about another 20-30% of performance to squeeze out of this by doing that, plus a bit more by making the fkey_list per RelOptInfo. Attached is v3 of the hacked up proof of concept performance demo patch. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 12/25/18 3:48 AM, David Rowley wrote: > On Tue, 25 Dec 2018 at 13:46, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >> I however observe failures on 4 regression test suites - inherit, >> equivclass, partition_join and partition_prune (diff attached). That's a >> bit surprising, because AFAICS the patch merely optimizes the execution >> and should not change the planning otherwise (all the failures seem to >> be a query plan changing in some way). I'm not sure if the plans are >> correct or better than the old ones. > > Seems I didn't run the tests after doing a last-minute move of the > create_eclass_index() call in make_one_rel(). I'd previously had it > just below set_base_rel_pathlists(), which meant that the index didn't > exist in some cases so it would fall back on the original code. It > appears the use case call from set_base_rel_pathlists() require > is_child eclass members to be indexed too, but these were not in v1 or > v2 since ec_relids does not record child rel members. > > It seems simple enough to fix this just by adding ec_allrelids and > setting it for all members rather than just non-children members. > This also allows me to add the two additional cases to allow > generate_implied_equalities_for_column() and > has_relevant_eclass_joinclause() to also make use of the eclass index. > This further reduces the total planning time for the query on my > machine to 2304 ms. > OK, that makes sense. >> The other thing is that we probably should not use a single test case to >> measure the optimization - I doubt it can improve less extreme queries, >> but maybe we should verify it does not regress them? > > Agreed. That still needs to be verified. Although I'm yet unsure what > sort of form we could use this idea in. I wonder if it's fine to > create this eclass index on the fly, or if it's something we should > keep maintained all the time. The problem I saw with keeping it all > the time is down to eq_classes being a List and list_nth() is slow, > which means the bms_next_member() loops I've added would perform > poorly when compiled with a linked list lookup. > Yeah, good questions. I think the simplest thing we could do is building them on the first access - that would at least ensure we don't build the index without accessing it at least once. Of course, it might still be faster to do the check directly, for small numbers of eclasses / fkeys and rels. Perhaps there's some sort of heuristics deciding when to build the indexes, but that seems like an overkill at this point. What I propose is constructing a "minimal" simple query invoking this code (something like two rels with a join on a foreign key) and measure impact on that. That seems like a fairly realistic use case. ALso, I wonder what is te goal here - how much do we need to redude the duration to be happy? Initially I'd say "on par with the state before the FK patch" but I guess we're already optimizing beyond that point. What was the timing before adding the FK stuff? >> With the patch attached, bms_overlap gets quite high in the profiles. I >> think one reason for that is that all bitmap operations (not just >> bms_overlap) always start the loop at 0. For the upper boundary is >> usually determined as Min() of the lengths, but we don't do that for >> lower boundary because we don't track that. The bitmaps for eclasses are >> likely sparse, so this is quite painful. Attaches is a simple patch that >> adds tracking of "minword" and uses it in various bms_ methods to skip >> initial part of the loop. On my machine this reduces the timings by >> roughly 5% (from 1610 to 1530 ms). > > Seems like an interesting idea, although for the most part, Bitmapsets > are small and this adds some overhead to all use cases. I doubt it is > worth the trouble for bms_is_member(), since that does not need to > perform a loop over each bitmapword. Perhaps. The bitmapsets are generally small in number of members, but the values may be actually quite high in some cases (I don't have any exact statistics, though). You're right it adds a bit of overhead, but I'd expect that to be outweighted by the reduction of cache misses. But 5% is fairly close to noise. > It looks like most of the bms_overlap() usages are caused by > functions like have_join_order_restriction(), join_is_legal(), > check_outerjoin_delay() and add_paths_to_joinrel(), all of which are > performing a loop over the join_info_list. Perhaps that could be > indexed a similar way to the eq_classes List. I'd imagine there's > about another 20-30% of performance to squeeze out of this by doing > that, plus a bit more by making the fkey_list per RelOptInfo. > Looks promising. > Attached is v3 of the hacked up proof of concept performance demo patch. > regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, 26 Dec 2018 at 09:50, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > Yeah, good questions. I think the simplest thing we could do is building > them on the first access - that would at least ensure we don't build the > index without accessing it at least once. I think we first need to focus on what is back-patchable here. The problem I see with the equivalence class index idea is that it would require passing the index down into match_eclasses_to_foreign_key_col() which is not a static function, so we can't really go changing its signature on a backbranch. Another idea would be to create a new version of match_eclasses_to_foreign_key_col() which uses the index, which would mean we'd not break any extensions that might happen to use match_eclasses_to_foreign_key_col(). Ideally, the quick fix in the v1 patch would be good enough for the backbranches, but a quick bit of benchmarking shows that there's still a big regression to what the performance is like without the foreign keys. (Average of EXPLAIN over 60 seconds) foreign key qual matching code commented out: 2486.204 ms Master: 13909.551 ms v1 patch: 7310.719 ms -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, 22 Dec 2018 at 09:28, David Rowley <david.rowley@2ndquadrant.com> wrote: > > On Fri, 21 Dec 2018 at 06:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I was distressed to discover via perf that 69% of the runtime of this > > test now goes into match_eclasses_to_foreign_key_col(). That seems > > clearly unacceptable. > > Agreed. That's pretty terrible. > > I looked at this a bit and see that > match_eclasses_to_foreign_key_col() is missing any smarts to skip > equivalence classes that don't have ec_relids bits for both rels. With > that added the run-time is reduced pretty dramatically. > > I've only tested with a debug build as of now, but I get: > > Unpatched: > > $ pgbench -n -T 60 -f query.sql postgres > latency average = 18411.604 ms > > Patched: > latency average = 8748.177 ms So that this does not get lost, I've added an entry for the original patch for the March commitfest. While the patch does not bring the performance back to what it was before this code was added, it makes a massive dent in the additional overhead. https://commitfest.postgresql.org/22/1984/ -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, 5 Feb 2019 at 22:43, David Rowley <david.rowley@2ndquadrant.com> wrote: > So that this does not get lost, I've added an entry for the original > patch for the March commitfest. Attaching the original patch again so the commitfest bot gets off my back about the other one not applying. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David Rowley <david.rowley@2ndquadrant.com> writes: > Attaching the original patch again so the commitfest bot gets off my > back about the other one not applying. Pushed that one. I'm interested by the "POC" patch, but I agree that it'd take some research to show that it isn't a net negative for simple queries. It sounds like you're not really interested in pursuing that right now? Anyway, I rebased the POC patch up to HEAD, just in case anyone still wants to play with it. regards, tom lane diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 65302fe..2ffa00c 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2351,6 +2351,7 @@ _outEquivalenceClass(StringInfo str, const EquivalenceClass *node) WRITE_NODE_FIELD(ec_sources); WRITE_NODE_FIELD(ec_derives); WRITE_BITMAPSET_FIELD(ec_relids); + WRITE_BITMAPSET_FIELD(ec_allrelids); WRITE_BOOL_FIELD(ec_has_const); WRITE_BOOL_FIELD(ec_has_volatile); WRITE_BOOL_FIELD(ec_below_outer_join); diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 0debac7..c5888b8 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -216,6 +216,8 @@ make_one_rel(PlannerInfo *root, List *joinlist) } root->total_table_pages = total_pages; + root->eq_index = create_eclass_index(root, EQUIVALANCE_IDX_MULTI_MEMBER); + /* * Generate access paths for each base rel. */ @@ -226,6 +228,9 @@ make_one_rel(PlannerInfo *root, List *joinlist) */ rel = make_rel_from_joinlist(root, joinlist); + free_eclass_index(root->eq_index); + root->eq_index = NULL; + /* * The result should join all and only the query's base rels. */ diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 61b5b11..570b42f 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -358,6 +358,7 @@ process_equivalence(PlannerInfo *root, ec1->ec_sources = list_concat(ec1->ec_sources, ec2->ec_sources); ec1->ec_derives = list_concat(ec1->ec_derives, ec2->ec_derives); ec1->ec_relids = bms_join(ec1->ec_relids, ec2->ec_relids); + ec1->ec_allrelids = bms_join(ec1->ec_allrelids, ec2->ec_allrelids); ec1->ec_has_const |= ec2->ec_has_const; /* can't need to set has_volatile */ ec1->ec_below_outer_join |= ec2->ec_below_outer_join; @@ -372,6 +373,7 @@ process_equivalence(PlannerInfo *root, ec2->ec_sources = NIL; ec2->ec_derives = NIL; ec2->ec_relids = NULL; + ec2->ec_allrelids = NULL; ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo); ec1->ec_below_outer_join |= below_outer_join; ec1->ec_min_security = Min(ec1->ec_min_security, @@ -432,6 +434,7 @@ process_equivalence(PlannerInfo *root, ec->ec_sources = list_make1(restrictinfo); ec->ec_derives = NIL; ec->ec_relids = NULL; + ec->ec_allrelids = NULL; ec->ec_has_const = false; ec->ec_has_volatile = false; ec->ec_below_outer_join = below_outer_join; @@ -572,6 +575,7 @@ add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids, { ec->ec_relids = bms_add_members(ec->ec_relids, relids); } + ec->ec_allrelids = bms_add_members(ec->ec_allrelids, relids); ec->ec_members = lappend(ec->ec_members, em); return em; @@ -708,6 +712,7 @@ get_eclass_for_sort_expr(PlannerInfo *root, newec->ec_sources = NIL; newec->ec_derives = NIL; newec->ec_relids = NULL; + newec->ec_allrelids = NULL; newec->ec_has_const = false; newec->ec_has_volatile = contain_volatile_functions((Node *) expr); newec->ec_below_outer_join = false; @@ -1114,6 +1119,60 @@ generate_join_implied_equalities_for_ecs(PlannerInfo *root, nominal_join_relids = join_relids; } + if (root->eq_index != NULL) + { + Bitmapset *inner_ecs = NULL; + Bitmapset *join_ecs = NULL; + Bitmapset *matching_ecs; + int i; + + Assert((root->eq_index->ei_flags & EQUIVALANCE_IDX_MULTI_MEMBER)); + + i = -1; + while ((i = bms_next_member(inner_relids, i)) > 0) + inner_ecs = bms_add_members(inner_ecs, root->eq_index->ei_index[i]); + + i = -1; + while ((i = bms_next_member(join_relids, i)) > 0) + join_ecs = bms_add_members(join_ecs, root->eq_index->ei_index[i]); + + /* Determine all eclasses in common between inner rels and join rels */ + matching_ecs = bms_int_members(inner_ecs, join_ecs); + + i = -1; + while ((i = bms_next_member(matching_ecs, i)) >= 0) + { + EquivalenceClass *ec = root->eq_index->ei_classes[i]; + List *sublist = NIL; + + /* ECs containing consts do not need any further enforcement */ + if (ec->ec_has_const) + continue; + + /* Ensure the class contains members for any of nominal_join_relids */ + Assert(bms_overlap(ec->ec_relids, nominal_join_relids)); + + if (!ec->ec_broken) + sublist = generate_join_implied_equalities_normal(root, + ec, + join_relids, + outer_relids, + inner_relids); + + /* Recover if we failed to generate required derived clauses */ + if (ec->ec_broken) + sublist = generate_join_implied_equalities_broken(root, + ec, + nominal_join_relids, + outer_relids, + nominal_inner_relids, + inner_rel); + + result = list_concat(result, sublist); + } + return result; + } + foreach(lc, eclasses) { EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc); @@ -2026,6 +2085,8 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2) * we ignore that fine point here.) This is much like exprs_known_equal, * except that we insist on the comparison operator matching the eclass, so * that the result is definite not approximate. + * + * An eq_index without volatile classes must already exist on 'root'. */ EquivalenceClass * match_eclasses_to_foreign_key_col(PlannerInfo *root, @@ -2038,31 +2099,35 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root, AttrNumber var2attno = fkinfo->confkey[colno]; Oid eqop = fkinfo->conpfeqop[colno]; List *opfamilies = NIL; /* compute only if needed */ - ListCell *lc1; + EquivalenceClassIndex *eq_index; + Bitmapset *matching_ecs; + int i; - foreach(lc1, root->eq_classes) + Assert(root->eq_index != NULL); + Assert((root->eq_index->ei_flags & EQUIVALANCE_IDX_NON_VOLATILE) != 0); + + eq_index = root->eq_index; + + /* Determine which eclasses contain members for both of the relations */ + matching_ecs = bms_intersect(eq_index->ei_index[var1varno], + eq_index->ei_index[var2varno]); + + i = -1; + while ((i = bms_next_member(matching_ecs, i)) >= 0) { - EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc1); + EquivalenceClass *ec = eq_index->ei_classes[i]; bool item1member = false; bool item2member = false; - ListCell *lc2; + ListCell *lc; - /* Never match to a volatile EC */ - if (ec->ec_has_volatile) - continue; - /* Note: it seems okay to match to "broken" eclasses here */ + /* the eclass index shouldn't have indexed these */ + Assert(!ec->ec_has_volatile); - /* - * If eclass visibly doesn't have members for both rels, there's no - * need to grovel through the members. - */ - if (!bms_is_member(var1varno, ec->ec_relids) || - !bms_is_member(var2varno, ec->ec_relids)) - continue; + /* Note: it seems okay to match to "broken" eclasses here */ - foreach(lc2, ec->ec_members) + foreach(lc, ec->ec_members) { - EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2); + EquivalenceMember *em = (EquivalenceMember *) lfirst(lc); Var *var; if (em->em_is_child) @@ -2238,6 +2303,106 @@ generate_implied_equalities_for_column(PlannerInfo *root, else parent_relids = NULL; /* not used, but keep compiler quiet */ + if (root->eq_index != NULL) + { + Bitmapset *matched_ecs = NULL; + int i; + + Assert((root->eq_index->ei_flags & EQUIVALANCE_IDX_MULTI_MEMBER)); + + i = -1; + while ((i = bms_next_member(rel->relids, i)) > 0) + matched_ecs = bms_add_members(matched_ecs, root->eq_index->ei_index[i]); + + i = -1; + while ((i = bms_next_member(matched_ecs, i)) >= 0) + { + EquivalenceClass *cur_ec = root->eq_index->ei_classes[i]; + EquivalenceMember *cur_em; + ListCell *lc2; + + /* Won't generate joinclauses if const */ + if (cur_ec->ec_has_const) + continue; + + /* + * Scan members, looking for a match to the target column. Note that + * child EC members are considered, but only when they belong to the + * target relation. (Unlike regular members, the same expression + * could be a child member of more than one EC. Therefore, it's + * potentially order-dependent which EC a child relation's target + * column gets matched to. This is annoying but it only happens in + * corner cases, so for now we live with just reporting the first + * match. See also get_eclass_for_sort_expr.) + */ + cur_em = NULL; + foreach(lc2, cur_ec->ec_members) + { + cur_em = (EquivalenceMember *) lfirst(lc2); + if (bms_equal(cur_em->em_relids, rel->relids) && + callback(root, rel, cur_ec, cur_em, callback_arg)) + break; + cur_em = NULL; + } + + if (!cur_em) + continue; + + /* + * Found our match. Scan the other EC members and attempt to generate + * joinclauses. + */ + foreach(lc2, cur_ec->ec_members) + { + EquivalenceMember *other_em = (EquivalenceMember *) lfirst(lc2); + Oid eq_op; + RestrictInfo *rinfo; + + if (other_em->em_is_child) + continue; /* ignore children here */ + + /* Make sure it'll be a join to a different rel */ + if (other_em == cur_em || + bms_overlap(other_em->em_relids, rel->relids)) + continue; + + /* Forget it if caller doesn't want joins to this rel */ + if (bms_overlap(other_em->em_relids, prohibited_rels)) + continue; + + /* + * Also, if this is a child rel, avoid generating a useless join + * to its parent rel(s). + */ + if (is_child_rel && + bms_overlap(parent_relids, other_em->em_relids)) + continue; + + eq_op = select_equality_operator(cur_ec, + cur_em->em_datatype, + other_em->em_datatype); + if (!OidIsValid(eq_op)) + continue; + + /* set parent_ec to mark as redundant with other joinclauses */ + rinfo = create_join_clause(root, cur_ec, eq_op, + cur_em, other_em, + cur_ec); + + result = lappend(result, rinfo); + } + + /* + * If somehow we failed to create any join clauses, we might as well + * keep scanning the ECs for another match. But if we did make any, + * we're done, because we don't want to return non-redundant clauses. + */ + if (result) + break; + } + return result; + } + foreach(lc1, root->eq_classes) { EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1); @@ -2354,6 +2519,23 @@ have_relevant_eclass_joinclause(PlannerInfo *root, { ListCell *lc1; + if (root->eq_index != NULL) + { + Bitmapset *rel1_ecs = NULL; + Bitmapset *rel2_ecs = NULL; + + int i = -1; + while ((i = bms_next_member(rel1->relids, i)) > 0) + rel1_ecs = bms_add_members(rel1_ecs, root->eq_index->ei_index[i]); + + i = -1; + while ((i = bms_next_member(rel2->relids, i)) > 0) + rel2_ecs = bms_add_members(rel2_ecs, root->eq_index->ei_index[i]); + + /* return true if there are any eclasses in common between the two relations */ + return bms_overlap(rel1_ecs, rel2_ecs); + } + foreach(lc1, root->eq_classes) { EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc1); @@ -2407,6 +2589,28 @@ has_relevant_eclass_joinclause(PlannerInfo *root, RelOptInfo *rel1) { ListCell *lc1; + if (root->eq_index != NULL) + { + Bitmapset *matched_ecs = NULL; + int i; + + Assert((root->eq_index->ei_flags & EQUIVALANCE_IDX_MULTI_MEMBER)); + + i = -1; + while ((i = bms_next_member(rel1->relids, i)) > 0) + matched_ecs = bms_add_members(matched_ecs, root->eq_index->ei_index[i]); + + i = -1; + + while ((i = bms_next_member(matched_ecs, i)) >= 0) + { + EquivalenceClass *ec = root->eq_index->ei_classes[i]; + if (!bms_is_subset(ec->ec_relids, rel1->relids)) + return true; + } + return false; + } + foreach(lc1, root->eq_classes) { EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc1); @@ -2556,3 +2760,76 @@ is_redundant_with_indexclauses(RestrictInfo *rinfo, List *indexclauses) return false; } + +EquivalenceClassIndex * +create_eclass_index(PlannerInfo *root, int flags) +{ + EquivalenceClassIndex *ec_index = makeNode(EquivalenceClassIndex); + EquivalenceClass **ei_classes; + Bitmapset **ei_index; + ListCell *lc; + int i; + + ec_index->ei_classes = ei_classes = palloc(sizeof(EquivalenceClass *) * + list_length(root->eq_classes)); + ec_index->ei_index = ei_index = palloc0(sizeof(Bitmapset *) * + root->simple_rel_array_size); + ec_index->ei_flags = flags; + + /* + * Index the eclass list so that we can quickly identify eclasses + * belonging to a given relation. First we build an array to store + * each eclass, this allows us to quickly lookup the eclass by array + * index. We then build an index using a Bitmapset for each relation to + * mark which eclass array element contains a member for that relation. + */ + i = 0; + foreach(lc, root->eq_classes) + ei_classes[i++] = lfirst(lc); + + /* + * We could build the indexes in the loop above, but looping backwards + * allows the Bitmapset to be allocated to its final size on the first + * allocation. Doing this forward could cause small incremental + * allocations which can be slower. + */ + for (i--; i >= 0; i--) + { + int relid = -1; + + while ((relid = bms_next_member(ei_classes[i]->ec_allrelids, relid)) > 0) + { + EquivalenceClass *ec = ei_classes[i]; + + /* Don't index classes with a const if flags mention not to. */ + if (ec->ec_has_const && (flags & EQUIVALANCE_IDX_NON_CONST) != 0) + continue; + + /* Skip volatile classes when not required in the index */ + if (ec->ec_has_volatile && + (flags & EQUIVALANCE_IDX_NON_VOLATILE) != 0) + continue; + + if (list_length(ec->ec_members) <= 1 && + (flags & EQUIVALANCE_IDX_MULTI_MEMBER) != 0) + continue; + + Assert(relid < root->simple_rel_array_size); + + /* mark this eclass as having a member for relid */ + ei_index[relid] = bms_add_member(ei_index[relid], i); + } + } + + return ec_index; +} + +void +free_eclass_index(EquivalenceClassIndex *eci) +{ + Assert(IsA(eci, EquivalenceClassIndex)); + + pfree(eci->ei_classes); + pfree(eci->ei_index); +} + diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index 2afc3f1..a3bb646 100644 --- a/src/backend/optimizer/plan/initsplan.c +++ b/src/backend/optimizer/plan/initsplan.c @@ -2433,6 +2433,9 @@ match_foreign_keys_to_quals(PlannerInfo *root) List *newlist = NIL; ListCell *lc; + + root->eq_index = create_eclass_index(root, EQUIVALANCE_IDX_NON_VOLATILE); + foreach(lc, root->fkey_list) { ForeignKeyOptInfo *fkinfo = (ForeignKeyOptInfo *) lfirst(lc); @@ -2578,6 +2581,10 @@ match_foreign_keys_to_quals(PlannerInfo *root) } /* Replace fkey_list, thereby discarding any useless entries */ root->fkey_list = newlist; + + /* clean up the eclass index */ + free_eclass_index(root->eq_index); + root->eq_index = NULL; } diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index f938925..736a1db 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -259,6 +259,7 @@ typedef enum NodeTag /* these aren't subclasses of Path: */ T_EquivalenceClass, T_EquivalenceMember, + T_EquivalenceClassIndex, T_PathKey, T_PathTarget, T_RestrictInfo, diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index a008ae0..f50f98c 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -263,6 +263,8 @@ struct PlannerInfo List *eq_classes; /* list of active EquivalenceClasses */ + struct EquivalenceClassIndex *eq_index; /* Temporary eclass index */ + List *canon_pathkeys; /* list of "canonical" PathKeys */ List *left_join_clauses; /* list of RestrictInfos for mergejoinable @@ -923,6 +925,7 @@ typedef struct EquivalenceClass List *ec_derives; /* list of derived RestrictInfos */ Relids ec_relids; /* all relids appearing in ec_members, except * for child members (see below) */ + Relids ec_allrelids; /* all relids, including child members */ bool ec_has_const; /* any pseudoconstants in ec_members? */ bool ec_has_volatile; /* the (sole) member is a volatile expr */ bool ec_below_outer_join; /* equivalence applies below an OJ */ @@ -933,6 +936,22 @@ typedef struct EquivalenceClass struct EquivalenceClass *ec_merged; /* set if merged into another EC */ } EquivalenceClass; +typedef struct EquivalenceClassIndex +{ + NodeTag type; + + EquivalenceClass **ei_classes; /* an array of EquivalenceClasses */ + Bitmapset **ei_index; /* an array, indexed by relid contining + * a bitmapset of each ei_classes item + * which has a member belonging to this + * relation */ + int ei_flags; +} EquivalenceClassIndex; + +#define EQUIVALANCE_IDX_NON_CONST 1 +#define EQUIVALANCE_IDX_NON_VOLATILE 2 +#define EQUIVALANCE_IDX_MULTI_MEMBER 4 + /* * If an EC contains a const and isn't below-outer-join, any PathKey depending * on it must be redundant, since there's only one possible value of the key. diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index 040335a..7c81239 100644 --- a/src/include/optimizer/paths.h +++ b/src/include/optimizer/paths.h @@ -169,6 +169,9 @@ extern bool eclass_useful_for_merging(PlannerInfo *root, extern bool is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist); extern bool is_redundant_with_indexclauses(RestrictInfo *rinfo, List *indexclauses); +extern EquivalenceClassIndex *create_eclass_index(PlannerInfo *root, + int flags); +extern void free_eclass_index(EquivalenceClassIndex *eci); /* * pathkeys.c
On Thu, 21 Feb 2019 at 15:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pushed that one. I'm interested by the "POC" patch, but I agree > that it'd take some research to show that it isn't a net negative > for simple queries. It sounds like you're not really interested > in pursuing that right now? Thanks for pushing it. I'm still interested in the POC patch. I just knew it wasn't something for the back branches and thought something that was would be more important... Things having gone quiet here wasn't a good source of inspiration to do any further work on it. It's good to hear you're interested. > Anyway, I rebased the POC patch up to HEAD, just in case anyone > still wants to play with it. Cool. Thanks. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
David Rowley <david.rowley@2ndquadrant.com> writes: > On Thu, 21 Feb 2019 at 15:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Anyway, I rebased the POC patch up to HEAD, just in case anyone >> still wants to play with it. > Cool. Thanks. I haven't done any of the performance testing that this patch clearly needs, but just in a quick look through the code: * I seriously dislike the usage of root->eq_classes, primarily the underdocumented way that it means one thing in some phases of the planner and something else in other phases. You seem to be doing that in hopes of saving some memory, but is the index data structure really large enough to matter? This scheme is certainly unlikely to continue to work if we add any additional uses of EquivalenceClassIndexes. So I think it might be better to pass them around as explicit arguments and/or attach them someplace else than PlannerInfo. * I'm also not very excited about having both a fast and slow path in places like has_relevant_eclass_joinclause() depending on whether the index exists or not. IMO, if we're going to do this at all, we should just go over to requiring the index to exist when needed, and get rid of the slow paths. (A possible variant to that is "build the index structure on-demand", though you'd have to be careful about GEQO memory management.) Otherwise we'll forever be fighting hidden planner-performance bugs of the form "if you call function xyz from here, it'll be way slower than you expected". * There's not much point in making EquivalenceClassIndex a Node type if you're not going to wire it up to any of the Node infrastructure (particularly outfuncs.c, which might be useful for debug purposes). But I'm not really sure that it ought to be a distinct data structure at all --- maybe this requirement should be more tightly integrated with the ECs themselves? One idea of what that might look like is to let each base RelOptInfo have a list of associated EquivalenceClasses, so that you'd only have to search through directly-relevant ECs when trying to prove something. But I'm just handwaving here. * Spell check please, particularly EQUIVALANCE. * Documentation of the data structure is pretty weak, eg what is "this relation" in the comment about ei_index? And how do you know how long the arrays are, or what they're indexed by? And there's no explicit statement that ei_flags is a bitmask of those other symbols, much less any real statement of what each flag means. Setting the CF entry to WOA for now. I wonder though if we should just push it out to v13 immediately --- are you intending to do more with it in the near future? regards, tom lane
On Sat, 9 Mar 2019 at 08:05, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Setting the CF entry to WOA for now. I wonder though if we should > just push it out to v13 immediately --- are you intending to do more > with it in the near future? Thanks a lot for going over this and providing feedback. I put the patch out there mostly to see if such a thing is something we might want, and it's good to see no objections to the idea. I didn't want to waste too much time if there was going to be some serious objections to the idea. This is something I'd like to look into for v13. I think it'll be much easier to do if we can get your List reimplementation patch in first. That's going to allow list_nth on PlannerInfo->eq_classes to work quickly and will remove the need for having to build an array to store the classes, and as you mention, RelOptInfo could store a Bitmapset to store which ECs have members for this rel. I've modified the CF entry to target v13. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, 9 Mar 2019 at 13:18, David Rowley <david.rowley@2ndquadrant.com> wrote: > This is something I'd like to look into for v13. I think it'll be > much easier to do if we can get your List reimplementation patch in > first. That's going to allow list_nth on PlannerInfo->eq_classes to > work quickly and will remove the need for having to build an array to > store the classes, and as you mention, RelOptInfo could store a > Bitmapset to store which ECs have members for this rel. I've > modified the CF entry to target v13. I started looking at this again and I came up with the attached eclass_indexes_v2.patch. This modifies the original patch significantly so that we no longer build this big eclass index when we think we might start to need it. Instead, I've added a Bitmapset field to RelOptInfo named "eclass_indexes". During add_eq_member() we just note down the PlannerInfo->eq_classes index of the eclass we're adding to in each RelOptInfo->eclass_indexes that's involved in the new eclass member being added. This means we can use the Bitmapset lookup anytime we like. That allowed me to get rid of those special cases I had added to make use of the index when it exists. I've now modified all the existing code to make use of the RelOptInfo->eclass_indexes field. As mentioned above, my thoughts on this patch were that this new method would only be any good once we got Tom's list implementation patch as that makes list_nth() O(1) instead of O(N). On benchmarking this I was quite surprised that it still improves performance quite a bit even without the list reimplementation patch. Using Tom's test case [1], I get the following: * master (82a5649fb) latency average = 6992.320 ms latency average = 6990.297 ms latency average = 7030.619 ms * master + eclass_indexes_v2 latency average = 2537.536 ms latency average = 2530.824 ms latency average = 2543.770 ms If I add in Tom's list reimplementation too, I get: * master + Tom's list reimplementation v3 + eclass_indexes latency average = 1225.910 ms latency average = 1209.565 ms latency average = 1207.326 ms I really didn't expect just this patch alone to speed this up as it calls list_nth() in a loop. I can only imagine that with this workload that these list_nth() loops are not performing many loops, otherwise, the performance would die off quickly. I thought about how we could defend against workloads where there's a large number of eclasses and most match a given relation. I ended up with a small function named list_skip_forward() that simply keeps looping for N times eating N ListCells into the given ListCell. This does improve performance a little bit more, but probably, more importantly, should be a good defence against the worst case. As is, the function would become completely redundant with Tom's list reimplementation patch, so for now, I just snuck it in as a static function in equivclass.c. I've attached this list_skip_forward.patch too. This patch is a bit rough around the edges as I only just started playing with it in the past 30 mins or so. Perhaps it shows that we might be able to fix this in PG12 and not have to wait for the list reimplementation at all. The performance with both attached patches is: * master + eclass_indexes + list_skip_forward_v0 latency average = 2375.383 ms latency average = 2351.450 ms latency average = 2339.259 ms Still nowhere near as good as with the list reimplementation patch, but way better than master. [1] https://www.postgresql.org/message-id/6970.1545327857@sss.pgh.pa.us -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Sun, 10 Mar 2019 at 21:34, David Rowley <david.rowley@2ndquadrant.com> wrote: > I started looking at this again and I came up with the attached > eclass_indexes_v2.patch. The CF bot didn't like v2. It warned about an uninitialized variable. My compiler didn't. Here's v3, hopefully with that fixed. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David Rowley <david.rowley@2ndquadrant.com> writes: > [ eclass_indexes_v3.patch ] I looked at this for a little bit. I'm on board with the basic idea of assigning integer indexes to the ECs and keeping bitmapsets of relevant EC indexes in RelOptInfos. However ... man, is that delete_eclass() thing ugly. And slow, and fragile-feeling. I think that maybe we could improve that situation by not trying to maintain the RelOptInfo.eclass_indexes sets at all during the initial construction of the ECs, but only setting them up after we've completed doing that. We already have an assumption that we know when EC merging is done and it's safe to make pathkeys that reference the "canonical" (merged) ECs, so it seems like that would be a point where we could build the indexing bitmapsets safely. This hinges on not needing the bitmapsets till after that point, but it'd be easy to arrange some Asserts verifying that we don't try to use them before that. If that doesn't work (because we need the index info sooner), maybe we could consider never removing ECs from eq_classes, so that their indices never change, then teaching most/all of the loops over eq_classes to ignore entries with non-null ec_merged. But we probably want the indexing bitmapsets to reflect canonical EC numbers, so we'd still have to do some updating I fear -- or else be prepared to chase up the ec_merged links when using the index bitmaps. Stepping back a little bit, I wonder whether now is the time to rethink the overall EC data structure, as foreseen in this old comment: * Note: constructing merged EquivalenceClasses is a standard UNION-FIND * problem, for which there exist better data structures than simple lists. * If this code ever proves to be a bottleneck then it could be sped up --- * but for now, simple is beautiful. I'm not very sure what a better data structure would really look like --- after perusing Sedgewick's section on UNION-FIND, it seems like the ec_merged links are more or less what he's recommending anyway, and the expensive part of this is probably all the equal() tests to find whether two proposed EC member expressions are already in the set of known expressions. So I'm just handwaving here. But my point is that we needn't feel wedded to the idea of keeping the ECs in a list, if we can think of some better data structure for them. regards, tom lane
Thanks for looking at this. On Mon, 18 Mar 2019 at 10:08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I looked at this for a little bit. I'm on board with the basic idea > of assigning integer indexes to the ECs and keeping bitmapsets of > relevant EC indexes in RelOptInfos. However ... man, is that > delete_eclass() thing ugly. And slow, and fragile-feeling. Yeah. When I discovered we remove eclasses from the list I was a little annoyed as the code was pretty simple before having to account for that. I'll grant you ugly and slow, but I don't quite see the fragile part. > I think that maybe we could improve that situation by not trying to > maintain the RelOptInfo.eclass_indexes sets at all during the initial > construction of the ECs, but only setting them up after we've completed > doing that. We already have an assumption that we know when EC merging > is done and it's safe to make pathkeys that reference the "canonical" > (merged) ECs, so it seems like that would be a point where we could > build the indexing bitmapsets safely. This hinges on not needing the > bitmapsets till after that point, but it'd be easy to arrange some > Asserts verifying that we don't try to use them before that. I don't think that's really that easily workable. Consider, for example, when add_child_rel_equivalences() is called, this is well after the location I think you have in mind. Probably this function should be modified to use the indexes, but it must also update the indexes too. > If that doesn't work (because we need the index info sooner), maybe we > could consider never removing ECs from eq_classes, so that their indices > never change, then teaching most/all of the loops over eq_classes to > ignore entries with non-null ec_merged. But we probably want the indexing > bitmapsets to reflect canonical EC numbers, so we'd still have to do some > updating I fear -- or else be prepared to chase up the ec_merged links > when using the index bitmaps. That's probably a better solution. Perhaps we can continue to nullify the ec_members, then just skip eclasses with a NIL ec_members. I avoided that in the patch because I was worried about what extension might be doing, but if you think it's okay, then I can change the patch. > Stepping back a little bit, I wonder whether now is the time to rethink > the overall EC data structure, as foreseen in this old comment: > > * Note: constructing merged EquivalenceClasses is a standard UNION-FIND > * problem, for which there exist better data structures than simple lists. > * If this code ever proves to be a bottleneck then it could be sped up --- > * but for now, simple is beautiful. I've thought for a while now that it wouldn't be too hard to have equal(), or some version of it return -1, 0, 1 to allow us to binary search or build a binary search tree of nodes. I imagine it wouldn't be too hard to create a compact binary search tree inside an array with indexes to the left and right sub-tree instead of pointers. That would likely be a bit more cache friendly and allow simple non-recursive traversals of the entire tree. However, that does not sound like something this patch should be doing. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, 18 Mar 2019 at 14:06, David Rowley <david.rowley@2ndquadrant.com> wrote: > > On Mon, 18 Mar 2019 at 10:08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > If that doesn't work (because we need the index info sooner), maybe we > > could consider never removing ECs from eq_classes, so that their indices > > never change, then teaching most/all of the loops over eq_classes to > > ignore entries with non-null ec_merged. But we probably want the indexing > > bitmapsets to reflect canonical EC numbers, so we'd still have to do some > > updating I fear -- or else be prepared to chase up the ec_merged links > > when using the index bitmaps. > > That's probably a better solution. Perhaps we can continue to nullify > the ec_members, then just skip eclasses with a NIL ec_members. I > avoided that in the patch because I was worried about what extension > might be doing, but if you think it's okay, then I can change the > patch. I've modified the patch to do it this way. The only loop over eq_classes I saw outside of equivclass.c was in postgres_fdw.c. This just calls eclass_useful_for_merging() on each EC in the list. Instead of having that loop skip deleted ECs, I changed eclass_useful_for_merging() so that it just returns false for that case. The only other thing I change was to create a new function named get_common_eclass_indexes() which removes some duplicate code where we were getting ECs common to two relations. I also made it so this function does not allocate unnecessary Bitmapsets when the inputs are simple relations. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David Rowley <david.rowley@2ndquadrant.com> writes: > [ eclass_indexes_v4.patch ] I still don't like this approach too much. I think we can fairly easily construct the eclass_indexes at a higher level instead of trying to manage them in add_eq_member, and after some hacking I have the attached. Some notes: * To be sure of knowing whether the indexes have been made yet, I added a new flag variable PlannerInfo.ec_merging_done. This seems like a cleaner fix for the dependency between canonical-pathkey construction and EC construction, too. I did need to add code to set the flag in a couple of short-circuit code paths where we never call generate_base_implied_equalities(), though. * I kind of want to rename generate_base_implied_equalities() to something else, considering that it does a good bit more than just that now. But I haven't thought of a good name. I considered finalize_equivalence_classes(), but that seems like an overstatement when it's still possible for new sort-key eclasses to get made later. * I did not include your changes to use the indexes in generate_join_implied_equalities[_for_ecs], because they were wrong. generate_join_implied_equalities_for_ecs is supposed to consider only ECs listed in the input list. (It's troublesome that apparently we don't have any regression tests exposing that need; we should try to make a test case that does show it.) There's probably some way to still make use of the indexes there. If nothing else, we could refactor so that generate_join_implied_equalities isn't just a wrapper around generate_join_implied_equalities_for_ecs but has its own looping, and we only use the indexes in generate_join_implied_equalities not the other entry point. I haven't tried though. * You mentioned postgres_fdw.c's get_useful_ecs_for_relation as potentially affected by this change. It looks to me like we could nuke that function altogether in favor of having its caller scan the foreign table's eclass_indexes. I haven't tried that either. I'm unsure how hard we should push to get something like this into v12. I'm concerned that its dependency on list_nth might result in performance regressions in some cases; it's a lot easier to believe that this will be mostly-a-win with the better List infrastructure we're hoping to get into v13. If you want to keep playing with it, OK, but I'm kind of tempted to shelve it for now. regards, tom lane diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 69179a0..4e8400d 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2190,6 +2190,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node) WRITE_NODE_FIELD(cte_plan_ids); WRITE_NODE_FIELD(multiexpr_params); WRITE_NODE_FIELD(eq_classes); + WRITE_BOOL_FIELD(ec_merging_done); WRITE_NODE_FIELD(canon_pathkeys); WRITE_NODE_FIELD(left_join_clauses); WRITE_NODE_FIELD(right_join_clauses); @@ -2256,6 +2257,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node) WRITE_UINT_FIELD(pages); WRITE_FLOAT_FIELD(tuples, "%.0f"); WRITE_FLOAT_FIELD(allvisfrac, "%.6f"); + WRITE_BITMAPSET_FIELD(eclass_indexes); WRITE_NODE_FIELD(subroot); WRITE_NODE_FIELD(subplan_params); WRITE_INT_FIELD(rel_parallel_workers); diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 61b5b11..804b211 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -64,6 +64,10 @@ static bool reconsider_outer_join_clause(PlannerInfo *root, bool outer_on_left); static bool reconsider_full_join_clause(PlannerInfo *root, RestrictInfo *rinfo); +static Bitmapset *get_eclass_indexes_for_relids(PlannerInfo *root, + Relids relids); +static Bitmapset *get_common_eclass_indexes(PlannerInfo *root, Relids relids1, + Relids relids2); /* @@ -341,10 +345,11 @@ process_equivalence(PlannerInfo *root, /* * Case 2: need to merge ec1 and ec2. This should never happen after - * we've built any canonical pathkeys; if it did, those pathkeys might - * be rendered non-canonical by the merge. + * the ECs have reached canonical state; otherwise, pathkeys could be + * rendered non-canonical by the merge, and relation eclass indexes + * would get broken by removal of an eq_classes list entry. */ - if (root->canon_pathkeys != NIL) + if (root->ec_merging_done) elog(ERROR, "too late to merge equivalence classes"); /* @@ -743,6 +748,26 @@ get_eclass_for_sort_expr(PlannerInfo *root, root->eq_classes = lappend(root->eq_classes, newec); + /* + * If EC merging is already complete, we have to mop up by adding the new + * EC to the eclass_indexes of the relation(s) mentioned in it. + */ + if (root->ec_merging_done) + { + int ec_index = list_length(root->eq_classes) - 1; + int i = -1; + + while ((i = bms_next_member(newec->ec_relids, i)) > 0) + { + RelOptInfo *rel = root->simple_rel_array[i]; + + Assert(rel->reloptkind == RELOPT_BASEREL); + + rel->eclass_indexes = bms_add_member(rel->eclass_indexes, + ec_index); + } + } + MemoryContextSwitchTo(oldcontext); return newec; @@ -800,42 +825,71 @@ get_eclass_for_sort_expr(PlannerInfo *root, void generate_base_implied_equalities(PlannerInfo *root) { + int ec_index; ListCell *lc; - Index rti; + /* + * At this point, we're done absorbing knowledge of equivalences in the + * query, so no further EC merging should happen, and ECs remaining in the + * eq_classes list can be considered canonical. (But note that it's still + * possible for new single-member ECs to be added through + * get_eclass_for_sort_expr().) + */ + root->ec_merging_done = true; + + ec_index = 0; foreach(lc, root->eq_classes) { EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc); + bool can_generate_joinclause = false; + int i; Assert(ec->ec_merged == NULL); /* else shouldn't be in list */ Assert(!ec->ec_broken); /* not yet anyway... */ - /* Single-member ECs won't generate any deductions */ - if (list_length(ec->ec_members) <= 1) - continue; + /* + * Generate implied equalities that are restriction clauses. + * Single-member ECs won't generate any deductions, either here or at + * the join level. + */ + if (list_length(ec->ec_members) > 1) + { + if (ec->ec_has_const) + generate_base_implied_equalities_const(root, ec); + else + generate_base_implied_equalities_no_const(root, ec); - if (ec->ec_has_const) - generate_base_implied_equalities_const(root, ec); - else - generate_base_implied_equalities_no_const(root, ec); + /* Recover if we failed to generate required derived clauses */ + if (ec->ec_broken) + generate_base_implied_equalities_broken(root, ec); - /* Recover if we failed to generate required derived clauses */ - if (ec->ec_broken) - generate_base_implied_equalities_broken(root, ec); - } + /* Detect whether this EC might generate join clauses */ + can_generate_joinclause = + (bms_membership(ec->ec_relids) == BMS_MULTIPLE); + } - /* - * This is also a handy place to mark base rels (which should all exist by - * now) with flags showing whether they have pending eclass joins. - */ - for (rti = 1; rti < root->simple_rel_array_size; rti++) - { - RelOptInfo *brel = root->simple_rel_array[rti]; + /* + * Mark the base rels cited in each eclass (which should all exist by + * now) with the eq_classes indexes of all eclasses mentioning them. + * This will let us avoid searching in subsequent lookups. While + * we're at it, we can mark base rels that have pending eclass joins; + * this is a cheap version of has_relevant_eclass_joinclause(). + */ + i = -1; + while ((i = bms_next_member(ec->ec_relids, i)) > 0) + { + RelOptInfo *rel = root->simple_rel_array[i]; - if (brel == NULL) - continue; + Assert(rel->reloptkind == RELOPT_BASEREL); + + rel->eclass_indexes = bms_add_member(rel->eclass_indexes, + ec_index); + + if (can_generate_joinclause) + rel->has_eclass_joins = true; + } - brel->has_eclass_joins = has_relevant_eclass_joinclause(root, brel); + ec_index++; } } @@ -2037,12 +2091,23 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root, Index var2varno = fkinfo->ref_relid; AttrNumber var2attno = fkinfo->confkey[colno]; Oid eqop = fkinfo->conpfeqop[colno]; + RelOptInfo *rel1 = root->simple_rel_array[var1varno]; + RelOptInfo *rel2 = root->simple_rel_array[var2varno]; List *opfamilies = NIL; /* compute only if needed */ - ListCell *lc1; - - foreach(lc1, root->eq_classes) + Bitmapset *matching_ecs; + int i; + + /* Consider only eclasses mentioning both relations */ + Assert(root->ec_merging_done); + Assert(IS_SIMPLE_REL(rel1)); + Assert(IS_SIMPLE_REL(rel2)); + matching_ecs = bms_intersect(rel1->eclass_indexes, + rel2->eclass_indexes); + + i = -1; + while ((i = bms_next_member(matching_ecs, i)) >= 0) { - EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc1); + EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i); bool item1member = false; bool item2member = false; ListCell *lc2; @@ -2052,14 +2117,6 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root, continue; /* Note: it seems okay to match to "broken" eclasses here */ - /* - * If eclass visibly doesn't have members for both rels, there's no - * need to grovel through the members. - */ - if (!bms_is_member(var1varno, ec->ec_relids) || - !bms_is_member(var2varno, ec->ec_relids)) - continue; - foreach(lc2, ec->ec_members) { EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2); @@ -2119,11 +2176,19 @@ add_child_rel_equivalences(PlannerInfo *root, RelOptInfo *parent_rel, RelOptInfo *child_rel) { - ListCell *lc1; + int i; - foreach(lc1, root->eq_classes) + /* + * EC merging should be complete already, so we can use the parent rel's + * eclass_indexes to avoid searching all of root->eq_classes. + */ + Assert(root->ec_merging_done); + Assert(IS_SIMPLE_REL(parent_rel)); + + i = -1; + while ((i = bms_next_member(parent_rel->eclass_indexes, i)) >= 0) { - EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1); + EquivalenceClass *cur_ec = (EquivalenceClass *) list_nth(root->eq_classes, i); ListCell *lc2; /* @@ -2134,14 +2199,6 @@ add_child_rel_equivalences(PlannerInfo *root, if (cur_ec->ec_has_volatile) continue; - /* - * No point in searching if parent rel not mentioned in eclass; but we - * can't tell that for sure if parent rel is itself a child. - */ - if (parent_rel->reloptkind == RELOPT_BASEREL && - !bms_is_subset(parent_rel->relids, cur_ec->ec_relids)) - continue; - foreach(lc2, cur_ec->ec_members) { EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); @@ -2191,6 +2248,14 @@ add_child_rel_equivalences(PlannerInfo *root, } } } + + /* + * The child is now mentioned in all the same eclasses as its parent --- + * except for corner cases such as a volatile EC. But it's okay if + * eclass_indexes lists too many rels, so just borrow the parent's index + * set rather than making a new one. + */ + child_rel->eclass_indexes = parent_rel->eclass_indexes; } @@ -2227,7 +2292,10 @@ generate_implied_equalities_for_column(PlannerInfo *root, List *result = NIL; bool is_child_rel = (rel->reloptkind == RELOPT_OTHER_MEMBER_REL); Relids parent_relids; - ListCell *lc1; + int i; + + /* Should be OK to rely on eclass_indexes */ + Assert(root->ec_merging_done); /* Indexes are available only on base or "other" member relations. */ Assert(IS_SIMPLE_REL(rel)); @@ -2238,9 +2306,10 @@ generate_implied_equalities_for_column(PlannerInfo *root, else parent_relids = NULL; /* not used, but keep compiler quiet */ - foreach(lc1, root->eq_classes) + i = -1; + while ((i = bms_next_member(rel->eclass_indexes, i)) >= 0) { - EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1); + EquivalenceClass *cur_ec = (EquivalenceClass *) list_nth(root->eq_classes, i); EquivalenceMember *cur_em; ListCell *lc2; @@ -2252,14 +2321,6 @@ generate_implied_equalities_for_column(PlannerInfo *root, continue; /* - * No point in searching if rel not mentioned in eclass (but we can't - * tell that for a child rel). - */ - if (!is_child_rel && - !bms_is_subset(rel->relids, cur_ec->ec_relids)) - continue; - - /* * Scan members, looking for a match to the target column. Note that * child EC members are considered, but only when they belong to the * target relation. (Unlike regular members, the same expression @@ -2352,11 +2413,16 @@ bool have_relevant_eclass_joinclause(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) { - ListCell *lc1; + Bitmapset *matching_ecs; + int i; - foreach(lc1, root->eq_classes) + /* Examine only eclasses mentioning both rel1 and rel2 */ + matching_ecs = get_common_eclass_indexes(root, rel1->relids, rel2->relids); + + i = -1; + while ((i = bms_next_member(matching_ecs, i)) >= 0) { - EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc1); + EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i); /* * Won't generate joinclauses if single-member (this test covers the @@ -2384,6 +2450,10 @@ have_relevant_eclass_joinclause(PlannerInfo *root, * b.y and a.x = 42", it is worth considering a join between a and b, * since the join result is likely to be small even though it'll end * up being an unqualified nestloop. + * + * The bms_overlap tests here are normally redundant, given the work + * done by get_common_eclass_indexes, but we keep them in case either + * rel's eclass_indexes contains extra entries. */ if (bms_overlap(rel1->relids, ec->ec_relids) && bms_overlap(rel2->relids, ec->ec_relids)) @@ -2405,11 +2475,16 @@ have_relevant_eclass_joinclause(PlannerInfo *root, bool has_relevant_eclass_joinclause(PlannerInfo *root, RelOptInfo *rel1) { - ListCell *lc1; + Bitmapset *matched_ecs; + int i; - foreach(lc1, root->eq_classes) + /* Examine only eclasses mentioning rel1 */ + matched_ecs = get_eclass_indexes_for_relids(root, rel1->relids); + + i = -1; + while ((i = bms_next_member(matched_ecs, i)) >= 0) { - EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc1); + EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i); /* * Won't generate joinclauses if single-member (this test covers the @@ -2556,3 +2631,54 @@ is_redundant_with_indexclauses(RestrictInfo *rinfo, List *indexclauses) return false; } + +/* + * get_eclass_indexes_for_relids + * Build and return a Bitmapset containing the indexes into root's + * eq_classes list for all eclasses that mention any of these relids + */ +static Bitmapset * +get_eclass_indexes_for_relids(PlannerInfo *root, Relids relids) +{ + Bitmapset *ec_indexes = NULL; + int i = -1; + + /* Should be OK to rely on eclass_indexes */ + Assert(root->ec_merging_done); + + while ((i = bms_next_member(relids, i)) > 0) + { + RelOptInfo *rel = root->simple_rel_array[i]; + + ec_indexes = bms_add_members(ec_indexes, rel->eclass_indexes); + } + return ec_indexes; +} + +/* + * get_common_eclass_indexes + * Build and return a Bitmapset containing the indexes into root's + * eq_classes list for all eclasses that mention rels in both + * relids1 and relids2. + */ +static Bitmapset * +get_common_eclass_indexes(PlannerInfo *root, Relids relids1, Relids relids2) +{ + Bitmapset *rel1ecs; + Bitmapset *rel2ecs; + int relid; + + rel1ecs = get_eclass_indexes_for_relids(root, relids1); + + /* + * We can get away with just using the relation's eclass_indexes directly + * when relids2 is a singleton set. + */ + if (bms_get_singleton_member(relids2, &relid)) + rel2ecs = root->simple_rel_array[relid]->eclass_indexes; + else + rel2ecs = get_eclass_indexes_for_relids(root, relids2); + + /* Calculate and return the common EC indexes, recycling the left input. */ + return bms_int_members(rel1ecs, rel2ecs); +} diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 56d839b..c34380f 100644 --- a/src/backend/optimizer/path/pathkeys.c +++ b/src/backend/optimizer/path/pathkeys.c @@ -42,9 +42,7 @@ static bool right_merge_direction(PlannerInfo *root, PathKey *pathkey); * entry if there's not one already. * * Note that this function must not be used until after we have completed - * merging EquivalenceClasses. (We don't try to enforce that here; instead, - * equivclass.c will complain if a merge occurs after root->canon_pathkeys - * has become nonempty.) + * merging EquivalenceClasses. */ PathKey * make_canonical_pathkey(PlannerInfo *root, @@ -55,6 +53,10 @@ make_canonical_pathkey(PlannerInfo *root, ListCell *lc; MemoryContext oldcontext; + /* Can't make canonical pathkeys if the set of ECs might still change */ + if (!root->ec_merging_done) + elog(ERROR, "too soon to build canonical pathkeys"); + /* The passed eclass might be non-canonical, so chase up to the top */ while (eclass->ec_merged) eclass = eclass->ec_merged; diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c index 3cedd01..9fc1b24 100644 --- a/src/backend/optimizer/plan/planmain.c +++ b/src/backend/optimizer/plan/planmain.c @@ -142,6 +142,12 @@ query_planner(PlannerInfo *root, List *tlist, set_cheapest(final_rel); /* + * We don't need to run generate_base_implied_equalities, but + * we do need to pretend that EC merging is complete. + */ + root->ec_merging_done = true; + + /* * We still are required to call qp_callback, in case it's * something like "SELECT 2+2 ORDER BY 1". */ diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index e408e77..0a0fc97 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -620,6 +620,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse, root->cte_plan_ids = NIL; root->multiexpr_params = NIL; root->eq_classes = NIL; + root->ec_merging_done = false; root->append_rel_list = NIL; root->rowMarks = NIL; memset(root->upper_rels, 0, sizeof(root->upper_rels)); diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index aebe162..5d4aa60 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -886,6 +886,7 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte, subroot->cte_plan_ids = NIL; subroot->multiexpr_params = NIL; subroot->eq_classes = NIL; + subroot->ec_merging_done = false; subroot->append_rel_list = NIL; subroot->rowMarks = NIL; memset(subroot->upper_rels, 0, sizeof(subroot->upper_rels)); diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index eb815c2..3a5482b 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -121,6 +121,15 @@ plan_set_operations(PlannerInfo *root) Assert(parse->distinctClause == NIL); /* + * In the outer query level, we won't have any true equivalences to deal + * with; but we do want to be able to make pathkeys, which will require + * single-member EquivalenceClasses. Indicate that EC merging is complete + * so that pathkeys.c won't complain. + */ + Assert(root->eq_classes == NIL); + root->ec_merging_done = true; + + /* * We'll need to build RelOptInfos for each of the leaf subqueries, which * are RTE_SUBQUERY rangetable entries in this Query. Prepare the index * arrays for that. diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index 4130514..4a3a4e3 100644 --- a/src/backend/optimizer/util/relnode.c +++ b/src/backend/optimizer/util/relnode.c @@ -178,6 +178,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent) rel->pages = 0; rel->tuples = 0; rel->allvisfrac = 0; + rel->eclass_indexes = NULL; rel->subroot = NULL; rel->subplan_params = NIL; rel->rel_parallel_workers = -1; /* set up in get_relation_info */ @@ -600,6 +601,7 @@ build_join_rel(PlannerInfo *root, joinrel->pages = 0; joinrel->tuples = 0; joinrel->allvisfrac = 0; + joinrel->eclass_indexes = NULL; joinrel->subroot = NULL; joinrel->subplan_params = NIL; joinrel->rel_parallel_workers = -1; @@ -779,6 +781,7 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel, joinrel->pages = 0; joinrel->tuples = 0; joinrel->allvisfrac = 0; + joinrel->eclass_indexes = NULL; joinrel->subroot = NULL; joinrel->subplan_params = NIL; joinrel->serverid = InvalidOid; diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index 253e0b7..bdf43ae 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -265,6 +265,8 @@ struct PlannerInfo List *eq_classes; /* list of active EquivalenceClasses */ + bool ec_merging_done; /* set true once ECs are canonical */ + List *canon_pathkeys; /* list of "canonical" PathKeys */ List *left_join_clauses; /* list of RestrictInfos for mergejoinable @@ -495,6 +497,8 @@ typedef struct PartitionSchemeData *PartitionScheme; * pages - number of disk pages in relation (zero if not a table) * tuples - number of tuples in relation (not considering restrictions) * allvisfrac - fraction of disk pages that are marked all-visible + * eclass_indexes - EquivalenceClasses that mention this rel (filled + * only after EC merging is complete) * subroot - PlannerInfo for subquery (NULL if it's not a subquery) * subplan_params - list of PlannerParamItems to be passed to subquery * @@ -668,6 +672,8 @@ typedef struct RelOptInfo BlockNumber pages; /* size estimates derived from pg_class */ double tuples; double allvisfrac; + Bitmapset *eclass_indexes; /* Indexes in PlannerInfo's eq_classes list of + * ECs that mention this rel */ PlannerInfo *subroot; /* if subquery */ List *subplan_params; /* if subquery */ int rel_parallel_workers; /* wanted number of parallel workers */
Thanks for having a hack at this. On Fri, 22 Mar 2019 at 10:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm unsure how hard we should push to get something like this into v12. > I'm concerned that its dependency on list_nth might result in performance > regressions in some cases; it's a lot easier to believe that this will > be mostly-a-win with the better List infrastructure we're hoping to get > into v13. If you want to keep playing with it, OK, but I'm kind of > tempted to shelve it for now. Yeah, mentioned a similar concern above. The best I could come up with to combat it was the list_skip_forward function. It wasn't particularly pretty and was only intended as a stop-gap until List become array-based. I think it should stop any regression. I'm okay with waiting until we get array based Lists, if you think that's best, but it's a bit sad to leave this regression in yet another major release. However, there's always a danger we find some show-stopper with your list reimplementation patch, in which case I wouldn't really like to be left with list_skip_forward() in core. If there's any consensus we want this for v12, then I'll happily look over your patch, otherwise, I'll look sometime before July. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
David Rowley <david.rowley@2ndquadrant.com> writes: > On Fri, 22 Mar 2019 at 10:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm unsure how hard we should push to get something like this into v12. >> I'm concerned that its dependency on list_nth might result in performance >> regressions in some cases; ... > However, there's always a danger we find some show-stopper with your > list reimplementation patch, in which case I wouldn't really like to > be left with list_skip_forward() in core. We could always do something like what we've already done with simple_rel_array and simple_rte_array, ie, replace the eq_classes List with a manually-managed pointer array. Given the small number of places that touch that list, it wouldn't be too awful --- but still, I'd only consider that if the List-reimplementation patch goes down in flames. regards, tom lane
On Fri, 22 Mar 2019 at 10:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <david.rowley@2ndquadrant.com> writes: > > [ eclass_indexes_v4.patch ] > > I still don't like this approach too much. I think we can fairly easily > construct the eclass_indexes at a higher level instead of trying to > manage them in add_eq_member, and after some hacking I have the attached. I've rebased this on top of the pgindent changes (attached) and looked over it. The only problem I see is that you're not performing a bms_copy() of the parent's eclass_indexes in add_child_rel_equivalences(). You've written a comment that claims it's fine to just point to the parent's in: /* * The child is now mentioned in all the same eclasses as its parent --- * except for corner cases such as a volatile EC. But it's okay if * eclass_indexes lists too many rels, so just borrow the parent's index * set rather than making a new one. */ child_rel->eclass_indexes = parent_rel->eclass_indexes; but that's not true since in get_eclass_for_sort_expr() we perform: rel->eclass_indexes = bms_add_member(rel->eclass_indexes, ec_index); which will work fine about in about 63 out of 64 cases, but once bms_add_member has to reallocate the set then it'll leave the child rel's eclass_indexes pointing to freed memory. I'm not saying I have any reproducible test case that can crash the backend from this, but it does seem like a bug waiting to happen. Apart from that, as far as I can tell, the patch seems fine. I didn't add the bms_copy(). I'll wait for your comments before doing so. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Sat, 25 May 2019 at 16:37, David Rowley <david.rowley@2ndquadrant.com> wrote: > The only problem I see is that you're not performing a bms_copy() of > the parent's eclass_indexes in add_child_rel_equivalences(). You've > written a comment that claims it's fine to just point to the parent's > in: > > /* > * The child is now mentioned in all the same eclasses as its parent --- > * except for corner cases such as a volatile EC. But it's okay if > * eclass_indexes lists too many rels, so just borrow the parent's index > * set rather than making a new one. > */ > child_rel->eclass_indexes = parent_rel->eclass_indexes; > > but that's not true since in get_eclass_for_sort_expr() we perform: > > rel->eclass_indexes = bms_add_member(rel->eclass_indexes, > ec_index); > > which will work fine about in about 63 out of 64 cases, but once > bms_add_member has to reallocate the set then it'll leave the child > rel's eclass_indexes pointing to freed memory. I'm not saying I have > any reproducible test case that can crash the backend from this, but > it does seem like a bug waiting to happen. > > Apart from that, as far as I can tell, the patch seems fine. > > I didn't add the bms_copy(). I'll wait for your comments before doing so. I've rebased this on top of the current master. d25ea0127 conflicted with the old version. I also tried to get the planner to crash by trying to find a case where a new eclass is added after setting the child relations eclass_indexes. I thought this could be done via a call to make_pathkey_from_sortinfo(), but I was unable to find any case that passes create_it as true. I added some code to emit a warning when this happens after a call to add_child_rel_equivalences() and found that the warning wasn't raised after running make check. However, I'm still pretty scared by not making a copy of the Bitmapset. It seems like if anything ever changed in this area then we could end up with some very rare crashes if the parent's eclass_indexes grew another bitmapword since the child took it's copy of them, so I added the bms_copy() in the attached version. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Sun, 16 Jun 2019 at 19:42, David Rowley <david.rowley@2ndquadrant.com> wrote: > > I've rebased this on top of the current master. d25ea0127 conflicted > with the old version. ... and again, per recent conflicting change in equivclass.c I've also taken a fresh set of performance benchmarks since 1cff1b95a has recently changed the list.c implementation to use arrays instead of singly-linked-lists. I've attached 2 patched: 0001: Is the rebased version of eclass_indexes_v7.patch. 0002: Is new and goes even further to improve performance. Using schema.sql and query.sql from https://postgr.es/m/6970.1545327857%40sss.pgh.pa.us I get: master @ 21039555 postgres=# \i query.sql Time: 5078.105 ms (00:05.078) Time: 5279.733 ms (00:05.280) Time: 5375.766 ms (00:05.376) Time: 5382.716 ms (00:05.383) master + 0001: postgres=# \i query.sql Time: 2116.394 ms (00:02.116) Time: 2076.883 ms (00:02.077) Time: 2142.237 ms (00:02.142) Time: 2199.468 ms (00:02.199) (2.47x faster than master) Per what Tom mentioned in https://postgr.es/m/16252.1553202606@sss.pgh.pa.us about generate_join_implied_equalities[_for_ecs]. Since generate_join_implied_equalities() is still quite an overhead in profiles, it seems to make sense to special purpose this function rather than have it call generate_join_implied_equalities_for_ecs() and pass the root->eq_classes list. Passing the list means we can't use the new ec_indexes Bitmapset, so can get no benefit of the improved EC lookup method. Since generate_join_implied_equalities_for_ecs() is fairly short, I don't think it's all that bad to keep another slightly altered copy of it. Especially given the following performance results from doing so: master + 0001 + 0002: postgres=# \i query.sql Time: 1308.742 ms (00:01.309) Time: 1294.766 ms (00:01.295) Time: 1293.113 ms (00:01.293) Time: 1300.643 ms (00:01.301) (4.06x faster than master) Unless there's some objection, I'll be looking into pushing both 0001 and 0002 in a single commit in the next few days. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Thu, 18 Jul 2019 at 19:24, David Rowley <david.rowley@2ndquadrant.com> wrote: > Unless there's some objection, I'll be looking into pushing both 0001 > and 0002 in a single commit in the next few days. I've pushed this after doing a bit of final tweaking. After a re-read, I didn't really like all the code that rechecked that ec->ec_relids matched the relation we're searching for. The only code that seems to be able to put additional members into eclass_indexes that there's no mention of in the actual class was in add_child_rel_equivalences(). The comments for the definition of eclass_indexes said nothing about there being a possibility of the field containing an index for an EC that knows nothing about the given relation. Fixing that either meant fixing the comment to say that "they *may* contain", or fixing up the code so that it's strict about what ECs can be mentioned in eclass_indexes. I went with the fixing the code option since it also allows us to get rid of some redundant checks, to which I turned into Asserts() to catch any possible future bugs that might be introduced by any code that might one day remove rels from an EC, e.g something like https://commitfest.postgresql.org/23/1712/ I also did some performance tests on the most simple query I could think of that uses eclasses. select2.sql: SELECT * FROM t1 INNER JOIN t2 ON t1.a=t2.a Master: $ pgbench -n -f select2.sql -T 60 postgres tps = 12143.597276 (excluding connections establishing) tps = 12100.773839 (excluding connections establishing) tps = 12086.209389 (excluding connections establishing) tps = 12098.194927 (excluding connections establishing) tps = 12105.140058 (excluding connections establishing) Patched: $ pgbench -n -f select2.sql -T 60 postgres tps = 12224.597530 (excluding connections establishing) tps = 12097.286522 (excluding connections establishing) tps = 12035.306729 (excluding connections establishing) tps = 11965.848289 (excluding connections establishing) tps = 12059.846474 (excluding connections establishing) There's a bit of noise there, but on average we're just 0.25% slower on the worse case and the benchmarks shown above put us ~406% better on with the fairly complex query that Tom posted in the initial email on this thread. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
David Rowley writes: > On Thu, 18 Jul 2019 at 19:24, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Unless there's some objection, I'll be looking into pushing both 0001 >> and 0002 in a single commit in the next few days. > > I've pushed this after doing a bit of final tweaking. sqlsmith triggers an assertion in this commit (3373c7155350): TRAP: FailedAssertion("!(rel->reloptkind == RELOPT_BASEREL)", File: "equivclass.c", Line: 764) Here's a somewhat reduced testcase to be run on the regression db: --8<---------------cut here---------------start------------->8--- select max(date_mii(now()::date, 42)) over (partition by subq_1.c9 order by c3), min(c3) over (partition by subq_1.c8 ) from (select 1 as c3 from public.partr_def2 as ref_0 left join public.num_exp_power_10_ln as sample_0 on (ref_0.a = sample_0.id ) ) as subq_0 right join (select 1 as c8, 1 as c9) as subq_1 on (true); --8<---------------cut here---------------end--------------->8--- Backtrace below. regards, Andreas (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007face8834535 in __GI_abort () at abort.c:79 #2 0x0000562b8d2731a3 in ExceptionalCondition ( conditionName=conditionName@entry=0x562b8d418320 "!(rel->reloptkind == RELOPT_BASEREL)", errorType=errorType@entry=0x562b8d2c601d "FailedAssertion", fileName=fileName@entry=0x562b8d418190 "equivclass.c", lineNumber=lineNumber@entry=764) at assert.c:54 #3 0x0000562b8d067ddc in get_eclass_for_sort_expr (root=root@entry=0x562b8e9077c8, expr=expr@entry=0x7facdf610030, nullable_relids=0x7facdf6216f8, nullable_relids@entry=0x7facdf615560, opfamilies=0x7facdf621348, opcintype=opcintype@entry=2277, collation=collation@entry=0, sortref=1, rel=0x0, create_it=true) at equivclass.c:764 #4 0x0000562b8d07326e in make_pathkey_from_sortinfo (root=0x562b8e9077c8, expr=0x7facdf610030, nullable_relids=0x7facdf615560, opfamily=397, opcintype=2277, collation=0, reverse_sort=false, nulls_first=false, sortref=1, rel=0x0, create_it=true) at pathkeys.c:215 #5 0x0000562b8d07401f in make_pathkey_from_sortop (create_it=true, sortref=1, nulls_first=false, ordering_op=1072, nullable_relids=0x7facdf615560, expr=0x7facdf610030, root=0x562b8e9077c8) at pathkeys.c:258 #6 make_pathkeys_for_sortclauses (root=root@entry=0x562b8e9077c8, sortclauses=sortclauses@entry=0x7facdf620f98, tlist=tlist@entry=0x562b8e929768) at pathkeys.c:1086 #7 0x0000562b8d082533 in make_pathkeys_for_window (root=root@entry=0x562b8e9077c8, tlist=0x562b8e929768, wc=<optimized out>, wc=<optimized out>) at planner.c:5642 #8 0x0000562b8d087c81 in create_one_window_path (wflists=<optimized out>, activeWindows=<optimized out>, output_target=<optimized out>, input_target=<optimized out>, path=0x7facdf620ea8, window_rel=<optimized out>, root=<optimized out>) at planner.c:4663 #9 create_window_paths (activeWindows=<optimized out>, wflists=0x7facdf613b80, output_target_parallel_safe=<optimized out>, output_target=0x7facdf6205b8, input_target=0x7facdf6206f8, input_rel=<optimized out>, root=0x562b8e9077c8) at planner.c:4588 #10 grouping_planner (root=<optimized out>, inheritance_update=false, tuple_fraction=<optimized out>) at planner.c:2211 #11 0x0000562b8d089dfa in subquery_planner (glob=glob@entry=0x562b8e91bb20, parse=parse@entry=0x562b8e906740, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at planner.c:1013 #12 0x0000562b8d08afa7 in standard_planner (parse=0x562b8e906740, cursorOptions=256, boundParams=<optimized out>) at planner.c:406
On Mon, 22 Jul 2019 at 00:44, Andreas Seltenreich <seltenreich@gmx.de> wrote: > sqlsmith triggers an assertion in this commit (3373c7155350): > > TRAP: FailedAssertion("!(rel->reloptkind == RELOPT_BASEREL)", File: "equivclass.c", Line: 764) Thanks for the report. It looks like this is caused by the join removal code removing the LEFT JOIN and leaving a dead rel in the eclasses ec_relids. The fix is likely either to adjust the Assert to allow that or to add an if test so that we only bother calling bms_add_member for base rels. I'm not quite sure yet. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, 22 Jul 2019 at 01:50, David Rowley <david.rowley@2ndquadrant.com> wrote: > > On Mon, 22 Jul 2019 at 00:44, Andreas Seltenreich <seltenreich@gmx.de> wrote: > > sqlsmith triggers an assertion in this commit (3373c7155350): > > > > TRAP: FailedAssertion("!(rel->reloptkind == RELOPT_BASEREL)", File: "equivclass.c", Line: 764) > > Thanks for the report. > > It looks like this is caused by the join removal code removing the > LEFT JOIN and leaving a dead rel in the eclasses ec_relids. The fix > is likely either to adjust the Assert to allow that or to add an if > test so that we only bother calling bms_add_member for base rels. I'm > not quite sure yet. I ended up adjusting the Assert to allow dead rels too. I thought about adding an if test so we only do the bms_add_member for base rels, but I didn't really like the special case where eclass_indexes wouldn't be correctly set for dead rels. I had thoughts that dead rels were not common enough to go to additional trouble over. That's debatable of course. I also thought about removing the Assert completely, but it does help verify we don't get anything unexpected in ec_relids. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services