Thread: [HACKERS] Cutting initdb's runtime (Perl question embedded)
Andres mentioned, and I've confirmed locally, that a large chunk of initdb's runtime goes into regprocin's brute-force lookups of function OIDs from function names. The recent discussion about cutting TAP test time prompted me to look into that question again. We had had some grand plans for getting genbki.pl to perform the name-to-OID conversion as part of a big rewrite, but since that project is showing few signs of life, I'm thinking that a more localized performance fix would be a good thing to look into. There seem to be a couple of plausible routes to a fix: 1. The best thing would still be to make genbki.pl do the conversion, and write numeric OIDs into postgres.bki. The core stumbling block here seems to be that for most catalogs, Catalog.pm and genbki.pl never really break down a DATA line into fields --- and we certainly have got to do that, if we're going to replace the values of regproc fields. The places that do need to do that approximate it like this: # To construct fmgroids.h and fmgrtab.c, we need to inspect some# of the individual data fields. Just splitting on whitespace#won't work, because some quoted fields might contain internal# whitespace. We handle this by folding them allto a simple# "xxx". Fortunately, this script doesn't need to look at any# fields that might need quoting, so this simplehack is# sufficient.$row->{bki_values} =~ s/"[^"]*"/"xxx"/g;@{$row}{@attnames} = split /\s+/, $row->{bki_values}; We would need a bullet-proof, non-hack, preferably not too slow way to split DATA lines into fields properly. I'm one of the world's worst Perl programmers, but surely there's a way? 2. Alternatively, we could teach bootstrap mode to build a hashtable mapping proname to OID while it reads pg_proc data from postgres.bki, and then make regprocin's bootstrap path consult the hashtable instead of looking directly at the pg_proc file. That I'm quite sure is do-able, but it seems like it's leaving money on the table compared to doing the transformation earlier. Thoughts? regards, tom lane
On 04/12/2017 04:12 PM, Tom Lane wrote: > 1. The best thing would still be to make genbki.pl do the conversion, > and write numeric OIDs into postgres.bki. The core stumbling block > here seems to be that for most catalogs, Catalog.pm and genbki.pl > never really break down a DATA line into fields --- and we certainly > have got to do that, if we're going to replace the values of regproc > fields. The places that do need to do that approximate it like this: > > # To construct fmgroids.h and fmgrtab.c, we need to inspect some > # of the individual data fields. Just splitting on whitespace > # won't work, because some quoted fields might contain internal > # whitespace. We handle this by folding them all to a simple > # "xxx". Fortunately, this script doesn't need to look at any > # fields that might need quoting, so this simple hack is > # sufficient. > $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; > @{$row}{@attnames} = split /\s+/, $row->{bki_values}; > > We would need a bullet-proof, non-hack, preferably not too slow way to > split DATA lines into fields properly. I'm one of the world's worst > Perl programmers, but surely there's a way? > > 2. Alternatively, we could teach bootstrap mode to build a hashtable > mapping proname to OID while it reads pg_proc data from postgres.bki, > and then make regprocin's bootstrap path consult the hashtable instead > of looking directly at the pg_proc file. That I'm quite sure is do-able, > but it seems like it's leaving money on the table compared to doing > the transformation earlier. > > Thoughts? Looked at this an option 1 seems simple enough if I am not missing something. I might hack something up later tonight. Either way I think this improvement can be done separately from the proposed replacement of the catalog header files. Trying to fix everything at once often leads to nothing being fixed at all. Andreas
On 2017-04-12 10:12:47 -0400, Tom Lane wrote: > Andres mentioned, and I've confirmed locally, that a large chunk of > initdb's runtime goes into regprocin's brute-force lookups of function > OIDs from function names. The recent discussion about cutting TAP test > time prompted me to look into that question again. We had had some > grand plans for getting genbki.pl to perform the name-to-OID conversion > as part of a big rewrite, but since that project is showing few signs > of life, I'm thinking that a more localized performance fix would be > a good thing to look into. There seem to be a couple of plausible > routes to a fix: > > 1. The best thing would still be to make genbki.pl do the conversion, > and write numeric OIDs into postgres.bki. The core stumbling block > here seems to be that for most catalogs, Catalog.pm and genbki.pl > never really break down a DATA line into fields --- and we certainly > have got to do that, if we're going to replace the values of regproc > fields. The places that do need to do that approximate it like this: > > # To construct fmgroids.h and fmgrtab.c, we need to inspect some > # of the individual data fields. Just splitting on whitespace > # won't work, because some quoted fields might contain internal > # whitespace. We handle this by folding them all to a simple > # "xxx". Fortunately, this script doesn't need to look at any > # fields that might need quoting, so this simple hack is > # sufficient. > $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; > @{$row}{@attnames} = split /\s+/, $row->{bki_values}; > > We would need a bullet-proof, non-hack, preferably not too slow way to > split DATA lines into fields properly. I'm one of the world's worst > Perl programmers, but surely there's a way? I've done something like 1) before: http://archives.postgresql.org/message-id/20150221230839.GE2037%40awork2.anarazel.de I don't think the speeds matters all that much, because we'll only do it when generating the .bki file - a couple ms more or less won't matter much. I IIRC spent some more time to also load the data files from a different format: https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/sane-catalog although that's presumably heavily outdated now. - Andres
On 04/12/2017 05:00 PM, Andreas Karlsson wrote: > Looked at this an option 1 seems simple enough if I am not missing > something. I might hack something up later tonight. Either way I think > this improvement can be done separately from the proposed replacement of > the catalog header files. Trying to fix everything at once often leads > to nothing being fixed at all. Here is my proof of concept patch. It does basically the same thing as Andres's patch except that it handles quoted values a bit better and does not try to support anything other than the regproc type. The patch speeds up initdb without fsync from 0.80 seconds to 0.55 seconds, which is a nice speedup, while adding a negligible amount of extra work on compilation. Two things which could be improved in this patch if people deem it important: - Refactor the code to be more generic, like Andres patch, so it is easier to add lookups for other data types. - Write something closer to a real .bki parser to actually understand quoted values and _null_ when parsing the data. Right now this patch only splits each row into its fields (while being aware of quotes) but does not attempt to remove quotes. The PoC patch treats "foo" and foo as different. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Andreas Karlsson <andreas@proxel.se> writes: > Here is my proof of concept patch. It does basically the same thing as > Andres's patch except that it handles quoted values a bit better and > does not try to support anything other than the regproc type. > The patch speeds up initdb without fsync from 0.80 seconds to 0.55 > seconds, which is a nice speedup, while adding a negligible amount of > extra work on compilation. I've pushed this with some mostly-cosmetic adjustments: * created a single subroutine that understands how to split DATA lines, rather than having several copies of the regex * rearranged the code so that the data structure returned by Catalog::Catalogs() isn't scribbled on (which was already happening before your patch, but it seemed pretty ugly to me) * stripped out the bootstrap-time name lookup code from all of reg* not just regproc. There's certainly lots more that could be done in the genbki code, but I think all we can justify at this stage of the development cycle is to get the low-hanging fruit for testing speedups. regards, tom lane
On 2017-04-13 12:13:30 -0400, Tom Lane wrote: > Andreas Karlsson <andreas@proxel.se> writes: > > Here is my proof of concept patch. It does basically the same thing as > > Andres's patch except that it handles quoted values a bit better and > > does not try to support anything other than the regproc type. > > > The patch speeds up initdb without fsync from 0.80 seconds to 0.55 > > seconds, which is a nice speedup, while adding a negligible amount of > > extra work on compilation. > > I've pushed this with some mostly-cosmetic adjustments: Cool. I wonder if we also should remove AtEOXact_CatCache()'s cross-checks - the resowner replacement has been in place for a while, and seems robust enough. They're now the biggest user of time. - Andres
Andres Freund <andres@anarazel.de> writes: > Cool. I wonder if we also should remove AtEOXact_CatCache()'s > cross-checks - the resowner replacement has been in place for a while, > and seems robust enough. They're now the biggest user of time. Hm, biggest user of time in what workload? I've not noticed that function particularly. I agree that it doesn't seem like we need to spend a lot of time cross-checking there, though. Maybe keep the code but #ifdef it under some nondefault debugging symbol. regards, tom lane
On 2017-04-13 12:56:14 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Cool. I wonder if we also should remove AtEOXact_CatCache()'s > > cross-checks - the resowner replacement has been in place for a while, > > and seems robust enough. They're now the biggest user of time. > > Hm, biggest user of time in what workload? I've not noticed that > function particularly. Just initdb. I presume it's because the catcaches will frequently be relatively big there. > I agree that it doesn't seem like we need to spend a lot of time > cross-checking there, though. Maybe keep the code but #ifdef it > under some nondefault debugging symbol. Hm, if we want to keep it, maybe tie it to CLOBBER_CACHE_ALWAYS or such, so it gets compiled at least sometimes? Not a great fit, but ... Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2017-04-13 12:56:14 -0400, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> Cool. I wonder if we also should remove AtEOXact_CatCache()'s >>> cross-checks - the resowner replacement has been in place for a while, >>> and seems robust enough. They're now the biggest user of time. >> Hm, biggest user of time in what workload? I've not noticed that >> function particularly. > Just initdb. I presume it's because the catcaches will frequently be > relatively big there. Hm. That ties into something I was looking at yesterday. The only reason that function is called so much is that bootstrap mode runs a separate transaction for *each line of the bki file* (cf do_start, do_end in bootparse.y). Which seems pretty silly. I experimented with collapsing all the transactions for consecutive DATA lines into one transaction, but couldn't immediately make it work due to memory management issues. I didn't try collapsing the entire run into a single transaction, but maybe that would actually be easier, though no doubt more wasteful of memory. >> I agree that it doesn't seem like we need to spend a lot of time >> cross-checking there, though. Maybe keep the code but #ifdef it >> under some nondefault debugging symbol. > Hm, if we want to keep it, maybe tie it to CLOBBER_CACHE_ALWAYS or such, > so it gets compiled at least sometimes? Not a great fit, but ... Don't like that, because CCA is by definition not the normal cache behavior. It would make a bit of sense to tie it to CACHEDEBUG, but as you say, it'd never get tested normally if we do that. On the whole, though, we may be looking at diminishing returns here. I just did some "perf" measurement of the overall "initdb" cycle, and what I'm seeing suggests that bootstrap mode as such is now a pretty small fraction of the overall cycle: + 51.07% 0.01% 28 postgres postgres [.] PostgresMain # ... + 13.52% 0.00% 0 postgres postgres [.] AuxiliaryProcessMain # That says that the post-bootstrap steps are now the bulk of the time, which agrees with naked-eye observation. regards, tom lane
On 2017-04-13 14:05:43 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2017-04-13 12:56:14 -0400, Tom Lane wrote: > >> Andres Freund <andres@anarazel.de> writes: > >>> Cool. I wonder if we also should remove AtEOXact_CatCache()'s > >>> cross-checks - the resowner replacement has been in place for a while, > >>> and seems robust enough. They're now the biggest user of time. > > >> Hm, biggest user of time in what workload? I've not noticed that > >> function particularly. > > > Just initdb. I presume it's because the catcaches will frequently be > > relatively big there. > > Hm. That ties into something I was looking at yesterday. The only > reason that function is called so much is that bootstrap mode runs a > separate transaction for *each line of the bki file* (cf do_start, > do_end in bootparse.y). Which seems pretty silly. Indeed. > On the whole, though, we may be looking at diminishing returns here. > I just did some "perf" measurement of the overall "initdb" cycle, > and what I'm seeing suggests that bootstrap mode as such is now a > pretty small fraction of the overall cycle: > > + 51.07% 0.01% 28 postgres postgres [.] PostgresMain # > ... > + 13.52% 0.00% 0 postgres postgres [.] AuxiliaryProcessMain # > > That says that the post-bootstrap steps are now the bulk of the time, > which agrees with naked-eye observation. The AtEOXact_CatCache weren't only in bootstrap mode, but yea, it's by far smaller wins in comparison to the regprocin thing. Greetings, Andres Freund
On 04/13/2017 06:13 PM, Tom Lane wrote: > I've pushed this with some mostly-cosmetic adjustments: Thanks! I like your adjustments. > There's certainly lots more that could be done in the genbki code, > but I think all we can justify at this stage of the development > cycle is to get the low-hanging fruit for testing speedups. Yeah, I also noticed that the genbki code seems to have gotten little love and that much more can be done here. Andreas
Andres Freund <andres@anarazel.de> writes: > On 2017-04-13 14:05:43 -0400, Tom Lane wrote: >> Hm. That ties into something I was looking at yesterday. The only >> reason that function is called so much is that bootstrap mode runs a >> separate transaction for *each line of the bki file* (cf do_start, >> do_end in bootparse.y). Which seems pretty silly. > Indeed. I failed to resist the temptation to poke at this, and found that indeed nothing seems to break if we just use one transaction for the whole processing of postgres.bki. So I've pushed a patch that does that. We're definitely down to the point where worrying about the speed of bootstrap mode, per se, is useless; the other steps in initdb visibly take a lot more time. regards, tom lane
On 04/14/2017 11:54 PM, Tom Lane wrote: > I failed to resist the temptation to poke at this, and found that > indeed nothing seems to break if we just use one transaction for the > whole processing of postgres.bki. So I've pushed a patch that does > that. We're definitely down to the point where worrying about the > speed of bootstrap mode, per se, is useless; the other steps in > initdb visibly take a lot more time. Looked some at this and what take time now for me seems to mainly be these four things (out of a total runtime of 560 ms). 1. setup_conversion: 140 ms 2. select_default_timezone: 90 ms 3. bootstrap_template1: 80 ms 4. setup_schema: 65 ms These four take up about two thirds of the total runtime, so it seems likely that we may still have relatively low hanging fruit (but not worth committing for PostgreSQL 10). I have not done profiling of these functions yet, so am not sure how they best would be fixed but maybe setup_conversion could be converted into bki entries to speed it up. Andreas
On 15/04/17 13:44, Andreas Karlsson wrote: > On 04/14/2017 11:54 PM, Tom Lane wrote: >> I failed to resist the temptation to poke at this, and found that >> indeed nothing seems to break if we just use one transaction for the >> whole processing of postgres.bki. So I've pushed a patch that does >> that. We're definitely down to the point where worrying about the >> speed of bootstrap mode, per se, is useless; the other steps in >> initdb visibly take a lot more time. > > Looked some at this and what take time now for me seems to mainly be > these four things (out of a total runtime of 560 ms). > > 1. setup_conversion: 140 ms > 2. select_default_timezone: 90 ms > 3. bootstrap_template1: 80 ms > 4. setup_schema: 65 ms > > These four take up about two thirds of the total runtime, so it seems > likely that we may still have relatively low hanging fruit (but not > worth committing for PostgreSQL 10). > > I have not done profiling of these functions yet, so am not sure how > they best would be fixed but maybe setup_conversion could be converted > into bki entries to speed it up. > > Andreas > > How much could be done concurrently? Cheers. Gavin
Andreas Karlsson <andreas@proxel.se> writes: > Looked some at this and what take time now for me seems to mainly be > these four things (out of a total runtime of 560 ms). > 1. setup_conversion: 140 ms > 2. select_default_timezone: 90 ms > 3. bootstrap_template1: 80 ms > 4. setup_schema: 65 ms FWIW, you can bypass select_default_timezone by setting environment variable TZ to a valid timezone name before calling initdb. On my workstation, that currently cuts the time for "initdb --no-sync" from about 1.25 sec to just about exactly 1.0 sec. I doubt that we want all the buildfarm members to switch over to doing that, since then we'd lose coverage of select_default_timezone. But it's interesting to speculate about having the buildfarm script extract the selected timezone name out of postgresql.conf after the first initdb it does, and then set TZ for remaining tests. We surely don't need to test select_default_timezone more than once per buildfarm run. regards, tom lane
On 04/15/2017 12:07 AM, Tom Lane wrote: > Andreas Karlsson <andreas@proxel.se> writes: >> Looked some at this and what take time now for me seems to mainly be >> these four things (out of a total runtime of 560 ms). >> 1. setup_conversion: 140 ms >> 2. select_default_timezone: 90 ms >> 3. bootstrap_template1: 80 ms >> 4. setup_schema: 65 ms > FWIW, you can bypass select_default_timezone by setting environment > variable TZ to a valid timezone name before calling initdb. On > my workstation, that currently cuts the time for "initdb --no-sync" > from about 1.25 sec to just about exactly 1.0 sec. > > I doubt that we want all the buildfarm members to switch over to > doing that, since then we'd lose coverage of select_default_timezone. > But it's interesting to speculate about having the buildfarm script > extract the selected timezone name out of postgresql.conf after the > first initdb it does, and then set TZ for remaining tests. We surely > don't need to test select_default_timezone more than once per > buildfarm run. > > Yeah. The obvious place to do this would be the "make check" stage, which runs very early. Unfortunately, pg_regress carefully removes its data directory on success - kind of a pity that's not switchable. Probably for now the simplest thing for buildfarm animals whose owners are trying to squeeze every last second would be to put something like TZ => 'Us/Eastern', in the build_env section of the config file. But as you say we don't want everyone doing that. Alternatively, we could have an initdb TAP test that explicitly removed the environment setting so we'd get coverage of select_default_timezone, and have the buildfarm set TZ to something if it's not already set. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan wrote: > Alternatively, we could have an initdb TAP test that explicitly removed > the environment setting so we'd get coverage of select_default_timezone, > and have the buildfarm set TZ to something if it's not already set. What about having an initdb option that runs select_default_timezone only and reports the result, so that it can be used in the buildfarm script to set TZ in all the regular initdb calls? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04/15/2017 11:44 AM, Alvaro Herrera wrote: > Andrew Dunstan wrote: > >> Alternatively, we could have an initdb TAP test that explicitly removed >> the environment setting so we'd get coverage of select_default_timezone, >> and have the buildfarm set TZ to something if it's not already set. > What about having an initdb option that runs select_default_timezone > only and reports the result, so that it can be used in the buildfarm > script to set TZ in all the regular initdb calls? > Seems like more work. What I had in mind was the attached plus roughly this in the buildfarm client: $ENV{TZ} ||= 'US/Eastern'; or whatever zone we choose to use. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > What I had in mind was the attached plus roughly this in the buildfarm > client: > $ENV{TZ} ||= 'US/Eastern'; > or whatever zone we choose to use. How about letting the first "make check" run with whatever is in the environment, and then forcing TZ to become set (much as above) for all the remaining tests? I'm afraid what you've got here might encourage a certain sameness of the test environments. regards, tom lane
On 04/15/2017 12:13 PM, Tom Lane wrote: > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> What I had in mind was the attached plus roughly this in the buildfarm >> client: >> $ENV{TZ} ||= 'US/Eastern'; >> or whatever zone we choose to use. > How about letting the first "make check" run with whatever is in the > environment, and then forcing TZ to become set (much as above) for > all the remaining tests? I'm afraid what you've got here might > encourage a certain sameness of the test environments. > > Sure. Just means putting this code a bit later in the file. "make check" is only one initdb, so it won't cost much. I'm still inclined to force a TAP test for initdb with no TZ set, though. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > Sure. Just means putting this code a bit later in the file. "make check" > is only one initdb, so it won't cost much. I'm still inclined to force a > TAP test for initdb with no TZ set, though. I'd like that to be an additional tweak for some existing test case rather than expending a whole initdb run just for that --- but otherwise, no objection. regards, tom lane
On 04/15/2017 02:13 PM, Tom Lane wrote: > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> Sure. Just means putting this code a bit later in the file. "make check" >> is only one initdb, so it won't cost much. I'm still inclined to force a >> TAP test for initdb with no TZ set, though. > I'd like that to be an additional tweak for some existing test case > rather than expending a whole initdb run just for that --- but otherwise, > no objection. > > That's what my patch did. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 4/15/17 12:33, Andrew Dunstan wrote: > Sure. Just means putting this code a bit later in the file. "make check" > is only one initdb, so it won't cost much. I'm still inclined to force a > TAP test for initdb with no TZ set, though. How much is this going to buy overall? Is it worth the complications? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04/16/2017 07:43 PM, Peter Eisentraut wrote: > On 4/15/17 12:33, Andrew Dunstan wrote: >> Sure. Just means putting this code a bit later in the file. "make check" >> is only one initdb, so it won't cost much. I'm still inclined to force a >> TAP test for initdb with no TZ set, though. > How much is this going to buy overall? Is it worth the complications? > I don't know, but it's a very small change. I am going to spend some time instrumenting where the time goes in various tests. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > There's certainly lots more that could be done in the genbki code, > but I think all we can justify at this stage of the development > cycle is to get the low-hanging fruit for testing speedups. I threw Devel::NYTProf at it and picked some more low-hanging fruit. Attached are separate patches for each change, and here are the runtimes of genbki.pl and Gen_fmgrtab.pl, respectively, after each patch (averages of 5 runs, in millseconds): master (b6dd1271): 355, 182 1: Avoid unnecessary regex captures: 349, 183 2: Avoid repeated calls to SplitDataLine: 316, 158 3: Inline SplitDataLine: 291, 141 4: Inline check_natts: 287, 141 Together they shave 68ms or 19.2% off the runtime of genbki.pl and 41ms or 22.5% off the runtime of Gen_fmgrtab.pl Finally, one non-performance patch, which just removes the use of Exporter in Catalog.pm, since none of the users actually import anything from it. - ilmari -- "I use RMS as a guide in the same way that a boat captain would use a lighthouse. It's good to know where it is, but you generally don't want to find yourself in the same spot." - Tollef Fog Heen From 5d7f4b1e78243daa6939501b88b2644bfcf9bd77 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 13:26:35 +0100 Subject: [PATCH 1/8] Avoid unnecessary regex captures in Catalog.pm --- src/backend/catalog/Catalog.pm | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index fa8de04..e7b647a 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -54,7 +54,7 @@ sub Catalogs { # Strip C-style comments. - s;/\*(.|\n)*\*/;;g; + s;/\*(?:.|\n)*\*/;;g; if (m;/\*;) { @@ -80,12 +80,12 @@ sub Catalogs { $catalog{natts} = $1; } - elsif (/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/) + elsif (/^DATA\(insert(?:\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/) { - check_natts($filename, $catalog{natts}, $3, + check_natts($filename, $catalog{natts}, $2, $input_file, $input_line_number); - push @{ $catalog{data} }, { oid => $2, bki_values => $3 }; + push @{ $catalog{data} }, { oid => $1, bki_values => $2 }; } elsif (/^DESCR\(\"(.*)\"\)$/) { -- 2.7.4 From 92402e0306eda209b1a2811acc41325d75add0cb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 14:04:20 +0100 Subject: [PATCH 2/8] Avoid repeated calls to Catalog::SplitDataLine Both check_natts and the callers of Catalog::Catalogs were calling Catalog::SplitDataLines, the former discarding the result. Instead, have Catalog::Catalogs store the split fields directly and pass the count to check_natts --- src/backend/catalog/Catalog.pm | 14 +++++++------- src/backend/catalog/genbki.pl | 3 +-- src/backend/utils/Gen_fmgrtab.pl | 3 +-- 3 files changed, 9 insertions(+), 11 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index e7b647a..0c508b0 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -82,10 +82,12 @@ sub Catalogs } elsif (/^DATA\(insert(?:\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/) { - check_natts($filename, $catalog{natts}, $2, + my @bki_values = SplitDataLine($2); + + check_natts($filename, $catalog{natts}, scalar(@bki_values), $input_file, $input_line_number); - push @{ $catalog{data} }, { oid => $1, bki_values => $2 }; + push @{ $catalog{data} }, { oid => $1, bki_values => \@bki_values }; } elsif (/^DESCR\(\"(.*)\"\)$/) { @@ -254,17 +256,15 @@ sub RenameTempFile # verify the number of fields in the passed-in DATA line sub check_natts { - my ($catname, $natts, $bki_val, $file, $line) = @_; + my ($catname, $natts, $bki_vals, $file, $line) = @_; die "Could not find definition for Natts_${catname} before start of DATA() in $file\n" unless defined $natts; - my $nfields = scalar(SplitDataLine($bki_val)); - die sprintf "Wrong number of attributes in DATA() entry at %s:%d (expected %d but got %d)\n", - $file, $line, $natts, $nfields - unless $natts == $nfields; + $file, $line, $natts, $bki_vals + unless $natts == $bki_vals; } 1; diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 198e072..8875f6c 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -161,9 +161,8 @@ foreach my $catname (@{ $catalogs->{names} }) foreach my $row (@{ $catalog->{data} }) { - # Split line into tokens without interpreting their meaning. my %bki_values; - @bki_values{@attnames} = Catalog::SplitDataLine($row->{bki_values}); + @bki_values{@attnames} = @{$row->{bki_values}}; # Perform required substitutions on fields foreach my $att (keys %bki_values) diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index 76bdf5c..d2c4617 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -58,9 +58,8 @@ foreach my $column (@{ $catalogs->{pg_proc}->{columns} }) my $data = $catalogs->{pg_proc}->{data}; foreach my $row (@$data) { - # Split line into tokens without interpreting their meaning. my %bki_values; - @bki_values{@attnames} = Catalog::SplitDataLine($row->{bki_values}); + @bki_values{@attnames} = @{$row->{bki_values}}; # Select out just the rows for internal-language procedures. # Note assumption here that INTERNALlanguageId is 12. -- 2.7.4 From c8e55bd9ff36029c3f4fe7053f54f9f862c79d7e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 14:47:06 +0100 Subject: [PATCH 3/8] Inline SplitDataLines into its only caller --- src/backend/catalog/Catalog.pm | 38 ++++++++++++++------------------------ 1 file changed, 14 insertions(+), 24 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 0c508b0..9bd6263 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -82,12 +82,24 @@ sub Catalogs } elsif (/^DATA\(insert(?:\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/) { - my @bki_values = SplitDataLine($2); + # The field-splitting regex will clobber $1, save it + my $oid = $1; + + # This handling of quoted strings might look too + # simplistic, but it matches what bootscanner.l does: + # that has no provision for quote marks inside quoted + # strings, either. If we don't have a quoted string, + # just snarf everything till next whitespace. That will + # accept some things that bootscanner.l will see as + # erroneous tokens; but it seems wiser to do that and + # let bootscanner.l complain than to silently drop + # non-whitespace characters. + my @bki_values = $2 =~ /"[^"]*"|\S+/g; check_natts($filename, $catalog{natts}, scalar(@bki_values), $input_file, $input_line_number); - push @{ $catalog{data} }, { oid => $1, bki_values => \@bki_values }; + push @{ $catalog{data} }, { oid => $oid, bki_values => \@bki_values }; } elsif (/^DESCR\(\"(.*)\"\)$/) { @@ -218,28 +230,6 @@ sub Catalogs return \%catalogs; } -# Split a DATA line into fields. -# Call this on the bki_values element of a DATA item returned by Catalogs(); -# it returns a list of field values. We don't strip quoting from the fields. -# Note: it should be safe to assign the result to a list of length equal to -# the nominal number of catalog fields, because check_natts already checked -# the number of fields. -sub SplitDataLine -{ - my $bki_values = shift; - - # This handling of quoted strings might look too simplistic, but it - # matches what bootscanner.l does: that has no provision for quote marks - # inside quoted strings, either. If we don't have a quoted string, just - # snarf everything till next whitespace. That will accept some things - # that bootscanner.l will see as erroneous tokens; but it seems wiser - # to do that and let bootscanner.l complain than to silently drop - # non-whitespace characters. - my @result = $bki_values =~ /"[^"]*"|\S+/g; - - return @result; -} - # Rename temporary files to final names. # Call this function with the final file name and the .tmp extension # Note: recommended extension is ".tmp$$", so that parallel make steps -- 2.7.4 From fc6c69a692002277378308c2f6ca019185ef9fbb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 14:38:22 +0100 Subject: [PATCH 4/8] Inline check_nattrs into its only caller --- src/backend/catalog/Catalog.pm | 24 ++++++++---------------- 1 file changed, 8 insertions(+), 16 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 9bd6263..0aa3e0c 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -96,8 +96,14 @@ sub Catalogs # non-whitespace characters. my @bki_values = $2 =~ /"[^"]*"|\S+/g; - check_natts($filename, $catalog{natts}, scalar(@bki_values), - $input_file, $input_line_number); + # Check that the DATA line matches the declared number of attributes + die "Could not find definition for Natts_${catname} before start of DATA() in $filename\n" + unless defined $catalog{natts}; + + die sprintf + "Wrong number of attributes in DATA() entry at %s:%d (expected %d but got %d)\n", + $filename, $input_line_number, $catalog{natts}, scalar(@bki_values) + unless $catalog{natts} == @bki_values; push @{ $catalog{data} }, { oid => $oid, bki_values => \@bki_values }; } @@ -243,18 +249,4 @@ sub RenameTempFile rename($temp_name, $final_name) || die "rename: $temp_name: $!"; } -# verify the number of fields in the passed-in DATA line -sub check_natts -{ - my ($catname, $natts, $bki_vals, $file, $line) = @_; - - die "Could not find definition for Natts_${catname} before start of DATA() in $file\n" - unless defined $natts; - - die sprintf - "Wrong number of attributes in DATA() entry at %s:%d (expected %d but got %d)\n", - $file, $line, $natts, $bki_vals - unless $natts == $bki_vals; -} - 1; -- 2.7.4 From 592dbc64d21066f5da43d535af2cb3780b95d006 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 15:57:18 +0100 Subject: [PATCH 6/8] Remove pointless Exporter usage in Catalog.pm None of the users actually import any of the subroutines, they just call them by their fully-qualified names. --- src/backend/catalog/Catalog.pm | 5 ----- 1 file changed, 5 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index b326427..d4b4170 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -16,11 +16,6 @@ package Catalog; use strict; use warnings; -require Exporter; -our @ISA = qw(Exporter); -our @EXPORT = (); -our @EXPORT_OK = qw(Catalogs SplitDataLine RenameTempFile); - # Call this function with an array of names of header files to parse. # Returns a nested data structure describing the data in the headers. sub Catalogs -- 2.7.4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-04-17 17:49:54 +0100, Dagfinn Ilmari Mannsåker wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > There's certainly lots more that could be done in the genbki code, > > but I think all we can justify at this stage of the development > > cycle is to get the low-hanging fruit for testing speedups. > > I threw Devel::NYTProf at it and picked some more low-hanging fruit. > Attached are separate patches for each change, and here are the runtimes > of genbki.pl and Gen_fmgrtab.pl, respectively, after each patch > (averages of 5 runs, in millseconds): > > master (b6dd1271): 355, 182 > > 1: Avoid unnecessary regex captures: 349, 183 > 2: Avoid repeated calls to SplitDataLine: 316, 158 > 3: Inline SplitDataLine: 291, 141 > 4: Inline check_natts: 287, 141 > > Together they shave 68ms or 19.2% off the runtime of genbki.pl and 41ms > or 22.5% off the runtime of Gen_fmgrtab.pl I'm a bit doubtful about improving the performance of genbki at the cost of any sort of complication - it's only executed during the actual build, not during initdb... I don't see much point in doing things like 3) and 4), it's just not worth it imo. - Andres
Andres Freund <andres@anarazel.de> writes: > On 2017-04-17 17:49:54 +0100, Dagfinn Ilmari Mannsåker wrote: >> I threw Devel::NYTProf at it and picked some more low-hanging fruit. > I'm a bit doubtful about improving the performance of genbki at the cost > of any sort of complication - it's only executed during the actual > build, not during initdb... I don't see much point in doing things like > 3) and 4), it's just not worth it imo. Yeah, it's only run once per build, or probably even less often than that considering we only re-run it when the input files change. I could get interested in another 20% reduction in initdb's time, but not genbki's. regards, tom lane
On 2017-04-17 17:49:54 +0100, Dagfinn Ilmari Mannsåker wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > There's certainly lots more that could be done in the genbki code, > > but I think all we can justify at this stage of the development > > cycle is to get the low-hanging fruit for testing speedups. > > I threw Devel::NYTProf at it and picked some more low-hanging fruit. > Attached are separate patches for each change, and here are the runtimes > of genbki.pl and Gen_fmgrtab.pl, respectively, after each patch > (averages of 5 runs, in millseconds): Btw, I think Tom's "more that could be done" was referring more to doing more upfront work, checking, easier input format, whatnot in the genbki, not so much performance work... Tom, correct me if I'm wrong. - Andres
Andres Freund <andres@anarazel.de> writes: > Btw, I think Tom's "more that could be done" was referring more to doing > more upfront work, checking, easier input format, whatnot in the genbki, > not so much performance work... Tom, correct me if I'm wrong. Yeah, as per what we were just saying, performance of that code isn't really an issue right now. Supporting a nicer input format is probably the biggest step forward that could be taken, but that will require some major work :-(. We've also talked about things like supporting regprocedure rather than only regproc (ie disambiguating overloaded functions), likewise regoperator, and so on, with an eye to reducing the number of places where people have to write numeric OIDs in the input. There's some threads on this in the archives, but I'm too lazy to look. regards, tom lane
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > Andres Freund <andres@anarazel.de> writes: >> On 2017-04-17 17:49:54 +0100, Dagfinn Ilmari Mannsåker wrote: >>> I threw Devel::NYTProf at it and picked some more low-hanging fruit. > >> I'm a bit doubtful about improving the performance of genbki at the cost >> of any sort of complication - it's only executed during the actual >> build, not during initdb... I don't see much point in doing things like >> 3) and 4), it's just not worth it imo. > > Yeah, it's only run once per build, or probably even less often than that > considering we only re-run it when the input files change. I could get > interested in another 20% reduction in initdb's time, but not genbki's. Fair enough. I still think 1), 2) and 5) are worthwile code cleanups regardless of the performance impact. In fact, if we don't care that much about the performance, we can move the duplicated code in Gen_fmgrmtab.pl and genbki.pl that turns bki_values into a hash into Catalogs.pm. That regresses genbki.pl time by ~30ms on my machine. Revised patch series attached. -- - Twitter seems more influential [than blogs] in the 'gets reported in the mainstream press' sense at least. - Matt McLeod - That'd be because the content of a tweet is easier to condense down to a mainstream media article. - Calle Dybedahl From f7b75bcbe993d4ddbc92da85c7148bf7fee143ee Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 13:26:35 +0100 Subject: [PATCH 1/4] Avoid unnecessary regex captures in Catalog.pm --- src/backend/catalog/Catalog.pm | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index fa8de04..e7b647a 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -54,7 +54,7 @@ sub Catalogs { # Strip C-style comments. - s;/\*(.|\n)*\*/;;g; + s;/\*(?:.|\n)*\*/;;g; if (m;/\*;) { @@ -80,12 +80,12 @@ sub Catalogs { $catalog{natts} = $1; } - elsif (/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/) + elsif (/^DATA\(insert(?:\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/) { - check_natts($filename, $catalog{natts}, $3, + check_natts($filename, $catalog{natts}, $2, $input_file, $input_line_number); - push @{ $catalog{data} }, { oid => $2, bki_values => $3 }; + push @{ $catalog{data} }, { oid => $1, bki_values => $2 }; } elsif (/^DESCR\(\"(.*)\"\)$/) { -- 2.7.4 From decd1b8c171cc508da5f79f01d2f0779a569a963 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 14:04:20 +0100 Subject: [PATCH 2/4] Avoid repeated calls to Catalog::SplitDataLine Both check_natts and the callers of Catalog::Catalogs were calling Catalog::SplitDataLines, the former discarding the result. Instead, have Catalog::Catalogs store the split fields directly and pass the count to check_natts --- src/backend/catalog/Catalog.pm | 14 +++++++------- src/backend/catalog/genbki.pl | 3 +-- src/backend/utils/Gen_fmgrtab.pl | 3 +-- 3 files changed, 9 insertions(+), 11 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index e7b647a..81513c7 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -82,10 +82,12 @@ sub Catalogs } elsif (/^DATA\(insert(?:\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/) { - check_natts($filename, $catalog{natts}, $2, + my @bki_values = SplitDataLine($2); + + check_natts($filename, $catalog{natts}, scalar(@bki_values), $input_file, $input_line_number); - push @{ $catalog{data} }, { oid => $1, bki_values => $2 }; + push @{ $catalog{data} }, { oid => $1, bki_values => \@bki_values }; } elsif (/^DESCR\(\"(.*)\"\)$/) { @@ -254,17 +256,15 @@ sub RenameTempFile # verify the number of fields in the passed-in DATA line sub check_natts { - my ($catname, $natts, $bki_val, $file, $line) = @_; + my ($catname, $natts, $bki_vals, $file, $line) = @_; die "Could not find definition for Natts_${catname} before start of DATA() in $file\n" unless defined $natts; - my $nfields = scalar(SplitDataLine($bki_val)); - die sprintf "Wrong number of attributes in DATA() entry at %s:%d (expected %d but got %d)\n", - $file, $line, $natts, $nfields - unless $natts == $nfields; + $file, $line, $natts, $bki_vals + unless $natts == $bki_vals; } 1; diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 198e072..8875f6c 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -161,9 +161,8 @@ foreach my $catname (@{ $catalogs->{names} }) foreach my $row (@{ $catalog->{data} }) { - # Split line into tokens without interpreting their meaning. my %bki_values; - @bki_values{@attnames} = Catalog::SplitDataLine($row->{bki_values}); + @bki_values{@attnames} = @{$row->{bki_values}}; # Perform required substitutions on fields foreach my $att (keys %bki_values) diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index 76bdf5c..d2c4617 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -58,9 +58,8 @@ foreach my $column (@{ $catalogs->{pg_proc}->{columns} }) my $data = $catalogs->{pg_proc}->{data}; foreach my $row (@$data) { - # Split line into tokens without interpreting their meaning. my %bki_values; - @bki_values{@attnames} = Catalog::SplitDataLine($row->{bki_values}); + @bki_values{@attnames} = @{$row->{bki_values}}; # Select out just the rows for internal-language procedures. # Note assumption here that INTERNALlanguageId is 12. -- 2.7.4 From a5b8e34bfab6a687d6b13293c27f1f1246bffacb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 15:19:33 +0100 Subject: [PATCH 3/4] Hashify bki values in Catlog::Catalogs Both callers werere doing the same conversion of the bki_values array into a hash, so do it just once in Catalog.pm --- src/backend/catalog/Catalog.pm | 5 ++++- src/backend/catalog/genbki.pl | 23 +++++++++++------------ src/backend/utils/Gen_fmgrtab.pl | 18 ++++++------------ 3 files changed, 21 insertions(+), 25 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 81513c7..9c06ec1 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -87,7 +87,10 @@ sub Catalogs check_natts($filename, $catalog{natts}, scalar(@bki_values), $input_file, $input_line_number); - push @{ $catalog{data} }, { oid => $1, bki_values => \@bki_values }; + my %bki_values; + @bki_values{map $_->{name}, @{$catalog{columns}}} = @bki_values; + + push @{ $catalog{data} }, { oid => $oid, bki_values => \%bki_values }; } elsif (/^DESCR\(\"(.*)\"\)$/) { diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 8875f6c..fba157e 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -161,24 +161,23 @@ foreach my $catname (@{ $catalogs->{names} }) foreach my $row (@{ $catalog->{data} }) { - my %bki_values; - @bki_values{@attnames} = @{$row->{bki_values}}; + my $bki_values = $row->{bki_values}; # Perform required substitutions on fields - foreach my $att (keys %bki_values) + foreach my $att (keys %$bki_values) { # Substitute constant values we acquired above. # (It's intentional that this can apply to parts of a field). - $bki_values{$att} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g; - $bki_values{$att} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g; + $bki_values->{$att} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g; + $bki_values->{$att} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g; # Replace regproc columns' values with OIDs. # If we don't have a unique value to substitute, # just do nothing (regprocin will complain). if ($bki_attr{$att}->{type} eq 'regproc') { - my $procoid = $regprocoids{$bki_values{$att}}; - $bki_values{$att} = $procoid + my $procoid = $regprocoids{$bki_values->{$att}}; + $bki_values->{$att} = $procoid if defined($procoid) && $procoid ne 'MULTIPLE'; } } @@ -187,20 +186,20 @@ foreach my $catname (@{ $catalogs->{names} }) # This relies on the order we process the files in! if ($catname eq 'pg_proc') { - if (defined($regprocoids{$bki_values{proname}})) + if (defined($regprocoids{$bki_values->{proname}})) { - $regprocoids{$bki_values{proname}} = 'MULTIPLE'; + $regprocoids{$bki_values->{proname}} = 'MULTIPLE'; } else { - $regprocoids{$bki_values{proname}} = $row->{oid}; + $regprocoids{$bki_values->{proname}} = $row->{oid}; } } # Save pg_type info for pg_attribute processing below if ($catname eq 'pg_type') { - my %type = %bki_values; + my %type = %$bki_values; $type{oid} = $row->{oid}; push @types, \%type; } @@ -208,7 +207,7 @@ foreach my $catname (@{ $catalogs->{names} }) # Write to postgres.bki my $oid = $row->{oid} ? "OID = $row->{oid} " : ''; printf $bki "insert %s( %s )\n", $oid, - join(' ', @bki_values{@attnames}); + join(' ', @{$bki_values}{@attnames}); # Write comments to postgres.description and postgres.shdescription if (defined $row->{descr}) diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index d2c4617..03a2ed5 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -49,28 +49,22 @@ my $catalogs = Catalog::Catalogs($infile); # Collect the raw data from pg_proc.h. my @fmgr = (); -my @attnames; -foreach my $column (@{ $catalogs->{pg_proc}->{columns} }) -{ - push @attnames, $column->{name}; -} my $data = $catalogs->{pg_proc}->{data}; foreach my $row (@$data) { - my %bki_values; - @bki_values{@attnames} = @{$row->{bki_values}}; + my $bki_values = $row->{bki_values}; # Select out just the rows for internal-language procedures. # Note assumption here that INTERNALlanguageId is 12. - next if $bki_values{prolang} ne '12'; + next if $bki_values->{prolang} ne '12'; push @fmgr, { oid => $row->{oid}, - strict => $bki_values{proisstrict}, - retset => $bki_values{proretset}, - nargs => $bki_values{pronargs}, - prosrc => $bki_values{prosrc}, }; + strict => $bki_values->{proisstrict}, + retset => $bki_values->{proretset}, + nargs => $bki_values->{pronargs}, + prosrc => $bki_values->{prosrc}, }; } # Emit headers for both files -- 2.7.4 From 036b1f6a01ac518501056b10b6dd011fda6815ed Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 17 Apr 2017 15:57:18 +0100 Subject: [PATCH 4/4] Remove pointless Exporter usage in Catalog.pm None of the users actually import any of the subroutines, they just call them by their fully-qualified names. --- src/backend/catalog/Catalog.pm | 5 ----- 1 file changed, 5 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 9c06ec1..662df3b 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -16,11 +16,6 @@ package Catalog; use strict; use warnings; -require Exporter; -our @ISA = qw(Exporter); -our @EXPORT = (); -our @EXPORT_OK = qw(Catalogs SplitDataLine RenameTempFile); - # Call this function with an array of names of header files to parse. # Returns a nested data structure describing the data in the headers. sub Catalogs -- 2.7.4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Resurrecting this old thread ... I decided it'd be interesting to re-examine where initdb's runtime is going, seeing that we just got done with a lot of bootstrap data restructuring. I stuck some timing code into initdb, and got results like this: creating directory /home/postgres/testversion/data ... ok elapsed = 0.256 msec creating subdirectories ... ok elapsed = 2.385 msec selecting default max_connections ... 100 elapsed = 13.528 msec selecting default shared_buffers ... 128MB elapsed = 13.699 msec selecting dynamic shared memory implementation ... posix elapsed = 0.129 msec elapsed = 281.335 msec in select_default_timezone creating configuration files ... ok elapsed = 1.319 msec running bootstrap script ... ok elapsed = 162.143 msec performing post-bootstrap initialization ... ok elapsed = 832.569 msec Sync to disk skipped. real 0m1.316s user 0m0.941s sys 0m0.395s (I'm using "initdb -N" because the cost of the sync step is so platform-dependent, and it's not interesting anyway for buildfarm or make check-world testing. Also, I rearranged the code slightly so that select_default_timezone could be timed separately from the rest of the "creating configuration files" step.) In trying to break down the "post-bootstrap initialization" step a bit further, I soon realized that trying to time the sub-steps from initdb is useless. initdb is just shoving bytes down the pipe as fast as the kernel will let it; it has no idea how long it's taking the backend to do any one query or queries. So I ended up adding "-c log_statement=all -c log_min_duration_statement=0" to the backend_options, and digging query durations out of the log output. I got these totals for the major steps in the post-boot run: pg_authid setup: 0.909 ms pg_depend setup: 64.980 ms system views: 106.221 ms pg_description: 39.665 ms pg_collation: 65.162 ms conversions: 72.024 ms text search: 29.454 ms init-acl hacking: 14.339 ms information schema: 188.497 ms plpgsql: 2.531 ms analyze/vacuum/additional db creation: 171.762 ms So the conversions don't look nearly as interesting as Andreas suggested upthread. Pushing them into .bki format would at best save ~ 70 ms out of 1300. Which is not nothing, but it's not going to change the world either. Really the only thing here that jumps out as being unduly expensive for what it's doing is select_default_timezone. That is, and always has been, a brute-force algorithm; I wonder if there's a way to do better? We can probably guess that every non-Windows platform is using the IANA timezone data these days. If there were some way to extract the name of the active timezone setting directly, we wouldn't have to try to reverse-engineer it. But I don't know of any portable way :-( regards, tom lane
On Tue, May 8, 2018 at 4:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Really the only thing here that jumps out as being unduly expensive for > what it's doing is select_default_timezone. That is, and always has been, > a brute-force algorithm; I wonder if there's a way to do better? We can > probably guess that every non-Windows platform is using the IANA timezone > data these days. If there were some way to extract the name of the active > timezone setting directly, we wouldn't have to try to reverse-engineer it. > But I don't know of any portable way :-( Who says we need a portable way? If we had something that worked on Linux and macOS, it would cover most developer environments. I wonder if readlink("/etc/localtime", buf, sz) might be a viable approach. You could, at least, try any time zone you can derive that way first, before you try any others. Also, how about having a --timezone option for initdb? Then you could determine the time zone just once and pass it down to subsequent initdb invocations. Or just hardcode the regression tests to use some fixed time zone, like Antarctica/Troll. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, May 8, 2018 at 4:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Really the only thing here that jumps out as being unduly expensive for >> what it's doing is select_default_timezone. That is, and always has been, >> a brute-force algorithm; I wonder if there's a way to do better? > Who says we need a portable way? If we had something that worked on > Linux and macOS, it would cover most developer environments. I wonder > if readlink("/etc/localtime", buf, sz) might be a viable approach. I wondered about that, but I'm afraid it's often a hardlink not a symlink. Still, we could try it. > Also, how about having a --timezone option for initdb? We already have that, it's called the TZ environment variable. There was an effort awhile back to try to speed up the buildfarm by having that get set automatically, but it failed for reasons I don't recall ATM. regards, tom lane
Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Tue, May 8, 2018 at 4:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Really the only thing here that jumps out as being unduly expensive for > >> what it's doing is select_default_timezone. That is, and always has been, > >> a brute-force algorithm; I wonder if there's a way to do better? > > > Who says we need a portable way? If we had something that worked on > > Linux and macOS, it would cover most developer environments. I wonder > > if readlink("/etc/localtime", buf, sz) might be a viable approach. > > I wondered about that, but I'm afraid it's often a hardlink not a > symlink. Still, we could try it. In Debian systems, it's a symlink. Apparently in RHEL6 and older it's a copy or hardlink, and the file /etc/sysconfig/clock contains a ZONE variable that points to the right zone. Maybe if we add enough platform-dependent hacks, we would use the slow fallback only for rare cases. (Maybe have initdb emit a warning when the fallback is used, so that we know what else to look for.) This comment is insightful: https://github.com/HowardHinnant/date/issues/269#issuecomment-353792132 It's talking about this code: https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L3652 -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, May 9, 2018 at 11:39 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > In Debian systems, it's a symlink. Apparently in RHEL6 and older it's a > copy or hardlink, and the file /etc/sysconfig/clock contains a ZONE > variable that points to the right zone. Maybe if we add enough > platform-dependent hacks, we would use the slow fallback only for rare > cases. (Maybe have initdb emit a warning when the fallback is used, so > that we know what else to look for.) I just checked a couple of RHEL7 systems and it seems to be a symlink there. It's also a symlink on my laptop (macOS 10.13.3). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Who says we need a portable way? If we had something that worked on >>> Linux and macOS, it would cover most developer environments. I wonder >>> if readlink("/etc/localtime", buf, sz) might be a viable approach. >> I wondered about that, but I'm afraid it's often a hardlink not a >> symlink. Still, we could try it. > In Debian systems, it's a symlink. Apparently in RHEL6 and older it's a > copy or hardlink, and the file /etc/sysconfig/clock contains a ZONE > variable that points to the right zone. Yeah, on my RHEL6 box, $ ls -l /etc/localtime -rw-r--r--. 1 root root 3519 May 4 2010 /etc/localtime The mod date's a little astonishing, considering this system was built in 2013. It is *not* a hardlink, or even an exact match, to anything under /usr/share/zoneinfo, though perhaps it was originally. Also: $ cat /etc/sysconfig/clock # The time zone of the system is defined by the contents of /etc/localtime. # This file is only for evaluation by system-config-date, do not rely on its # contents elsewhere. ZONE="America/New York" I'm inclined to respect the comment, especially since I see they are not spelling the zone name canonically anyway (note space not underscore); so looking at this wouldn't work without some ill-defined heuristics. However, more modern Red Hat systems seem to have /etc/localtime as a symlink, so checking it would work there, and also macOS seems to do it that way for as far back as I can check. > This comment is insightful: > https://github.com/HowardHinnant/date/issues/269#issuecomment-353792132 > It's talking about this code: > https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L3652 Interesting. Not sure if we want to try all those files. But I'll take a look at whipping up something that checks /etc/localtime. regards, tom lane
I wrote: > ... I'll > take a look at whipping up something that checks /etc/localtime. Here's a draft patch. It seems to do what I expect on a couple of different macOS releases as well as recent Fedora. Googling finds some suggestions that there might be other locations for the symlink, for instance it's alleged that glibc can be configured to look at /usr/local/etc/localtime instead of /etc/localtime. But AFAICS none of the alternative locations are used on enough systems that it'd be worth the cycles to check them. Likewise, there seem to be some systems with conventions about storing the zone name in a text file someplace, but not really enough of them to make it worth our time to support that. The initdb speedup on the other boxes I checked seems to be in the vicinity of 50 ms out of circa-1-sec total, which is less than I'd have expected from my measurements on my RHEL6 box :-(. Still, I think it's worth doing, not least because (if it works) this fixes the problem that you may get an unexpected alias for your system timezone, as in this complaint awhile back: https://www.postgresql.org/message-id/flat/4D7016AA.2090609%40agliodbs.com I also changed initdb so that it reports what zone name it picked. I have a vague recollection that it might've done that long ago, and we suppressed it in a burst of enthusiasm about making initdb less chatty. That seems like something to undo, since we're tweaking what are still basically heuristic rules for choosing the zone. Next question is what to do with this. Do we want to sit on it till v12, or sneak it in now? regards, tom lane diff --git a/src/bin/initdb/findtimezone.c b/src/bin/initdb/findtimezone.c index 4c3a91a..6901188 100644 --- a/src/bin/initdb/findtimezone.c +++ b/src/bin/initdb/findtimezone.c @@ -15,6 +15,7 @@ #include <fcntl.h> #include <sys/stat.h> #include <time.h> +#include <unistd.h> #include "pgtz.h" @@ -126,12 +127,19 @@ pg_load_tz(const char *name) * On most systems, we rely on trying to match the observable behavior of * the C library's localtime() function. The database zone that matches * furthest into the past is the one to use. Often there will be several - * zones with identical rankings (since the Olson database assigns multiple + * zones with identical rankings (since the IANA database assigns multiple * names to many zones). We break ties arbitrarily by preferring shorter, * then alphabetically earlier zone names. * + * Many modern systems use the IANA database, so if we can determine the + * system's idea of which zone it is using and its behavior matches our zone + * of the same name, we can skip the rather-expensive search through all the + * zones in our database. This short-circuit path also ensures that we spell + * the zone name the same way the system setting does, even in the presence + * of multiple aliases for the same zone. + * * Win32's native knowledge about timezones appears to be too incomplete - * and too different from the Olson database for the above matching strategy + * and too different from the IANA database for the above matching strategy * to be of any use. But there is just a limited number of timezones * available, so we can rely on a handmade mapping table instead. */ @@ -150,6 +158,8 @@ struct tztry time_t test_times[MAX_TEST_TIMES]; }; +static bool check_system_link_file(const char *linkname, struct tztry *tt, + char *bestzonename); static void scan_available_timezones(char *tzdir, char *tzdirsub, struct tztry *tt, int *bestscore, char *bestzonename); @@ -299,12 +309,19 @@ score_timezone(const char *tzname, struct tztry *tt) return i; } +/* + * Test whether given zone name is a perfect match to localtime() behavior + */ +static bool +perfect_timezone_match(const char *tzname, struct tztry *tt) +{ + return (score_timezone(tzname, tt) == tt->n_test_times); +} + /* * Try to identify a timezone name (in our terminology) that best matches the - * observed behavior of the system timezone library. We cannot assume that - * the system TZ environment setting (if indeed there is one) matches our - * terminology, so we ignore it and just look at what localtime() returns. + * observed behavior of the system localtime() function. */ static const char * identify_system_timezone(void) @@ -339,7 +356,7 @@ identify_system_timezone(void) * way of doing things, but experience has shown that system-supplied * timezone definitions are likely to have DST behavior that is right for * the recent past and not so accurate further back. Scoring in this way - * allows us to recognize zones that have some commonality with the Olson + * allows us to recognize zones that have some commonality with the IANA * database, without insisting on exact match. (Note: we probe Thursdays, * not Sundays, to avoid triggering DST-transition bugs in localtime * itself.) @@ -374,7 +391,18 @@ identify_system_timezone(void) tt.test_times[tt.n_test_times++] = t; } - /* Search for the best-matching timezone file */ + /* + * Try to avoid the brute-force search by seeing if we can recognize the + * system's timezone setting directly. + * + * Currently we just check /etc/localtime; there are other conventions for + * this, but that seems to be the only one used on enough platforms to be + * worth troubling over. + */ + if (check_system_link_file("/etc/localtime", &tt, resultbuf)) + return resultbuf; + + /* No luck, so search for the best-matching timezone file */ strlcpy(tmptzdir, pg_TZDIR(), sizeof(tmptzdir)); bestscore = -1; resultbuf[0] = '\0'; @@ -383,7 +411,7 @@ identify_system_timezone(void) &bestscore, resultbuf); if (bestscore > 0) { - /* Ignore Olson's rather silly "Factory" zone; use GMT instead */ + /* Ignore IANA's rather silly "Factory" zone; use GMT instead */ if (strcmp(resultbuf, "Factory") == 0) return NULL; return resultbuf; @@ -472,7 +500,7 @@ identify_system_timezone(void) /* * Did not find the timezone. Fallback to use a GMT zone. Note that the - * Olson timezone database names the GMT-offset zones in POSIX style: plus + * IANA timezone database names the GMT-offset zones in POSIX style: plus * is west of Greenwich. It's unfortunate that this is opposite of SQL * conventions. Should we therefore change the names? Probably not... */ @@ -487,6 +515,94 @@ identify_system_timezone(void) } /* + * Examine a system-provided symlink file to see if it tells us the timezone. + * + * Unfortunately, there is little standardization of how the system default + * timezone is determined in the absence of a TZ environment setting. + * But a common strategy is to create a symlink at a well-known place. + * If "linkname" identifies a readable symlink, and the tail of its contents + * matches a zone name we know, and the actual behavior of localtime() agrees + * with what we think that zone means, then we may use that zone name. + * + * We insist on a perfect behavioral match, which might not happen if the + * system has a different IANA database version than we do; but in that case + * it seems best to fall back to the brute-force search. + * + * linkname is the symlink file location to probe. + * + * tt tells about the system timezone behavior we need to match. + * + * If we successfully identify a zone name, store it in *bestzonename and + * return true; else return false. bestzonename must be a buffer of length + * TZ_STRLEN_MAX + 1. + */ +static bool +check_system_link_file(const char *linkname, struct tztry *tt, + char *bestzonename) +{ +#ifdef HAVE_READLINK + char link_target[MAXPGPATH]; + int len; + const char *cur_name; + + /* + * Try to read the symlink. If not there, not a symlink, etc etc, just + * quietly fail; the precise reason needn't concern us. + */ + len = readlink(linkname, link_target, sizeof(link_target)); + if (len < 0 || len >= sizeof(link_target)) + return false; + link_target[len] = '\0'; + +#ifdef DEBUG_IDENTIFY_TIMEZONE + fprintf(stderr, "symbolic link \"%s\" contains \"%s\"\n", + linkname, link_target); +#endif + + /* + * The symlink is probably of the form "/path/to/zones/zone/name", or + * possibly it is a relative path. Nobody puts their zone DB directly in + * the root directory, so we can definitely skip the first component; but + * after that it's trial-and-error to identify which path component begins + * the zone name. + */ + cur_name = link_target; + while (*cur_name) + { + /* Advance to next segment of path */ + cur_name = strchr(cur_name + 1, '/'); + if (cur_name == NULL) + break; + /* If there are consecutive slashes, skip all, as the kernel would */ + do + { + cur_name++; + } while (*cur_name == '/'); + + /* + * Test remainder of path to see if it is a matching zone name. + * Relative paths might contain ".."; we needn't bother testing if the + * first component is that. Also defend against overlength names. + */ + if (*cur_name && *cur_name != '.' && + strlen(cur_name) <= TZ_STRLEN_MAX && + perfect_timezone_match(cur_name, tt)) + { + /* Success! */ + strcpy(bestzonename, cur_name); + return true; + } + } + + /* Couldn't extract a matching zone name */ + return false; +#else + /* No symlinks? Forget it */ + return false; +#endif +} + +/* * Recursively scan the timezone database looking for the best match to * the system timezone behavior. * @@ -586,7 +702,7 @@ static const struct * HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time * Zones on Windows 10 and Windows 7. * - * The zones have been matched to Olson timezones by looking at the cities + * The zones have been matched to IANA timezones by looking at the cities * listed in the win32 display name (in the comment here) in most cases. */ { diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index ae22e7d..c8ba4b8 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -173,6 +173,7 @@ static char *pgdata_native; /* defaults */ static int n_connections = 10; static int n_buffers = 50; +static const char *default_timezone = NULL; static char *dynamic_shared_memory_type = NULL; /* @@ -1047,6 +1048,11 @@ test_config_settings(void) else printf("%dkB\n", n_buffers * (BLCKSZ / 1024)); + printf(_("selecting default timezone ... ")); + fflush(stdout); + default_timezone = select_default_timezone(share_path); + printf("%s\n", default_timezone ? default_timezone : "GMT"); + printf(_("selecting dynamic shared memory implementation ... ")); fflush(stdout); dynamic_shared_memory_type = choose_dsm_implementation(); @@ -1079,7 +1085,6 @@ setup_config(void) char **conflines; char repltok[MAXPGPATH]; char path[MAXPGPATH]; - const char *default_timezone; char *autoconflines[3]; fputs(_("creating configuration files ... "), stdout); @@ -1161,7 +1166,6 @@ setup_config(void) "#default_text_search_config = 'pg_catalog.simple'", repltok); - default_timezone = select_default_timezone(share_path); if (default_timezone) { snprintf(repltok, sizeof(repltok), "timezone = '%s'",
Hi, On 2018-05-10 12:18:19 -0400, Tom Lane wrote: > Next question is what to do with this. Do we want to sit on it till > v12, or sneak it in now? Is there a decent argument for sneaking it in? I don't really have an opinion. I don't think it'd really be arguable that this'll make testing meaningfully faster. OTOH, it's fresh in your mind (which can be said about a lot of patches obviously). Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-05-10 12:18:19 -0400, Tom Lane wrote: >> Next question is what to do with this. Do we want to sit on it till >> v12, or sneak it in now? > Is there a decent argument for sneaking it in? I don't really have an > opinion. I don't think it'd really be arguable that this'll make testing > meaningfully faster. OTOH, it's fresh in your mind (which can be said > about a lot of patches obviously). Yeah, I had hoped that this might make a noticeable difference on slower buildfarm animals, but some testing shows that it's more likely to be barely above the noise floor. OTOH, in view of Josh's old gripe, maybe it could be argued to be a bug fix, at least on platforms where it does anything. regards, tom lane
On Thu, May 10, 2018 at 3:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, I had hoped that this might make a noticeable difference on slower > buildfarm animals, but some testing shows that it's more likely to be > barely above the noise floor. Any idea why? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, May 10, 2018 at 3:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, I had hoped that this might make a noticeable difference on slower >> buildfarm animals, but some testing shows that it's more likely to be >> barely above the noise floor. > Any idea why? Well, maybe the noise floor is too high. I experimented with "make check" in src/bin/scripts (TAP tests enabled), which does 13 initdb's as of HEAD. On dromedary's host (an older macOS release), 3 runs each: HEAD: 1m47.017s 1m49.548s 1m46.805s +patch: 1m47.188s 1m45.611s 1m51.520s If you blame that last run on some background task chewing cycles, which is certainly possible considering I'd neglected to turn off dromedary's cron job, then there's some gain but not much. There *is* a clearly visible win if you measure "initdb -N" in isolation, HEAD: 0m5.244s 0m5.025s 0m4.930s +patch: 0m4.785s 0m4.706s 0m4.824s but it's just not much compared to the other overhead of a TAP test. These numbers would support the idea that we're saving about 250 ms per initdb, which times 13 is circa 3 sec, which is comparable to the inter-run variation I'm seeing in the scripts test. I have no doubt that it'd add up to something over a lot of buildfarm runs, but it's certainly no sort of game-changer. Running on my RHEL6 dev machine, where I customarily use PROVE_FLAGS='-j4' for check-world runs, src/bin/scripts takes maybe 14.9s on average. The patch itself does nothing for that machine of course, but if I simulate its effect by setting the TZ environment variable, the runtime drops to 14.2s or so. So again, there'd be some win for developer testing, but not enough to get anybody excited. regards, tom lane
> On May 10, 2018, at 12:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andres Freund <andres@anarazel.de> writes: >> On 2018-05-10 12:18:19 -0400, Tom Lane wrote: >>> Next question is what to do with this. Do we want to sit on it till >>> v12, or sneak it in now? > >> Is there a decent argument for sneaking it in? I don't really have an >> opinion. I don't think it'd really be arguable that this'll make testing >> meaningfully faster. OTOH, it's fresh in your mind (which can be said >> about a lot of patches obviously). > > Yeah, I had hoped that this might make a noticeable difference on slower > buildfarm animals, but some testing shows that it's more likely to be > barely above the noise floor. > > OTOH, in view of Josh's old gripe, maybe it could be argued to be a bug > fix, at least on platforms where it does anything. Read back to get some history/context on this, and from my vantage point it sounds like this is fixing a bug (i.e. incorrect behavior). It also sounds like based on the changes the earliest we’d be able to commit is is 11 and not any further because people could be expecting the incorrect behavior to happen, and thus we may break existing systems. Jonathan
"Jonathan S. Katz" <jonathan.katz@excoventures.com> writes: >> On May 10, 2018, at 12:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> OTOH, in view of Josh's old gripe, maybe it could be argued to be a bug >> fix, at least on platforms where it does anything. > Read back to get some history/context on this, and from my vantage > point it sounds like this is fixing a bug (i.e. incorrect behavior). It also sounds > like based on the changes the earliest we’d be able to commit is is 11 and > not any further because people could be expecting the incorrect behavior to > happen, and thus we may break existing systems. Yeah, given the small number of complaints, I doubt back-patching would be a good idea. Seems like we should just leave this for v12. regards, tom lane
I wrote: >> ... I'll >> take a look at whipping up something that checks /etc/localtime. > Here's a draft patch. It seems to do what I expect on a couple of > different macOS releases as well as recent Fedora. The cfbot points out that this has suffered bit-rot, so here's a rebased version --- no substantive changes. regards, tom lane diff --git a/src/bin/initdb/findtimezone.c b/src/bin/initdb/findtimezone.c index 4c3a91a..6901188 100644 *** a/src/bin/initdb/findtimezone.c --- b/src/bin/initdb/findtimezone.c *************** *** 15,20 **** --- 15,21 ---- #include <fcntl.h> #include <sys/stat.h> #include <time.h> + #include <unistd.h> #include "pgtz.h" *************** pg_load_tz(const char *name) *** 126,137 **** * On most systems, we rely on trying to match the observable behavior of * the C library's localtime() function. The database zone that matches * furthest into the past is the one to use. Often there will be several ! * zones with identical rankings (since the Olson database assigns multiple * names to many zones). We break ties arbitrarily by preferring shorter, * then alphabetically earlier zone names. * * Win32's native knowledge about timezones appears to be too incomplete ! * and too different from the Olson database for the above matching strategy * to be of any use. But there is just a limited number of timezones * available, so we can rely on a handmade mapping table instead. */ --- 127,145 ---- * On most systems, we rely on trying to match the observable behavior of * the C library's localtime() function. The database zone that matches * furthest into the past is the one to use. Often there will be several ! * zones with identical rankings (since the IANA database assigns multiple * names to many zones). We break ties arbitrarily by preferring shorter, * then alphabetically earlier zone names. * + * Many modern systems use the IANA database, so if we can determine the + * system's idea of which zone it is using and its behavior matches our zone + * of the same name, we can skip the rather-expensive search through all the + * zones in our database. This short-circuit path also ensures that we spell + * the zone name the same way the system setting does, even in the presence + * of multiple aliases for the same zone. + * * Win32's native knowledge about timezones appears to be too incomplete ! * and too different from the IANA database for the above matching strategy * to be of any use. But there is just a limited number of timezones * available, so we can rely on a handmade mapping table instead. */ *************** struct tztry *** 150,155 **** --- 158,165 ---- time_t test_times[MAX_TEST_TIMES]; }; + static bool check_system_link_file(const char *linkname, struct tztry *tt, + char *bestzonename); static void scan_available_timezones(char *tzdir, char *tzdirsub, struct tztry *tt, int *bestscore, char *bestzonename); *************** score_timezone(const char *tzname, struc *** 299,310 **** return i; } /* * Try to identify a timezone name (in our terminology) that best matches the ! * observed behavior of the system timezone library. We cannot assume that ! * the system TZ environment setting (if indeed there is one) matches our ! * terminology, so we ignore it and just look at what localtime() returns. */ static const char * identify_system_timezone(void) --- 309,327 ---- return i; } + /* + * Test whether given zone name is a perfect match to localtime() behavior + */ + static bool + perfect_timezone_match(const char *tzname, struct tztry *tt) + { + return (score_timezone(tzname, tt) == tt->n_test_times); + } + /* * Try to identify a timezone name (in our terminology) that best matches the ! * observed behavior of the system localtime() function. */ static const char * identify_system_timezone(void) *************** identify_system_timezone(void) *** 339,345 **** * way of doing things, but experience has shown that system-supplied * timezone definitions are likely to have DST behavior that is right for * the recent past and not so accurate further back. Scoring in this way ! * allows us to recognize zones that have some commonality with the Olson * database, without insisting on exact match. (Note: we probe Thursdays, * not Sundays, to avoid triggering DST-transition bugs in localtime * itself.) --- 356,362 ---- * way of doing things, but experience has shown that system-supplied * timezone definitions are likely to have DST behavior that is right for * the recent past and not so accurate further back. Scoring in this way ! * allows us to recognize zones that have some commonality with the IANA * database, without insisting on exact match. (Note: we probe Thursdays, * not Sundays, to avoid triggering DST-transition bugs in localtime * itself.) *************** identify_system_timezone(void) *** 374,380 **** tt.test_times[tt.n_test_times++] = t; } ! /* Search for the best-matching timezone file */ strlcpy(tmptzdir, pg_TZDIR(), sizeof(tmptzdir)); bestscore = -1; resultbuf[0] = '\0'; --- 391,408 ---- tt.test_times[tt.n_test_times++] = t; } ! /* ! * Try to avoid the brute-force search by seeing if we can recognize the ! * system's timezone setting directly. ! * ! * Currently we just check /etc/localtime; there are other conventions for ! * this, but that seems to be the only one used on enough platforms to be ! * worth troubling over. ! */ ! if (check_system_link_file("/etc/localtime", &tt, resultbuf)) ! return resultbuf; ! ! /* No luck, so search for the best-matching timezone file */ strlcpy(tmptzdir, pg_TZDIR(), sizeof(tmptzdir)); bestscore = -1; resultbuf[0] = '\0'; *************** identify_system_timezone(void) *** 383,389 **** &bestscore, resultbuf); if (bestscore > 0) { ! /* Ignore Olson's rather silly "Factory" zone; use GMT instead */ if (strcmp(resultbuf, "Factory") == 0) return NULL; return resultbuf; --- 411,417 ---- &bestscore, resultbuf); if (bestscore > 0) { ! /* Ignore IANA's rather silly "Factory" zone; use GMT instead */ if (strcmp(resultbuf, "Factory") == 0) return NULL; return resultbuf; *************** identify_system_timezone(void) *** 472,478 **** /* * Did not find the timezone. Fallback to use a GMT zone. Note that the ! * Olson timezone database names the GMT-offset zones in POSIX style: plus * is west of Greenwich. It's unfortunate that this is opposite of SQL * conventions. Should we therefore change the names? Probably not... */ --- 500,506 ---- /* * Did not find the timezone. Fallback to use a GMT zone. Note that the ! * IANA timezone database names the GMT-offset zones in POSIX style: plus * is west of Greenwich. It's unfortunate that this is opposite of SQL * conventions. Should we therefore change the names? Probably not... */ *************** identify_system_timezone(void) *** 487,492 **** --- 515,608 ---- } /* + * Examine a system-provided symlink file to see if it tells us the timezone. + * + * Unfortunately, there is little standardization of how the system default + * timezone is determined in the absence of a TZ environment setting. + * But a common strategy is to create a symlink at a well-known place. + * If "linkname" identifies a readable symlink, and the tail of its contents + * matches a zone name we know, and the actual behavior of localtime() agrees + * with what we think that zone means, then we may use that zone name. + * + * We insist on a perfect behavioral match, which might not happen if the + * system has a different IANA database version than we do; but in that case + * it seems best to fall back to the brute-force search. + * + * linkname is the symlink file location to probe. + * + * tt tells about the system timezone behavior we need to match. + * + * If we successfully identify a zone name, store it in *bestzonename and + * return true; else return false. bestzonename must be a buffer of length + * TZ_STRLEN_MAX + 1. + */ + static bool + check_system_link_file(const char *linkname, struct tztry *tt, + char *bestzonename) + { + #ifdef HAVE_READLINK + char link_target[MAXPGPATH]; + int len; + const char *cur_name; + + /* + * Try to read the symlink. If not there, not a symlink, etc etc, just + * quietly fail; the precise reason needn't concern us. + */ + len = readlink(linkname, link_target, sizeof(link_target)); + if (len < 0 || len >= sizeof(link_target)) + return false; + link_target[len] = '\0'; + + #ifdef DEBUG_IDENTIFY_TIMEZONE + fprintf(stderr, "symbolic link \"%s\" contains \"%s\"\n", + linkname, link_target); + #endif + + /* + * The symlink is probably of the form "/path/to/zones/zone/name", or + * possibly it is a relative path. Nobody puts their zone DB directly in + * the root directory, so we can definitely skip the first component; but + * after that it's trial-and-error to identify which path component begins + * the zone name. + */ + cur_name = link_target; + while (*cur_name) + { + /* Advance to next segment of path */ + cur_name = strchr(cur_name + 1, '/'); + if (cur_name == NULL) + break; + /* If there are consecutive slashes, skip all, as the kernel would */ + do + { + cur_name++; + } while (*cur_name == '/'); + + /* + * Test remainder of path to see if it is a matching zone name. + * Relative paths might contain ".."; we needn't bother testing if the + * first component is that. Also defend against overlength names. + */ + if (*cur_name && *cur_name != '.' && + strlen(cur_name) <= TZ_STRLEN_MAX && + perfect_timezone_match(cur_name, tt)) + { + /* Success! */ + strcpy(bestzonename, cur_name); + return true; + } + } + + /* Couldn't extract a matching zone name */ + return false; + #else + /* No symlinks? Forget it */ + return false; + #endif + } + + /* * Recursively scan the timezone database looking for the best match to * the system timezone behavior. * *************** static const struct *** 586,592 **** * HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time * Zones on Windows 10 and Windows 7. * ! * The zones have been matched to Olson timezones by looking at the cities * listed in the win32 display name (in the comment here) in most cases. */ { --- 702,708 ---- * HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time * Zones on Windows 10 and Windows 7. * ! * The zones have been matched to IANA timezones by looking at the cities * listed in the win32 display name (in the comment here) in most cases. */ { diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 32746c7..cb8c745 100644 *** a/src/bin/initdb/initdb.c --- b/src/bin/initdb/initdb.c *************** static char *pgdata_native; *** 174,179 **** --- 174,180 ---- static int n_connections = 10; static int n_buffers = 50; static const char *dynamic_shared_memory_type = NULL; + static const char *default_timezone = NULL; /* * Warning messages for authentication methods *************** test_config_settings(void) *** 1058,1063 **** --- 1059,1069 ---- printf("%dMB\n", (n_buffers * (BLCKSZ / 1024)) / 1024); else printf("%dkB\n", n_buffers * (BLCKSZ / 1024)); + + printf(_("selecting default timezone ... ")); + fflush(stdout); + default_timezone = select_default_timezone(share_path); + printf("%s\n", default_timezone ? default_timezone : "GMT"); } /* *************** setup_config(void) *** 1086,1092 **** char **conflines; char repltok[MAXPGPATH]; char path[MAXPGPATH]; - const char *default_timezone; char *autoconflines[3]; fputs(_("creating configuration files ... "), stdout); --- 1092,1097 ---- *************** setup_config(void) *** 1168,1174 **** "#default_text_search_config = 'pg_catalog.simple'", repltok); - default_timezone = select_default_timezone(share_path); if (default_timezone) { snprintf(repltok, sizeof(repltok), "timezone = '%s'", --- 1173,1178 ----
On Fri, Aug 10, 2018 at 02:31:25PM -0400, Tom Lane wrote: > The cfbot points out that this has suffered bit-rot, so here's a rebased > version --- no substantive changes. + /* + * Try to read the symlink. If not there, not a symlink, etc etc, just + * quietly fail; the precise reason needn't concern us. + */ + len = readlink(linkname, link_target, sizeof(link_target)); One thing that I can see changing with this patch is how timezone is set in postgresql.conf. For example, on HEAD I get 'Japan' while this patch gives back 'Asia/Tokyo'. Could it be an issue for countries with multiple timezones? I am not sure how Russian systems would react on that for example. The time cut for initdb is interesting for buildfarm members anyway, and the patch looks in nice shape to me. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > One thing that I can see changing with this patch is how timezone is set > in postgresql.conf. For example, on HEAD I get 'Japan' while this patch > gives back 'Asia/Tokyo'. Could it be an issue for countries with > multiple timezones? I am not sure how Russian systems would react on > that for example. Interesting --- what platform were you testing on? I believe that this patch will never make for any functional change, it will only give you some other alias for the zone it would have selected anyway. This could only fail to be true if there are distinct timezones that score_timezone() is failing to tell apart, which would be a bug in score_timezone, not this patch. (Presumably, we could fix any such bug by increasing the number of dates that score_timezone tests.) Since the tzdb database is rather full of aliases, for instance the one you mentioned $ grep ^Li src/timezone/data/tzdata.zi ... Li Asia/Tokyo Japan ... there is certainly plenty of opportunity for this to change the apparent value of TimeZone. But I think it's for the better: instead of choosing an alias that happens to be first according to some unspecified search order, it will choose the alias that somebody actually configured the operating system with. regards, tom lane
On Wed, Sep 12, 2018 at 10:00:21AM -0400, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: >> One thing that I can see changing with this patch is how timezone is set >> in postgresql.conf. For example, on HEAD I get 'Japan' while this patch >> gives back 'Asia/Tokyo'. Could it be an issue for countries with >> multiple timezones? I am not sure how Russian systems would react on >> that for example. > > Interesting --- what platform were you testing on? Debian SID, with Asia/Tokyo used for /etc/localtime. > I believe that this patch will never make for any functional change, > it will only give you some other alias for the zone it would have > selected anyway. This could only fail to be true if there are > distinct timezones that score_timezone() is failing to tell apart, > which would be a bug in score_timezone, not this patch. (Presumably, > we could fix any such bug by increasing the number of dates that > score_timezone tests.) Looking at the list of aliases, I am not seeing listed countries running across multiple timezones, so that may be fine.. Anyway, your change, and particularly the cut of time in running initdb, which matters for TAP tests, are definitely good things in my opinion. So I am switching that as ready for committer. -- Michael
Attachment
On 2018-Sep-13, Michael Paquier wrote: > Looking at the list of aliases, I am not seeing listed countries running > across multiple timezones, so that may be fine.. Well, mine is there, and it's correct: Li America/Santiago Chile/Continental Li Pacific/Easter Chile/EasterIsland Laugh all you want about Chile of all countries having multiple timezones ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Sep 12, 2018 at 10:00:21AM -0400, Tom Lane wrote: >> I believe that this patch will never make for any functional change, >> it will only give you some other alias for the zone it would have >> selected anyway. > Looking at the list of aliases, I am not seeing listed countries running > across multiple timezones, so that may be fine. Not sure what you're worried about. "Linked" time zones are the *same data*. In an installed tzdb tree, the Japan file is either a hardlink or symlink to the Asia/Tokyo one, so they can't differ. What you seem to be speculating about is actual errors in the tzdb data, ie not describing the facts on the ground in particular places. That's possible I suppose but it's hardly our problem if it happens; it'd be theirs to fix. regards, tom lane
On Thu, Sep 13, 2018 at 11:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael@paquier.xyz> writes: > > On Wed, Sep 12, 2018 at 10:00:21AM -0400, Tom Lane wrote: > >> I believe that this patch will never make for any functional change, > >> it will only give you some other alias for the zone it would have > >> selected anyway. > > > Looking at the list of aliases, I am not seeing listed countries running > > across multiple timezones, so that may be fine. > > Not sure what you're worried about. "Linked" time zones are the *same > data*. In an installed tzdb tree, the Japan file is either a hardlink or > symlink to the Asia/Tokyo one, so they can't differ. What you seem to be > speculating about is actual errors in the tzdb data, ie not describing > the facts on the ground in particular places. That's possible I suppose > but it's hardly our problem if it happens; it'd be theirs to fix. I tested this on a system where /etc/localtime is not a symlink (FreeBSD) and it worked fine, falling back to the old behaviour and finding my timezone. (Apparently the argument against a symlink /etc/localtime -> /usr/share/tzinfo/... is that new processes would effectively switch timezone after /usr is mounted so your boot logs would be mixed up. Or something. I bet that actually happens on Linux too, but maybe no one does /usr as a mount point anymore...?) I noticed that the patch does a bunch of s/Olson/IANA/. That leaves only one place in the tree that still refers to the "Olson" database: dt_common.c. Might want to change that too? -- Thomas Munro http://www.enterprisedb.com
On Wed, Sep 12, 2018 at 08:04:53PM -0300, Alvaro Herrera wrote: > Well, mine is there, and it's correct: > > Li America/Santiago Chile/Continental > Li Pacific/Easter Chile/EasterIsland > > Laugh all you want about Chile of all countries having multiple > timezones ... Impressed is a better word. Really impressed seeing the shape of the country ;) -- Michael
Attachment
Thomas Munro <thomas.munro@enterprisedb.com> writes: > I noticed that the patch does a bunch of s/Olson/IANA/. That leaves > only one place in the tree that still refers to the "Olson" database: > dt_common.c. Might want to change that too? Ah, I didn't realize we were that close to wiping out the old terminology. Yeah, I'll get that one too. Thanks for noticing. regards, tom lane