Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) - Mailing list pgsql-hackers

From John Naylor
Subject Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Date
Msg-id CAJVSVGXsogxLCphCW1=PQc0m5PFjnpx9Wc98UfmSXZM0inpv2g@mail.gmail.com
Whole thread Raw
In response to Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
List pgsql-hackers
On 12/20/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'd be inclined to put the script in src/tools, I think.  IMO src/common
> is for code that actually gets built into our executables.

Done.

>> which takes
>> pl_unreserved_kwlist.h as input and outputs
>> pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h.
>
> I wonder whether we'd not be better off producing just one output
> file, in which we have the offsets emitted as PG_KEYWORD macros
> and then the giant string emitted as a macro definition, ie
> something like
>
> #define PG_KEYWORD_STRING \
>     "absolute\0" \
>     "alias\0" \
>     ...
>
> That simplifies the Makefile-hacking, at least, and it possibly gives
> callers more flexibility about what they actually want to do with the
> string.

Okay, I tried that. Since the script is turning one header into
another, I borrowed the "*_d.h" nomenclature from the catalogs. Using
a single file required some #ifdef hacks in the output file. Maybe
there's a cleaner way to do this, but I don't know what it is.

Using a single file also gave me another idea: Take value and category
out of ScanKeyword, and replace them with an index into another array
containing those, which will only be accessed in the event of a hit.
That would shrink ScanKeyword to 4 bytes (offset, index), further
increasing locality of reference. Might not be worth it, but I can try
it after moving on to the core scanner.

> I'm for "not change things unnecessarily".  People might well be
> scraping the keyword list out of parser/kwlist.h for other purposes
> right now --- indeed, it's defined the way it is exactly to let
> people do that.  I don't see a good reason to force them to redo
> whatever tooling they have that depends on that.  So let's build
> kwlist_offsets.h alongside that, but not change kwlist.h itself.

Done.

>> I used the global .gitignore, but maybe that's an abuse of it.
>
> Yeah, I'd say it is.

Moved.

>> +# TODO: Error out if the keyword names are not in ASCII order.
>
> +many for including such a check.

Done.

> Also note that we don't require people to have Perl installed when
> building from a tarball.  Therefore, these derived headers must get
> built during "make distprep" and removed by maintainer-clean but
> not distclean.  I think this also has some implications for VPATH
> builds, but as long as you follow the pattern used for other
> derived header files (e.g. fmgroids.h), you should be fine.

Done. I also blindly added support for MSVC.

-John Naylor

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: could recovery_target_timeline=latest be the default in standbymode?
Next
From: Michael Paquier
Date:
Subject: Re: Clean up some elog messages and comments for do_pg_stop_backupand do_pg_start_backup