Re: Changing default -march landscape - Mailing list pgsql-hackers
From | Nathan Bossart |
---|---|
Subject | Re: Changing default -march landscape |
Date | |
Msg-id | ZmpG2ZzT30Q75BZO@nathan Whole thread Raw |
In response to | Changing default -march landscape (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Changing default -march landscape
Re: Changing default -march landscape |
List | pgsql-hackers |
On Thu, Jun 13, 2024 at 11:11:56AM +1200, Thomas Munro wrote: > David R and I were discussing vectorisation and microarchitectures and > what you can expect the target microarchitecture to be these days, and > it seemed like some of our choices are not really very > forward-looking. > > Distros targeting x86-64 traditionally assumed the original AMD64 K8 > instruction set, so if we want to use newer instructions we use > various configure or runtime checks to see if that's safe. > > Recent GCC and Clang versions understand -march=x86-64-v{2,3,4}[1]. > RHEL9 and similar and SUSE tumbleweed now require x86-64-v2, and IIUC > they changed the -march default to -v2 in their build of GCC, and I > think Ubuntu has something in the works perhaps for -v3[2]. > > Some of our current tricks won't won't take proper advantage of that: > we'll still access POPCNT through a function pointer! This is perhaps only tangentially related, but I've found it really difficult to avoid painting ourselves into a corner with this stuff. Let's use the SSE 4.2 CRC32C code as an example. Right now, if your default compiler flags indicate support for SSE 4.2 (which I'll note can be assumed with x86-64-v2), we'll use it unconditionally, no function pointer required. If additional compiler flags happen to convince the compiler to generate SSE 4.2 code, we'll instead build both a fallback version and the SSE version, and then we'll use a function pointer to direct to whatever we detect is available on the CPU when the server starts. Now, let's say we require x86-64-v2. Once we have that, we can avoid the function pointer on many more x86 machines. While that sounds great, now we have a different problem. If someone wants to add, say, AVX-512 support [0], which is a much newer instruction set, we'll need to use the function pointer again. And we're back where we started. We could instead just ask folks to compile with --march=native, but then these optimizations are only available for a subset of users until we decide the instructions are standard enough 20 years from now. The idea that's been floating around recently is to build a bunch of different versions of Postgres and to choose one on startup based on what the CPU supports. That seems like quite a lot of work, and it'll increase the size of the builds quite a bit, but it at least doesn't have the aforementioned problem. Sorry if I just rambled on about something unrelated, but your message had enough keywords to get me thinking about this again. [0] https://postgr.es/m/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192%40BL1PR11MB5304.namprd11.prod.outlook.com -- nathan
pgsql-hackers by date: