Thread: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Umair Shahid
Date:
On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid <umair.shahid@2ndquadrant.com> wrote:
---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, Jun 23, 2016 at 9:32 PM
Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready
To: Magnus Hagander <magnus@hagander.net>
Cc: Umair Shahid <umair.shahid@2ndquadrant.com>, Dave Page <dpage@postgresql.org>, PostgreSQL Packagers <pgsql-packagers@postgresql.org>
Magnus Hagander <magnus@hagander.net> writes:
> That makes more sense as the joinrel stuff *has* been changed between the
> two betas. I'm sure someone who's touched that code (Tom?) can comment on
> that part..
It still makes little sense to me, as the previous reports say that the
problem happened during bootstrap, and the planner does not run
during bootstrap.
Could we get a look at debug_query_string in the coredump, to possibly
narrow down where the crash is really happening?
Moving thread to -hackers ...
debug_query_string is
"INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid, t.description FROM tmp_pg_description t, pg_class c WHERE c.relname = t.classname;"
Happening in "setup_description"
> It's still strange that it doesn't affect woodlouse.
Or any of the other Windows critters...
regards, tom lane--
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 05:17, Umair Shahid <umair.shahid@gmail.com> wrote:
* The issue doesn't happen in a VS 2015 build done on the test host
* I couldn't use just-in-time debugging because the restricted execution token setup isolated the process. For the same reason, breakpoints stop working in initdb.c after line 3557.
On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid <umair.shahid@2ndquadrant.com> wrote:---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, Jun 23, 2016 at 9:32 PM
Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready
To: Magnus Hagander <magnus@hagander.net>
Cc: Umair Shahid <umair.shahid@2ndquadrant.com>, Dave Page <dpage@postgresql.org>, PostgreSQL Packagers <pgsql-packagers@postgresql.org>
Magnus Hagander <magnus@hagander.net> writes:
> That makes more sense as the joinrel stuff *has* been changed between the
> two betas. I'm sure someone who's touched that code (Tom?) can comment on
> that part..
It still makes little sense to me, as the previous reports say that the
problem happened during bootstrap, and the planner does not run
during bootstrap.
Could we get a look at debug_query_string in the coredump, to possibly
narrow down where the crash is really happening?Moving thread to -hackers ...debug_query_string is"INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid, t.description FROM tmp_pg_description t, pg_class c WHERE c.relname = t.classname;"Happening in "setup_description"
I was helping Haroon with this last night. I don't have access to the original thread and he's not around so I don't know how much he said. I'll repeat our findings here.
During debugging I found that:
* A VS 2013 build (perfomed by Haroon and copied to the test host) crashes consistently with the reported symptoms - "performing post-bootstrap initialization ... child process was terminated by exception 0xC0000005"
* The issue doesn't happen in a VS 2015 build done on the test host
* I couldn't use just-in-time debugging because the restricted execution token setup isolated the process. For the same reason, breakpoints stop working in initdb.c after line 3557.
* To get a backtrace, I had to:
* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution pointer over the setup of the restricted execution token by dragging & dropping the yellow instruction pointer arrow. Yes, really. Or, y'know, comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres child process
* Debug->attach to process, attach to the child postgres. This doesn't detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashes
* It is an access violation (segfault) in postgres.exe when attempting to read memory at 0xFFFFFFFFFFFFFFFF in calc_joinrel_size_estimate() at costsize.c:3940
fkselec = get_foreign_key_join_selectivity(root,
outer_rel->relids,
inner_rel->relids,
sjinfo,
&restrictlist);
with debug_query_string:
0x0000000009bf6140 "INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid, t.description FROM tmp_pg_description t, pg_class c WHERE c.relname = t.classname;\n"
Backtrace:
Exception thrown at 0x00000001401A5A81 in postgres.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.
> postgres.exe!calc_joinrel_size_estimate(PlannerInfo * root, RelOptInfo * outer_rel, RelOptInfo * inner_rel, double outer_rows, double inner_rows, SpecialJoinInfo * sjinfo, List * restrictlist) Line 3944 C
postgres.exe!set_joinrel_size_estimates(PlannerInfo * root, RelOptInfo * rel, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo, List * restrictlist) Line 3852 C
postgres.exe!build_join_rel(PlannerInfo * root, Bitmapset * joinrelids, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo, List * * restrictlist_ptr) Line 521 C
postgres.exe!make_join_rel(PlannerInfo * root, RelOptInfo * rel1, RelOptInfo * rel2) Line 721 C
postgres.exe!make_rels_by_clause_joins(PlannerInfo * root, RelOptInfo * old_rel, ListCell * other_rels) Line 266 C
postgres.exe!join_search_one_level(PlannerInfo * root, int level) Line 69 C
postgres.exe!standard_join_search(PlannerInfo * root, int levels_needed, List * initial_rels) Line 2172 C
postgres.exe!query_planner(PlannerInfo * root, List * tlist, void(*)(PlannerInfo *, void *) qp_callback, void * qp_extra) Line 255 C
postgres.exe!grouping_planner(PlannerInfo * root, char inheritance_update, double tuple_fraction) Line 1695 C
postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse, PlannerInfo * parent_root, char hasRecursion, double tuple_fraction) Line 775 C
postgres.exe!standard_planner(Query * parse, int cursorOptions, ParamListInfoData * boundParams) Line 312 C
postgres.exe!pg_plan_query(Query * querytree, int cursorOptions, ParamListInfoData * boundParams) Line 800 C
postgres.exe!exec_simple_query(const char * query_string) Line 1023 C
postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Line 4076 C
postgres.exe!main(int argc, char * * argv) Line 227 C
Local vars:
+ inner_rel 0x0000000009dfd170 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009d6d718 {...} ...} RelOptInfo *
inner_rows 270.00000000000000 double
+ outer_rel 0x00000001401ded48 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
outer_rows 2.653352065130e-314#DEN double
+ restrictlist 0x0000000009d6f7f8 {type=T_List (656) length=1 head=0x0000000009d6f7d8 {data={ptr_value=0x0000000009d6e980 ...} ...} ...} List *
+ root 0x0000000009dfd800 {type=1 parse=0x000000000067d220 {type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo *
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009dfcfd8 {nwords=1 words=0x0000000009dfcfdc {...} } ...} SpecialJoinInfo *
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Michael Paquier
Date:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > * Launch a VS x86 command prompt > * devenv /debugexe bin\initdb.exe -D test > * Set a breakpoint in initdb.c:3557 and initdb.c:3307 > * Run > * When it traps at get_restricted_token(), manually move the execution > pointer over the setup of the restricted execution token by dragging & > dropping the yellow instruction pointer arrow. Yes, really. Or, y'know, > comment it out and rebuild, but I was working with a supplied binary. > * Continue until next breakpoint > * Launch process explorer and find the pid of the postgres child process > * Debug->attach to process, attach to the child postgres. This doesn't > detach the parent, VS does multiprocess debugging. > * Continue execution > * vs will trap on the child when it crashes Do you think a crash dump could have been created by creating crashdumps/ in PGDATA as part of initdb before this query is run? -- Michael
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 10:21, Craig Ringer <craig@2ndquadrant.com> wrote:
* To get a backtrace, I had to:* Launch a VS x86 command prompt* devenv /debugexe bin\initdb.exe -D test* Set a breakpoint in initdb.c:3557 and initdb.c:3307* Run* When it traps at get_restricted_token(), manually move the execution pointer over the setup of the restricted execution token by dragging & dropping the yellow instruction pointer arrow. Yes, really. Or, y'know, comment it out and rebuild, but I was working with a supplied binary.* Continue until next breakpoint* Launch process explorer and find the pid of the postgres child process* Debug->attach to process, attach to the child postgres. This doesn't detach the parent, VS does multiprocess debugging.* Continue execution* vs will trap on the child when it crashes
Also, to save anyone else this hassle, I have saved a process dump (windows core file) and the debug symbols to gdrive. You can get them at:
Note that you will need a Visual Studio version installed. VS Community 2015 works fine. You only need to install the C++ devenv and C++ headers, you don't need MFC or any of the rest. The default install is fine if you don't mind a bigger download. Once installed, open postgres.dmp, then go to debug->options, symbols. There, enable the Microsoft Symbol Server, and also add a new entry for the absolute path to the symbols directory for the archive you unpacked. You should enable the symbol cache directory too, make a directory in your user dir and put it there.
If Haroon shared some gdrive links earlier on the thread I don't have access to, this is the same data just efficiently compressed (32MB instead of 180MB) and packaged up in a single convenient archive with the matching sources and a full working install. You'll need 7zip to unpack it, but that should be on your "install as soon as you install Windows" list anyway.
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> * Launch a VS x86 command prompt
> * devenv /debugexe bin\initdb.exe -D test
> * Set a breakpoint in initdb.c:3557 and initdb.c:3307
> * Run
> * When it traps at get_restricted_token(), manually move the execution
> pointer over the setup of the restricted execution token by dragging &
> dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
> comment it out and rebuild, but I was working with a supplied binary.
> * Continue until next breakpoint
> * Launch process explorer and find the pid of the postgres child process
> * Debug->attach to process, attach to the child postgres. This doesn't
> detach the parent, VS does multiprocess debugging.
> * Continue execution
> * vs will trap on the child when it crashes
Do you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?
I see what you did there ;)
Yes, quite possibly, actually. I should've just got Haroon to build me a new initdb without the priv setting and with creation of crashdumps/ .
It might be worth testing that out and adding an initdb startup flag to create the directory, since initdb is such a PITA to debug.
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Michael Paquier
Date:
On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > Yes, quite possibly, actually. I should've just got Haroon to build me a new > initdb without the priv setting and with creation of crashdumps/ . > > It might be worth testing that out and adding an initdb startup flag to > create the directory, since initdb is such a PITA to debug. I was more thinking about putting that under -DDEBUG for example. -- Michael
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
"Tsunakawa, Takayuki"
Date:
> From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier > Sent: Friday, June 24, 2016 11:37 AM > On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com> > wrote: > It might be worth testing that out and adding an initdb startup flag > > to create the directory, since initdb is such a PITA to debug. > > I was more thinking about putting that under -DDEBUG for example. > I think just the existing option -d (--debug) and/or -n (--no-clean) would be OK. Regards Takayuki Tsunakawa
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Michael Paquier
Date:
On Fri, Jun 24, 2016 at 11:51 AM, Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com> wrote: >> From: pgsql-hackers-owner@postgresql.org >> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier >> Sent: Friday, June 24, 2016 11:37 AM >> On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com> >> wrote: >> It might be worth testing that out and adding an initdb startup flag >> > to create the directory, since initdb is such a PITA to debug. >> >> I was more thinking about putting that under -DDEBUG for example. >> > > I think just the existing option -d (--debug) and/or -n (--no-clean) would be OK. If the majority thinks that an option switch is more adapted, I won't fight it strongly. Just please let's not mess up with the behavior of the existing options. -- Michael
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 05:17, Umair Shahid <umair.shahid@gmail.com> wrote:
Given that it's only been seen in VS 2013, it's particularly odd that it's not biting woodlouse.
I'd like more details from those whose installs are crashing. What exact vcvars env did you run under, with which exact cl.exe version?
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Michael Paquier
Date:
On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > Given that it's only been seen in VS 2013, it's particularly odd that it's > not biting woodlouse. > > I'd like more details from those whose installs are crashing. What exact > vcvars env did you run under, with which exact cl.exe version? Which OS did you use for the compilation? I don't think that this matters much but woodloose is using Win7. -- Michael
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 12:31, Michael Paquier <michael.paquier@gmail.com> wrote:
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\cl.exe
I see the same symptoms, with the segfault.
On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> Given that it's only been seen in VS 2013, it's particularly odd that it's
> not biting woodlouse.
>
> I'd like more details from those whose installs are crashing. What exact
> vcvars env did you run under, with which exact cl.exe version?
Which OS did you use for the compilation? I don't think that this
matters much but woodloose is using Win7.
I'll have to wait for Haroon for that info for the crashing builds he did, but I've now reproduced it with:
Windows server 2012 R2, VS 2013 Community Update 5, cross compile tools for x86 to amd64. cl 18.00.40629 for x64, env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" x86_amd64"
"where cl" reports
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\cl.exe
Note that cross compilation is a typical configuration on Windows, where you routinely use 32bit x86 compilers to build 64bit code, except in the newest SDKs.
This host is a clean install, an AWS instance created for the purpose.
It looks like woodlouse probably runs an older VS2013 and uses the native x64 toolchain; its env includes:
C:\\Program Files (x86)\\Microsoft Visual Studio 12.0\\VC\\BIN\\amd64
and does not have x86_amd64 in it.
BTW, I suggested to Haroon that he clone beta2 from git, then do a git-bisect between beta1 (works) and beta2 (fails) to see if he can identify the commit that causes things to start failing. I don't know how far he got with that yesterday.
By comparison, I had no problems on the same host with VS Community 2015, cl 19.00.23918, env "VS2015 x64 Native Tools Command Prompt":
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat"" amd64
On a side note I'm unable to build with vs2013 community u5 native tools ( for some reason. Link errors, unresolved external symbol _ischartype_l . cl 18.00.42629 for x64, env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" amd64"
"where cl" reports:
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\cl.exe
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 13:00, Craig Ringer <craig@2ndquadrant.com> wrote:
I've now reproduced it with:
I can also confirm that it _doesn't_ crash with the same SDK using a 32-bit build (running under WoW on x64). cl 18.00.40629 for x86, env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" x86"
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> * Launch a VS x86 command prompt
> * devenv /debugexe bin\initdb.exe -D test
> * Set a breakpoint in initdb.c:3557 and initdb.c:3307
> * Run
> * When it traps at get_restricted_token(), manually move the execution
> pointer over the setup of the restricted execution token by dragging &
> dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
> comment it out and rebuild, but I was working with a supplied binary.
> * Continue until next breakpoint
> * Launch process explorer and find the pid of the postgres child process
> * Debug->attach to process, attach to the child postgres. This doesn't
> detach the parent, VS does multiprocess debugging.
> * Continue execution
> * vs will trap on the child when it crashes
Do you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?
The answer is "yes" btw. Add "crashdumps" to the static array of directories created by initdb and it works great.
Sigh. It'd be less annoying if I hadn't written most of the original patch.
For convenience I also commented out the check_root call in src/backend/main.c and the get_restricted_token(progname) call in initdb.c, so I could run it easily under an admin account where I can also install tools etc without hassle. Not recommended on a non-throwaway machine of course.
The generated crashdump shows the same crash in the same location.
I have absolutely no idea why it's trying to access memory at what looks like (uint64)(-1) though. Nothing in the auto vars list:
+ &restrictlist 0x000000000043f7b0 {0x0000000009e32600 {type=T_List (656) length=1 head=0x0000000009e325e0 {data={ptr_value=...} ...} ...}} List * *
+ inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo *
+ inner_rel->relids 0x0000000009e30520 {nwords=658 words=0x0000000009e30524 {...} } Bitmapset *
+ outer_rel 0x00000001401dec98 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
+ outer_rel->relids 0xe808498b48d78b48 {nwords=??? words=0xe808498b48d78b4c {...} } Bitmapset *
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} } ...} SpecialJoinInfo *
or locals:
+ inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo *
inner_rows 270.00000000000000 double
+ outer_rel 0x00000001401dec98 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
outer_rows 2.653351978175e-314#DEN double
+ restrictlist 0x0000000009e32600 {type=T_List (656) length=1 head=0x0000000009e325e0 {data={ptr_value=0x0000000009e31788 ...} ...} ...} List *
+ root 0x0000000009e7b3f8 {type=1 parse=0x0000000000504ad0 {type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo *
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} } ...} SpecialJoinInfo *
seems to fit. Though outer_rel->relids is a pretty weird address - 0xe808498b48d78b48? Really?
I'd point DrMemory at it, but unfortunately it only supports 32-bit applications so far. I don't have access to any of the commerical tools like Purify. Maybe someone at EDB can help out with that, if you guys do?
Register states are:
RAX = 000000000043F7B0 RBX = 0000000009E32218 RCX = 0000000009E78510 RDX = 0000000009E7ABD0 RSI = 0000000009E78510 RDI = 0000000009E32218 R8 = 0000000009E7B3F8 R9 = 0000000009E7B1E8 R10 = 0000000009E7A9C0 R11 = 0000000000000001 R12 = 0000000009E32200 R13 = 0000000000000000 R14 = 0000000009E7B1E8 R15 = 0000000000000000 RIP = 00000001401A59D1 RSP = 000000000043F6E0 RBP = 0000000009E7A9C0 EFL = 00010202
and the exact crash site is
fkselec = get_foreign_key_join_selectivity(root,
outer_rel->relids,
inner_rel->relids,
sjinfo,
&restrictlist);
00000001401A59AB mov r8,qword ptr [r8+8]
00000001401A59AF mov rdx,qword ptr [rdx+8]
00000001401A59B3 movaps xmmword ptr [rax-28h],xmm6
00000001401A59B7 movaps xmmword ptr [rax-38h],xmm7
00000001401A59BB movaps xmmword ptr [rax-48h],xmm8
00000001401A59C0 movaps xmmword ptr [rax-58h],xmm9
00000001401A59C5 lea rax,[rax+38h]
00000001401A59C9 movaps xmm7,xmm3
00000001401A59CC mov qword ptr [rsp+20h],rax
00000001401A59D1 movaps xmmword ptr [rax-68h],xmm10 <---- here
00000001401A59D6 mov qword ptr [rax-48h],r14
00000001401A59DA mov r14,qword ptr [sjinfo]
00000001401A59E2 mov ebp,dword ptr [r14+28h]
00000001401A59E6 mov qword ptr [rax-50h],r15
00000001401A59EA mov r9,r14
00000001401A59ED mov r15,rcx
00000001401A59F0 call get_foreign_key_join_selectivity (01401A5C30h)
with
XMM3 000000000000000040A5720000000000
RAX 000000000043F7B0
XMM7 000000000000000040A5720000000000
RSP 000000000043F6E0
XMM10 00000000000000000000000000000000
I'm about 100% ignorant of x64 asm, but hopefully someone can interpret this usefully. I can tell it's doing a sse "Move Aligned Packed Single-Precision Floating-Point Values" (from memory into a sse register?) but that's about it.
rax-68h is 0x000000000043F748. The memory at that location is
00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0 bf 00 00 00 00 00 00 00 00 c0 a9 e7 09 00 00 00 00 f8 b3 e7 09 00 00
So there you go, a whole bunch of data and I, at least, am still none the wiser.
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Michael Paquier
Date:
On Fri, Jun 24, 2016 at 3:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > > > On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote: >> >> On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> >> wrote: >> > * Launch a VS x86 command prompt >> > * devenv /debugexe bin\initdb.exe -D test >> > * Set a breakpoint in initdb.c:3557 and initdb.c:3307 >> > * Run >> > * When it traps at get_restricted_token(), manually move the execution >> > pointer over the setup of the restricted execution token by dragging & >> > dropping the yellow instruction pointer arrow. Yes, really. Or, y'know, >> > comment it out and rebuild, but I was working with a supplied binary. >> > * Continue until next breakpoint >> > * Launch process explorer and find the pid of the postgres child >> > process >> > * Debug->attach to process, attach to the child postgres. This doesn't >> > detach the parent, VS does multiprocess debugging. >> > * Continue execution >> > * vs will trap on the child when it crashes >> >> Do you think a crash dump could have been created by creating >> crashdumps/ in PGDATA as part of initdb before this query is run? > > > > The answer is "yes" btw. Add "crashdumps" to the static array of directories > created by initdb and it works great. As simple as attached.. > Sigh. It'd be less annoying if I hadn't written most of the original patch. You mean the patch that created the crashdumps/ trick? This has saved me a couple of months back to analyze a problem TBH. -- Michael
Attachment
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Haroon
Date:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>> I was helping Haroon with this last night. I don't have access to the
>> original thread and he's not around so I don't know how much he said. I'll
>> repeat our findings here.
>> original thread and he's not around so I don't know how much he said. I'll
>> repeat our findings here.
Craig, I am around now looking into this. I'll update the list as I get more info.
- Haroon
On 24 June 2016 at 11:27, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 3:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
>
>
> On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>> On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com>
>> wrote:
>> > * Launch a VS x86 command prompt
>> > * devenv /debugexe bin\initdb.exe -D test
>> > * Set a breakpoint in initdb.c:3557 and initdb.c:3307
>> > * Run
>> > * When it traps at get_restricted_token(), manually move the execution
>> > pointer over the setup of the restricted execution token by dragging &
>> > dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
>> > comment it out and rebuild, but I was working with a supplied binary.
>> > * Continue until next breakpoint
>> > * Launch process explorer and find the pid of the postgres child
>> > process
>> > * Debug->attach to process, attach to the child postgres. This doesn't
>> > detach the parent, VS does multiprocess debugging.
>> > * Continue execution
>> > * vs will trap on the child when it crashes
>>
>> Do you think a crash dump could have been created by creating
>> crashdumps/ in PGDATA as part of initdb before this query is run?
>
>
>
> The answer is "yes" btw. Add "crashdumps" to the static array of directories
> created by initdb and it works great.
As simple as attached..
> Sigh. It'd be less annoying if I hadn't written most of the original patch.
You mean the patch that created the crashdumps/ trick? This has saved
me a couple of months back to analyze a problem TBH.
--
Michael
Haroon http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes: > I have absolutely no idea why it's trying to access memory at what looks > like (uint64)(-1) though. Nothing in the auto vars list: > + &restrictlist 0x000000000043f7b0 {0x0000000009e32600 {type=T_List (656) > length=1 head=0x0000000009e325e0 {data={ptr_value=...} ...} ...}} List * * > + inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537) > reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo > * > + inner_rel->relids 0x0000000009e30520 {nwords=658 words=0x0000000009e30524 > {...} } Bitmapset * > + outer_rel 0x00000001401dec98 > {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, > RelOptInfo * input_rel), Line 646} {...} RelOptInfo * > + outer_rel->relids 0xe808498b48d78b48 {nwords=??? words=0xe808498b48d78b4c > {...} } Bitmapset * > + sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) > min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} } > ...} SpecialJoinInfo * inner_rel seems to be pointing at garbage, or at least why is the referenced object tag T_EquivalenceClass not T_RelOptInfo? And why aren't we being given anything for outer_rel? The value for outer_rel->relids isn't inspiring any confidence either, and for that matter inner_rel->relids couldn't possibly have more than nwords==1 given how simple the query is. In short, either the debugger is totally confused or the code is, because most of these pointers aren't pointing at anything sane. TBH, this looks more like a compiler bug than anything else. I wonder whether it's getting confused by taking the address of a parameter (although surely we do that elsewhere). It would be worth recompiling at -O0, or whatever the local equivalent of that is, to see if (1) the crash goes away or (2) the debugger's printouts get any more reliable. regards, tom lane
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
"Haroon ."
Date:
> On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> > I'd like more details from those whose installs are crashing. What exact
This is a Windows server 2012 R2 Standard.
> > vcvars env did you run under, with which exact cl.exe version?
This is a Windows server 2012 R2 Standard.
Devenv: Microsoft Visual Studio 2013 Community Version 12.0.31101.0.
Env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat"" x86_amd64
'where cl.exe'
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\cl.exe
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\cl.exe
I have been able to reproduce it on Windows 7 Professional (Service Pack 1 ) also with Microsoft Visual Studio 2013 Community Version 12.0.40629.0.
Env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat"" x86_amd64
'Where cl.exe'
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\cl.exe
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\cl.exe
I started with bisect activity between beta2 (bad) and beta1(good) given that beta1 works fine. Crash occurs at following commit.
commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2aThis appears consistent with the crash in planner suggested by crash dump Craig shared.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat Jun 18 15:22:34 2016 -0400
Restore foreign-key-aware estimation of join relation sizes.
This patch provides a new implementation of the logic added by commit
137805f89 and later removed by 77ba61080. It differs from the original
primarily in expending much less effort per joinrel in large queries,
which it accomplishes by doing most of the matching work once per query not
once per joinrel. Hopefully, it's also less buggy and better commented.
The never-documented enable_fkey_estimates GUC remains gone.
There remains work to be done to make the selectivity estimates account
for nulls in FK referencing columns; but that was true of the original
patch as well. We may be able to address this point later in beta.
In the meantime, any error should be in the direction of overestimating
rather than underestimating joinrel sizes, which seems like the direction
we want to err in.
Tomas Vondra and Tom Lane
Discussion: <31041.1465069446@sss.pgh.pa.us>
Tom any ideas on what could be going wrong here ?
Given that it fails on 'setup_description', I tried bypassing that by commenting it out, it again crashes on 'setup_privileges' and 'setup_schema'.
debug_query_string for setup_privileges:
INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_class'), 0, relacl, 'i' FROM pg_class WHERE relacl IS NOT NULL AND relkind IN ('r', 'v', 'm', 'S');INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT pg_class.oid, (SELECT oid FROM pg_class WHERE relname = 'pg_class'), pg_attribute.attnum, pg_attribute.attacl, 'i' FROM pg_class JOIN pg_attribute ON (pg_class.oid = pg_attribute.attrelid) WHERE pg_attribute.attacl IS NOT NULL AND pg_class.relkind IN ('r', 'v', 'm', 'S');INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_proc'), 0, proacl, 'i' FROM pg_proc WHERE proacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_type'), 0, typacl, 'i' FROM pg_type WHERE typacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_language'), 0, lanacl, 'i' FROM pg_language WHERE lanacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_largeobject_metadata'), 0, lomacl, 'i' FROM pg_largeobject_metadata WHERE lomacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_namespace'), 0, nspacl, 'i' FROM pg_namespace WHERE nspacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_database'), 0, datacl, 'i' FROM pg_database WHERE datacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_tablespace'), 0, spcacl, 'i' FROM pg_tablespace WHERE spcacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_foreign_data_wrapper'), 0, fdwacl, 'i' FROM pg_foreign_data_wrapper WHERE fdwacl IS NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class WHERE relname = 'pg_foreign_server'), 0, srvacl, 'i' FROM pg_foreign_server WHERE srvacl IS NOT NULL;/*
* SQL Information Schema
* as defined in ISO/IEC 9075-11:2011
*
* Copyright (c) 2003-2016, PostgreSQL Global Development Group
*
* src/backend/catalog/information_schema.sql
*
* Note: this file is read in single-user -j mode, which means that the
* command terminator is semicolon-newline-newline; whenever the backend
* sees that, it stops and executes what it's got. If you write a lot of
* statements without empty lines between, they'll all get quoted to you
* in any error message about one of them, so don't do that. Also, you
* cannot write a semicolon immediately followed by an empty line in a
* string literal (including a function body!) or a multiline comment.
*/
/*
* Note: Generally, the definitions in this file should be ordered
* according to the clause numbers in the SQL standard, which is also the
* alphabetical order. In some cases it is convenient or necessary to
* define one information schema view by using another one; in that case,
* put the referencing view at the very end and leave a note where it
* should have been put.
*/
/*
* 5.1
* INFORMATION_SCHEMA schema
*/
CREATE SCHEMA information_schema;
GRANT USAGE ON SCHEMA information_schema TO PUBLIC;
SET search_path TO information_schema;
debug_query_string for setup_schema:
INSERT INTO sql_implementation_info VALUES ('10003', 'CATALOG NAME', NULL, 'Y', NULL);
INSERT INTO sql_implementation_info VALUES ('10004', 'COLLATING SEQUENCE', NULL, (SELECT default_collate_name FROM character_sets), NULL);
INSERT INTO sql_implementation_info VALUES ('23', 'CURSOR COMMIT BEHAVIOR', 1, NULL, 'close cursors and retain prepared statements');
INSERT INTO sql_implementation_info VALUES ('2', 'DATA SOURCE NAME', NULL, '', NULL);
INSERT INTO sql_implementation_info VALUES ('17', 'DBMS NAME', NULL, (select trim(trailing ' ' from substring(version() from '^[^0-9]*'))), NULL);
INSERT INTO sql_implementation_info VALUES ('18', 'DBMS VERSION', NULL, '???', NULL); -- filled by initdb
INSERT INTO sql_implementation_info VALUES ('26', 'DEFAULT TRANSACTION ISOLATION', 2, NULL, 'READ COMMITTED; user-settable');
INSERT INTO sql_implementation_info VALUES ('28', 'IDENTIFIER CASE', 3, NULL, 'stored in mixed case - case sensitive');
INSERT INTO sql_implementation_info VALUES ('85', 'NULL COLLATION', 0, NULL, 'nulls higher than non-nulls');
INSERT INTO sql_implementation_info VALUES ('13', 'SERVER NAME', NULL, '', NULL);
INSERT INTO sql_implementation_info VALUES ('94', 'SPECIAL CHARACTERS', NULL, '', 'all non-ASCII characters allowed');
INSERT INTO sql_implementation_info VALUES ('46', 'TRANSACTION CAPABLE', 2, NULL, 'both DML and DDL');
And if I comment these out i.e. setup_description, setup_privileges and 'setup_schema' it seem to progress well without any errors/crashes.
Regards,
Haroon
--
Haroon http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 24 June 2016 at 21:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, this looks more like a compiler bug than anything else.
I tend to agree. Especially since valgrind has no complaints on x64 linux, and neither does DrMemory for 32-bit builds with the same toolchain on the same Windows and same SDK.
I don't see any particular reason we can't proceed with 9.6beta2 and build x64 Pg with MS VS 2015. There's no evidence turning up of a Pg bug here, and compiling with a different toolchain gets us working binaries for the target platform in question.
It would be worth recompiling at -O0, or whatever the local equivalent
of that is, to see if (1) the crash goes away or (2) the debugger's
printouts get any more reliable
Yeah, it probably is. I'll see if I can find time this w/e.
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes: > On 24 June 2016 at 21:34, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> TBH, this looks more like a compiler bug than anything else. > I tend to agree. Especially since valgrind has no complaints on x64 linux, > and neither does DrMemory for 32-bit builds with the same toolchain on the > same Windows and same SDK. If that is the explanation, I'm suspicious that it's got something to do with the interaction of a static inline-able (single-call-site) function and taking the address of a formal parameter. We certainly have multiple other instances of each thing, but maybe not both at the same place. This leads to a couple of suggestions for dodging the problem: 1. Make get_foreign_key_join_selectivity non-static so that it doesn't get inlined, along the lines of List *restrictlist); -static Selectivity get_foreign_key_join_selectivity(PlannerInfo *root, +extern Selectivity get_foreign_key_join_selectivity(PlannerInfo *root, Relids outer_relids, ... */ -static Selectivity +Selectivityget_foreign_key_join_selectivity(PlannerInfo *root, 2. Don't pass the original formal parameter to get_foreign_key_join_selectivity, ie do something like static doublecalc_joinrel_size_estimate(PlannerInfo *root, RelOptInfo *outer_rel, RelOptInfo *inner_rel, double outer_rows, double inner_rows, SpecialJoinInfo *sjinfo, - List *restrictlist) + List *orig_restrictlist){ JoinType jointype = sjinfo->jointype; + List *restrictlist = orig_restrictlist; Selectivity fkselec; Selectivity jselec; Selectivity pselec; Obviously, if either of those things do make the problem go away, it's a compiler bug. If not, we'll need to dig deeper. regards, tom lane
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Tom Lane
Date:
"Haroon ." <contact.mharoon@gmail.com> writes: > And if I comment these out i.e. setup_description, setup_privileges and > 'setup_schema' it seem to progress well without any errors/crashes. Presumably, what you've done there is remove every single join query from the post-bootstrap scripts. That isn't particularly useful in itself, but it does suggest that you would be able to fire up a normal session afterwards in which you could use a more conventional debugging approach. The problem can evidently be categorized as "planning of any join query whatsoever crashes", so a test case ought to be easy enough to come by. regards, tom lane
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
"Haroon ."
Date:
On Sat, Jun 25, 2016 at 6:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
If that is the explanation, I'm suspicious that it's got something to do
with the interaction of a static inline-able (single-call-site) function
and taking the address of a formal parameter. We certainly have multiple
other instances of each thing, but maybe not both at the same place.
This leads to a couple of suggestions for dodging the problem:
2. Don't pass the original formal parameter to
get_foreign_key_join_selectivity, ie do something like
static double
calc_joinrel_size_estimate(PlannerInfo *root,
RelOptInfo *outer_rel,
RelOptInfo *inner_rel,
double outer_rows,
double inner_rows,
SpecialJoinInfo *sjinfo,
- List *restrictlist)
+ List *orig_restrictlist)
{
JoinType jointype = sjinfo->jointype;
+ List *restrictlist = orig_restrictlist;
Selectivity fkselec;
Selectivity jselec;
Selectivity pselec;
The problem appears to be related to 'taking the address of a formal parameter'. NOT passing the original formal parameter to get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013. Resulting binaries seem to work fine as initdb doesn't experience child process crash anymore. 'vcregress check' does not report any failures also.
Anyways, We have decided to use VS2015 tool chain for 9.6beta2 release.
Thanks everyone for the valuable input and help. Appreciate it!
Regards,
Haroon
--
Haroon http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Tom Lane
Date:
"Haroon ." <contact.mharoon@gmail.com> writes: > On Sat, Jun 25, 2016 at 6:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This leads to a couple of suggestions for dodging the problem: >> >> 2. Don't pass the original formal parameter to >> get_foreign_key_join_selectivity, ie do something like >> >> static double >> calc_joinrel_size_estimate(PlannerInfo *root, >> RelOptInfo *outer_rel, >> RelOptInfo *inner_rel, >> double outer_rows, >> double inner_rows, >> SpecialJoinInfo *sjinfo, >> - List *restrictlist) >> + List *orig_restrictlist) >> { >> JoinType jointype = sjinfo->jointype; >> + List *restrictlist = orig_restrictlist; >> Selectivity fkselec; >> Selectivity jselec; >> Selectivity pselec; >> >> > The problem appears to be related to 'taking the address of a formal > parameter'. NOT passing the original formal parameter to > get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013. Thanks for investigating! I'll go commit that change. I wish someone would put up a buildfarm critter using VS2013, though. regards, tom lane
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Alvaro Herrera
Date:
Tom Lane wrote: > "Haroon ." <contact.mharoon@gmail.com> writes: > > The problem appears to be related to 'taking the address of a formal > > parameter'. NOT passing the original formal parameter to > > get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013. > > Thanks for investigating! I'll go commit that change. I wish someone > would put up a buildfarm critter using VS2013, though. Uh, isn't that what woodlouse is using? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Alvaro Herrera
Date:
Michael Paquier wrote: > On Fri, Jun 24, 2016 at 11:51 AM, Tsunakawa, Takayuki > <tsunakawa.takay@jp.fujitsu.com> wrote: > >> From: pgsql-hackers-owner@postgresql.org > >> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier > >> Sent: Friday, June 24, 2016 11:37 AM > >> On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com> > >> wrote: > >> > It might be worth testing that out and adding an initdb startup > >> > flag to create the directory, since initdb is such a PITA to > >> > debug. > >> > >> I was more thinking about putting that under -DDEBUG for example. > > > > I think just the existing option -d (--debug) and/or -n (--no-clean) > > would be OK. > > If the majority thinks that an option switch is more adapted, I won't > fight it strongly. Just please let's not mess up with the behavior of > the existing options. I think creating crashdumps/ when both -d and -n are specified is a bit odd but reasonable. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Thanks for investigating! I'll go commit that change. I wish someone >> would put up a buildfarm critter using VS2013, though. > Uh, isn't that what woodlouse is using? Well, it wasn't reporting this crash, so there's *something* different. regards, tom lane
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
It'd be handy if the buildfarm captured the output of:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Thanks for investigating! I'll go commit that change. I wish someone
>> would put up a buildfarm critter using VS2013, though.
> Uh, isn't that what woodlouse is using?
Well, it wasn't reporting this crash, so there's *something* different.
It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using native x86_64 compilers perhaps that's why?
We've confirmed it on two different versions of VS 2013, so it's not specific to one minor compiler point release.
It'd be handy if the buildfarm captured the output of:
* cl (no arguments, first line only)
* msbuild /nologo /version
and the env vars:
* VS*COMNTOOLS (* being any 3 digits)
* PROCESSOR_ARCHITECTURE
* PROCESSOR_IDENTIFIER
* PROCESSOR_ARCHITEW6432
since right now it's hard to be totally sure exactly what a VS animal is building with unless there's a log attached due to a failure.
That said, TBH I doubt we can or should cover every VS release in every VS configuration. Especially since there are so many ways you can excitingly break and mangle VS, particularly when installing multiple VS versions on one host. It's a great IDE with a truly awful set of installation and managment tools.
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Alvaro Herrera
Date:
Craig Ringer wrote: > On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > > Tom Lane wrote: > > >> Thanks for investigating! I'll go commit that change. I wish someone > > >> would put up a buildfarm critter using VS2013, though. > > > > > Uh, isn't that what woodlouse is using? > > > > Well, it wasn't reporting this crash, so there's *something* different. > It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using > native x86_64 compilers perhaps that's why? Hmm, so what about a pure 32bit build, if such a thing still exists? If so and it causes the same crash, perhaps we should have one member for each VS version running on 32bit x86. (I note that the coverage of MSVC versions has greatly improved in recent months.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Craig Ringer wrote:
> On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > > Tom Lane wrote:
> > >> Thanks for investigating! I'll go commit that change. I wish someone
> > >> would put up a buildfarm critter using VS2013, though.
> >
> > > Uh, isn't that what woodlouse is using?
> >
> > Well, it wasn't reporting this crash, so there's *something* different.
> It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using
> native x86_64 compilers perhaps that's why?
Hmm, so what about a pure 32bit build, if such a thing still exists? If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.
It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I tested that.
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Alvaro Herrera
Date:
Craig Ringer wrote: > On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > > Hmm, so what about a pure 32bit build, if such a thing still exists? If > > so and it causes the same crash, perhaps we should have one member for > > each VS version running on 32bit x86. > > It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I > tested that. Ah, okay. I doubt it's worth setting up buildfarm members testing all cross-compiles just to try and catch possible compiler bugs that way, so unless somebody wants to invest more effort in this area, it seems we're done here. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Michael Paquier
Date:
On Fri, Jul 1, 2016 at 9:57 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Craig Ringer wrote: >> On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> >> > Hmm, so what about a pure 32bit build, if such a thing still exists? If >> > so and it causes the same crash, perhaps we should have one member for >> > each VS version running on 32bit x86. >> >> It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I >> tested that. > > Ah, okay. I doubt it's worth setting up buildfarm members testing all > cross-compiles just to try and catch possible compiler bugs that way, so > unless somebody wants to invest more effort in this area, it seems we're > done here. Sure. To be honest just using the latest version of MSVC available for the builds is fine I think. Windows is very careful regarding backward-compatibility of its compiled stuff usually, even if by using VS2015 you make the builds of Postgres incompatible with XP. But software is a world that keeps moving on, and XP is already out of support by Redmond. -- Michael
Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
From
Craig Ringer
Date:
On 1 July 2016 at 09:02, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jul 1, 2016 at 9:57 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Craig Ringer wrote:
>> On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> > Hmm, so what about a pure 32bit build, if such a thing still exists? If
>> > so and it causes the same crash, perhaps we should have one member for
>> > each VS version running on 32bit x86.
>>
>> It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I
>> tested that.
>
> Ah, okay. I doubt it's worth setting up buildfarm members testing all
> cross-compiles just to try and catch possible compiler bugs that way, so
> unless somebody wants to invest more effort in this area, it seems we're
> done here.
Sure. To be honest just using the latest version of MSVC available for
the builds is fine I think. Windows is very careful regarding
backward-compatibility of its compiled stuff usually, even if by using
VS2015 you make the builds of Postgres incompatible with XP. But
software is a world that keeps moving on, and XP is already out of
support by Redmond.
I agree. I'm happier now that we've got evidence it's a compiler bug, though.