initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid <umair.shahid@2ndquadrant.com>
wrote:
---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, Jun 23, 2016 at 9:32 PM
Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready
To: Magnus Hagander <magnus@hagander.net>
Cc: Umair Shahid <umair.shahid@2ndquadrant.com>, Dave Page <
dpage@postgresql.org>, PostgreSQL Packagers <
pgsql-packagers@postgresql.org>Magnus Hagander <magnus@hagander.net> writes:
That makes more sense as the joinrel stuff *has* been changed between the
two betas. I'm sure someone who's touched that code (Tom?) can comment on
that part..It still makes little sense to me, as the previous reports say that the
problem happened during bootstrap, and the planner does not run
during bootstrap.Could we get a look at debug_query_string in the coredump, to possibly
narrow down where the crash is really happening?
Moving thread to -hackers ...
debug_query_string is
* "INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid,
t.description FROM tmp_pg_description t, pg_class c WHERE c.relname =
t.classname;"*
Happening in "setup_description"
Show quoted text
It's still strange that it doesn't affect woodlouse.
Or any of the other Windows critters...
regards, tom lane
--
Umair Shahid
2ndQuadrant - The PostgreSQL Support Company
http://www.2ndQuadrant.com/
On 24 June 2016 at 05:17, Umair Shahid <umair.shahid@gmail.com> wrote:
On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid <
umair.shahid@2ndquadrant.com> wrote:---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, Jun 23, 2016 at 9:32 PM
Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready
To: Magnus Hagander <magnus@hagander.net>
Cc: Umair Shahid <umair.shahid@2ndquadrant.com>, Dave Page <
dpage@postgresql.org>, PostgreSQL Packagers <
pgsql-packagers@postgresql.org>Magnus Hagander <magnus@hagander.net> writes:
That makes more sense as the joinrel stuff *has* been changed between
the
two betas. I'm sure someone who's touched that code (Tom?) can comment
on
that part..
It still makes little sense to me, as the previous reports say that the
problem happened during bootstrap, and the planner does not run
during bootstrap.Could we get a look at debug_query_string in the coredump, to possibly
narrow down where the crash is really happening?Moving thread to -hackers ...
debug_query_string is
* "INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid,
t.description FROM tmp_pg_description t, pg_class c WHERE c.relname =
t.classname;"*Happening in "setup_description"
I was helping Haroon with this last night. I don't have access to the
original thread and he's not around so I don't know how much he said. I'll
repeat our findings here.
During debugging I found that:
* A VS 2013 build (perfomed by Haroon and copied to the test host) crashes
consistently with the reported symptoms - "performing post-bootstrap
initialization ... child process was terminated by exception 0xC0000005"
* The issue doesn't happen in a VS 2015 build done on the test host
* I couldn't use just-in-time debugging because the restricted execution
token setup isolated the process. For the same reason, breakpoints stop
working in initdb.c after line 3557.
* To get a backtrace, I had to:
* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres child process
* Debug->attach to process, attach to the child postgres. This doesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashes
* It is an access violation (segfault) in postgres.exe when attempting to
read memory at 0xFFFFFFFFFFFFFFFF in calc_joinrel_size_estimate() at
costsize.c:3940
fkselec = get_foreign_key_join_selectivity(root,
outer_rel->relids,
inner_rel->relids,
sjinfo,
&restrictlist);
with debug_query_string:
0x0000000009bf6140 "INSERT INTO pg_description SELECT t.objoid, c.oid,
t.objsubid, t.description FROM tmp_pg_description t, pg_class c WHERE
c.relname = t.classname;\n"
Backtrace:
Exception thrown at 0x00000001401A5A81 in postgres.exe: 0xC0000005:
Access violation reading location 0xFFFFFFFFFFFFFFFF.
postgres.exe!calc_joinrel_size_estimate(PlannerInfo * root, RelOptInfo *
outer_rel, RelOptInfo * inner_rel, double outer_rows, double inner_rows,
SpecialJoinInfo * sjinfo, List * restrictlist) Line 3944 C
postgres.exe!set_joinrel_size_estimates(PlannerInfo * root, RelOptInfo *
rel, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo *
sjinfo, List * restrictlist) Line 3852 C
postgres.exe!build_join_rel(PlannerInfo * root, Bitmapset * joinrelids,
RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo,
List * * restrictlist_ptr) Line 521 C
postgres.exe!make_join_rel(PlannerInfo * root, RelOptInfo * rel1,
RelOptInfo * rel2) Line 721 C
postgres.exe!make_rels_by_clause_joins(PlannerInfo * root, RelOptInfo *
old_rel, ListCell * other_rels) Line 266 C
postgres.exe!join_search_one_level(PlannerInfo * root, int level) Line 69
C
postgres.exe!standard_join_search(PlannerInfo * root, int levels_needed,
List * initial_rels) Line 2172 C
postgres.exe!query_planner(PlannerInfo * root, List * tlist,
void(*)(PlannerInfo *, void *) qp_callback, void * qp_extra) Line 255 C
postgres.exe!grouping_planner(PlannerInfo * root, char
inheritance_update, double tuple_fraction) Line 1695 C
postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse,
PlannerInfo * parent_root, char hasRecursion, double tuple_fraction) Line
775 C
postgres.exe!standard_planner(Query * parse, int cursorOptions,
ParamListInfoData * boundParams) Line 312 C
postgres.exe!pg_plan_query(Query * querytree, int cursorOptions,
ParamListInfoData * boundParams) Line 800 C
postgres.exe!exec_simple_query(const char * query_string) Line 1023 C
postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname,
const char * username) Line 4076 C
postgres.exe!main(int argc, char * * argv) Line 227 C
Local vars:
+ inner_rel 0x0000000009dfd170 {type=T_EquivalenceClass (537)
reloptkind=RELOPT_BASEREL (0) relids=0x0000000009d6d718 {...} ...} RelOptInfo
*
inner_rows 270.00000000000000 double
+ outer_rel 0x00000001401ded48
{postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel,
RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
outer_rows 2.653352065130e-314#DEN double
+ restrictlist 0x0000000009d6f7f8 {type=T_List (656) length=1
head=0x0000000009d6f7d8 {data={ptr_value=0x0000000009d6e980 ...} ...} ...} List
*
+ root 0x0000000009dfd800 {type=1 parse=0x000000000067d220
{type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo
*
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543)
min_lefthand=0x0000000009dfcfd8 {nwords=1 words=0x0000000009dfcfdc {...} }
...} SpecialJoinInfo *
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres child process
* Debug->attach to process, attach to the child postgres. This doesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashes
Do you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 June 2016 at 10:21, Craig Ringer <craig@2ndquadrant.com> wrote:
* To get a backtrace, I had to:
* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres child process
* Debug->attach to process, attach to the child postgres. This doesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashes
Also, to save anyone else this hassle, I have saved a process dump (windows
core file) and the debug symbols to gdrive. You can get them at:
Note that you will need a Visual Studio version installed. VS Community
2015 works fine. You only need to install the C++ devenv and C++ headers,
you don't need MFC or any of the rest. The default install is fine if you
don't mind a bigger download. Once installed, open postgres.dmp, then go
to debug->options, symbols. There, enable the Microsoft Symbol Server, and
also add a new entry for the absolute path to the symbols directory for the
archive you unpacked. You should enable the symbol cache directory too,
make a directory in your user dir and put it there.
If Haroon shared some gdrive links earlier on the thread I don't have
access to, this is the same data just efficiently compressed (32MB instead
of 180MB) and packaged up in a single convenient archive with the matching
sources and a full working install. You'll need 7zip to unpack it, but that
should be on your "install as soon as you install Windows" list anyway.
https://drive.google.com/open?id=0B7JKjZdzBUo1aE5DQnZ5VEpBUEk
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres childprocess
* Debug->attach to process, attach to the child postgres. This doesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashesDo you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?
I see what you did there ;)
Yes, quite possibly, actually. I should've just got Haroon to build me a
new initdb without the priv setting and with creation of crashdumps/ .
It might be worth testing that out and adding an initdb startup flag to
create the directory, since initdb is such a PITA to debug.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
Yes, quite possibly, actually. I should've just got Haroon to build me a new
initdb without the priv setting and with creation of crashdumps/ .It might be worth testing that out and adding an initdb startup flag to
create the directory, since initdb is such a PITA to debug.
I was more thinking about putting that under -DDEBUG for example.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
Sent: Friday, June 24, 2016 11:37 AM
On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:
It might be worth testing that out and adding an initdb startup flagto create the directory, since initdb is such a PITA to debug.
I was more thinking about putting that under -DDEBUG for example.
I think just the existing option -d (--debug) and/or -n (--no-clean) would be OK.
Regards
Takayuki Tsunakawa
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 24, 2016 at 11:51 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
Sent: Friday, June 24, 2016 11:37 AM
On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:
It might be worth testing that out and adding an initdb startup flagto create the directory, since initdb is such a PITA to debug.
I was more thinking about putting that under -DDEBUG for example.
I think just the existing option -d (--debug) and/or -n (--no-clean) would be OK.
If the majority thinks that an option switch is more adapted, I won't
fight it strongly. Just please let's not mess up with the behavior of
the existing options.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 June 2016 at 05:17, Umair Shahid <umair.shahid@gmail.com> wrote:
It's still strange that it doesn't affect woodlouse.
Or any of the other Windows critters...
<http://www.2ndQuadrant.com/>
Given that it's only been seen in VS 2013, it's particularly odd that it's
not biting woodlouse.
I'd like more details from those whose installs are crashing. What exact
vcvars env did you run under, with which exact cl.exe version?
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
Given that it's only been seen in VS 2013, it's particularly odd that it's
not biting woodlouse.I'd like more details from those whose installs are crashing. What exact
vcvars env did you run under, with which exact cl.exe version?
Which OS did you use for the compilation? I don't think that this
matters much but woodloose is using Win7.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 June 2016 at 12:31, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig@2ndquadrant.com>
wrote:Given that it's only been seen in VS 2013, it's particularly odd that
it's
not biting woodlouse.
I'd like more details from those whose installs are crashing. What exact
vcvars env did you run under, with which exact cl.exe version?Which OS did you use for the compilation? I don't think that this
matters much but woodloose is using Win7.
I'll have to wait for Haroon for that info for the crashing builds he did,
but I've now reproduced it with:
Windows server 2012 R2, VS 2013 Community Update 5, cross compile tools for
x86 to amd64. cl 18.00.40629 for x64, env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\vcvarsall.bat" x86_amd64"
"where cl" reports
C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\bin\x86_amd64\cl.exe
Note that cross compilation is a typical configuration on Windows, where
you routinely use 32bit x86 compilers to build 64bit code, except in the
newest SDKs.
I see the same symptoms, with the segfault.
This host is a clean install, an AWS instance created for the purpose.
It looks like woodlouse probably runs an older VS2013 and uses the native
x64 toolchain; its env includes:
C:\\Program Files (x86)\\Microsoft Visual Studio 12.0\\VC\\BIN\\amd64
and does not have x86_amd64 in it.
BTW, I suggested to Haroon that he clone beta2 from git, then do a
git-bisect between beta1 (works) and beta2 (fails) to see if he can
identify the commit that causes things to start failing. I don't know how
far he got with that yesterday.
By comparison, I had no problems on the same host with VS Community 2015,
cl 19.00.23918, env "VS2015 x64 Native Tools Command Prompt":
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio
14.0\VC\vcvarsall.bat"" amd64
On a side note I'm unable to build with vs2013 community u5 native tools (
for some reason. Link errors, unresolved external symbol _ischartype_l . cl
18.00.42629 for x64, env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\vcvarsall.bat" amd64"
"where cl" reports:
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\cl.exe
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 24 June 2016 at 13:00, Craig Ringer <craig@2ndquadrant.com> wrote:
I've now reproduced it with:
I can also confirm that it _doesn't_ crash with the same SDK using a 32-bit
build (running under WoW on x64). cl 18.00.40629 for x86, env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\vcvarsall.bat" x86"
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres childprocess
* Debug->attach to process, attach to the child postgres. This doesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashesDo you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?
The answer is "yes" btw. Add "crashdumps" to the static array of
directories created by initdb and it works great.
Sigh. It'd be less annoying if I hadn't written most of the original patch.
For convenience I also commented out the check_root call in
src/backend/main.c and the get_restricted_token(progname) call in initdb.c,
so I could run it easily under an admin account where I can also install
tools etc without hassle. Not recommended on a non-throwaway machine of
course.
The generated crashdump shows the same crash in the same location.
I have absolutely no idea why it's trying to access memory at what looks
like (uint64)(-1) though. Nothing in the auto vars list:
+ &restrictlist 0x000000000043f7b0 {0x0000000009e32600 {type=T_List (656)
length=1 head=0x0000000009e325e0 {data={ptr_value=...} ...} ...}} List * *
+ inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537)
reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo
*
+ inner_rel->relids 0x0000000009e30520 {nwords=658 words=0x0000000009e30524
{...} } Bitmapset *
+ outer_rel 0x00000001401dec98
{postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel,
RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
+ outer_rel->relids 0xe808498b48d78b48 {nwords=??? words=0xe808498b48d78b4c
{...} } Bitmapset *
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543)
min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} }
...} SpecialJoinInfo *
or locals:
+ inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537)
reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo
*
inner_rows 270.00000000000000 double
+ outer_rel 0x00000001401dec98
{postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel,
RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
outer_rows 2.653351978175e-314#DEN double
+ restrictlist 0x0000000009e32600 {type=T_List (656) length=1
head=0x0000000009e325e0 {data={ptr_value=0x0000000009e31788 ...} ...} ...} List
*
+ root 0x0000000009e7b3f8 {type=1 parse=0x0000000000504ad0
{type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo
*
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543)
min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} }
...} SpecialJoinInfo *
seems to fit. Though outer_rel->relids is a pretty weird address -
0xe808498b48d78b48? Really?
I'd point DrMemory at it, but unfortunately it only supports 32-bit
applications so far. I don't have access to any of the commerical tools
like Purify. Maybe someone at EDB can help out with that, if you guys do?
Register states are:
RAX = 000000000043F7B0 RBX = 0000000009E32218 RCX = 0000000009E78510 RDX =
0000000009E7ABD0 RSI = 0000000009E78510 RDI = 0000000009E32218 R8 =
0000000009E7B3F8 R9 = 0000000009E7B1E8 R10 = 0000000009E7A9C0 R11 =
0000000000000001 R12 = 0000000009E32200 R13 = 0000000000000000 R14 =
0000000009E7B1E8 R15 = 0000000000000000 RIP = 00000001401A59D1 RSP =
000000000043F6E0 RBP = 0000000009E7A9C0 EFL = 00010202
and the exact crash site is
fkselec = get_foreign_key_join_selectivity(root,
outer_rel->relids,
inner_rel->relids,
sjinfo,
&restrictlist);
00000001401A59AB mov r8,qword ptr [r8+8]
00000001401A59AF mov rdx,qword ptr [rdx+8]
00000001401A59B3 movaps xmmword ptr [rax-28h],xmm6
00000001401A59B7 movaps xmmword ptr [rax-38h],xmm7
00000001401A59BB movaps xmmword ptr [rax-48h],xmm8
00000001401A59C0 movaps xmmword ptr [rax-58h],xmm9
00000001401A59C5 lea rax,[rax+38h]
00000001401A59C9 movaps xmm7,xmm3
00000001401A59CC mov qword ptr [rsp+20h],rax
00000001401A59D1 movaps xmmword ptr [rax-68h],xmm10 <---- here
00000001401A59D6 mov qword ptr [rax-48h],r14
00000001401A59DA mov r14,qword ptr [sjinfo]
00000001401A59E2 mov ebp,dword ptr [r14+28h]
00000001401A59E6 mov qword ptr [rax-50h],r15
00000001401A59EA mov r9,r14
00000001401A59ED mov r15,rcx
00000001401A59F0 call get_foreign_key_join_selectivity
(01401A5C30h)
with
XMM3 000000000000000040A5720000000000
RAX 000000000043F7B0
XMM7 000000000000000040A5720000000000
RSP 000000000043F6E0
XMM10 00000000000000000000000000000000
I'm about 100% ignorant of x64 asm, but hopefully someone can interpret
this usefully. I can tell it's doing a sse "Move Aligned Packed
Single-Precision Floating-Point Values" (from memory into a sse register?)
but that's about it.
rax-68h is 0x000000000043F748. The memory at that location is
00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0 bf 00 00 00 00 00 00 00 00 c0
a9 e7 09 00 00 00 00 f8 b3 e7 09 00 00
So there you go, a whole bunch of data and I, at least, am still none the
wiser.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jun 24, 2016 at 3:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres child
process
* Debug->attach to process, attach to the child postgres. This doesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashesDo you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?The answer is "yes" btw. Add "crashdumps" to the static array of directories
created by initdb and it works great.
As simple as attached..
Sigh. It'd be less annoying if I hadn't written most of the original patch.
You mean the patch that created the crashdumps/ trick? This has saved
me a couple of months back to analyze a problem TBH.
--
Michael
Attachments:
dbg-initdb.patchinvalid/octet-stream; name=dbg-initdb.patchDownload
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index d4a5e7c..3796213 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -199,6 +199,9 @@ static const char *backend_options = "--single -F -O -j -c search_path=pg_catalo
static const char *const subdirs[] = {
"global",
+#if defined(DEBUG) && defined(WIN32)
+ "crashdumps",
+#endif
"pg_xlog/archive_status",
"pg_clog",
"pg_commit_ts",
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer
<craig(at)2ndquadrant(dot)com> wrote:
I was helping Haroon with this last night. I don't have access to the
original thread and he's not around so I don't know how much he said.
I'll
repeat our findings here.
Craig, I am around now looking into this. I'll update the list as I get
more info.
- Haroon
On 24 June 2016 at 11:27, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 3:22 PM, Craig Ringer <craig@2ndquadrant.com>
wrote:On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move theexecution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or,y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres child
process
* Debug->attach to process, attach to the child postgres. Thisdoesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashesDo you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?The answer is "yes" btw. Add "crashdumps" to the static array of
directories
created by initdb and it works great.
As simple as attached..
Sigh. It'd be less annoying if I hadn't written most of the original
patch.
You mean the patch that created the crashdumps/ trick? This has saved
me a couple of months back to analyze a problem TBH.
--
Michael
--
Haroon http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer
<craig(at)2ndquadrant(dot)com> wrote:
I was helping Haroon with this last night. I don't have access to the
original thread and he's not around so I don't know how much he said.
I'll
repeat our findings here.
Craig, I am around now looking into this. I'll update the list as I get
more info.
Apparently my previous message (this same text ) didn't make it through ...
-- Haroon
Import Notes
Resolved by subject fallback
I have been running bisect, it breaks at this commit:
*commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a*
*Author: Tom Lane <tgl@sss.pgh.pa.us <tgl@sss.pgh.pa.us>>*
*Date: Sat Jun 18 15:22:34 2016 -0400*
* Restore foreign-key-aware estimation of join relation sizes.*
* This patch provides a new implementation of the logic added by commit*
* 137805f89 and later removed by 77ba61080. It differs from the
original*
* primarily in expending much less effort per joinrel in large queries,*
* which it accomplishes by doing most of the matching work once per
query not*
* once per joinrel. Hopefully, it's also less buggy and better
commented.*
* The never-documented enable_fkey_estimates GUC remains gone.*
* There remains work to be done to make the selectivity estimates
account*
* for nulls in FK referencing columns; but that was true of the original*
* patch as well. We may be able to address this point later in beta.*
* In the meantime, any error should be in the direction of
overestimating*
* rather than underestimating joinrel sizes, which seems like the
direction*
* we want to err in.*
* Tomas Vondra and Tom Lane* Discussion: <
31041.1465069446@sss.pgh.pa.us>
--
Haroon http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jun 24, 2016 at 12:19 PM, Haroon Muhammad <contact.mharoon@gmail.com
Show quoted text
wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer
<craig(at)2ndquadrant(dot)com> wrote:I was helping Haroon with this last night. I don't have access to the
original thread and he's not around so I don't know how much he said.I'll
repeat our findings here.
Craig, I am around now looking into this. I'll update the list as I get
more info.Apparently my previous message (this same text ) didn't make it through ...
-- Haroon
Craig Ringer <craig@2ndquadrant.com> writes:
I have absolutely no idea why it's trying to access memory at what looks
like (uint64)(-1) though. Nothing in the auto vars list:
+ &restrictlist 0x000000000043f7b0 {0x0000000009e32600 {type=T_List (656) length=1 head=0x0000000009e325e0 {data={ptr_value=...} ...} ...}} List * * + inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo * + inner_rel->relids 0x0000000009e30520 {nwords=658 words=0x0000000009e30524 {...} } Bitmapset * + outer_rel 0x00000001401dec98 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo * + outer_rel->relids 0xe808498b48d78b48 {nwords=??? words=0xe808498b48d78b4c {...} } Bitmapset * + sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} } ...} SpecialJoinInfo *
inner_rel seems to be pointing at garbage, or at least why is the
referenced object tag T_EquivalenceClass not T_RelOptInfo? And
why aren't we being given anything for outer_rel? The value for
outer_rel->relids isn't inspiring any confidence either, and
for that matter inner_rel->relids couldn't possibly have more than
nwords==1 given how simple the query is. In short, either the
debugger is totally confused or the code is, because most of these
pointers aren't pointing at anything sane.
TBH, this looks more like a compiler bug than anything else. I wonder
whether it's getting confused by taking the address of a parameter
(although surely we do that elsewhere).
It would be worth recompiling at -O0, or whatever the local equivalent
of that is, to see if (1) the crash goes away or (2) the debugger's
printouts get any more reliable.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer
<craig(at)2ndquadrant(dot)com> wrote:
I'd like more details from those whose installs are crashing. What exact
vcvars env did you run under, with which exact cl.exe version?
This is a Windows server 2012 R2 Standard.
Devenv: Microsoft Visual Studio 2013 Community Version 12.0.31101.0.
Env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\vcvarsall.bat"" x86_amd64
'where cl.exe'
C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\bin\x86_amd64\cl.exe
C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\bin\cl.exe
I have been able to reproduce it on Windows 7 Professional (Service Pack 1
) also with Microsoft Visual Studio 2013 Community Version 12.0.40629.0.
Env:
%comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\vcvarsall.bat"" x86_amd64
'Where cl.exe'
C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\bin\x86_amd64\cl.exe
C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\bin\cl.exe
I started with bisect activity between beta2 (bad) and beta1(good) given
that beta1 works fine. Crash occurs at following commit.
commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat Jun 18 15:22:34 2016 -0400
Restore foreign-key-aware estimation of join relation sizes.
This patch provides a new implementation of the logic added by commit
137805f89 and later removed by 77ba61080. It differs from the original
primarily in expending much less effort per joinrel in large queries,
which it accomplishes by doing most of the matching work once per query
not
once per joinrel. Hopefully, it's also less buggy and better commented.
The never-documented enable_fkey_estimates GUC remains gone.
There remains work to be done to make the selectivity estimates account
for nulls in FK referencing columns; but that was true of the original
patch as well. We may be able to address this point later in beta.
In the meantime, any error should be in the direction of overestimating
rather than underestimating joinrel sizes, which seems like the
direction
we want to err in.
Tomas Vondra and Tom Lane
Discussion: <31041.1465069446@sss.pgh.pa.us>
This appears consistent with the crash in planner suggested by crash dump
Craig shared.
Tom any ideas on what could be going wrong here ?
Given that it fails on 'setup_description', I tried bypassing that by
commenting it out, it again crashes on 'setup_privileges' and
'setup_schema'.
debug_query_string for setup_privileges:
*INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs,
privtype) SELECT oid, (SELECT oid FROM pg_class WHERE
relname = 'pg_class'), 0, relacl, 'i' FROM
pg_class WHERE relacl IS NOT NULL AND relkind IN ('r',
'v', 'm', 'S');INSERT INTO pg_init_privs (objoid, classoid, objsubid,
initprivs, privtype) SELECT pg_class.oid, (SELECT oid FROM
pg_class WHERE relname = 'pg_class'), pg_attribute.attnum,
pg_attribute.attacl, 'i' FROM pg_class JOIN
pg_attribute ON (pg_class.oid = pg_attribute.attrelid) WHERE
pg_attribute.attacl IS NOT NULL AND pg_class.relkind IN ('r', 'v',
'm', 'S');INSERT INTO pg_init_privs (objoid, classoid, objsubid,
initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class
WHERE relname = 'pg_proc'), 0, proacl, 'i' FROM
pg_proc WHERE proacl IS NOT NULL;INSERT INTO pg_init_privs
(objoid, classoid, objsubid, initprivs, privtype) SELECT oid,
(SELECT oid FROM pg_class WHERE relname = 'pg_type'), 0,
typacl, 'i' FROM pg_type WHERE typacl IS NOT
NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid, initprivs,
privtype) SELECT oid, (SELECT oid FROM pg_class WHERE
relname = 'pg_language'), 0, lanacl, 'i' FROM
pg_language WHERE lanacl IS NOT NULL;INSERT INTO pg_init_privs
(objoid, classoid, objsubid, initprivs, privtype) SELECT oid,
(SELECT oid FROM pg_class WHERE relname = 'pg_largeobject_metadata'),
0, lomacl, 'i' FROM pg_largeobject_metadata
WHERE lomacl IS NOT NULL;INSERT INTO pg_init_privs (objoid,
classoid, objsubid, initprivs, privtype) SELECT oid,
(SELECT oid FROM pg_class WHERE relname = 'pg_namespace'), 0,
nspacl, 'i' FROM pg_namespace WHERE nspacl IS
NOT NULL;INSERT INTO pg_init_privs (objoid, classoid, objsubid,
initprivs, privtype) SELECT oid, (SELECT oid FROM pg_class
WHERE relname = 'pg_database'), 0, datacl, 'i' FROM
pg_database WHERE datacl IS NOT NULL;INSERT INTO
pg_init_privs (objoid, classoid, objsubid, initprivs, privtype) SELECT
oid, (SELECT oid FROM pg_class WHERE relname =
'pg_tablespace'), 0, spcacl, 'i' FROM
pg_tablespace WHERE spcacl IS NOT NULL;INSERT INTO pg_init_privs
(objoid, classoid, objsubid, initprivs, privtype) SELECT oid,
(SELECT oid FROM pg_class WHERE relname = 'pg_foreign_data_wrapper'),
0, fdwacl, 'i' FROM pg_foreign_data_wrapper
WHERE fdwacl IS NOT NULL;INSERT INTO pg_init_privs (objoid,
classoid, objsubid, initprivs, privtype) SELECT oid,
(SELECT oid FROM pg_class WHERE relname = 'pg_foreign_server'), 0,
srvacl, 'i' FROM pg_foreign_server WHERE
srvacl IS NOT NULL;/**
* * SQL Information Schema*
* * as defined in ISO/IEC 9075-11:2011*
* **
* * Copyright (c) 2003-2016, PostgreSQL Global Development Group*
* **
* * src/backend/catalog/information_schema.sql*
* **
* * Note: this file is read in single-user -j mode, which means that the*
* * command terminator is semicolon-newline-newline; whenever the backend*
* * sees that, it stops and executes what it's got. If you write a lot of*
* * statements without empty lines between, they'll all get quoted to you*
* * in any error message about one of them, so don't do that. Also, you*
* * cannot write a semicolon immediately followed by an empty line in a*
* * string literal (including a function body!) or a multiline comment.*
* */*
*/**
* * Note: Generally, the definitions in this file should be ordered*
* * according to the clause numbers in the SQL standard, which is also the*
* * alphabetical order. In some cases it is convenient or necessary to*
* * define one information schema view by using another one; in that case,*
* * put the referencing view at the very end and leave a note where it*
* * should have been put.*
* */*
*/**
* * 5.1*
* * INFORMATION_SCHEMA schema*
* */*
*CREATE SCHEMA information_schema;*
*GRANT USAGE ON SCHEMA information_schema TO PUBLIC;*
*SET search_path TO information_schema;*
debug_query_string for setup_schema:
*INSERT INTO sql_implementation_info VALUES ('10003', 'CATALOG NAME', NULL,
'Y', NULL);*
*INSERT INTO sql_implementation_info VALUES ('10004', 'COLLATING SEQUENCE',
NULL, (SELECT default_collate_name FROM character_sets), NULL);*
*INSERT INTO sql_implementation_info VALUES ('23', 'CURSOR COMMIT
BEHAVIOR', 1, NULL, 'close cursors and retain prepared statements');*
*INSERT INTO sql_implementation_info VALUES ('2', 'DATA SOURCE NAME',
NULL, '', NULL);*
*INSERT INTO sql_implementation_info VALUES ('17', 'DBMS NAME', NULL,
(select trim(trailing ' ' from substring(version() from '^[^0-9]*'))),
NULL);*
*INSERT INTO sql_implementation_info VALUES ('18', 'DBMS VERSION', NULL,
'???', NULL); -- filled by initdb*
*INSERT INTO sql_implementation_info VALUES ('26', 'DEFAULT TRANSACTION
ISOLATION', 2, NULL, 'READ COMMITTED; user-settable');*
*INSERT INTO sql_implementation_info VALUES ('28', 'IDENTIFIER CASE', 3,
NULL, 'stored in mixed case - case sensitive');*
*INSERT INTO sql_implementation_info VALUES ('85', 'NULL COLLATION', 0,
NULL, 'nulls higher than non-nulls');*
*INSERT INTO sql_implementation_info VALUES ('13', 'SERVER NAME', NULL,
'', NULL);*
*INSERT INTO sql_implementation_info VALUES ('94', 'SPECIAL CHARACTERS',
NULL, '', 'all non-ASCII characters allowed');*
*INSERT INTO sql_implementation_info VALUES ('46', 'TRANSACTION
CAPABLE', 2, NULL, 'both DML and DDL');*
And if I comment these out i.e. setup_description, setup_privileges and
'setup_schema' it seem to progress well without any errors/crashes.
Regards,
Haroon
--
Haroon http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 24 June 2016 at 21:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, this looks more like a compiler bug than anything else.
I tend to agree. Especially since valgrind has no complaints on x64 linux,
and neither does DrMemory for 32-bit builds with the same toolchain on the
same Windows and same SDK.
I don't see any particular reason we can't proceed with 9.6beta2 and build
x64 Pg with MS VS 2015. There's no evidence turning up of a Pg bug here,
and compiling with a different toolchain gets us working binaries for the
target platform in question.
It would be worth recompiling at -O0, or whatever the local equivalent
of that is, to see if (1) the crash goes away or (2) the debugger's
printouts get any more reliable
Yeah, it probably is. I'll see if I can find time this w/e.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes:
On 24 June 2016 at 21:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, this looks more like a compiler bug than anything else.
I tend to agree. Especially since valgrind has no complaints on x64 linux,
and neither does DrMemory for 32-bit builds with the same toolchain on the
same Windows and same SDK.
If that is the explanation, I'm suspicious that it's got something to do
with the interaction of a static inline-able (single-call-site) function
and taking the address of a formal parameter. We certainly have multiple
other instances of each thing, but maybe not both at the same place.
This leads to a couple of suggestions for dodging the problem:
1. Make get_foreign_key_join_selectivity non-static so that it doesn't
get inlined, along the lines of
List *restrictlist);
-static Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,
+extern Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,
Relids outer_relids,
...
*/
-static Selectivity
+Selectivity
get_foreign_key_join_selectivity(PlannerInfo *root,
2. Don't pass the original formal parameter to
get_foreign_key_join_selectivity, ie do something like
static double
calc_joinrel_size_estimate(PlannerInfo *root,
RelOptInfo *outer_rel,
RelOptInfo *inner_rel,
double outer_rows,
double inner_rows,
SpecialJoinInfo *sjinfo,
- List *restrictlist)
+ List *orig_restrictlist)
{
JoinType jointype = sjinfo->jointype;
+ List *restrictlist = orig_restrictlist;
Selectivity fkselec;
Selectivity jselec;
Selectivity pselec;
Obviously, if either of those things do make the problem go away, it's
a compiler bug. If not, we'll need to dig deeper.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
"Haroon ." <contact.mharoon@gmail.com> writes:
And if I comment these out i.e. setup_description, setup_privileges and
'setup_schema' it seem to progress well without any errors/crashes.
Presumably, what you've done there is remove every single join query
from the post-bootstrap scripts. That isn't particularly useful in
itself, but it does suggest that you would be able to fire up a
normal session afterwards in which you could use a more conventional
debugging approach. The problem can evidently be categorized as
"planning of any join query whatsoever crashes", so a test case
ought to be easy enough to come by.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jun 25, 2016 at 6:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
If that is the explanation, I'm suspicious that it's got something to do
with the interaction of a static inline-able (single-call-site) function
and taking the address of a formal parameter. We certainly have multiple
other instances of each thing, but maybe not both at the same place.
This leads to a couple of suggestions for dodging the problem:2. Don't pass the original formal parameter to
get_foreign_key_join_selectivity, ie do something likestatic double calc_joinrel_size_estimate(PlannerInfo *root, RelOptInfo *outer_rel, RelOptInfo *inner_rel, double outer_rows, double inner_rows, SpecialJoinInfo *sjinfo, - List *restrictlist) + List *orig_restrictlist) { JoinType jointype = sjinfo->jointype; + List *restrictlist = orig_restrictlist; Selectivity fkselec; Selectivity jselec; Selectivity pselec;
The problem appears to be related to 'taking the address of a formal
parameter'. NOT passing the original formal parameter to
get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013.
Resulting binaries seem to work fine as initdb doesn't experience child
process crash anymore. 'vcregress check' does not report any failures also.
Anyways, We have decided to use VS2015 tool chain for 9.6beta2 release.
Thanks everyone for the valuable input and help. Appreciate it!
Regards,
Haroon
--
Haroon http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
"Haroon ." <contact.mharoon@gmail.com> writes:
On Sat, Jun 25, 2016 at 6:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
This leads to a couple of suggestions for dodging the problem:
2. Don't pass the original formal parameter to
get_foreign_key_join_selectivity, ie do something likestatic double calc_joinrel_size_estimate(PlannerInfo *root, RelOptInfo *outer_rel, RelOptInfo *inner_rel, double outer_rows, double inner_rows, SpecialJoinInfo *sjinfo, - List *restrictlist) + List *orig_restrictlist) { JoinType jointype = sjinfo->jointype; + List *restrictlist = orig_restrictlist; Selectivity fkselec; Selectivity jselec; Selectivity pselec;The problem appears to be related to 'taking the address of a formal
parameter'. NOT passing the original formal parameter to
get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013.
Thanks for investigating! I'll go commit that change. I wish someone
would put up a buildfarm critter using VS2013, though.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane wrote:
"Haroon ." <contact.mharoon@gmail.com> writes:
The problem appears to be related to 'taking the address of a formal
parameter'. NOT passing the original formal parameter to
get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013.Thanks for investigating! I'll go commit that change. I wish someone
would put up a buildfarm critter using VS2013, though.
Uh, isn't that what woodlouse is using?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier wrote:
On Fri, Jun 24, 2016 at 11:51 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
Sent: Friday, June 24, 2016 11:37 AM
On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:It might be worth testing that out and adding an initdb startup
flag to create the directory, since initdb is such a PITA to
debug.I was more thinking about putting that under -DDEBUG for example.
I think just the existing option -d (--debug) and/or -n (--no-clean)
would be OK.If the majority thinks that an option switch is more adapted, I won't
fight it strongly. Just please let's not mess up with the behavior of
the existing options.
I think creating crashdumps/ when both -d and -n are specified is a bit
odd but reasonable.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Tom Lane wrote:
Thanks for investigating! I'll go commit that change. I wish someone
would put up a buildfarm critter using VS2013, though.
Uh, isn't that what woodlouse is using?
Well, it wasn't reporting this crash, so there's *something* different.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Tom Lane wrote:
Thanks for investigating! I'll go commit that change. I wish someone
would put up a buildfarm critter using VS2013, though.Uh, isn't that what woodlouse is using?
Well, it wasn't reporting this crash, so there's *something* different.
It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using
native x86_64 compilers perhaps that's why?
We've confirmed it on two different versions of VS 2013, so it's not
specific to one minor compiler point release.
It'd be handy if the buildfarm captured the output of:
* cl (no arguments, first line only)
* msbuild /nologo /version
and the env vars:
* VS*COMNTOOLS (* being any 3 digits)
* PROCESSOR_ARCHITECTURE
* PROCESSOR_IDENTIFIER
* PROCESSOR_ARCHITEW6432
since right now it's hard to be totally sure exactly what a VS animal is
building with unless there's a log attached due to a failure.
That said, TBH I doubt we can or should cover every VS release in every VS
configuration. Especially since there are so many ways you can excitingly
break and mangle VS, particularly when installing multiple VS versions on
one host. It's a great IDE with a truly awful set of installation and
managment tools.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer wrote:
On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Tom Lane wrote:
Thanks for investigating! I'll go commit that change. I wish someone
would put up a buildfarm critter using VS2013, though.Uh, isn't that what woodlouse is using?
Well, it wasn't reporting this crash, so there's *something* different.
It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using
native x86_64 compilers perhaps that's why?
Hmm, so what about a pure 32bit build, if such a thing still exists? If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.
(I note that the coverage of MSVC versions has greatly improved in
recent months.)
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Craig Ringer wrote:
On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Tom Lane wrote:
Thanks for investigating! I'll go commit that change. I wish
someone
would put up a buildfarm critter using VS2013, though.
Uh, isn't that what woodlouse is using?
Well, it wasn't reporting this crash, so there's *something* different.
It may only affect the i386 to x86_64 cross compiler. If Woodlouse is
using
native x86_64 compilers perhaps that's why?
Hmm, so what about a pure 32bit build, if such a thing still exists? If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.
It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I
tested that.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer wrote:
On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Hmm, so what about a pure 32bit build, if such a thing still exists? If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I
tested that.
Ah, okay. I doubt it's worth setting up buildfarm members testing all
cross-compiles just to try and catch possible compiler bugs that way, so
unless somebody wants to invest more effort in this area, it seems we're
done here.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 1, 2016 at 9:57 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Craig Ringer wrote:
On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Hmm, so what about a pure 32bit build, if such a thing still exists? If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I
tested that.Ah, okay. I doubt it's worth setting up buildfarm members testing all
cross-compiles just to try and catch possible compiler bugs that way, so
unless somebody wants to invest more effort in this area, it seems we're
done here.
Sure. To be honest just using the latest version of MSVC available for
the builds is fine I think. Windows is very careful regarding
backward-compatibility of its compiled stuff usually, even if by using
VS2015 you make the builds of Postgres incompatible with XP. But
software is a world that keeps moving on, and XP is already out of
support by Redmond.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1 July 2016 at 09:02, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jul 1, 2016 at 9:57 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:Craig Ringer wrote:
On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
Hmm, so what about a pure 32bit build, if such a thing still exists?
If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target.
I
tested that.
Ah, okay. I doubt it's worth setting up buildfarm members testing all
cross-compiles just to try and catch possible compiler bugs that way, so
unless somebody wants to invest more effort in this area, it seems we're
done here.Sure. To be honest just using the latest version of MSVC available for
the builds is fine I think. Windows is very careful regarding
backward-compatibility of its compiled stuff usually, even if by using
VS2015 you make the builds of Postgres incompatible with XP. But
software is a world that keeps moving on, and XP is already out of
support by Redmond.
I agree. I'm happier now that we've got evidence it's a compiler bug,
though.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services