07:05 LiquidAcid: Venemo, well, that helped a lot, getting a proper backtrace with Xnine
07:05 Venemo: cool
07:07 LiquidAcid: crashes in iris_disk_cache_init, curious...
07:37 orbea: LiquidAcid: xnine crashes in the same place with the llvm pipe?
07:37 orbea: (i forgot about xnine earlier...)
07:39 LiquidAcid: orbea, same thing with LIBGL_ALWAYS_SOFTWARE=true
07:40 LiquidAcid: #8 0xed073b67 in pipe_iris_create_screen (fd=7, config=0xffffcc8c) at ../mesa-20.0.2/src/gallium/auxiliary/target-helpers/drm_helper.h:47
07:40 LiquidAcid: #9 0xed1b563b in pipe_loader_drm_create_screen (dev=0x829c570, config=0xffffcc8c) at ../mesa-20.0.2/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:323
07:40 LiquidAcid: doesn't change the fact that the pipe loader loads iris
07:41 Venemo: LiquidAcid: how do you reproduce this problem? sorry but I'm afraid I missed your intro
07:41 orbea: ah, interesting
07:43 LiquidAcid: Venemo, initially i was just calling ninewinecfg
07:43 Venemo: on my machine, I think 'wine ninewinecfg' just works.
07:43 LiquidAcid: however i couldn't get a good backtrace out of winedbg, and the process memory map suggested the faulting code was somewhere in llvm
07:44 LiquidAcid: Venemo, yeah, that's what all folks keep telling me
07:44 LiquidAcid: i then remembered there being a native test application, and the rest is history
07:45 Venemo: so. what kind of system is this? which kernel, mesa, llvm and wine versions do you use? and how did you install nine?
07:46 LiquidAcid: this is a x86 32bit chroot, distro is gentoo, mesa-20.0.2, llvm-9.0.1, vanilla kernel 5.4.28
07:47 orbea: wine-5.4 iirc?
07:47 LiquidAcid: yep, wine is 5.4 vanilla
07:47 Venemo: what hardware?
07:47 LiquidAcid: 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
07:48 LiquidAcid: OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2)
07:48 Venemo: and does the stuff work outside of the chroot?
07:48 LiquidAcid: yes
07:49 orbea: do other graphics contexts work in the chroot? opengl or vulkan?
07:49 Venemo: okay so it would seem this is an intel specific issue that their stuff doesn't work in a chroot
07:49 LiquidAcid: also, it was all working inside the chroot until i migrated from the wine-d3d9 package to what we now have in portage
07:50 Venemo: what is portage
07:50 orbea: gentoo repo
07:50 LiquidAcid: the package manager
07:50 orbea: they have wine-nine-standalone now
07:51 Venemo: I'm afraid I haven't any experience with gentoo, forgive me
07:51 LiquidAcid: well, this seems to be a mesa problem, i just haven't found a way to trigger it without using nine
07:51 Venemo: does iris work with other stuff inside the chroot? is nine the only thing that fails?
07:52 orbea: try glxgears and vkcube
07:52 LiquidAcid: glxgears works, that's actually the first thing i tried before analysing this whole mess
07:53 orbea: its past my bedtime, good luck!
07:53 LiquidAcid: libGL: MESA-LOADER: dlopen(/usr/lib/dri/iris_dri.so)
07:53 Venemo: is this a debug build of mesa? if yes, can you give me the full backtrace from the crash you see?
07:53 LiquidAcid: sure, one moment
07:54 Venemo: please use a pastebin kind of site
07:54 Venemo: (ie. don't dump the whole thing into IRC)
07:56 LiquidAcid: https://www.math.uni-bielefeld.de/~tjakobi/archive/xnine_bt.txt
07:57 LiquidAcid: i know, i'm using irc since years
07:57 Venemo: sorry, I didn't know
07:57 Venemo: I'm pretty sure this is not a debug build, stuff is <optimized out>
07:58 LiquidAcid: this is with -Og -ggdb -g -gdwarf-2 -gstrict-dwarf
07:58 LiquidAcid: which is the preferred way to debug nine problem according to your wiki
07:58 LiquidAcid: anyway, the optimized out bits don't matter, id_sha1 is 0x10
07:59 Venemo: can you tell me the git hash of the mesa version you use?
08:00 LiquidAcid: https://gitlab.freedesktop.org/mesa/mesa/-/tags/mesa-20.0.2
08:01 LiquidAcid: i also tried with git tip, but no change
08:01 LiquidAcid: my next steps would be to step in build_id_find_nhdr_for_addr() and see why it returns NULL
08:02 Venemo: this makes no sense at all.
08:02 Venemo: so according to your backtrace, it hits an assertion failure here: #5 0xef794a11 in iris_disk_cache_init (screen=0x80a3b98) at ../mesa-20.0.2/src/gallium/drivers/iris/iris_disk_cache.c:259
08:02 LiquidAcid: you mean stepping into the function?
08:03 Venemo: but looking at that line here: https://gitlab.freedesktop.org/mesa/mesa/-/blob/mesa-20.0.2/src/gallium/drivers/iris/iris_disk_cache.c#L259 there is no assertion at all
08:03 LiquidAcid: yes, without assertions enabled it segfaults
08:03 Venemo: can you please make a real debug build
08:03 LiquidAcid: https://gitlab.freedesktop.org/mesa/mesa/-/blob/20.0/src/gallium/drivers/iris/iris_disk_cache.c#L249
08:03 LiquidAcid: this is the assertion that is triggered
08:03 Venemo: the bt says 259
08:04 LiquidAcid: it's still that assertion
08:05 Venemo: ok...
08:05 LiquidAcid: which is pretty obvious, note is NULL here
08:05 LiquidAcid: the big question is, why?
08:06 Venemo: can you try running it with INTEL_DEBUG=shader_time in order to disable the shader cache?
08:06 LiquidAcid: i already did
08:06 Venemo: did it work then?
08:06 LiquidAcid: obviously it solves the problem
08:07 LiquidAcid: or rather, it works around it
08:07 Venemo: okay
08:07 Venemo: there seems to be a number of reasons why build_id_find_nhdr_for_addr can return NULL, I think your best bet is to run the thing with gdb and see which case it hits.
08:11 Venemo: I would do it myself but I don't have a way to reproduce the problem
08:11 LiquidAcid: no that's fine, i can handle gdb just fine
08:12 Venemo: okay.
08:12 Venemo: if you do that, it'd be also interesting to see if glxgears also hits iris_disk_cache_init and if yes, why does it work
08:16 LiquidAcid: yep, it does, but one step at a time
08:32 LiquidAcid: hmm, that's weird
08:33 LiquidAcid: so first of all, with glxgears the function is just called once, but with Xnine it's called twice, and the second time it faults
08:33 LiquidAcid: apparantly dl_iterate_phdr() fails when that happens
08:35 Venemo: that's weird
08:36 Venemo: does it try to create the screen twice for some reason?
08:36 LiquidAcid: looks a bit similar to this one: https://bugs.freedesktop.org/show_bug.cgi?id=110757
08:37 LiquidAcid: the first call is triggered by glXChooseVisual
08:38 Venemo: mhm
08:39 LiquidAcid: that happens somewhere in libsdl2, which i haven't build with debug symbols
08:39 LiquidAcid: https://hastebin.com/zomusajofe.shell
08:41 LiquidAcid: i wonder if this is really due to all this called twice, or due to build_id_find_nhdr_for_addr() behaving differently
08:41 LiquidAcid: the first time:
08:42 Venemo: I don't know
08:42 LiquidAcid: (gdb) p info
08:42 LiquidAcid: $1 = {dli_fname = 0x80cb8c0 "/usr/lib/dri/iris_dri.so", dli_fbase = 0xf57bf000, dli_sname = 0x0, dli_saddr = 0x0}
08:42 LiquidAcid: the second time:
08:42 LiquidAcid: (gdb) p info
08:42 LiquidAcid: $3 = {dli_fname = 0x80a2640 "/usr/lib/d3d/d3dadapter9.so", dli_fbase = 0xef435000, dli_sname = 0x0, dli_saddr = 0x0}
08:43 LiquidAcid: is there something missing in d3dadapter9.so that lets dl_iterate_phdr() fail?
08:43 Venemo: I'm unsure
08:45 LiquidAcid: https://bugs.freedesktop.org/show_bug.cgi?id=110757#c4 <- could this help?
08:45 LiquidAcid: https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/targets/d3dadapter9/meson.build#L60 <- i don't see this here
08:45 Venemo: worth a shot
08:55 LiquidAcid: huzzah!
08:56 Venemo: if that fixed it, please open a MR to mesa
08:58 LiquidAcid: will do
08:58 LiquidAcid: i'm just wondering why noone else is seeing this
08:59 Venemo: maybe you hit some edge case in your chroot? I don't know honestly
08:59 LiquidAcid: well, whar edge case could that be?
08:59 LiquidAcid: *what
09:00 Venemo: haven't a clue
09:00 LiquidAcid: imo you should be able to reproduce this with a iris capable device and removing the shader cache prior to launching Xnine
09:00 Venemo: why does it work outside your chroot?
09:01 LiquidAcid: seems the shader cache depends on the driver launching it, and there is no nine on the host
09:02 LiquidAcid: remember that data in build_id_find_nhdr_for_addr() depends on the .so
09:02 Venemo: "no nine on the host" ?
09:02 Venemo: what does that mean?
09:04 LiquidAcid: uh, i don't understand the question
09:04 LiquidAcid: mesa in the chroot is build with nine, mesa on the chroot is NOT build with nine
09:04 LiquidAcid: corrected: mesa in the chroot is build with nine, mesa on the host is NOT build with nine
09:05 Venemo: you said earlier that the problem is not present outside of the chroot
09:05 Venemo: how did you test it outside the chroot if you don't have nine outside the chroot?
09:05 LiquidAcid: correct, iris works perfectly on the host
09:05 LiquidAcid: well, the usual suspects, glxgears, ut2004, doom3, etc.
09:06 Venemo: so you didn't actually test on the host
09:08 LiquidAcid: look, i'm ending this discussion, in particular since i haven't even received any help from you at all, you just keep on questioning my methods and you haven't contributed anything valuable at all
09:09 LiquidAcid: all i hear is, i haven't got a clue, i don't know, i'm unsure -- how is this supposed to be helpful?
09:10 Venemo: sorry I couldn't help more
09:10 Venemo: have a nice day
09:12 Venemo: I'd still be interested in learning the solution if you find it.