01:00Company: so....
01:00Company: Vulkan Push Constants - what's the besst equivalent in GL?
01:01Company: because the web says "use a buffer" but my profiler says Mesa spends 20% of the time updating the buffer
01:02Company: so I'm starting to suspect the web is wrong
01:02airlied: push constants are pretty much glUniform equiv
01:03airlied: ubos could be used
01:03Company: ubos is what I'm using
01:03Company: but the glSubBufferData(0 call causes the 20% CPU usage
01:04airlied: might need to allocate the buffer different
01:04Company: I tried glBufferData() too, that was even slower
01:05Company: but I can try just using a bunch of uniforms
01:06Company: I could even try one uniform that looks like an array? Does that matter?
01:06airlied: did you try glBufferData with different usage?
01:07Company: nope
01:08Company: I used DYNAMIC_DRAW
01:08Company: could try the other ones
01:09airlied: STREAM_DRAW might be an option for your use case
01:17zmike: yeah should be fast for drivers that implement buffer invalidation
01:17Company: it doesn't tc_buffer_unmap() anymore, but it still spends quite some time in tc_improve_map_buffer_flags() => tc_invalidate_buffer() => si_buffer_create()
01:21Company: I'll play with using glUniforms instead of UBOs
02:05tutitutututu: You do not know how to link stuff? lld has done it for ages morauns, so was kernel mode Linux doing it, kernel lives inside hw protection ring, which is not suid 0 granted, suid 0 is a consequence it's POSIX user system for file accesses, it has nothing to do with hw mpu protection, kernel mode Linux was the first to remove that, later device tree overlays, ebpf granted access to it, you are so stupid, that it's rather obnoxious, ioctls are very
02:05tutitutututu: easy to be added to library os, however rings control just needs to be lifted to userspace, it's hardware it does not care who is in charge of controlling the rings, you start to harass with moto crank gangsters and whores like you killed off Gary Kildall, you uneducated fools, and it ends up that I kill all of your harass gangs in a row. Anal indrek on island with their shared hole Laura are bad jokes, this man owes my dad million dollars, he is
02:05tutitutututu: not the sharpest or any kind of self managed person, he is a prank like you, terrorist and you know what happens to terrorists, I assume. If you stalk me again, I finish you.
02:11soreau: another mole bites the dust :P
02:14soreau: daniels: ping on !25358
03:06SolarAquarion: Good evening!
03:06SolarAquarion: mesa/src/compiler/clc/meson.build:92:21: ERROR: File /usr//usr/lib/clc/spirv-mesa3d-.spv does not exist.
03:06SolarAquarion: this is a fun build error
03:07SolarAquarion: why the hell would it be trying to find something in /usr/usr/ there's nothing in /usr/usr/ ever?
03:08mattst88: sounds like you've specified --prefix wrong or something like that
03:08SolarAquarion: mattst88, it's the basic arch-meson wrapper file
03:10mattst88: you'll have to read the code to figure out what's going wrong -- DYNAMIC_LIBCLC_PATH in src/compiler/clc/meson.build
03:11SolarAquarion: mattst88, Dynamic_libclc_path is /usr/lib/clc/
03:11SolarAquarion: i find it in that file, but it's static, and so it aint finding it
03:12SolarAquarion: i wqasn't finding libclc also during non static build
03:12SolarAquarion: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9930
03:12SolarAquarion: mine
03:14mattst88: what does your /usr/share/pkgconfig/libclc.pc file show for libexecdir=?
03:16SolarAquarion: mattst88, the libclc that was built from vanilla has a broke libexecdir
03:16mattst88: it has libexecdir=/usr//usr/lib/clc ?
03:16SolarAquarion: libexecdir=/usr//usr/lib/clc
03:16SolarAquarion: yes
03:17mattst88: yep, there's the problem
03:17SolarAquarion: that's bizarre.
03:18mattst88: like I said -- sounds like you've specified --prefix wrong or something like that :)
03:18SolarAquarion: mattst88, is the start of the issue in llvm/libclc
03:18mattst88: yes, sounds like the problem is in the packaging of libclc
03:19SolarAquarion: mattst88, i packaged datadir as /usr/usr/ and it was installed to /usr
03:22mattst88: I don't know what you mean exactly, but gentoo configures only with -DCMAKE_INSTALL_PREFIX=/usr
04:40SolarAquarion: mattst88, -DCMAKE_INSTALL_PREFIX=/usr \
04:40SolarAquarion: -DCMAKE_INSTALL_DATADIR=/lib \
04:47soreau: SolarAquarion: can you explain what problem you're having again? what package are you trying to build and what command are you using to do it?
04:56SolarAquarion: soreau, i fixed the issue that i was having. it's because i changed some stuff which caused everything to fail
04:58soreau: glag you got it sorted
04:58soreau: glad*
05:15SolarAquarion: [1344/3593] Generating src/intel/vulkan/grl/gfx125_bvh_build_leaf_primref_to_procedurals.h with a custom command (wrapped by meson to set env)
05:15SolarAquarion: FAILED: src/intel/vulkan/grl/gfx125_bvh_build_leaf_primref_to_procedurals.h
05:15SolarAquarion: env MESA_SHADER_CACHE_DISABLE=true /build/mesa-git/src/build/src/intel/compiler/intel_clc -p dg2 --prefix gfx125_bvh_build_leaf_primref_to_procedurals -e primref_to_procedurals --in ../mesa/src/intel/vulkan/grl/gpu/bvh_build_leaf.cl --in /build/mesa-git/src/mesa/src/intel/vulkan/grl/gpu/libs/lsc_intrinsics_fallback.cl -o src/intel/vulkan/grl/gfx125_bvh_build_leaf_primref_to_procedurals.h -- -cl-std=cl2.0 -D__OPENCL_VERSION__=200
05:15SolarAquarion: -DMAX_HW_SIMD_WIDTH=16 -DMAX_WORKGROUP_SIZE=16 -I/build/mesa-git/src/mesa/src/intel/vulkan/grl/gpu -I/build/mesa-git/src/mesa/src/intel/vulkan/grl/include -include opencl-c.h
05:15SolarAquarion: (file=input,line=0,column=0,index=0): Type mismatch on symbol "store_uint4_L1WB_L3WB" between imported variable/function %77 and exported variable/function %9718.
05:15SolarAquarion: [1345/3593] Generating src/intel/vulkan/grl/gfx125_bvh_build_leaf_primref_to_quads.h with a custom command (wrapped by meson to set env)
05:15SolarAquarion: FAILED: src/intel/vulkan/grl/gfx125_bvh_build_leaf_primref_to_quads.h
05:15SolarAquarion: env MESA_SHADER_CACHE_DISABLE=true /build/mesa-git/src/build/src/intel/compiler/intel_clc -p dg2 --prefix gfx125_bvh_build_leaf_primref_to_quads -e primref_to_quads --in ../mesa/src/intel/vulkan/grl/gpu/bvh_build_leaf.cl --in /build/mesa-git/src/mesa/src/intel/vulkan/grl/gpu/libs/lsc_intrinsics_fallback.cl -o src/intel/vulkan/grl/gfx125_bvh_build_leaf_primref_to_quads.h -- -cl-std=cl2.0 -D__OPENCL_VERSION__=200 -DMAX_HW_SIMD_WIDTH=16
05:15SolarAquarion: -DMAX_WORKGROUP_SIZE=16 -I/build/mesa-git/src/mesa/src/intel/vulkan/grl/gpu -I/build/mesa-git/src/mesa/src/intel/vulkan/grl/include -include opencl-c.h
05:15SolarAquarion: (file=input,line=0,column=0,index=0): Type mismatch on symbol "store_uint4_L1WB_L3WB" between imported variable/function %77 and exported variable/function %9718.
05:15SolarAquarion: [1346/3593] Generating src/intel/vulkan/grl/gfx125_bvh_build_leaf_create_HW_instance_nodes.h with a custom command (wrapped by meson to set env)
05:15SolarAquarion: FAILED: src/intel/vulkan/grl/gfx125_bvh_bu
05:15SolarAquarion: i should've put this in a pastebin
05:15SolarAquarion: soreau, latest build failure
05:16airlied: yes known problem with llvm17 and intel-clc, not solution yet
05:16SolarAquarion: so intel-clc is currently bugged
05:19soreau: SolarAquarion: you might want to use a pastebin service
05:20soreau: but maybe you should use llvm16 for now or something
05:48airlied: mattst88: did you say reverting fb5ecbb4fe9d9f58afee341116def699f3bb8341 made llvm17 work for you or was that just some intermediate llvm?
05:48airlied: I'm playing with that and I get clang creation errors passing that to llvm 17
05:49mattst88: airlied: reverting works with a snapshot of llvm before 17.0.0, I think.
05:49mattst88: I haven't tried with 17.0.0 final, but I can do that
05:53airlied: mattst88: okay I'm trying with 17.0.
05:53airlied: mattst88: okay I'm trying with 17.0.2
05:53airlied: but am just rebuilding things to make sure
06:36airlied: mattst88: I think I have a hackaround
06:44airlied: mattst88: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25536
06:44airlied: Kayden, dj-death ^ not sure who else to ping :-P
07:03dj-death: airlied: nice, thanks for that
09:40karolherbst: airlied: mhhh.. I found the reason why llvmpipe has bad CPU utilization... the bigger the block is, the more likely it becomes, that llvmpipe starves...
09:41karolherbst: not sure if that's fixable in a sane way
09:41karolherbst: but maybe llvmpipe could decouple subgroups within blocks, and just keep enqueuing "subgroups" from other blocks if one finishes or something? not sure if that's feasible
10:25karolherbst: gfxstrand: mind if we land the three nir patches from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24764 ? Apparently users are running into this issue and it helps with kernel compilation speeds. We should keep the rusticl patch for when copy_deref gets fixed as well.
11:13karolherbst: airlied: also.. I have a CL app which is like 30x faster with pocl than llvmpipe :')
11:13karolherbst: and it's just running kernels
11:16karolherbst: but it's also doing crypto inside the kernel, so maybe pocl ends up using crypto extensions
11:17karolherbst: but maybe the way we optimize things is also pretty terrible, and maybe the function stuff we are doing now is slowing things a lot
15:13Company: so, I did some sysprof on my renderer, and one thing where a bunch of time is spent is a function that essentially memcpy()s stuff into the vertex buffer
15:14Company: but that function takes <1% on Vulkan and >10% on GL
15:14Company: I copy into a buffer mapped with glMapBuffer(), did I screw things up somehow?
15:34Company: huh fun, on Intel they're both <1%, this is only AMD
15:35HdkR: Are you using glBufferStorage to set the `GL_CLIENT_STORAGE_BIT`?
15:36HdkR: Smells like you're accessing vram across PCIe
15:42zmike: https://basnieuwenhuizen.nl/the-catastrophe-of-reading-from-vram/
15:42Company: so far I'm just creating a GL_ARRAY_BUFFER and mapping it for writing
15:45Company: I played with WRITE_ONLY and READ_WRITE and STATIC_DRAW and DYNAMIC_DRAW but that didn't seem to do anything - though I only looked at framerate and didn't run a full profile
15:45HdkR: Indeed, and it's probably ending up in VRAM. So you're writing a bunch data across the PCIe bus which is slow as snot
15:45Company: yeah, that's not supposed to happen
15:46HdkR: That's what glBufferStorage is for. Giving the driver more context about what you're wanting to do
15:46Company: but glBufferStorage is GL 4.4+ only
15:47HdkR: Also glMapBufferRange
15:47Company: am I supposed to use CPU memory and glBufferData() otherwise?
15:47HdkR: Pretty much
15:48HdkR: glBufferData or glBufferSubData depending
15:48HdkR: glbufferData will have the driver orphan buffers and do shadow copies of your data, glBufferSubData will theoretically do magic sub-range tracking and updating but mileage may vary :P
15:49Company: MapBufferRange looks good
15:50Company: I mainly care about GLES 3 and Mesa/nvidia GL (which is usually GL4)
15:50Company: though the oldest junk people seem to use with Gnome seems to be around GL 3 - we've had a few bugs when people's GPUs didn't do samplers
15:51HdkR: without `GL_CLIENT_STORAGE_BIT` you're likely still going to shoot yourself in the foot
15:52HdkR: But luckily AMD things also usually support GL 4
15:52karolherbst: glBufferStorage can be used with an extension almost everybody provides though
15:52karolherbst: even gl3 implementations
15:53karolherbst: we even had it in the mesa 10 days :D
15:54karolherbst: anyway, any competent GL driver has GL_ARB_buffer_storage
15:54HdkR: It's even in GLES land with GL_EXT variant which is nice
15:54mareko: GL2 has it too
15:54karolherbst: I'd consider any implementation not providing it as a broken one
15:54karolherbst: :P
15:55Company: I shall be using that then and fallback
15:55karolherbst: yeah, it makes your life easier
15:56mareko: I think virgl can't do ARB_buffer_storage
15:56mareko: and svga
15:56mareko: that's likely it
15:57karolherbst: mhhh.. yeah.. I guess any VM driver might have issues implementing it properly
15:57karolherbst: though it should be possible in theory
15:58MrCooper: Company: if you draw only once using the vertex data you write, should probably use GL_STREAM_DRAW
15:58Company: the other option is going to malloc() + glBufferData() when unmapping
15:59Company: MrCooper: I'm reusing the buffer for a later frame
15:59Company: but it's unclear to me how people interpret those STREAM vs DYNAMIC flags
16:00karolherbst: a use case where I used buffer storage was where I've updated the buffer significantly more often than used on the GPU (multiple 100-thousend times a frame) but on the GPU side it didn't matter which immediate state was visible
16:00Company: and how they are meant to map to coherent vs cached vs whatever flags in Vulkan
16:00MrCooper: then does the upload time really matter? A buffer in VRAM might be faster for the draws
16:01Company: I don't know - but currently I get 4x the framerate with my Vulkan code than with my GL code
16:01Company: and I'm trying to figure out why
16:02MrCooper: seems unlikely that 10% CPU usage alone could explain 1/4 frame rate
16:02soreau: Company: tried zink for kicks?
16:03Company: MrCooper: I have more than 1 suspicion of slowdowns - but this here is one thing
16:04Company: and when the framerate goes up here I can go look at the next thing
16:18linkmauve: https://opengles.gpuinfo.org/listextensions.php puts the EXT extension at 36% coverage, that makes me sad all of a sudden.
16:19HdkR: linkmauve: That's okay, you can ignore those proprietary blobs :P
16:19karolherbst: linkmauve: which puts that extension into the top 15% of most supported extensions :P
16:20HdkR: Android ecosystem is wild
16:20karolherbst: yeah, and broken
16:20karolherbst: the official adreno driver doesn't support that ext?
16:20karolherbst: how shameful
16:21karolherbst: I suspect google never required it to be there
16:21boofi: Any reason why I have started getting whole system freezes while playing any demanding game?
16:21boofi: It seems to be a GPU reset that only happens with RADV, not amdvlk
16:22boofi: Oh wait im not supposed to be asking here i forgot
17:18zmike: eric_engestrom dcbaker: is it time yet to create a milestone for the next release to start tagging issues?
17:21mareko: karolherbst: Qemu supports direct buffer mappings, but only Venus uses them because it's required by Vulkan, not VirGL
17:22mareko: karolherbst: how else do you think we run radeonsi+radv with the virtio-gpu kernel driver in the guest :)
17:23karolherbst: mhh, yeah, fair
17:25anholt: wait, do we have progress on virtio-gpu for amd?
17:35bnieuwenhuizen: anholt: not sure about progress but there was https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21658
17:39anholt: I saw that one go up. just seems stalled right now?
17:40mareko: anholt: it will be updated, let me just tell you that development hasn't stopped since then
17:41anholt: glad to hear
17:58Company: I seem to be not smart enough for GL buffers
17:58Company: fastest method seems to be malloc() + glBufferSubData()
17:58Company: and ignoring all the fancy glMapBufferRange() stuff
17:58zmike: unsurprising
17:59Company: I was assuming that's leading to an extra memcpy()
17:59zmike: it is, but if you can't set the memory domain for your buffer then that's still faster
18:00Company: how does that just work in Vulkan but GL can't make it work?
18:00zmike: because memory allocation in vulkan is explicit by design
18:02Company: ah, you mean because GL has to care about syncing everything all the time and that makes things hard to get right
18:04zmike: no I mean in GL you're probably trying to write to buffers that use memory not optimal for cpu writes but in VK if you do that it's because you allocated your buffer wrong
18:05Company: I was assuming that the fancy glBufferStorage() and glMapBufferRange() flags made the memory optimal
18:05Company: that's why I tried avoiding the glBufferSubData() in the first place
18:05zmike: I haven't read the backlog
18:06Company: I'm abstracting the GTK Vulkan renderer to gain a GL backend
18:06Company: so I'm basically trying to find GL calls that match the Vulkan stuff I do as well as possible
18:07Company: and after I did that, I'm getting 600fps on GL and 2500fps on Vulkan
18:07Company: and now I'm investigating
18:08Company: yesterday I was looking at UBOs as push constant replacement being slow
18:08Company: today was buffers being slow
18:18krumelmonster: Hi, I get a segmentation fault within mesa, called from imgui (called from my lovely software I want to open source soon) and I've been told to report it to the mesa project however I'm not certain this is a mesa issue and I have no idea how I would isolate the issue from my codebase https://github.com/ocornut/imgui/issues/6896
18:26anholt: have you run under valgrind? also, make sure you have a debugoptimized or debug build of mesa, since you seem to be using a personal build but don't have debug symbols.
18:27anholt: crocus_resource_create_for_buffer doesn't look like something that should crash on misuse of gl api
18:29krumelmonster: This is archs 23.2.1-1 with debug symbols from debugd -- but actually I will install archs mesa-debug package now so I can use my IDEs debugger from here on. I'm not running under valgrind or any of the such
18:30krumelmonster: Here's the offending line in mesa 23.2.1 https://gitlab.freedesktop.org/mesa/mesa/-/blob/mesa-23.2.1/src/gallium/drivers/crocus/crocus_resource.c#L658
18:31krumelmonster: Do you think I should try to make the issue reproducible by trying to dissecting the imgui draw list?
18:32krumelmonster: Should I already open an issue in the mesa tracker with only the litte information I have so far?
18:39vsyrjala: seems to be the only unchecked crocus_alloc_resource(). so calloc() failed somehow?
18:41tnt: krumelmonster: does gdb tell you `res` is NULL ?
18:48krumelmonster: sorry tnt I moved to bare gdb so I could profit from debugd. I'm in frame 0 but `print res` teslls me No symbol "res" in current context.
18:49tnt: You can try to disassemble around the crash to see exactly which opcode failed (to make sure the line estimate from gdb is accurate) and print the register values.
18:53krumelmonster: Is there no way I can get to see the locals in gdb like I usually do? info locals says "No locals." for all of the mesa part of this
18:54krumelmonster: If neccessary I could try a different build of mesa, this crash typically happens if I just wait
18:56tnt: Not sure how arch debug thing works and if it's supposed to include all locals info.
18:56tnt: But for sure you can build mesa locally as debug build and use that.
18:57krumelmonster: It does usually. It compile with debug symbols, then strips them and ships them separately. I've never seen this not work like this before
19:04krumelmonster: I assume that debug symbols are enabled by default
19:13airlied: optimisatioms will screw up debug even with symbols
19:14krumelmonster: should I pass any flags to meson for a particularly debugable build?
19:25airlied: buildtype debug
19:28krumelmonster: This was the default apparently as ctrl+c, `meson --buildtype debug ..` then `ninja` didn't start it over
19:34krumelmonster: While waiting for this, back in gdb I can see that the segfault is caused by movl $0x0,0xbc(%r12) where %r12 is a null-pointer if I'm doing this correctly. At least print $r12 says $7 = 0
19:39krumelmonster: So that makes a lot of sense with ISL_TILING_LINEAR being 0
19:42vsyrjala: there is some special meson command if you want to change any build knobs after the fact. i never remember it so i just rm -rf the build directory instead
19:43pendingchaos: "meson configure"
19:44tnt: But the only way that pointer would be null would be if calloc returned NULL
19:47airlied: sounds like you are using all the ram
19:47airlied: or possible all the vm space
19:48karolherbst: is this 32 bit by any chance?
19:48krumelmonster: no
19:49krumelmonster: and free says 2.3Gi available. But maybe it's my application that has the memory leak. I'm currently running it with the fresh mesa build in hopes for it to crash soon :)
19:50karolherbst: well.. the thing is, allocation don't fail just because you are out of RAM
19:50karolherbst: unless you configure your system in such a way
19:50karolherbst: and running out of VM would be kinda unlikely, but still possible
19:52krumelmonster: It would be very odd for the malloc to fail and really if res was null it would be more likely to segfault for 657, the line just before
19:52karolherbst: with optimizations everything is possible
19:52karolherbst: could be reordered
19:53airlied: just add an assert and see
19:53karolherbst: could also be something entirely else
19:53karolherbst: *different
19:54krumelmonster: I just hope it crashes soon and that my new build has "better" debug symbols
20:10krumelmonster: It doesn't seem to be willing to crash with my own build. Is there anything else I could do to write a useful report in the mesa gitlab?
20:13psykose: did you build 23.2.1 with the same config as arch or just master?
20:14psykose: the arch dbg for mesa doesn't have locals because it's -g1 so it's bt only https://gitlab.archlinux.org/archlinux/packaging/packages/mesa/-/blob/main/PKGBUILD?ref_type=heads#L136
20:14psykose: regardless to make sure it's almost the same thing you could use the same args as that and same tag
20:15krumelmonster: I built master but *now it crashed
20:15krumelmonster: `info locals` says `res = 0x0`
20:17psykose: means indeed calloc returned a null
20:21krumelmonster: Sadly I missed out on the "meson configure" so the parameters to crocus_alloc_resource are all optimized out... Am I wasting your time?
20:22HdkR: Sounds like if you just enable swap then you'll stop running out of memory
20:23psykose: what does `sysctl vm.overcommit_memory` return for you
20:24krumelmonster: vm.overcommit_memory = 0
20:24psykose: okie, normal
20:24psykose: hm
20:24psykose: not exactly sure then
20:25psykose: are you able to view the memory usage of the system during the period you ~run the way to crash the thing and seeing if magically you actually do run out of 'available' right before it happens?
20:28krumelmonster: not for this past run
20:30krumelmonster: I still have to wait 20 minutes for my actual mesa debug build to finish so that's plenty of time to set up some monitoring
20:33psykose: something also worth a try is doing vm.overcommit_memory=1 just to see if it behaves different
20:33psykose: (note that =0 is also "overcommit enabled", the heuristic is just different)
20:34psykose: sadly i'm not familiar if there are other reasons calloc would give you a null, in glibc's allocator (or whatever else), it's a bit more complicated than just 'does computer have memory' etc..
20:35psykose: hm
20:35psykose: you might also have ran out of `sysctl vm.max_map_count` perhaps, i think on exhaustion that also fails
20:35karolherbst: there is also this thing to check for malloc corruptions
20:35krumelmonster: May this be a hardware issue? sublime text and firefox have been crashing recently and from what I'm doing with this laptop those might be the two application shoveling most memory
20:35psykose: it could be
20:35krumelmonster: I'll memtest64 tonight
20:35karolherbst: well.. unlikely though
20:36karolherbst: why would glibc decide to not allocate memory out of the sudden
20:36karolherbst: nah.. 99% of the cases where one thinks it's a hardware issue, it's not :P
20:36karolherbst: you can assume hardware issues don't exist unless you have solid proof
20:37karolherbst: krumelmonster: what's errno right after the failing calloc call?
20:37krumelmonster: I made sure to hand out the same type of laptop I have around friends and there has been a memory defect before as was proven by memtest64. Don't remember how it showed 'though
20:37karolherbst: it apparently indicates why it fails
20:38karolherbst: not sure if that's part of the posix standard though
20:38karolherbst: ahh it is
20:38karolherbst: `The setting of errno and the [ENOMEM] error condition are mandatory if an insufficient memory condition occurs.`
20:39psykose: would be interesting if it's not set
20:39krumelmonster: karolherbst I guess I'll have to modify crocus_alloc_resource and then hit the issue again in order to retrieve the errno, right?
20:39karolherbst: nah, should be fine
20:39karolherbst: just run with gdb and if you hit the error do "p errno"
20:40krumelmonster: 12
20:40psykose: that's enomem
20:40krumelmonster: well that's awkward
20:41karolherbst: quite so
20:41karolherbst: I guess memory consumption and everything is in order?
20:41karolherbst: might also be some random cgroup stuff or something
20:41psykose: it still doesn't really mean anything that we weren't already guessing
20:41psykose: many things that can cause that..
20:41karolherbst: krumelmonster: do "p calloc(8)"
20:42karolherbst: ehh wait
20:42karolherbst: "p calloc(8, 1)"
20:42karolherbst: or rather 1, 8
20:42karolherbst: whatever
20:42krumelmonster: $3 = (void *) 0x5555562e3bb0
20:42karolherbst: mhh
20:42psykose: do 1, 256
20:43krumelmonster: (gdb) p calloc(8, 1) $3 = (void *) 0x5555562e3bb0 (gdb) p calloc(1, 8) $4 = (void *) 0x55555605e530
20:43krumelmonster: (gdb) p calloc(1, 256) $5 = (void *) 0x555556e84120
20:43psykose: hm
20:43psykose: what's the size of that struct
20:43karolherbst: mhhhh
20:43karolherbst: maybe VM fragmentation going on?
20:44psykose: yeah that was a guess i was having but idk why it wouldn't just go fetch more memory when failing to find a big enough fragment
20:44karolherbst: "p calloc(1, sizeof(struct crocus_resource))"
20:44tnt: or calloc worked and something overwrote the result with null ?
20:44karolherbst: the best feature of gdb is, that you can throw random C expression at it
20:45karolherbst: nah, there is no code for it
20:45karolherbst: unless it's a weirdo OOB write
20:45psykose: if calloc worked and the result was overwritten errno would probably not be 12
20:46karolherbst: but yeah.. you could also do "break crocus_alloc_resource" then "p crocus_alloc_resource(pscreen, templ)" and single step through crocus_alloc_resource
20:46karolherbst: yeah..
20:46karolherbst: and that
20:46krumelmonster: (gdb) p calloc(1, sizeof(struct crocus_resource)) $6 = (void *) 0x0
20:46karolherbst: okay
20:46psykose: yeah
20:46tnt: ah yeah good point about errno
20:46karolherbst: so it does fail to allocate
20:46psykose: could u just print the sizeof by itself so it's clear how big it is for fun
20:47karolherbst: this is what they call remote gdb debugging, aren't they?
20:47psykose: :D
20:47krumelmonster: (gdb) p sizeof(struct crocus_resource) $7 = 520
20:47psykose: thic
20:47krumelmonster: :D
20:47psykose: so 256 is small enough but 520 is too big
20:47psykose: and, it does not just get more mappings
20:47psykose: i wonder if you can step into expr?
20:47karolherbst: you can
20:47psykose: like, step into a `calloc(1, 520)` from it
20:47karolherbst: just have to set a breakpoint first
20:47psykose: and see why it gives you a null
20:48karolherbst: _and_ have debug symbols for glibc :)
20:48psykose: aye
20:48tnt: Can you check the current process mappings ?
20:48karolherbst: ahh good idea
20:48krumelmonster: By the way I ran the software again but with the debug build and I closed some big browser windows so system RAM should be out of question.
20:48HdkR: `cat /proc/<pidof app>/maps | wc -l` Might have hit the mapping max :)
20:48karolherbst: "info maps" or was it "info mem"?
20:48karolherbst: something something
20:48tnt: info proc mappings
20:48karolherbst: ahh
20:49karolherbst: krumelmonster: mhhhh...
20:49karolherbst: okay...
20:49karolherbst: there is this thing
20:49krumelmonster: oh ah wow hm
20:50psykose: krumelmonster: what's info proc mappings say
20:50karolherbst: krumelmonster: "p malloc_trim(0)"
20:50krumelmonster: there's a gazillion of `0x7fffd40f3000 0x7fffd40f4000 0x1000 0x106404000 rw-s anon_inode:i915.gem`
20:50krumelmonster: in info proc mappings
20:50psykose: paste it somewhere
20:50karolherbst: after the trim, do the calloc resource thing again
20:50psykose: get the number first
20:50karolherbst: ahh yeah
20:50krumelmonster: it's so many I'll have to figure how to even get to the start of the list... Give me a sec
20:50karolherbst: yeah. pastebin the mappings first
20:50psykose: and see if it's at 65536 :p
20:51karolherbst: ohhhhh
20:51karolherbst: uhm
20:51karolherbst: wasn't there a limit for this kinda stuff?
20:51psykose: it's the default for max_map_count which would explain why it can't get more
20:51psykose: yes, sysctl vm.max_map_count
20:51karolherbst: it's 65530 here
20:51krumelmonster: My alacritty scrollback buffer is big but it's not big enough for this
20:51karolherbst: but yeah
20:51karolherbst: yeah...
20:52psykose: fedora these days was looking to default it to infinite and have cgroups instead
20:52karolherbst: ahh
20:52karolherbst: fair enough
20:52karolherbst: but yeah.. having too many mappings would explain it
20:52psykose: i've had to debug a lot of "OOM" issues that had 100's of gbs free
20:52karolherbst: pain
20:52tnt: So now question is where do they come from :)
20:53karolherbst: maybe tons of GL contexts created?
20:53karolherbst: maybe opengl was dlopened 1000 times?
20:53karolherbst: maybe it's just doing a looot of small allocations?
20:53karolherbst: maybe the app causes GPU resources to not being fred?
20:53karolherbst: something along those lines probably
20:54psykose: krumelmonster: set logging file xxx.log
20:54psykose: then type it
20:54psykose: should output to a file
20:56psykose: ah also needs set logging on i think
20:57krumelmonster: I think it's too big for my paste service, just a sec
20:58psykose: curl -F'file=@xxx.log' https://0x0.st
21:00krumelmonster: thanks psykose, the three others I tried said 413 https://0x0.st/HWpg.log
21:01psykose: yep
21:01psykose: you have the exact limit
21:01psykose: outta maps
21:01psykose: my guess from way up top was right
21:01psykose: makes sense in retrospect since there's only like 4 kinds of ENOMEM you can hit
21:01psykose: soo.. uhh
21:01psykose: idk how glibcs allocator ended up in this state
21:02psykose: i guess it's one of those really common pathological cases where it fragments itself to death
21:02psykose: you can avoid it with just `sysctl vm.max_map_count=500000` as root
21:02psykose: at least it hides it :p
21:02krumelmonster: heh
21:02karolherbst: psykose: but most of the mappings are GPU things, no?
21:02psykose: i guess
21:02psykose: hm
21:03psykose: yeah i guess it's weird it's 90% i915
21:03psykose: there's something to fix somewhere
21:03psykose: since this takes time to reproduce, could you do a uhh
21:03psykose: small bump
21:03psykose: to like =100000
21:03psykose: i bet it just takes 2x as long and still crashes?
21:04psykose: if so, something is leaking the maps
21:04krumelmonster: interestingly imgui seems to have a workaround for precisely this part of its draw list code for intel but it's only active on windows https://github.com/ocornut/imgui/blob/1450d23b60a92713de2d1969b665d09ca8ac83b2/backends/imgui_impl_opengl3.cpp#L324
21:05psykose: hm
21:05psykose: idk anything about that personally but it doesn't smell like it's for this
21:06krumelmonster: ok.
21:06psykose: you can try, if that codepath builds, to just make it #if 1 and see if it changes something, but eh
21:07krumelmonster: The reason I'm investigating it is that I want to open source this project this week. Not that's it's really usable feature-wise but I thought I'd tackle the crashes first. Therefore hiding it is not so much of interest :)
21:07psykose: makes sense
21:08karolherbst: krumelmonster: maybe there is a bug in your application and you end up doing weirdo things to GL
21:08psykose: always possible
21:08krumelmonster: Absolutely possible, I have no idea what I'm doing karolherbst
21:09psykose: also for future self reference, the 4 ways i was thinking of is cgroup limit, rlimit(lol), max_map_count, and vm.overcommit_memory=2 (actual checked overcommit)
21:09krumelmonster: Then again I'm pretty sure I don't create any resources whatsoever when the application is just running like when I wait for this issue
21:09psykose: seems to almost always be 3) tho
21:10psykose: and 'actual' "i ran out of memory" moments is less "enomem" and more "oom killer gave you a bad day and nuked your unsaved libreoffice"
21:10krumelmonster: yes I'd always get the latter
21:11tnt: krumelmonster: you can monitor the number of mappings when you're application is running and see if it's a slow leak or a lot at once or ...
21:12krumelmonster: Say my application did have a leak of OpenGL resources, could I see this somehow?
21:12psykose: valgrind might know? idk if you tried it
21:12psykose: it's usually quite good at finding memory issues, especially leaks
21:13psykose: there's sanitizers too, asan+lsan is similar for that
21:13krumelmonster: I haven't. I thought we where considering a leak of OpenGL resources as in GLcreate something in the mainloop and never delete it or the such
21:14psykose: worth a try
21:15krumelmonster: By the way: Here's a backtrace where it crashed somewhere else most definitely for the same reason http://ix.io/4Ibe
21:16psykose: if it's for the same reason then it's the same reason
21:17anholt: valgrind's massif is more likely to help than sanitizers.
21:18anholt: but probably better would be porting over the iris_dump_bo_list under INTEL_DEBUG(DEBUG_SUBMIT) and using that to figure out what BOs are being allocated, which may be a hint.
21:19anholt: gallium's refcnt leaking stuff might help, but it's a pain to get stood up so I wouldn't start that way, just dump the BOs in use at submit time and try to infer things from their names
21:26Lynne: is there an equivalent of #dri-devel, but for those working on future vulkan extensions
21:26Lynne: I'd like to both complain and praise a proposed extension
21:27krumelmonster: Thank you for all the advices. I'll try to figure out how to read ms_print/massif output. Here's a sample (I didn't wait for the crash) http://ix.io/4Ibj
21:28krumelmonster: well actually the ascii art in the beginning says it all probably^^
21:40krumelmonster: https://0x0.st/HWfr.png makes it look like there's a vao leak
21:42krumelmonster: oh and I think I know who created it... awkward, very awkward...
21:43karolherbst: yeah.. that would do it
21:44psykose: fingers pointing at self? :D
21:44krumelmonster: yes
21:44tnt: 01_shader_audio/src/instreammanager.cpp:155 ?
21:44krumelmonster: yes
21:49krumelmonster: Albeit all the timewasting I hope you can be happy about having teached me many things. I might have been able to find this myself but rather by taking apart my code until it's fine than with a structured usage of gdb and valgrind so thank you for showing me how it's done :)
22:46anholt: lina: have you done an updated git tree of intel gfx prms?
22:58HdkR: I love that Matrix adds another layer of netsplit madness