00:00 Kayden: thought we added that as part of the disk cache stuff
00:00 anholt:hasn't found any such code
00:00 anholt: would love for it to be done on the frontend
00:00 mareko: we have the disk cache, but it doesn't deduplicate live shaders
00:00 Kayden: huh. maybe you're right
00:01 mareko: it requires that shader CSOs do reference counting, which most drivers don't do
00:01 mareko: it's really hard to do in st/mesa
00:04 Kayden: hmm...yeah. I thought st/mesa was refcounting them already
00:04 Kayden: seems like you would have to refcount them for shareable shaders to work?
01:00 zmike: mareko: while on the topic of async shader compiles, has there been any consideration into adding that into tc?
01:14 robclark: side note on async compile.. it makes chrome's gpu-sandbox rather unhappy.. I'm still muddling my way thru that
01:51 baryluk: Does vc4 / v3d work on RPi4 UEFI for anybody? For me it doesn't autoload, and when modprobed, doesn't do anything either (no /dev/dri, no kms, no fb switch, no dmesg output). (Linux 5.11-rc6, RPi4 UEFI 1.22)
02:56 airlied: agd5f: you might need to drop the warning patch from your tree
03:12 xyene: when using EXT_image_dma_buf_import, are there any restrictions/requirements on the physical memory layout of the dmabuf? do things break if the physical memory range is not contiguous?
03:12 mareko: Kayden: wrapping the CSO in a refcounted structure in st/mesa would work
03:13 mareko: Kayden: I don't know if deduplicating gl_program is possible
03:13 xyene: I'm writing a kernel module that hands out dmabufs where the physical pages are not necessarily contiguous, and I'm crashing radeonsi: https://gist.github.com/Xyene/3e479605b465d82e54883c8f699c3346
03:14 mareko: zmike: not considered, it seems orthogonal to what tc is doing
03:15 zmike: ok
03:24 imirkin: xyene: non-contiguous physical pages is pretty common, i'd be surprised if that were enough to upset things. more likely a dma-buf is going where it's not supposed to ... dma-buf's aren't all equal, and aren't all importable whereever you like
03:25 imirkin: the crash itself appears to be a bug in the amdgpu winsys, probably around handling errors
03:25 imirkin: mareko: --^
03:27 imirkin: looks like it's getting a bo == 0x0, so that's probably not going to go too well
03:27 imirkin: but how did it get there? that seems bad.
03:30 imirkin: rbarraud: check channel logs - xexaxo1 had some potentially useful comments for you
03:30 imirkin: https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2021-02-04
03:30 imirkin: (look for 14:59 timestamp)
03:30 mareko: imirkin: it looks like memory corruption
03:31 imirkin: mareko: i think the final frames are some sort of SIGSEGV handler
03:31 xyene: imirkin: thanks for the hints; I'm totally expecting something silly in the kernel module part. For a bit more context, the buffer does seem to import "fine" for a while... sometimes I get some correct frames (for entire minutes), other times it crashes almost immediately
03:31 imirkin: i think the last real one is the amdgpu_cs_add_buffer call (frame #7)
03:32 mareko: malloc crashes, so valgrind is the tool to use
03:32 imirkin: mareko: it's crashing in the sigsegv handling
03:32 imirkin: note how frame #6 is "signal handler called"
03:33 imirkin: i think frame #7 is the one with the problem
03:33 mareko: even before that, the kernel rejects the command buffer
03:33 imirkin: right
03:34 imirkin: but it seems like there are some issues with radeonsi's handling of such a rejection maybe?
03:34 mareko: a rejected command buffer is not a well tested codepath
03:34 imirkin: right
03:34 imirkin: with nouveau, we definitely try to avoid that scenario :)
03:35 mareko: wayland also allows "losing" the context, which drops all future command buffers
03:36 mareko: it's possible the dmabuf is illegal and the driver chokes on it
03:37 imirkin: what would constitute an 'illegal' dmabuf?
03:38 imirkin: xyene: i think your goal should be to understand why the command buffers are being rejected by the kernel
03:38 mareko: all kinds of things, I don't really know
03:39 mareko: bo should never be NULL indeed
03:41 imirkin: xyene: it seems like your additional problems are due to buggy handling of such failures in radeonsi
03:41 imirkin: fixing those, while a good idea, isn't going to get you closer to a working state though
03:42 mareko: it's more likely that the memory is corrupted, which explain the NULL bo, rejected command stream, and malloc crash
04:36 rbarraud: Hi imirkin
04:36 rbarraud: Thanks for your help last night (morning?) :-)
04:36 rbarraud: I'm grovelling thru the system with new insight :-)
04:38 rbarraud: And thanks for the chlog pointer :-)
04:40 rbarraud: xexaxo1: Thanks! :-)
05:15 xyene: imirkin: thanks! I think that put me on the right path... the extant code I'm dealing with looks like it uses one thread to submit frames and another to update the texture to be displayed
05:15 xyene: looks like things go wrong if the texture is updated while eglSwapBuffers is currently running (https://gist.github.com/Xyene/afd675613dcdbf897536550b50d1f462)
05:16 xyene: but I wouldn't expect anything good to happen under these circumstances anyway
09:08 MrCooper: xyene: do the threads calling eglSwapBuffers and glEGLImageTargetTexture2DOES use separate EGL contexts?
11:15 patrik: danvet, can you have a look at that immutable branch from Andy. Seems he really wants it to go through.
11:19 karolherbst: dcbaker: is there an easy way to have a local meson installation to test stuff with without messing up the system or is that painful?
11:19 karolherbst: apparently I have some time today to look into rust again :D
11:38 Sumera[m]: danvet, melissawen: can I use kms_has_vblank(fd) to set test requirements in kms_cursor_crc and kms_vblank?
11:39 Sumera[m]: so that we can just skip the tests if vblank is not supported, like it happens with kms_flip and vblank less mode
11:40 karolherbst: dcbaker: also do you have a nice branch with all the rust stuff I could try out?
11:40 melissawen: Sumera[m], I think so. It was the approach adopted for kms_flip, right?
11:41 Sumera[m]: melissawen: yes, that's why most of the tests skip and don't fail
11:58 melissawen: Sumera[m], looks fair; just to be sure that I got it well.. your goal is not to skip the test case, just skip the wait for vblank, right?
12:15 Sumera[m]: melissawen: um, I am not sure I understood , can you give an example of a test case?
12:16 Sumera[m]: what I meant is, if there's a sub-test that requires vblank, then I want to skip that subtest
12:16 Sumera[m]: a subtest is a test-case right?
12:16 Sumera[m]: * sumera confused
12:30 melissawen: Sumera[m], yes, testcase <-> subtest.. for example, in kms_cursor_crc, if not all, almost all subtests wait for vblank at some point
12:31 Sumera[m]: ah, okay
12:31 melissawen: so if you just skip them, you probably continue to not be able to test the composition
12:33 melissawen: but maybe skipping is your first step, for a next one of enabling test without vblank, is it?
12:35 Sumera[m]: hm, but doesn't composition depend on the vblank wait? As in, if I just skip the wait (and not the testcase) then won't the rest of it fail as well?
12:38 Sumera[m]: if just skipping the wait_for_vblank doesn't affect the rest of the code, then maybe that's a better approach?
12:39 pq: karolherbst, I seem to recall you should be able to simply run meson from its git checkout directly. But if you want to install something from upstream, 'pip3 install --user meson' can get you a personal version.
12:40 karolherbst: pq: ahh, cool
12:50 melissawen: Sumera[m], yes, i think so.. or just for checking if it is possible, since we want to test the vkms_composer proposed for virtual_hw
12:52 Sumera[m]: melissawen: alright, I will go with this approach then
13:04 danvet: patrik, I still don't see the point tbh
13:05 danvet: I replied
13:08 patrik: danvet, thanks, I mostly wanted you to chime in on the thread.
14:32 karolherbst: dcbaker: just tried out the bindgen wrapper and it works great :)
14:33 karolherbst: but now I have to deal with the inclusion of the file somehow...
14:38 karolherbst: ohh, that is addressed by https://github.com/mesonbuild/meson/pull/8236
14:38 karolherbst: let's see
14:39 karolherbst: https://github.com/mesonbuild/meson/pull/8158 actually
15:12 imirkin: mareko: i'm seeing user buffers with buffer.user == null. what do those mean? i think they're new.
15:12 imirkin: mareko: $10 = {stride = 8, is_user_buffer = true, buffer_offset = 0, buffer = {resource = 0x0, user = 0x0}}
15:14 Venemo: has anyone ever managed to get gdb's reverse debugging feature to work with mesa?
15:15 xyene: MrCooper: the thread creating the texture is using eglCreateImage with EGL_NO_CONTEXT and then using that EGLImage with glEGLImageTargetTexture2DOES to update the texture used by the other thread
15:15 xyene: not sure how this works out from a context PoV; the EGLImage creation thread doesn't bind any context
15:25 pq: xyene, glEGLImageTargetTexture2DOES needs the right context current and the correct texture bound.
15:40 baryluk: does somebody have some reference images (or even better high res video) of viewperf11 and 13 somewhere? I have some weird things going on, but I don't know it if is Mesa / GPU bug, or is supposed to be like that.
15:49 imirkin: mareko: looks like 0278d1fa323cf1f289a2c5f4cd803c4203d4a48a is the commit that introduced the badness (gallium: add unbind_num_trailing_slots to set_vertex_buffers)
15:52 imirkin: can't say it's *immediately* apparent to me what the cause of the problem is
15:57 imirkin: ohhh, i think i see it
15:58 imirkin: we need to actively respect unbind_num_trailing_slots in set_vertex_buffers now, or else some masks aren't properly updated
15:58 danvet: airlied, [PATCH] drm/i915: Autoselect CONFIG_CHECKPOINT_RESTORE for SYS_kcmp <- maybe chime in on this one
16:03 imirkin: yeah that fixes it
16:03 imirkin: thanks for listening.
16:10 pendingchaos: since dxvk now sets them by default, I think we will want some way for a nir_opt_algebraic optimization to be inexact but respects DenormFlushToZero/SignedZeroInfNanPreserve
16:10 pendingchaos: any ideas how? we could add a character in front of the first opcode similar to ~, but this seems confusing
16:23 MrCooper: imirkin: maybe should rename the channel to #rubberduck :)
16:23 imirkin: MrCooper: yeah, i'll go there first in the future :)
16:57 zmike: I'm in full support of the #rubberduck community
16:58 zackr: does anyone know if there there any tests/apps to validate DRIVER_GEM support in a drm driver?
17:17 xexaxo1: danvet: how is IGT doing with ^^. IIRC the kernel docs do mention IGT for KMS, but I don't know how we're doing for GEM - worth mentioning something even if it's "no generic GEM tests atm"?
17:19 bnieuwenhuizen: we don't have generic GEM ioctls for creation IIRC so a generic test might be interesting
17:24 xexaxo1: indeed - GEM_CLOSE is common (nothing special) and for a very brief point so was GEM_OPEN
17:32 danvet: xexaxo1, GEM_OPEN still common
17:32 danvet: just a "don't use it" thing
17:32 danvet: zackr, I dont think DRIVER_GEM is even exposed as a uapi concept
17:32 danvet: so kinda hard to test in igt fully generically
17:33 xexaxo1: GEM_OPEN is still a thing? haven't seen any drivers that actually use it
17:37 zackr: danvet: inside the drm fb helpers are using it, yes? so if we implemented in vmwgfx, then switched over to the drm fb helpers and fb stopped working that's a good indication our driver_gem support doesn't go brrr
17:38 zackr: i think... i'm just trying to figure out how to validate it
17:39 danvet: zackr, if igt fb helpers allocated a dumb buffer, they should be using dumb_destroy
17:39 danvet: not gem_close
17:39 danvet: or I'm not sure what you mean ...
17:46 zackr: danvet: the core issue is very simple, vmwgfx is basically the only driver that doesn't expose driver_gem. i'd like us to be a bit better citizens so i'm trying to figure out how much work it would be for us to just implement driver_gem. so i'm just trying to figure out if there's a quick way of going about it without having to do the gl driver in lockstep
17:47 danvet: zackr, export drm_gem_init(), don't set DRIVER_GEM
17:47 danvet: if everything works as I hope it does, this means you can use gem internally
17:47 danvet: with none of the uapi
17:47 danvet: and yeah would be neat if we can remove/inline the bunch of oddball special cases
17:48 danvet: but I think it's not that many really, it's mostly the helpers we've built on the assumption that there's gem
17:48 danvet: so hopefully biggest benefit will be code that can be deleted from vmwgfx and replaced by helpers
18:00 zackr: danvet: thank you
19:44 karolherbst: dcbaker: so uhh... even with the bindgen stuff, how can I wrap around the generated sources if I really need to mess with the generated stuff a little?
19:45 dcbaker: karolherbst: are you trying to use `include()`?
19:45 karolherbst: dcbaker: The rs file I generate doesn't compile on its own
19:45 dcbaker: yeah, this is a huge pain point for rust because cargo and meson have a lot of design mismatches
19:46 karolherbst: I know :/
19:46 dcbaker: The only option right now is to copy files to the build dir
19:46 dcbaker: configure_file(..., copy: true) is your friend :/
19:46 karolherbst: mhhh
19:46 dcbaker: the other option is to compile the generated file into an rlib and then link that with the static sources
19:46 dcbaker: neither of which are great
19:47 karolherbst: well, the generated file won't work on its own, because of silly reasons
19:47 karolherbst: very silly reasons
19:47 dcbaker: I'm thinking that we might need to just have meson's rust targets detect that you've got generated sources and copy the files for you
19:47 karolherbst: so the CL headers forward declare a bunch of types
19:47 dcbaker: but that wont happen this release unfortunately
19:47 karolherbst: which the runtime is supposed to implement
19:47 karolherbst: and bindgen just generates empty structs
19:48 karolherbst: so I disable codegen for them and just include my crates defining those
19:48 karolherbst: but of course other stuff uses those already :/
19:48 karolherbst: it's super annoying
19:48 karolherbst: maybe I find a different solution
19:58 dcbaker: I'd like to get more work done on getting generated sources in rust working for meson 0.58, but we're feature freazing on sunday, so I'm trying to get bindgen in before that
20:03 karolherbst: okay
20:03 karolherbst: I'll play around and see if I can find a viable workaround though
20:26 airlied: danvet: yeah I agree with what you/lucas said, + wow big landmine
20:49 karolherbst: dcbaker: ohhh, there is a "--raw-line" option to bindgen :O
20:49 danvet: airlied, maybe ack the new patch from ickle?
20:49 danvet: I'll throw mine on it too
20:53 danvet: airlied, I guess we need to poke akpm for an ack and then merge
20:54 danvet: maybe even cc: stable or something like that
21:02 karolherbst: dcbaker: ufff :/ I don't think any of those solutions would actually work. Well, except copying a bunch of files around. I implement a couple of structs wihin multiple rs files, but the generated source has to see those... doing the include! with a custom ENV var kind of sounds like the less annoying bit tbh :/
21:03 karolherbst: I could potentially split the project in multiple rlibs though
21:03 dcbaker: env is going to be hugely annoying in meson
21:03 dcbaker: ninja doesn't do env variables by design
21:03 dcbaker: so we're gonna have to wrap rustc
21:03 karolherbst: ahh no, I can't cyclic deps
21:04 karolherbst: ahhh
21:04 karolherbst: this is so annoying
21:04 karolherbst: and I prefer to use CL types instead of wrapping everything
21:04 dcbaker: can you use configure_file() to set the name of the file to include() at configure time?
21:05 karolherbst: I have to mix sources from the source _and_ build dir
21:05 dcbaker: oof
21:05 karolherbst: soo. I generate the CL API stuff
21:05 dcbaker: yeah, I'm going to have to implement the copy thing I think
21:05 dcbaker: that'll be fun
21:05 karolherbst: _but_ when I implement API objects the header expects, I use CL API types
21:05 karolherbst: :)
21:06 dcbaker: but I've got some mesa work that I need to get to first, so it probably wont happen immediately :)
21:06 karolherbst: CL has those "typedef struct _cl_platform_id * cl_platform_id;" typedefs
21:06 karolherbst: but what _cl_platform_id is, is defined by the implementation
21:06 dcbaker: right
21:06 dcbaker: that makes sense, and is also super annoying
21:06 karolherbst: yeah...
21:07 karolherbst: I could probably do not use any of the CL stuff and really wrap it strictly
21:07 karolherbst: mhhh
21:07 dcbaker: I do need to solve the mixed generated/not generated stuff anyway
21:07 karolherbst: but I actually want to use all the enums...
21:07 karolherbst: ahh, okay
21:08 dcbaker: and it's not just rust that has this problem, I think zig will have the same thing
21:08 karolherbst: okay
21:08 dcbaker: not that zig is ready to be actually used yet...
21:09 karolherbst: at least I got rid of most of the lines in my bindings.rs files
21:09 karolherbst: I still hate how it looks...
21:09 karolherbst: rust.bindgen: https://gitlab.freedesktop.org/karolherbst/mesa/-/blob/rusticl/src/gallium/frontends/rusticl/meson.build#L49
21:10 karolherbst: api/bindings.rs: https://gitlab.freedesktop.org/karolherbst/mesa/-/blob/rusticl/src/gallium/frontends/rusticl/api/bindings.rs
21:10 karolherbst: the bindings will go away once we have the copy stuff in place and I can just compile it
21:10 karolherbst: but...
21:10 karolherbst: the rust.bindgen call looks... ufff
21:12 dcbaker: we could add some wrappers around raw_lines and the black/white lists stuff so you could pass in arrays there, not sure if that would make it any nicer though
21:14 karolherbst: not sure either :/
21:14 karolherbst: I probably write some for loop
21:14 karolherbst: but at least there is no need for include! anymore, which is at least some progress
21:15 dcbaker: also not thrilled about adding the terms whitelist/blacklist, or using allowlist/denylist and confusing people 😒
21:15 karolherbst: dcbaker: btw, I saw that bindgen has a --rust-target flag
21:15 karolherbst: might make sense to set it as well
21:16 karolherbst: but I also don't know how this bootstraping in meson is supposed to work anyway.. meaning how to choose the actual rust version
21:16 dcbaker: yeah, unfortunately I'm not sure how to set it
21:16 dcbaker: it doesn't use editions like rustc does
21:16 karolherbst: rust has --edition 2015/2018
21:16 karolherbst: *rustc
21:16 karolherbst: yeah...
21:17 dcbaker: but bindgen uses rustc versions
21:17 karolherbst: maybe just use the version the installed rustc is at?
21:18 karolherbst: but now idea what consequences it has
21:18 karolherbst: but "The default is the latest stable Rust version" soo...
21:18 karolherbst: maybe really just set the version of the used rustc
21:18 dcbaker: yeah, I'm not sure either
21:19 karolherbst: I check git.. to see what the reasons were for adding it
21:19 dcbaker: setting it to the version of rustc you're using (or the next lower one if your is inbetween) seems like a safe bet
21:19 dcbaker: probably should have a switch for that so you can force it to something else though
21:20 karolherbst: maybe
21:20 karolherbst: maybe meson also needs a "you need at least rustc version X" thing?
21:21 dcbaker: Maybe? We do handle editions. you can always open code the rust version check with `rs = meson.get_compiler('rust'); rs.version().version_compare(...)`
21:21 dcbaker:will be back later
21:27 karolherbst: dcbaker: ahh yeah.. bindgen uses the version to know what features to enable. So I think it kind of makes sense to specify a global "you need this rust version" which bindgen would use as well
21:27 karolherbst: otherwise just use the one from the current rustc
21:47 alyssa: anholt: "ERROR - dEQP error: Mesa warning: failed to remap glVertex2hNV" any way to get rid of this spam in deqp-runner output?
21:48 anholt: alyssa: it always prints deqp's stderr, so unless you grep -v it, no.
21:48 anholt: that warning is usually "you've managed to screw up your glapi vs mesa drivers libs"
21:48 alyssa: ack
21:48 alyssa: wait that warning isn't supposed to be there??? :confused_pikachu: :p
21:48 anholt: yes, it's an indication that your env is really busted in a surprisingly unharmful way.
21:49 alyssa: oh boy
21:49 alyssa: I should possibly consider fixing that before trying to do full deqp runs locally...
21:52 alyssa: I'm playing games with LIBGL_DRIVERS_PATH so I guess that'd be why
21:53 imirkin: that warning is a harbinger of extreme badness
21:54 ickle: danvet: same patch for drm/Kconfig: select KCMP ?
21:54 alyssa: imirkin: uh oh
21:54 ickle: or patch 2/2
21:55 imirkin: i agree with anholt except for the "unharmful way" bit
21:55 imirkin: it means that two tables that are supposed to be lined up ... aren't
21:55 imirkin: and so glFoo will randomly call glBar
21:55 danvet: ickle, hm I guess either?
21:55 ickle: one line, might as well sneak it in
21:55 danvet: maybe both together will result in less "wtf why is this cc: stable"
21:55 danvet: yeah
21:55 danvet: also did you see the reply that systemd also wants this?
21:56 ickle: up
21:56 ickle: yup
21:56 danvet: systemd sailing in the shockwave of drm is kinda fun :-)
21:57 danvet: airlied, I guess that explains why we haven't noticed the lack of sys_kcmp that much ^^
21:57 alyssa: imirkin: alyssa@kevin:~/bin$ cat harbinger.sh
21:57 alyssa: scp -i $ODROIDN2_IDENTITY ~/mesa/build/src/mapi/shared-glapi/libglapi.so.0.0.0 root@[$ODROIDN2]:/usr/local/lib/aarch64-linux-gnu
21:57 alyssa: ^^ that did it :p
22:02 imirkin: alyssa: yeah. i used nfs to avoid that problem ;)
22:02 imirkin: i.e. do all the cross-compile on my fast(er) desktop
22:03 imirkin: and then just nfs-mount that on the arm device
22:29 alyssa: anholt: oxidized deqp-runner is a breeze to use, thank you! <3
22:29 Lyude: oh hey that's what I do too
22:30 anholt: alyssa: glad you like it!
22:30 xyene: pq: thanks, I think I have a good handle on what's going wrong now. Thread 1 has a loop roughly like {image = eglCreateImage(context 1); glBindTexture(GL_TEXTURE_2D, texture); glEGLImageTargetTexture2DOES(..., image); eglDestroyImage(image)}
22:30 xyene: Since the image is destroyed, the only reference remaining is the texture binding.
22:31 xyene: Then the second thread's loop uses a second context, and draws with the texture. If this races with thread 1 calling glEGLImageTargetTexture2DOES(..., texture) again, the refcount to the original buffer drops to 0 and is freed, but this might happen while thread 2 is within eglSwapBuffers and things die
22:31 xyene: At least, keeping the EGLImage around and not destroying it keeps the refcount > 0, and the crash doesn't happen
22:48 marex: kusma: hi, I'm looking at bc0222d471b ("compiler/nir: add texcoord replace lowering pass")
22:49 anholt: airlied: looks like asan catches lvp use-after-free in about the same spot that my zink-on-lvp gles cts runs do.
22:49 marex: kusma: are those ishl/iand/etc. opcodes really gonna work on GPUs that have no integer instructions (e.g. some gles1/2 vivante GPUs dont have those)
22:49 anholt: (https://gitlab.freedesktop.org/anholt/mesa/-/jobs/7134965/raw look for update_scene_state)
22:49 kusma: marex: probably not, no
22:49 marex: oh
22:50 kusma: marex: Also, I might end up reverting the automatic nature of the feature, see: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4198
22:50 kusma: marex: the former can probably be fied, though
22:51 alyssa: I like asan.
22:52 marex: kusma: I am trying to figure out what to do about https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8618
22:52 imirkin: marex: should be easy to adjust into multiplies / divides on such gpu's though
22:52 marex: kusma: I was thinking about applying the NIR texcoord lowering pass, but well ...
22:52 imirkin: marex: iirc there's a "native integers" bit somewhere, but not sure where
22:52 marex: imirkin: is that applicable on such GPUs at all ?
22:52 imirkin: well
22:52 imirkin: divide-by-const is also multiply-by-a-different-const
22:53 imirkin: and they all definitely support multiplying float values
22:53 marex: I mean, if I understand this correctly, this shifting is there because of the indirection (?)
22:53 imirkin: and generally don't support (actual) integers at all
22:53 marex: so does that indirection even applies on the old GPUs ?
22:53 kusma: marex: hmm, yeah... So I guess the problem is to map the lowering-bitmask to something that works everywhere...
22:53 imirkin: i haven't looked at the details tbh
22:54 marex:still has barely any clue
22:54 imirkin: if it's trying to use a bitmask as a bitmask in the shader, it's going to be very dissapointed.
22:54 imirkin: that won't work without native integers.
22:54 kusma: marex: By the way, this will all be constants unless we're using dynamic addressing for the texcoord... Not sure if that can happen in your backend or not...
22:55 kusma: So *maybe* constant-folding solves everything for you
22:55 marex: kusma: currently I'm still seeing issues even with the eglretrace
22:55 imirkin: kusma: you mean like gl_TexCoord[indirect] right?
22:55 kusma: imirkin: yeah
22:55 imirkin: i don't know offhand if that's legal in GLSL 120
22:55 imirkin: it might or might not be.
22:56 imirkin: someone would have to rtfs
22:57 kusma: marex: BTW, seems like your version is essentially what I started with :P
22:58 marex: kusma: I was working on that for ... well ... a long time, with help from imirkin and jekstrand, it was difficult all right
22:59 marex: kusma: so I might as well ask a different question, the version I use ends up with correct rendering of both neverball and the eglretrace pointsprite.trace
22:59 marex: kusma: but ...
22:59 MrBIOS:wishes there was a PCI-X video card that did OpenGL reasonably well
22:59 kusma: marex: Yeah, so I *think* my version should work for you in all cases where your version is valid. But in cases where it isn't, you'll be stuck with some shifts that you can't compile in my version. In your version, you'll simply do the wrong thing.
23:00 marex: if I replace https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8618/diffs?commit_id=3ad03b2ab102752b4d9ae6576fc85cf7bd27c8e4#c7cb3abafce1f6875bc8b14e9567930bdc201802_1093_1098 with NIR_PASS_V(s, nir_lower_texcoord_replace, v->key.sprite_coord_enable, false); , the rendering is slightly corrupted
23:00 marex: the eglretrace shows this burst of points, and sometimes the entire point in that burst is full white for a short time
23:00 kusma: marex: Hmm, that sounds odd. Neverball was exactly the case that I wrote this one for :P
23:01 kusma: And it renders seemingly correct for me...
23:01 marex: neverball is correct, yes
23:01 marex: the eglretrace is weird
23:01 kusma: I'm not sure what that's referring to, TBH
23:01 marex: I use MESA_GLSL_CACHE_DISABLE=true ETNA_MESA_DEBUG=nir NIR_PRINT=1 eglretrace pointsprite.trace on https://github.com/ptitSeb/gl4es/blob/master/traces/pointsprite.tgz
23:02 marex: you can open it in qapitrace, it is a lot of pointsprites
23:02 kusma: I'm afraid I can't do that now, and I'm off for the next four weeks...
23:03 kusma: But I'll take your word for it.
23:03 marex: kusma: all right, no worries, I'll keep digging as time permits
23:03 kusma: marex: Great. Did you ever see shift ops left after constant-folding, BTW?
23:03 marex: kusma: just one more question if that's OK with you
23:03 kusma: marex: shoot!
23:04 marex: kusma: constant-folding ? remember, I am still wrestling with the basics of nir, really
23:04 marex: kusma: do I have to set PIPE_CAP_POINT_SPRITE in screen to 1 or 0 if I use the nir_lower_texcoord_replace pass ?
23:05 kusma: marex: well, in the cases where these two passes do the same (no dynamic indexing), all of the inputs here are constants, so it should all constant-fold away.
23:06 kusma: marex: constant folding just optimizes constant instructions by evaluating them before codegen
23:06 marex: so I should see those removed with NIR_PRINT I guess ?
23:07 kusma: so "ishl(1, iadd(1, 1))" in NIR gets replaced with "4"
23:10 marex: kusma: and isn't that 4 a native integer instead of float ?
23:11 imirkin: there's a phase at which all integers are auto-converted to float things
23:11 imirkin: i forget precisely where that happens though
23:11 alyssa: lower_int_to_float
23:12 kusma: marex: yeah, but that should be lowered later on, like imirkin suggests :)
23:17 marex: kusma: aha, etnaviv only does int_to_float on older GPUs
23:18 imirkin: all GLES3-capable GPUs support native integers
23:18 imirkin: GLES3 / GL 3.0 is the cut-off for integer support.
23:28 kusma: marex: Right, then I guess you should be good for integer support in the first place :)
23:33 marex: kusma: hmmmmmm, there is something with that index stuff that is weird on etnaviv
23:33 marex: kusma: maybe that is a problem with etnaviv and native integers though
23:33 kusma: marex: :/
23:34 kusma: marex: well, there could also be some detail I got wrong, it's fairly new code...
23:36 marex: kusma: it could also be etnaviv bug
23:36 marex: or bug in my understanding of it
23:37 imirkin: marex: are you on an ES3.0 GPU or ES2.0?
23:38 marex: https://paste.debian.net/hidden/ba7c08b3/ so this makes the rendering corruption go away
23:38 marex: imirkin: right now, gc2000, it should be gles3 capable with native integers
23:38 marex: (should)
23:39 alyssa: kusma: I should test with bifrost huh
23:39 imirkin: marex: dunno why that'd cause problems then
23:40 marex: imirkin: also, based on what I gathered from the discussion here, it should be optimized out
23:40 marex: hmmmmmm
23:40 marex: lemme dump the shader assembler
23:40 imirkin: and the nir too probably
23:43 imirkin: kusma: btw, you don't appear to have handled the origin bit in that lowering pass?
23:44 imirkin: you're supposed to replace with (s,t) or (s,1-t) -- the flipping is a feature added in GL 3.0 iirc, but it's needed "natively" for rendering on fbo vs winsys
23:51 anholt: mirkin: I guess if this lowering is applying after the existing nir_lower_pntc_ytransform, yeah.
23:51 imirkin: anholt: oh, that will adjust gl_PointCoord?
23:52 imirkin: yeah, i guess it would.
23:52 imirkin: except that this pass fetches a fresh point coord
23:52 imirkin: you can't control it per thing, can you
23:53 imirkin: right, no, it's global
23:53 imirkin: but i don't think it affects gl_PointCoord...
23:54 imirkin: anholt: i'm talking about: glPointParameteri(GL_POINT_SPRITE_COORD_ORIGIN, GL_LOWER_LEFT);
23:54 imirkin: added in GL 3.2, it seems
23:54 imirkin: one of those things which didn't have their own extension... sigh
23:54 anholt: everything to do with point sprites: sigh.
23:54 alyssa: imirkin: That might be partially my fault.
23:55 imirkin: alyssa: that it wasn't in its own ext?
23:55 alyssa: Bifrost has a per-FS flag to flip gl_PointCoord, despite not having support for replacing varyings with it.
23:55 imirkin: you have more pull at khr than i thought :)
23:55 alyssa: lol
23:55 imirkin: so ... flipping gl_PointCoord is *also* a thing
23:56 imirkin: which is just unrelated to this thing.
23:56 alyssa: indpenedently?
23:56 imirkin: yea
23:56 alyssa: <field name="Point sprite coord origin max Y" size="1" start="27" type="bool"/>
23:56 alyssa: seems pretty much this thing
23:56 imirkin: seems that way, right
23:56 imirkin: but it isn't :)
23:56 alyssa: Midgard had full support for point sprites in hw so.. :p
23:56 imirkin: gl_PointCoord was GLES's answer to sprite coord replacement
23:57 imirkin: the reason that sprite coord replacement is so horrid is that it harkens to a fixed function age
23:57 alyssa: Mali isn't really a GLES part, so much as a smattering of GL/GLES/VK/CL/DX functionality that's almost enough to implement anything but not quite there.
23:57 imirkin: GLES obviously didn't have to deal with that, so it wasn't bound to *having* to texture with gl_TexCoord[]
23:57 imirkin: so they just threw in a gl_PointCoord, which is the "point sprite" value
23:58 imirkin: but with desktop GL, you have to replace random gl_TexCoords
23:58 imirkin: apparently someone still gave a shit about this by the time GL 3.2 rolled around, and decided to make it even worse, by allowing this per-gl_TexCoord flipping
23:58 imirkin: but gl_PointCoord is separate from that, and has to be flipped when the fbo is flipped (i.e. winsys fbo)
23:58 alyssa: ahhhhh
23:59 alyssa: so that flag is to avoid variants for fbo/onscreen?
23:59 imirkin: (even though it's the same value, deep down inside, the GLSL-level thing is separate, and they're separate varyings)
23:59 imirkin: presumably yes