00:23 ccr: hrm. this seems wrong .. $22 = {Name = 0x60300002f680 "state.light[-8472].position", Type = PROGRAM_STATE_VAR, Padded = true, DataType = 0, Size = 4, StateIndexes = {101, -8472, 125, 0, 0}, UniformStorageIndex = 0, MainUniformStorageIndex = 0}
00:26 ccr: https://gitlab.freedesktop.org/mesa/mesa/-/commit/2f5d18403a4d51a2cd927c141884361850bad41d seems to have somehow broken Zink at least with my test program.
00:39 mareko: zmike: initially 15% higher score with drawoverhead, later I added more optimizations, so the improvement is higher now; it's important to note that draw_vbo in radeonsi had many conditional blocks for older generations that are not needed on the latest chips, and templated draw_vbo removes those cases
00:40 zmike: mareko: nice
00:40 mareko: zmike: does vulkan have a test similar to drawoverhead?
00:40 zmike: well there's always zink+drawoverhead :D
00:40 zmike: but no, not that I'm aware of
00:41 mareko: if you wanna beat radeonsi in drawoverhead, you need a separate thread for vulkan within zink and maybe some vulkan extensions for multi draws and to remove feature emulation
00:42 zmike: yeah the multi draw thing is pretty big
00:42 bnieuwenhuizen: feature emulation?
00:42 mareko: gl features not in vulkan
00:42 zmike: I've been considering something like adding a thread to tc to dump draws into cmdbufs faster
00:43 zmike: doing effectively a full tc on top of tc seems like pain
00:44 bnieuwenhuizen: I think on the noop-case almost halv of the radv overhead also comes from tracing/debug checks
00:44 bnieuwenhuizen: stuff like the RGP traces or extra cache flushes on debug
00:44 mareko: bnieuwenhuizen: can you measure the number of draws per second?
00:45 bnieuwenhuizen: with GL drawoverhead + zink?
00:45 mareko: no, just vulkan
00:46 bnieuwenhuizen: probably could but don't have such a test at the moment
00:46 imirkin: seems like a vk-drawoverhead shouldn't be extermely difficult to compose.
00:46 bnieuwenhuizen: yeah
00:46 imirkin: not all of it will port over, but definitely some of it
00:46 bnieuwenhuizen: just kinda averse to starting it at 2 AM ...
00:47 zmike: haha
00:47 bnieuwenhuizen: so will be a problem for later today
00:47 zmike: mareko: I only checked out the raw draw counts in drawoverhead because I was optimizing the state change ones in the piglit version
00:48 zmike: still using the unigine heaven benchmark for graphical testing
00:48 zmike: I'm catching up, but you've still got a pretty good lead for now
00:48 mareko: radv is missing opts to match radeonsi in heaven perf
00:49 zmike: oh?
00:49 mareko: on gfx10.3
00:49 bnieuwenhuizen: well, he's running it with tess off
00:49 bnieuwenhuizen: so I presume the culling for tess doesn't really matter
00:50 mareko: probably not
00:51 bnieuwenhuizen: (assuming that was the opt you referred to)
00:51 mareko: yes
00:51 mareko: it increases rasterization performance in general, so non-tess cases are affected too, but it's hard to tell how much
00:52 mareko: shaders need to be primitive-bound or attribute-bound to see a benefit, not pixel-bound
00:52 mareko: *shaders=apps
00:55 zmike: tbh the perf gap is the same with or without tessellation
00:56 zmike: relatively speaking
00:56 zmike: mareko: ^
00:57 mareko: ok
00:57 bnieuwenhuizen: well also in his setup we load the GPU < 70% so hence we're looking at CPU stuff :P
01:05 mareko: I'm considering adding a thread into tc that only does draw merging and sits between st/mesa and the driver
01:06 zmike: the overhead of the merging is noticeable in drawoverhead, is it noticeable in real world usage?
01:07 mareko: it's only noticable if you don't have multi draws
01:07 mareko: it decreases overhead for radeonsi
01:07 zmike: hm
01:08 mareko: for display lists, I think the merging+driver overhead is bigger than st/mesa overhead, so splitting that work might help
01:09 zmike: I was just mentioning it because I saw the merging stuff come up when I was profiling today
01:09 anholt: Plagman: thanks! https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/-/merge_requests/27
01:09 mareko: zink_draw_vbo emulates multi draws, so it gets no benefit
01:10 zmike: you're looking at the master version
01:10 zmike: I assume
01:10 mareko: 1-2 days old
01:10 mareko: yes
01:11 zmike: I'm talking about https://gitlab.freedesktop.org/zmike/mesa/-/blob/test/src/gallium/drivers/zink/zink_draw.c#L599
01:11 bnieuwenhuizen: zmike got this branch a 100+? patches ahead of master ;)
01:11 zmike: closer to 600 :/
01:12 zmike: this is the multidraw https://gitlab.freedesktop.org/zmike/mesa/-/blob/test/src/gallium/drivers/zink/zink_draw.c#L304
01:12 zmike: though I was experimenting with using indirects here to simulate a real multi
01:12 zmike: so you'll have to imagine it as looping regular draws
01:12 mareko: indirects have a perf hit on our hw compared to normal draws
01:13 zmike: yeah, they have a perf hit everywhere haha
01:13 bnieuwenhuizen: due to CP processing speed?
01:13 zmike: I've abandoned the experiment
01:13 bnieuwenhuizen: and yes we noticed
01:13 mareko: bnieuwenhuizen: I don't know if CP is the culprit, so let's say the pipeline frontend
01:13 imirkin: on nvidia there's a funny quirk in that if you have a *ton* of instances, it's actually faster to do it indirect than direct
01:14 imirkin: (like, say, 100k instances)
01:14 zmike: huh
01:14 imirkin: it's the difference between writing 100k draws to a command buffer
01:14 bnieuwenhuizen: imirkin: so of course the driver converts that case into an indirect on an uploaded value?
01:14 imirkin: and having the CP do the looping itself
01:14 bnieuwenhuizen: oh draws not instance
01:14 imirkin: bnieuwenhuizen: lol, no. but it could/should call the macro in that case
01:14 mareko: zmike: a multi draws vulkan ext should solve that
01:15 imirkin: the "indirect" macro has nothing indirect about it, in practice
01:15 imirkin: i dunno where the cutoff is
01:15 imirkin: bnieuwenhuizen: well, each instance is written as a separate draw command
01:15 imirkin: hence the reason why CP can sometimes be faster
01:15 zmike: mareko: yeah, though that I think just helps with the drawoverhead case and not necessarily with real world usage?
01:16 zmike: I guess I'd have to actually do it to see
01:16 mareko: zmike: it helps viewperf
01:16 imirkin: and thus the world
01:16 mareko: gl world
01:17 zmike: something to look into for sure
01:29 mareko: radeonsi vs zink on master + local commits: https://pastefs.com/pid/264685
01:31 zmike: zink on master still does a full fence on every flush call, calls flush() randomly throughout the driver, and doesn't use barriers correctly, so that's not exactly a fair comparison :p
01:32 mareko: right
01:32 zmike: my test branch is the current bleeding edge, and zink-wip is like the 'stable' version of that
01:57 ccr:rubs his head harder.
02:33 bl4ckb0ne: is Emil Velikov in this chan?
02:34 airlied: xexaxo1: ^
07:40 kusma: ccr: are you sure? I don't think zink uses live_index, so that would be very surprising to me...
07:59 danvet: airlied, forgot to push out the -msic-next merge?
07:59 danvet: just so we stop breaking linux-next so badly :-)
09:30 tzimmermann: danvet, you wanted to merge a patch into misc-fixes before the PR (?)
09:30 danvet: tzimmermann, build testing the push as we speak
09:30 tzimmermann: ok
09:30 danvet: airlied, just remembered you said you're out, I'll process the misc and amdgpu -next pulls
09:45 danvet: tzimmermann, pushed
10:05 xexaxo1: bl4ckb0ne: greetings
10:41 j4ni: danvet: Lyude: need to do something about http://lore.kernel.org/r/20210120105715.4391dd95@canb.auug.org.au
10:42 j4ni: the fix is simple enough, the edid quirks don't affect msm
10:44 j4ni: the problem is that 7c553f8b5a7d ("drm/dp: Revert "drm/dp: Introduce EDID-based quirks"") is in drm-intel-next, not drm-misc-next
10:44 j4ni: the simple fix would be to just apply the msm change to drm-intel-next, but do two wrongs make a right...?
10:47 danvet: j4ni, I think two wrongs make a right here
10:48 danvet: a-b: me for the fix for merging through drm-intel
11:08 j4ni: danvet: http://lore.kernel.org/r/20210120110708.32131-1-jani.nikula@intel.com
11:08 j4ni: I didn't even build it...
11:10 danvet: j4ni, assuming you do build test, lgtm
12:00 tzimmermann: danvet, i forwarded -misc-fixes to 5.11-rc4 but now i see all these patches in the pull request. can you forward drm-fixes as well?
12:01 danvet: tzimmermann, done
12:03 tzimmermann: works now. thanks!
12:10 danvet: airlied, I think we should backmerge -rc5 btw, a pile of semi-embarrassing fixes have landed that we dont have in -next
12:17 danvet: also some smaller conflicts, but pulling in the fixes probably more important
13:08 ccr: kusma, well, I did a bisect and that was the first offending commit. but while digging (with my very limited understanding), I think there is something else going wrong in .. not sure exactly where, but in src/mesa/state_tracker/st_nir_lower_builtin.c::get_variable() there's tokens[1] = nir_src_as_uint(path->path[idx]->arr.index); which seems to return a completely bogus (negative) value.
13:10 ccr: kusma, while I do not understand this much at all, I _suspect_ that it might have something to do with the fact that the index values in the SSA forms of the NIR instructions seem wrong when looked at with gdb
13:10 ccr: but it puzzles me because if something is indeed this broken, it really should show up more and not just in a silly OpenGL test program of mine :/
13:11 kusma: Sounds pretty puzzling, yeah...
13:11 ccr: so maybe I'm doing something "wrong" and it works on GL but combined with Zink it segfaults
13:12 kusma: ccr: well, even an invalid program shouldn't trigger segfaults, so I suppose *somthing* is up in mesa here :P
13:13 ccr: if you want to try, hg clone https://tnsp.org/hg/forks/gldragon/ && cd gldragon && make ASAN=1 (reveals itself best with ASAN it seems) && ./gldragon -g cube.scene
13:13 kusma: I don't relly have the time for that right now, but maybe some other day :)
13:13 ccr: with MESA_LOADER_DRIVER_OVERRIDE=zink etc
13:13 ccr: np :)
13:21 ccr: in any case, it does not seem like just memory corruption, the values are way too consistent for that. one thing I also noticed is that the nir_src_as_uint() call noted hit unreachable("Invalid bit size") in nir_const_value_as_uint() with bit_size being 96. so .. shrug.
13:22 ccr: and reverting 2f5d18403a4d51a2cd927c141884361850bad41d on top of head makes this issue go away, though also makes other weird things happen due to later changes :)
14:38 ajax: i kind of wish for a version of unreachable() that was a build error if the compiler could prove that the statement was in fact reachable
14:38 ajax: or even might be reachable
14:38 imirkin: mareko: you changed pipe_framebuffer_state to no longer take references on the surfaces right? (or the surfaces to not take references on resources)?
14:39 imirkin: ajax: that exists. #define unreachable() ; + -Werror
14:39 imirkin: (oh hm, actually i guess if unreachable does nothing, then the default case is still there...)
14:40 imirkin: mareko: i'm seeing an issue where e.g. a bo is bound to a framebuffer state, and then gets deleted by the dri2 st (due to an external destroyImage request) -- should nouveau be taking explicit references to bo's bound to the framebuffer now? or should dri2 st be more careful somehow?
14:50 imirkin: mareko: fwiw we do util_copy_framebuffer_state(&nvc0->framebuffer, fb); in ->set_framebuffer_state. is that not enough?
14:57 MrCooper: surely the driver needs to keep a reference while it has a pointer
14:57 imirkin: MrCooper: yeah, util_copy_framebuffer_state used to do that
14:58 imirkin: and i think it still does actually, since that's the cause of the surface destroy
15:04 imirkin: hrm. actually it might be even more subtle than that.
15:06 imirkin: there appear to be two resources. i don't quite understand what's going on ... ignore for now.
15:12 karolherbst: ehhhh
15:12 karolherbst: something broke glGetString(GL_EXTENSIONS)
15:15 imirkin: yeah, looks like my issue is with bo sharing within the same process. we get two bo's, and appear to be screwing up the "internal" refcounts somehow
15:16 karolherbst: yeah, I saw
15:16 karolherbst: ohh.. wrong channel
15:17 karolherbst: imirkin: are some gtf tests also crashing for you on calling glGetString(GL_EXTENSIONS)?
15:17 karolherbst: GTF-GL43.gtf33.GL3Tests.pixel_buffer_object.pixel_buffer_object_pointerv eg
15:17 karolherbst: tested here only with nouveau and iris
15:17 karolherbst: but could be my problem
15:18 imirkin: karolherbst: that usually points to some sort of context-level screwup
15:18 karolherbst: yeah...
15:18 karolherbst: ohh wait...
15:18 karolherbst: I am stupid
15:18 karolherbst: guess what I forgot to do
15:19 imirkin: overrides
15:19 karolherbst: nope
15:19 karolherbst: didn't update the external repos
15:19 karolherbst: maybe it doesn't change anything..
15:19 karolherbst: maybe it does
15:19 karolherbst: let's see
15:20 karolherbst: ahh no.. still crashes
15:20 imirkin: also iirc glx was broken in some of those branches
15:20 imirkin: i pushed a change 100 years ago to fix it, but it never made it to the branches
15:21 imirkin: although that results in asserts earlier than glGetString
15:21 karolherbst: heh....
15:21 karolherbst: wait..
15:22 karolherbst: ahhhhhh
15:22 karolherbst: gtf built aginst gles...
15:22 ajax: is that glGetString crashing by calling through a null fptr?
15:23 ajax: usually what that really means is MakeCurrent failed to do its job
15:24 karolherbst: sure, but the GTF is either compiled against a certain GLES version or GL
15:24 imirkin: gtf + gles is also not going to end too well
15:24 karolherbst: and has ifdefs around
15:24 karolherbst: imirkin: it does work, you just have to recompile with GLCTS_GTF_TARGET=gles2
15:25 imirkin: (esp for GL tests)
15:25 karolherbst: or glesNN
15:25 karolherbst: it's very stupid
15:26 karolherbst: they could abort or print an error or just not use ifdefs, but no, it just crashes
15:27 karolherbst: it's probably the 5th time I ran into this issue :D
15:27 bl4ckb0ne: xexaxo1: do you have plan to keep only one build system or should waffle build with both cmake and meson
15:32 imirkin: karolherbst: maybe after the 10th time, you'll make 2 separate build trees :)
15:34 karolherbst: maybe
16:37 milek7: invalid shader: failed to link program: error: uniform `lights' declared as type `light_s[8]' and type `light_s[8]'
16:38 milek7: what such error could mean? in GLES context
16:38 imirkin: sounds like an unfinished thought in the error
16:39 imirkin: however such linkage failures are indicative of two uniforms with the same name having different types in vert/frag programs
16:39 HdkR: Sounds like you have something like `lights_s[8] lights;` rather than `lights_s lights[8];`
16:39 imirkin: or that.
16:42 imirkin: anyways, if they do have the same-looking type in both places, it could be due to precision differences. e.g. if you have "precision mediump float;" in the frag shader but not in vert
16:42 imirkin: (i *think* that causes a link failure, but not 100% sure.)
16:43 milek7: what's weird that it links in GL
16:43 milek7: but GLES is picky about precision, so maybe that is it
16:43 imirkin: yeah, GLES tends to be a lot pickier
16:43 imirkin: the thinking being that GLES drivers can be lighter and do less conversion
16:43 imirkin: the end result being that GLES drivers are more complicated due to all the checking ;)
16:45 linkmauve: Is there an equivalent of GL_KHR_no_error for GLES?
16:45 imirkin: not sure, but i assume so
16:45 imirkin: it's an EGL/GLX-level feature
16:46 imirkin: https://www.khronos.org/registry/OpenGL/extensions/KHR/KHR_no_error.txt
16:47 imirkin: linkmauve: it has an ES ext number
16:47 linkmauve: Oh, indeed.
16:47 linkmauve: Thanks.
16:48 imirkin: but you need to create the context as a no-error context, with e.g. CONTEXT_FLAG_NO_ERROR_BIT_KHR, defined in EGL_KHR_create_context_no_error
16:49 milek7: https://gist.github.com/Milek7/0cc319fdcaf80ba205d6837eb2e49b08
16:50 imirkin: milek7: and same error?
16:50 HdkR: GLES is picky about precision because it matters there. On GL the qualifier is defined to be a no-op ;)
16:51 milek7: yes
16:52 imirkin: can you add a "precision highp int;" to both just in case?
16:53 milek7: huh, it links now
16:53 milek7: thanks
16:53 imirkin: that's slightly surprising
16:53 imirkin: but i'm not a GLES spec expert
16:53 imirkin: must be that ints default to diff precision in the two stages?
16:54 imirkin: iirc you HAVE to specify float precision in frag
16:55 HdkR: int default precision changes between VS and FS yes
16:56 imirkin: because why make people's lives easier
16:56 imirkin: when you can make it harder?
16:56 HdkR: mediump in FS, highp everywhere else
16:56 HdkR: `4.7.4. Default Precision Qualifiers` in GLES 3.2 spec talks about it
17:03 marex: imirkin: hi, back to the texcoord/etnaviv topic ... when running the eglretrace, the VS outputs are VARYING_SLOT_{POS,COL0,PSIZ}, and PS inputs are VARYING_SLOT_{COL0,TEX0} , do I understand it correctly that I need to bind VARYING_SLOT_{POS,PSIZ} to VARYING_SLOT_TEX0 ?
17:04 imirkin: no
17:04 imirkin: i mean - you can do it however you want
17:04 imirkin: but what this is saying is that the TEX0 coordinates need to be replaced with the point sprite coord data
17:05 marex: imirkin: I think I might need some explanation of this
17:05 imirkin: imagine a different setup
17:05 imirkin: let's say you have a vertex shader which outputs gl_TexCoord[0]
17:05 imirkin: i.e. VARYING_SLOT_TEX0
17:05 imirkin: and a fragment shader which consumes VARYING_SLOT_TEX0
17:06 marex: then these two get linked together, that much I understand ... I think
17:06 imirkin: when you're drawing with points, and point sprite coord replacement is enabled, and tex0 is meant to be replaced
17:06 imirkin: then that frag shader's TEX0 input should be replaced with the point coord data
17:06 imirkin: irrespective of the fact that the vs actually does output a TEX0
17:07 imirkin: SLOT_POS == gl_Position, SLOT_PSIZ == gl_PointSize
17:07 imirkin: so none of those.
17:10 marex: imirkin: I think I will come back with more questions later
17:10 marex: imirkin: in fact, is there some good introduction to all this I can read ?
17:10 marex: imirkin: I feel like I'm trying to figure out something and I'm starting in the middle
17:11 imirkin: you're thinking of this in the modern world
17:12 imirkin: but point size replacement is from the old world
17:12 imirkin: of fixed function shaders/etc
17:12 marex: imirkin: I am trying to figure out how to think about it really
17:12 imirkin: now you'd just use gl_PointCoord and moveon with life.
17:12 imirkin: but in a fixed function shader, you HAVE to use gl_TexCoord
17:12 imirkin: because that's the only way to texture things
17:13 imirkin: so then you want to texture a point
17:13 imirkin: but a point is a single point
17:13 imirkin: so you can't do varying interpolation
17:13 imirkin: to generate a texture over an area
17:27 ccr:feels like a moron after remembering that he had b_ndebug=true set ..
17:28 Lyude: ccr: in igt, I actually added a check to stop the build from working in meson.build if b_ndebug is set
17:28 Lyude: you're not the first one to make such a mistake :P
17:28 marex: imirkin: what I still don't quite understand is how I get the W=1 into the PS
17:28 imirkin: marex: well, the specific mechanism is your problem
17:28 marex: imirkin: the hardware can only replace X,Y components of the vec4, but Z,W are 0
17:28 ccr: Lyude, :)
17:28 imirkin: marex: right
17:28 imirkin: marex: also ... you're supposed to be able to flip
17:29 imirkin: marex: i.e. replace it with either (s,t,0,1) or (s,1-t,0,1)
17:29 imirkin: so if the hardware's support for this is limited, then you need to touch it up with shader variants
17:29 ccr: Lyude, I was momentarily wondering why NIR_PRINT=1 did nothing .. then "hmm .. they're behind #ifndef NDEBUG" :P
17:30 marex: imirkin: well if I had a PS input set to constant 0,0,0,1 and let the HW replace XY in that, that might work ?
17:31 imirkin: marex: can you do the 1-t replacement?
17:31 imirkin: or will it only replace with t?
17:31 marex: imirkin: I am not sure about that yet
17:32 imirkin: i can't remember how you actually flip it...
17:32 imirkin: maybe you can't, and it's only when you're drawing on winsys vs fbo? dunno.
17:33 hakzsam: dcbaker: can you also push https://gitlab.freedesktop.org/mesa/mesa/-/commit/8882abe47eb79f2975762343ed1dc596f45d2602 to staging/21.0 please?
17:35 dcbaker: hakzsam: sure, I'm cutting rc2 right now, and I'll pull that into the branch as soon as I'm done
17:36 hakzsam: thanks
18:01 MrCooper: anholt: let me know how you'd like me to deal with arm64_a530_piglit_shader re https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7988#note_772083
18:03 anholt: just smash in the new test list until we can get piglit fixed.
18:03 ccr: kusma, now that I thought to enable NDEBUG, the problem is caught with a "../src/compiler/nir/nir.h:2524: nir_src_comp_as_uint: Assertion `nir_src_is_const(src)' failed." at shader compilation phase ( gdb backtrace http://paste.debian.net/1181949/ ), and bisecting points to https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7079
18:34 anholt: 13 minutes of >100% cpu in python to piglit quick --dry-run on cheza.
18:34 anholt: without even processing real test results
18:34 imirkin: i usually rm -rf the vs_in stuff
18:34 imirkin: and another big source of pointless tests
18:35 anholt: true, we do have that ripped out of ci, so my local build is worse than ci
18:35 anholt: but I wonder if we might be able to just ditch process isolation false in ci if we fix our runner.
19:22 jenatali: Anybody want to provide a review/ack for a one-liner MSVC compiler bug workaround, which probably needs to make it to 21.0? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8581
19:24 HdkR: Cute little ICE
19:24 jekstrand: jenatali: My inclination is to say, "fix MSVC" :-P
19:24 jenatali: It's not my fault! I filed the bug :)
19:24 ccr: destroy MSVC!
19:25 HdkR: "Why not both?"
19:25 imirkin: jekstrand: like gcc, presumably some version is already shipped and has the bug though?
19:25 jekstrand: I mean, it's a harmless change to Mesa but uh... seriously?
19:26 jenatali: I mean, I can carry the patch locally, but we need to build x86, and we use MSVC, and it's probably going to take some time to get fixed (and for us to get an updated compiler...)
19:26 jekstrand: jenatali: Yeah, I get it
19:27 jekstrand:still shudders thinking about back in the Fedora 22 days when neither GCC nor clang could build the Vulkan CTS correctly.
19:28 imirkin:remembers gcc "2.96" aka "let's take a random CVS snapshot and ship it", which optimized some for loops by removing them
19:28 ccr:also remembers gcc "2.96"
19:29 jenatali: O.o
19:29 imirkin: admittedly the program *did* run faster
19:29 jekstrand: deletion is a perfectly valid optimization for some for loops...
19:29 ccr: code not generated is code not run!
19:29 bl4ckb0ne: ship it
19:29 imirkin: it wasn't an actual release, but iirc RH6.0 included it? something like that.
19:30 ccr: yep, it was around those times
19:31 ccr: ye olde 2.95 stagnation, egcs fork, etc.
19:32 airlied: linuxthreads vs nptl
19:33 bnieuwenhuizen: jekstrand: probably worth calling out which MSVC version
19:34 bnieuwenhuizen: .. or I need to read the bug nvm
19:34 jenatali: bnieuwenhuizen: I'm assuming you meant me? It's the latest current version, 16.9.4 IIRC
19:34 bnieuwenhuizen: eh oops, yeah
19:34 jenatali: Ah, 16.8.4
19:38 imirkin: wow, those version numbers have gone up a lot
19:42 jenatali: Oh that's the VS version. The MSVC version is 19.28.29336. I can never keep track...
19:44 marex: imirkin: does this https://gitlab.freedesktop.org/marex/mesa/-/commit/c007dc545d8377f789b3cecb9ab49db3971ffa07 look somewhat sane approach to you ?
19:56 mattst88: airlied: not sure if you ever figured out a better way of filtering gitlab mails in gmail, but I realized last night that gitlab actually sets List-Id: based on the project
19:56 mattst88: e.g., List-id: mesa/mesa <...>
19:56 mattst88: so you can filter based on Has words list:mesa/mesa
19:57 mattst88: not sure if it's been doing that all along or whether that's a newish feature
19:57 mattst88: as far as I can tell, there's no way to filter based on X-Gitlab* headers in gmail
19:59 xexaxo1: bl4ckb0ne: considering the meson issues I've spotted/fixed recently, I'd rather keep both in the short term at least
20:00 bl4ckb0ne: makes sense
20:02 airlied: mattst88: interesting, I should update my filter which is currently "View it on gitlab" mesa :-P
20:03 mattst88: haha
20:04 bl4ckb0ne: xexaxo1: cant give much help on the mingw issues for 2.0 though, i dont have any windows computer
20:05 xexaxo1: bl4ckb0ne: don't know if I have one either. just for lols, I wonder how well it'll work with ReactOS
20:17 imirkin: mattst88: just hope you're not involved in 2 identically-named projects on different gitlab instances!
20:45 danvet: mattst88, I think that's been around since a while, at least all my gitlab filters use that
21:05 mattst88: imirkin: haha, that's where the From: gitlab@gitlab.freedesktop.org comes in!
21:05 imirkin: phew, i was worried!
21:05 mattst88: danvet: dang, can't believe I didn't notice a couple of years ago when I was bitching about not being able to filter stuff :)
21:06 jekstrand: I don't know how my filters work. I just let gmail figure it out and it was fine.
21:07 mattst88: oh, like using the "Filter messages like this"
21:07 jekstrand: yup
21:07 mattst88: yeah, it appears to figure it out -- list:(176.mesa.mesa.gitlab.freedesktop.org)
21:07 mattst88: damn
21:07 danvet: yeah that one is definitely pretty old
21:08 jekstrand: You're clearly trying too hard. :-P
21:08 mattst88: yep
22:59 marex: is there any usable documentation for NIR ?
22:59 jekstrand: Not nearly as much as you want
23:00 marex: sigh
23:00 jekstrand: If you want to know about ALU opcodes, nir_opcodes.py is where you want to go. For every opcode, it has a C expression which both describes the opcode's semantics and is used for constant folding.
23:00 imirkin: but a lot more than for the underlying hw you're developping on
23:00 imirkin: so at least that's nice :)
23:01 jekstrand: If you want to know about intrinsics, nir_intrinsics.py is where you want to go. Most things have some amount of description though it can be sketchy.
23:01 imirkin: jekstrand: just curious -- ever run into issues with const folding != hw for things like sin/cos/etc?
23:01 jekstrand: Other than that, read code and ask questions here. You'll get the hang of it.
23:01 zmike: mareko: fyi here's what I have atm https://pastebin.com/9JVFmM3M
23:01 ajax: the best way to get usable documentation, sadly, is to write it.
23:02 jekstrand: imirkin: The graphics specs (as opposed to CL/compute) are all pretty clear that you can constant-fold with different precision than the hardware.
23:02 zmike: though I can't seem to get a genuine, optimized release build somehow, which is not ideal
23:02 imirkin: jekstrand: yeah, coz people who write games care about what specs say
23:02 imirkin: anyways, was just curious if it was something that came up
23:03 jekstrand: imirkin: At least we specify our behavior, unlike D3D where you just try it and see what happens.
23:03 imirkin: heheh
23:03 imirkin: try it on nvidia and see what happens
23:03 jekstrand: Right. Forgot to specify that part.
23:03 marex: jekstrand: I am just trying to set the W=1 after load_input which is VARYING_SLOT_TEX
23:03 imirkin: marex: that's not generically OK
23:04 imirkin: you should only do that when point sprite coord replacement is enabled and you're drawing points
23:04 jekstrand: marex: nir_vec(b, nir_channel(b, load, 0), nir_channel(b, load, 1), nir_channel(b, load, 2), nir_imm_int(b, 0))
23:04 marex: imirkin: so far I didn't even figure out where to put what
23:04 imirkin: i think you meant nir_imm_float(b, 1.0f)
23:04 marex: imirkin: so details like shader key and stuff are for later
23:04 jekstrand: imirkin: They're the same. :P
23:05 jekstrand: Oh, 1.0
23:05 jekstrand: nvm
23:05 marex: jekstrand: nir_channel ... hmmmmm
23:05 jekstrand: Yeah, you want nir_imm_float(b, 1.0)
23:05 imirkin: i mean, they're *close*...
23:05 jekstrand: I was reading 0.0
23:05 marex: and plug that into nir_ssa_def_rewrite_uses_after() somehow ?
23:05 jekstrand:has synchronization problems between eyeballs, brain, and hands. Data races abound. :-P
23:05 jekstrand: marex: Yup
23:06 marex: hmmmm, so at least I was heading in the right direction with this
23:06 marex: thanks
23:07 jekstrand: Usually, nir_ssa_def_rewrite_uses is sufficient but if you're modifying the result of something instead of replacing it, you want _after.
23:07 marex: how do I tell whether the src ssa is TexCoord in the lowering pass though ?
23:07 jekstrand: how do you mean?
23:08 marex: say I have this ...
23:08 marex: vec4 32 ssa_1 = intrinsic load_input (ssa_0) (0, 0, 160, 129) /* base=0 */ /* component=0 */ /* dest_type=float32 */ /* location=1 slots=1 */ /* gl_Color */
23:08 marex: vec1 32 ssa_9 = mov ssa_1.w
23:08 marex: vec4 32 ssa_2 = intrinsic load_input (ssa_0) (1, 0, 160, 132) /* base=1 */ /* component=0 */ /* dest_type=float32 */ /* location=4 slots=1 */ /* gl_in_TexCoord0 */
23:08 jekstrand: It's really annoying if you do this after you've called nir_lower_io()
23:09 marex: jekstrand: I'm doing it in nir_lower_io
23:09 jekstrand: Uh.... Is that always valid?
23:09 marex: jekstrand: I have no idea
23:10 marex: I'd be pushing patches if I had a clue :)
23:10 jekstrand: Let's back up. What driver are you hacking on and what problem are you trying to solve?
23:10 marex: etnaviv
23:11 marex: for TGSI I have this https://gitlab.freedesktop.org/marex/mesa/-/commit/c007dc545d8377f789b3cecb9ab49db3971ffa07
23:11 jekstrand: Oh, good. GLES2. That makes everything easier. :)
23:11 marex: the hardware can only replace XY coordinates of texcoord
23:11 marex: gles1
23:11 marex: so I need to patch ZW (well, W only, Z is already 0) in FS
23:12 marex: the mesa fixed pipeline generates the shader code
23:13 imirkin: jekstrand: he's trying to make point sprite coord replacement work
23:13 imirkin: the hw can only replace texcoord.xy, not zw
23:13 jekstrand: yup
23:13 anholt: For hardware where we have to do spite-coord-mask-based recompiles, we should really make a mesa/st variant to just lower it away to pntc access.
23:13 imirkin: (and i'm pretty sure it can't even replace .y correctly for the flip case)
23:14 marex: imirkin: is there a test for the flip case ? if so, I can look at the command stream coming from the blob and see what hte blob does
23:14 anholt: marex: note that sprite_coord_enable should be a bitfield, not a bool (see p_state.h)
23:15 imirkin: marex: yes, there's a flip case, where instead of T, you replace with 1-T
23:15 imirkin: oh. a test for it.
23:15 imirkin: probably.
23:15 zmike: mareko: actual release build https://pastebin.com/fBxSE4Gn
23:15 marex: imirkin: some piglit or deqp test
23:15 jekstrand: If you were doing this before nir_lower_io, I'd suggest you look at nir_lower_fragcoord_wtrans.c
23:15 imirkin: sec
23:16 zmike: dcbaker: I think this is a ping for you: if I configure mesa for a debug build, I cannot switch to a release build without deleting my build directory
23:16 jekstrand: If you can do it before nir_lower_io, I'd recommend that.
23:16 zmike: it will always do a debug build
23:16 zmike: is this known or should I file a ticket?
23:16 marex: jekstrand: why dont I just call that like lima driver does then ?
23:16 marex: jekstrand: let me read that
23:16 imirkin: marex: hm, GL_POINT_SPRITE_COORD_ORIGIN is the thing
23:16 anholt: marex: wondering if you really need ARB_point_sprite on etnaviv, given existing bugs. you can disable it with PIPE_CAP_POINT_SPRITE in your screen.
23:17 imirkin: marex: however that appears to only be in GL 3.2
23:17 dcbaker: zmike: i haven't heard anything about that, I'd say file an issue
23:17 anholt: Oh, right. need it for gl2
23:17 zmike: dcbaker: 👍
23:17 imirkin: however i think st/mesa still uses that functionality for handling the coord difference on winsys vs fbo buffers
23:18 marex: imirkin: the GPU does not support that , super :)
23:18 jekstrand:is so glad Vulkan doesn't have flipped winsys FBOs
23:18 imirkin: ES2 has gl_PointCoord. not sure how it reacts to this sort of thing
23:18 imirkin: jekstrand: yeah, that's a ridiculous quirk
23:18 imirkin: i don't think DX has that either
23:19 marex: imirkin: in gles2, the pointcoord is handled as pntc
23:19 imirkin: marex: right. but should the y pointcoord be getting flipped for a winsys fbo? not sure.
23:19 imirkin: maybe st/mesa handles it differently for PNTC actually
23:19 imirkin: i.e. same way it does with a few other things
23:24 marex: jekstrand: but nir_lower_fragcoord_wtrans.c checks for load_frag_coord intrinsic, in my case the intrinsic is load_deref
23:24 jekstrand: marex: It's got two paths, one for load_frag_coord and one for load_deref
23:38 marex: jekstrand: is there some way to dump the NIR code between the different lowering passes ?
23:38 jekstrand: marex: NIR_PRINT=1
23:38 marex: I think I might be getting there, somehow, but I would like to verify I didn't do something random
23:38 marex: ah
23:41 marex: wow
23:41 jekstrand: Yeah, there are a LOT of passes. :)
23:43 marex: jekstrand: I think I completely missed the part about the passes
23:51 milek7: hmm it seems I'm getting GL_INVALID_OPERATION from glGetError, but no output from MESA_DEBUG=1
23:51 milek7: is that supposed to happen?
23:57 imirkin: milek7: depends on your build
23:57 imirkin: milek7: MESA_DEBUG=1 doesn't do anything btw
23:57 imirkin: milek7: simplest thing is to run in gdb, and breakpoint in _mesa_error