00:00 imirkin: airlied: ok, looks like his stuff is largely related to (surprise surprise) HDR. that's about 5 steps ahead of me.
00:02 imirkin: i'm just trying to get things up and running. worrying about colorspaces / whitepoints is ... a lot fancier.
01:09 Lyude: maxwell2 starts at nv120 right
01:11 skeggsb: Lyude: yeah
01:33 imirkin: wow, impressive patch on-list for a first contribution
03:08 Lyude: imirkin: is that re: clockgating?
03:08 imirkin: no, re some guy messing around with the disp sor assignment internals
03:08 Lyude: ahh, ok
03:08 imirkin: it'll ultimately probably not go upstream, but it shows a higher-than-usual level of competence :)
05:58 Benau: imirkin will you have the interests to look at this trace (recorded in nouveau gt240)
05:58 Benau: https://github.com/supertuxkart/stk-code/issues/3058#issuecomment-362989033
05:58 Benau: seems that texture buffer object is not properly supported
08:59 pmoreau: Lyude: Which branch to use on your GitHub for the latest clock gating version? I have been using wip/kepler+-clockgating-v1r5, but I don’t get anything printed saying clock gating is enabled, and that branch does not contain the 5th patch you were talking about earlier (that would enable the kernel param)
09:07 pmoreau: Lyude: Nevermind, I switched to the series on the ML instead.
11:49 imirkin: Benau: file a bug so i don't forget. that does seem odd. seems to work OK with a kepler board, so probably a nv50-era-specific bug
11:49 imirkin: Benau: do you see any errors in dmesg perchance?
11:49 imirkin: (like invalid opcode, or who knows)
11:49 Benau: wait
11:49 imirkin: of course you report this literally the day after i unplug my G92...
11:49 imirkin: generically speaking, TBO's should work btw
11:50 imirkin: so it's more that there's something odd going on here
11:50 imirkin: than TBO's not working at all
11:52 imirkin: my *guess*, without really doing any serious analysis, is that we're missing a flush
11:54 Benau: well no shader compile error
11:55 imirkin: Benau: are you able to apply a quick patch?
11:55 imirkin: (to mesa)
11:55 Benau: yes surd
11:56 imirkin: actually ... hm. i don't even know what i'd flush.
11:56 imirkin: TIC/TSC cache is still valid...
11:57 imirkin: can you just replay the trace with ST_DEBUG=flush ?
11:57 imirkin: i.e. ST_DEBUG=flush glretrace stk.trace
11:57 imirkin: or wait. not ST_DEBUg. MESA_DEBUG
11:58 imirkin: only works in a debug build though
12:00 Benau: after 4 hours till i'm back home
12:01 Benau: don't worry i have all the tools to build a debug mrsa
12:02 Benau: and btw i have get rid of all indirect draw call in stk, so you can remove some related hacks
12:02 Benau: (if sny
12:02 imirkin: robclark will be happy
12:03 imirkin: indirect draws should work fine on fermi+ though, and plain unsupported on tesla
12:04 Benau: and i see u can close the related bug about stk-indirect hangs
12:04 imirkin: those were fixed iirc
12:04 Benau: ok
12:04 imirkin: i was not aware of any outstanding issues with stk until just now
12:05 imirkin: i did know that it did a bunch of *weird* stuff which tripped up freedreno
12:05 imirkin: and there was a time when indeed it caused nouveau some grief, but that was like 2y ago
12:06 Benau: ok
12:06 imirkin: Benau: if you might also grab piglit, perhaps we just regressed TBO support. run some of the ARB_texture_buffer_object ( / ARB_texture_buffer_ranges) tests
12:06 Benau: ok can do later
12:06 imirkin: we have nothing in terms of CI, so it's all just manual testing, and older boards get less of it
12:07 imirkin: (read: none)
13:58 karolherbst: imirkin: the current gm107 emiter doesn't seem to like surface operations on MS images, but is something like "STORE IMAGE[0], TEMP[1].xyxw, TEMP[0], 2D_MSAA" legal in theory and we just have to work around something missing on the hardware or whatever?
13:58 karolherbst: it doesn't seem like that the PTX sust allows ms surfaces anyway
14:04 imirkin_: karolherbst: we don't expose MSAA images on maxwell+
14:04 imirkin_: it's legal in theory, we just don't support it
14:04 karolherbst: imirkin_: uhh, mhh the CTS still uses those
14:04 imirkin_: CTS bug
14:04 karolherbst: I see
14:04 karolherbst: thanks
14:04 imirkin_: or driver bug in exposing the wrong limits
14:05 imirkin_: is this CL or GL?
14:06 karolherbst: GL
14:06 karolherbst: KHR-GL45.multi_bind.dispatch_bind_textures
14:07 imirkin_: k. i know nothing about CL limits, so for all i know it's required there
14:07 imirkin_: hmmm ... i wonder if it does something sneaky. like bind a 1x-msaa image to a image2DMS or something.
14:07 karolherbst: I am sure that the CTS doesn't check for anything
14:07 karolherbst: mhh
14:07 karolherbst: not really I think
14:08 karolherbst: not quite sure though
14:08 karolherbst: external/openglcts/modules/gl/gl4cMultiBindTests.cpp
14:08 karolherbst: 3659+
14:09 karolherbst: uhm
14:09 karolherbst: it does have a max_image_samples though
14:09 imirkin_: so ideally max_image_samples == 1 for us.
14:09 karolherbst: ...
14:09 karolherbst: okay
14:09 karolherbst: CTS expects 0
14:09 imirkin_: (for maxwell+)
14:09 imirkin_: oh. maybe it's supposed to be 0
14:09 karolherbst: then it switches to 2D
14:09 imirkin_: rtfs :)
14:10 karolherbst: :)
14:10 imirkin_: min value for MAX_IMAGE_SAMPLES == 0, so yeah, it's probably right
14:11 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c#n91
14:11 imirkin_: change that to sample_count > 0
14:11 karolherbst: Passed
14:12 imirkin_: and i sent a patch for the enhacned layouts fail
14:12 karolherbst: yeah, I saw
14:12 karolherbst: will test it later
14:12 imirkin_: airlied was running into it on r600, figured i'd fix it on nvc0 :)
14:12 imirkin_: but it actually needs testing on fermi
14:12 imirkin_: since iirc fermi works slightly differently
14:12 karolherbst: I was thinking about how to deal with the fp64 stuff, because I don't look forward having 5+ asm files for that :/
14:12 imirkin_: could write the function with nv50 ir
14:13 karolherbst: CFG will kill us here though
14:13 imirkin_: nah
14:13 karolherbst: but if we are careful enough, this might work
14:13 imirkin_: like an actual Function
14:13 karolherbst: ohhh
14:13 karolherbst: and embeded it whenever it is needed?
14:13 imirkin_: no, called
14:13 imirkin_: same as before. out-of-line.
14:13 karolherbst: yeah right, but I meant the function itself
14:13 imirkin_: oh yes.
14:14 imirkin_: note that i've never tested this
14:14 karolherbst: mhh
14:14 imirkin_: so ... start with a small function which just returns the orig value
14:14 imirkin_: or something
14:14 karolherbst: yeah
14:14 imirkin_: before you go to a lot of trouble
14:14 karolherbst: or we wait until robclar finishes the fp64 emulation stuff and we just ask glsl to do that for us :p
14:14 karolherbst: robclark
14:15 imirkin_: iirc we do better than glsl could
14:15 karolherbst: ohh wait
14:15 karolherbst: airlied wanted to do that
14:15 karolherbst: ...
14:15 imirkin_: i mean, more optimally
14:15 karolherbst: yeah.... sure
14:15 imirkin_: e.g. rsq64h
14:15 karolherbst: mhh
14:15 karolherbst: but this seems to be not enough for the CTS afaik
14:15 imirkin_: right, but ...
14:15 karolherbst: sure
14:15 imirkin_: the impl that DOES work still makes use of those
14:15 imirkin_: as a better first guess.
14:16 karolherbst: I see
14:16 imirkin_: dboyan tested his impl extensively, and was within like 1 or 2 ULP's of the CPU result
14:16 karolherbst: anyway, need to get ready for the plane
14:16 imirkin_: kk. safe travels!
14:16 karolherbst: I will think that through and maybe I come up with a good enough result
16:50 karolherbst: got my devconf feedback today, seems like people were happy enough about it :)
16:55 karolherbst: imirkin: uhm.. my tested-by is for Pascal...
16:56 imirkin_: for what?
16:56 karolherbst: "nvc0: collapse output slots to have adjacent registers"
16:57 imirkin_: ok cool
16:57 imirkin_: it worked on kepler too
16:57 imirkin_: just need someone to test on fermi
16:57 imirkin_: since it definitely does things different than kepler
16:57 karolherbst: I see
16:57 karolherbst: RSpliet might be able to test it?
16:57 imirkin_: i never QUITE figured out how different
16:58 imirkin_: see commit 39df725f731f75f488c75a4910169beb352213fb
16:58 imirkin_: (that was a fun one to track down...)
16:58 RSpliet: There is a gf119 in my desktop machine currently. Low on time though...
16:59 karolherbst: imirkin_: ... meh
16:59 RSpliet: Something something getting a Phd on track and maintaining sanity rah rah rah
17:00 imirkin_: still at cambridge?
17:00 karolherbst: RSpliet: give up on sanity, it is only useless most of the time anyway
17:00 RSpliet: imirkin_: yep.
17:00 imirkin_: cool
17:00 RSpliet: karolherbst: "If you want to be number one, you need to be odd"
17:00 karolherbst: ....
17:01 imirkin_: or, to quote a horrible 80's movie, "If you want to be the best, you must lose your mind"
17:01 imirkin_: (not actually an 80's movie, more like 80's-style-movie)
17:01 karolherbst: RSpliet: At first I thought it was a random quote, but coming from you, there had to be a joke... which I just got a few seconds later
17:02 RSpliet: imirkin_: Quite related to that horrible 90's quote "Of all the things I've lost, I miss my mind the most"
17:02 imirkin_: that's a good one
17:06 Benau: imirkin MESA_DEBUG=flush glretrace stk.trace doesn't seem to work...
17:08 imirkin_: as in ... not at all, or just still misrendered?
17:09 imirkin_: also, can you describe the misrender?
17:09 imirkin_: i don't think you posted a screenshot
17:10 Benau: just like in the issue ticket (only mesh rendered with non-skinned mesh shader is displayed)
17:10 Benau: (which are the wheels)
17:10 imirkin_: and can you confirm that you don't see anything odd in dmesg while the trace runs?
17:10 Benau: wait
17:11 imirkin_: can you point me to the comment in your bug which contains what you see?
17:11 Benau: dmesg nor journalctl show anything useful
17:11 imirkin_: ok - that's good
17:12 Benau: https://github.com/supertuxkart/stk-code/issues/3058#issuecomment-362989033
17:12 imirkin_: do you just see what's in the very first screenshot in the github issue?
17:12 imirkin_: that contains a trace, not what you see it render
17:14 Benau: back
17:14 imirkin_: the link you gave points to your trace, not to what you see it render as
17:14 imirkin_: are any of the screenshots representative of what you see, and if so, which ones
17:15 Benau: you meant the link to the trace i recorded?
17:15 imirkin_: the github bug has a number of screenshots
17:15 imirkin_: are any of them representative of what you see when replaying the trace?
17:15 Benau: ohh
17:15 imirkin_: (i don't have the hw plugged in, so i can't check easily)
17:16 Benau_: seems that the motherboard doesn't like me run in naked without computer case
17:16 Benau_: hangs agian
17:16 Benau_: let me give you a screenshot of what i see
17:16 Benau_: wait
17:17 imirkin_: careful about the stress the GPU puts on the PCIe slot
17:17 imirkin_: if it has a big fan or whatever
17:17 Benau_: nope the gt240 i used is a "cheap" card
17:17 imirkin_: k. i have a gt240 with gddr5 which is 2-wide
17:22 Benau_: https://user-images.githubusercontent.com/3252341/35818844-fe85004e-0a98-11e8-9fd6-2e998e58cae1.png
17:22 Benau_: this is what i see now
17:23 imirkin_: nice, i like it ;)
17:23 imirkin_: are you a developer working on stk? or just an unlucky user?
17:25 Benau_: i'm the one of the developer of stk
17:25 imirkin_: oh cool
17:25 imirkin_: if there's any way you could try to narrow down the issue, that'd be super helpful
17:25 Benau_: and the kiki you see was made by me
17:26 imirkin_: i.e. is it really related to TBO's?
17:26 imirkin_: brb
17:27 Benau_: wait
17:27 Benau_: i now try to make it avoid upload the skin matrix
17:27 Benau_: every frame
17:29 Benau_: i think it's related to tbo mainly if i use gles build it's work fie
17:29 Benau_: i think it's related to tbo mainly if i use gles build it's work fine
17:29 Benau_: which in gles it use a 2d texture as skinning matrices
17:32 Benau_: and seems that even i don't upload the skinning matrices every frame it still doesn't show the animated model
17:32 Benau_: so maybe not flush related?
17:34 Benau_: also if i edit the skinning shader for joint_matrix = mat4(1.0); after the texelFetch the static pose of kiki is shown
17:35 Benau_: (texelFetch failed somehow?)
17:35 Benau_: https://github.com/supertuxkart/stk-code/blob/master/data/shaders/sp_skinning.vert#L111
17:41 Benau_: also even i sample texelFetch(skinning_tex, 0) it still shows nothing
17:49 imirkin_: Benau_: what if you have it just return vec4(1,1,1,1) instead of the texelFetch?
17:49 imirkin_: (i.e. i dunno that texelFetch is the actual issue.)
17:49 Benau_: well i think i do joint_matrix = mat4(1.0); after the texelFetch is the same?
17:50 imirkin_: oh, i didn't read the shader
17:50 imirkin_: sorry :)
17:50 Benau_: ok
17:50 imirkin_: hmmmm
17:50 imirkin_: and with the joint_matrix = mat4(1) it "works"?
17:51 imirkin_: (i.e. you get a white model or whatever it should be)
17:51 Benau_: no it show a static pose
17:51 Benau_: not animated
17:51 imirkin_: should it be animated?
17:51 imirkin_: (given that the joint_matrix == 1 always)
17:51 Benau_: yes if you get the correct matrices from skinning_tex
17:51 imirkin_: oh, what if you do
17:51 Benau_: just like the correct apitrace replay
17:52 imirkin_: joint_matrix = mat4(i_weight[0] + i_weight[1] + i_weight[2] + i_weight[3])
17:52 Benau_: wait
17:53 imirkin_: i.e. the equivalent of a vec4(1) being returned for each of the texelFetch's
17:54 imirkin_: also ... can i assume that there's no funny float business in that RGBA32F texture? like no denorms that you expect to come out as 0's, no nan/inf, etc?
17:55 Benau_: joint_matrix = mat4(i_weight[0] + i_weight[1] + i_weight[2] + i_weight[3]) shows a static pose
17:55 Benau_: and i sure that no nan in tbo
17:56 imirkin_: well phooey
18:18 Benau_: btw does mesa always unroll for (int i = 0; i < 4; i++) loop to the way like my skinning shader?
18:18 Benau_: (if 4 is constant)
18:23 Benau_: time sleep see ya tmr
19:33 dagb: imirkin_/Lyude: therm: Clockgating enabled
19:34 Lyude: dagb: nice :)
19:57 imirkin_: dagb: getting lower power usage?
20:16 dagb: imirkin_: nothing major at idle, anyway. Not that I expected it.
20:17 dagb: Powertop claims 19.8W at idle with 10% backlight.
20:23 imirkin_: o well
20:23 imirkin_: oh, but when the gpu's suspended, clockgating doesn't matter :)
20:25 karolherbst: well at 20W you want to hope that your GPU is still on :p
20:25 karolherbst: otherwise you have other serious problems