11:28 karolherbst: Lekensteyn: heh... I can reproduce the runpm bug on a desktop now
18:32 ReinUsesLisp: Hi, how does nouveau handle ARB_clip_control's flipped sampling on nvc0?
18:32 ReinUsesLisp: I'm not talking about the flipped rendering, I've seen that it uses a positive viewport instead of OpenGL's negative
20:35 imirkin_: ReinUsesLisp: nvc0 doesn't really care about any of that stuff
20:35 imirkin_: ReinUsesLisp: st/mesa normalizes it all, the values Just Work (tm)
20:36 imirkin_: the only bit of ARB_clip_control that requires driver support is the half-z stuff, which is a rasterizer setting (rast->clip_halfz)
20:36 imirkin_: basically it controls whether depth is -1..1 or 0..1
20:37 imirkin_: ReinUsesLisp: why do you ask?
20:37 imirkin_: (the hardware *does* have some setting to flip y coords somewhere, but we never use it)
20:38 imirkin_: (there's also a special reg where the setting of that value may be retrieved in a frag shader)
20:39 ReinUsesLisp: while emulating nvc0 hardware I'm getting "flipping" issues that are different on OpenGL and on Vulkan, on OpenGL at the moment I'm flipping gl_Position.y at the end of the vertex shader while on Vulkan I use a negative viewport
20:39 ReinUsesLisp: yes, Y_NEGATE flips the value on an S2R register
20:39 ReinUsesLisp: it doesn't seem to affect rendering though
20:39 ReinUsesLisp: what's the other register?
20:39 imirkin_: the Y_NEGATE is the thing i'm talking about
20:39 imirkin_: note that there's more to the coordinate flip
20:40 imirkin_: you also need to know the window width/height
20:40 imirkin_: since the ultimate coord is width - y
20:40 ReinUsesLisp: does Y_NEGATE affect rendering or just S2R?
20:40 imirkin_: i'm not 100% sure what the proper method of usage of this feature is
20:40 ReinUsesLisp: oh, we are not using the window coordinates at all
20:40 imirkin_: tbh, i am not 100% sure.
20:40 imirkin_: i've never use it
20:40 imirkin_: nor have i analyzed its usage
20:41 ReinUsesLisp: the blob uses Y_NEGATE on OpenGL to implement dFdy
20:41 imirkin_: i just know there's a flag for it on some method somewhere, and the special reg which retrieves its value.
20:41 imirkin_: for a winsys fb, yeah
20:41 ReinUsesLisp: I think it might also use it on NVN when it's using LOWER_LEFT
20:42 ReinUsesLisp: but don't quote me on that one since shaders are precompiled there
20:42 imirkin_: this is handled with uniforms in mesa
20:42 imirkin_: which get set based on the current state of things
20:42 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/state_tracker/st_glsl_to_tgsi.cpp#n1538
20:42 imirkin_: since it's generic code, it can't rely on the Y_NEGATE stuff
20:42 imirkin_: and it's not like this is perf-sensitive
20:43 imirkin_: so we've never bothered to care.
20:43 imirkin_: you also need this for interpolateAtOffset
20:43 ReinUsesLisp: how's the window height related to the viewport height?
20:44 imirkin_: i use them interchangeably
20:44 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/state_tracker/st_glsl_to_tgsi.cpp#n6415
20:44 imirkin_: er, i meant: https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/state_tracker/st_glsl_to_tgsi.cpp#n6334
20:44 imirkin_: which will internally affect gl_Position.xy
20:45 imirkin_: (er, just .y obviously)
20:46 ReinUsesLisp: nice, I'll investigate what does the blob do on OpenGL and NVN with window coordinates
20:46 imirkin_: and by gl_Position i of course mean gl_FragCoord
20:46 imirkin_: heh
20:46 ReinUsesLisp: hehe, I get you, it's the same abuf address
20:48 ReinUsesLisp: does st/mesa flip in-shader texture coordinates depending on its ARB_clip_control state?
20:48 imirkin_: no
20:48 imirkin_: it always multiplies by a uniform
20:49 imirkin_: which will either contain 1.0 or -1.0
20:49 imirkin_: depending on the flip state
20:49 imirkin_: so it's very much like Y_NEGATE
20:49 imirkin_: afaik, that reg will contain the float 1.0 or -1.0. not 100% sure though.
20:49 ReinUsesLisp: but, does it do that in the shader?
20:49 imirkin_: sure
20:49 imirkin_: look at the first link for e.g. dFdy
20:49 ReinUsesLisp: thanks, wanted to confirm
20:51 imirkin_: it's also done for interpolateAtOffset
20:51 imirkin_: and for adjusting gl_SamplePosition.y
20:51 imirkin_: (i think)
21:11 ReinUsesLisp: what does TRIANGLE_RAST_FLIP do? we are hacking it as a front face flip at the moment
21:14 imirkin_: no, i think that's something else
21:14 imirkin_: where is that?
21:14 ReinUsesLisp: https://github.com/envytools/envytools/blob/715cba01cb983fed0c856382a66943f734e6edc2/rnndb/graph/gf100_3d.xml#L692
21:15 imirkin_: so, faceness CW vs CCW is handled elsewhere
21:15 imirkin_: this, i can only imagine, means flipping the rasterization so that it goes bottom-up
21:16 imirkin_: and then you don't have to mess around with the height thing for gl_FragCoord
21:17 imirkin_: unfortunately the people who did the RE on all this are no longer around, and i don't quite know what it all means myself =/
21:17 imirkin_: maybe mwk knows?
21:19 imirkin_: ReinUsesLisp: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_state.c#n281
21:23 imirkin_: ReinUsesLisp: btw, as you guys discover stuff, let us know too :)
21:23 ReinUsesLisp: taking a look, the blob seems to use different VIEW_VOLUME_CLIP_CTRL's UNK11 values
21:24 ReinUsesLisp: yes, sure :P
21:24 imirkin_: the view volume thing is ... quite confusing
21:24 imirkin_: there's a value for a hard 0..1 view volume (on depth) which probalby makes sense for a winsys buffer.
21:24 imirkin_: dunno what that unk11 thing does precisely...
21:25 imirkin_: it's so hard to test these things.
21:27 Lekensteyn: karolherbst: huh, how? the link speed issue? How can you turn off the power without help from ACPI?
21:27 karolherbst: imirkin_: btw, I will buy a jetson nano
21:27 karolherbst: Lekensteyn: because it's just a pci config register to turn of the link
21:27 karolherbst: *off
21:27 karolherbst: anyway, I wasn't able to reproduce the issue afterall, by resume path was broken
21:27 karolherbst: _but_
21:27 karolherbst: I am able to shut off the link
21:28 karolherbst: Lekensteyn: ACPI just uses the bridge + 0x248 bit 0x80 to turn it off and 0x100 to turn it back on
21:28 karolherbst: and you can do the same on a desktop system
21:28 karolherbst: no idea if the power is cut as well though
21:28 karolherbst: the GPU isn't accessable anymore at least
21:29 karolherbst: anyway
21:29 karolherbst: that gave me an idea I want to test on my mac mini to just power down the nvidia GPU and see if that reduces heat generation :)
21:30 Lekensteyn: hm, if only documentation was available for those registers...
21:30 karolherbst: yeah...
21:30 karolherbst: _but_
21:30 karolherbst: the ACPI code of the desktop contains references to the Q0L2 field as well
21:30 karolherbst: so....
21:30 karolherbst: actually most of the code is the same as on a laptop
21:30 karolherbst: just the GPU power resource stuff is missing
21:31 karolherbst: Lekensteyn: only problem is, the CPU is a coffee lake one
21:31 karolherbst: so no idea if they just fixed the issue there
21:31 karolherbst: or it doesn't happen on a desktop
21:31 karolherbst: on cannon lake the issue is fixed for sure
21:31 karolherbst: I already tested it on two laptops
21:32 karolherbst: and runpm works on cannon lakes
21:32 karolherbst: so right now I blame sky lake and kaby lake
21:33 karolherbst: Lekensteyn: anyway, I will try and see if there are some erratas under NDA or something available and maybe we can get something worked out here
21:34 karolherbst: windows obviously doesn't show this issue
21:36 karolherbst: imirkin_: and I would like to have a post merge mesa CI runner running test on the nano
21:36 karolherbst: probably only the CTS for now as this is much simplier to manage than piglit
21:37 imirkin_: karolherbst: cool
21:37 karolherbst: and if that works out, I might buy two or three more
21:38 imirkin_: hopefully on RH's dime
21:38 karolherbst: the nanos are like $100 + shipping/taxes
21:38 karolherbst: sure
21:38 karolherbst: I already got informal approval for the one
21:39 karolherbst: and they don't have on-chip storage, so also a microSD card is needed, but those are quite cheap as well
21:39 airlied:expcets skeggsb has one sitting unused in a box :-P
21:39 karolherbst: :D
21:39 karolherbst: no idea, don't think so
21:39 karolherbst: or maybe
21:39 karolherbst: anyway, I would like to build some very basic CI for nouveau for that
21:39 karolherbst: and if that works out, it should be easier to get funding for other chipsets
21:39 karolherbst: just that the other chipsets require more money
21:40 imirkin_: what's the nano? maxwell?
21:40 karolherbst: yeah
21:40 karolherbst: https://developer.nvidia.com/embedded/jetson-nano-developer-kit
21:40 karolherbst: 128 cores :D
21:40 airlied: the biggest problem with CI is once youu have it you need throughput
21:40 airlied: at which point just getting an x86 machine and a GPU wins
21:40 karolherbst: well depends on what your goal is
21:40 airlied: esp if you are compiling on i
21:40 karolherbst: at first post merge is totally fine
21:41 karolherbst: ohh, I don't plan on compiling on that one
21:41 karolherbst: my initial idea is to build some docker images on some machine and just push them to the nano
21:41 karolherbst: and the nanos are just running those to run the tests
21:41 airlied: but CTS does some stupidly cpu intenseive crap as well
21:41 karolherbst: yeah
21:41 karolherbst: that's why multiple nanos
21:41 karolherbst: but
21:41 karolherbst: if the nano is just able to do one run per day, so be it
21:42 karolherbst: it's better than what we have today
21:42 karolherbst: and having a daily report on whether re regressed or not is good enough :)
21:42 karolherbst: anyway, it's $150 for getting something working
21:42 karolherbst: and if it doesn't work out, it's just $150
21:42 karolherbst: and I can use the board for other testing