01:40 imirkin: karolherbst: so?
03:41 Lyude: karolherbst: heh-I think I got the brightness issues fixed. I think the GPU just handles saving/restoring backlight state on the DPCD itself
03:41 Lyude: because the registers appear to be set when turning the screen back on, not sure about suspend/resume yet though
04:15 Lyude: oh i see, it's even weirder then that. you just enable the backlight before the panel is actually powered on?
04:16 Lyude:wonders if we're actually even talking to the panel connected to the gpu, or just some microcontroller talking to the actual panel
10:11 karolherbst: imirkin: ahh.. I setup everything on the jetson but forgot to run the tests... :D
14:14 karolherbst: imirkin: the tests fails on gm20b here
14:15 fincs: Is that nouveau userland + nouveau kernel driver?
14:15 karolherbst: yes
14:15 fincs: Wait a minute, gm20b is tegra
14:16 karolherbst: yes
14:16 karolherbst: I have a jetson nano
14:16 fincs: I thought tegra stuff use the Linux4Tegra + nvidia kernel bits?
14:16 karolherbst: we have support for that in nouveau
14:16 karolherbst: nvidia people worked on that :p
14:16 fincs: Oh
14:16 karolherbst: but yeah. I have my own kernel build
14:17 fincs: Is it possible to test with nvidia's kernel driver, in order to know for sure who is responsible for the test failing?
14:17 karolherbst: I already tested with nvidias driver on my gp107
14:17 karolherbst: and there it passed
14:18 karolherbst: ohh you mean nouveau userland + nvidia kernel driver
14:18 karolherbst: mhhh
14:18 karolherbst: no
14:18 fincs: Yes I mean that
14:18 fincs: nouveau userland + nvidia kernel
14:18 karolherbst: in theory nvidia userland + nouveau kernel driver would work, but I don't know where to get the proper binaries from
14:19 fincs: I should bite the bullet and finally get around rebasing the switch changes on top of a more recent mesa so that I can actually test imirkin's branch on Switch
14:19 karolherbst: but I doubt it supports gm20b anyway
14:19 karolherbst: fincs: where did you test it on before?
14:19 karolherbst: I thought the stuff worked for you
14:19 fincs: Yes, it worked for me but I only tested the compiler bits alongside my non-nouveau GPU code
14:20 karolherbst: ahh, I see
14:20 fincs: That's why I was saying that the compiler bits are correct in imirkin's branch
14:20 karolherbst: I did an mmt trace and gave it to imirkin, but he meant everything is fine in the driver as well :/
14:20 karolherbst: I am sure it's something stupid somewhere
14:22 fincs: Yeah it has got to be something stupid
14:23 fincs: I checked nouveau's gpu init code (nvc0_screen) yesterday too, and I couldn't find anything wrong with it
14:23 karolherbst: anyway.. I will test astc stuff now :)
14:24 fincs: Cool :)
14:24 karolherbst: we only enable it for gk20a right now
14:24 karolherbst: would be cool to enable it for gm20b as well
14:24 fincs: I'm sure it will work
14:24 karolherbst: ohh and etc as well
14:24 karolherbst: yeah... maybe
14:24 fincs: Because official Switch stuff uses ASTC
14:25 karolherbst: ohh, it's more about nouveaus code
14:25 fincs: However, ETC stuff seems to be missing from what official devs are expected to use
14:25 fincs: I still think it's there
14:25 fincs: But it would be nice to have actual confirmation
14:25 karolherbst: I shouldn't compile stuff on the jetson nano :D
14:25 karolherbst: it's so slow
14:25 fincs: IIRC you only need to change a single check to enable ASTC/ETC stuff in nouveau for gm20b
14:25 karolherbst: and even under full load it consumes like 5W
14:25 karolherbst: fincs: yeah.. but... there can always be random silly things
14:25 fincs: My default development workflow is cross compilation :p
14:26 karolherbst: cross compilation is annoying though :/
14:26 fincs: Nah
14:26 karolherbst: I'd rather setup a compile server
14:26 fincs: I eat crossbuilds for breakfast
15:40 fincs: Alright, that was an incredibly painful rebase
15:40 fincs: But it's done
15:40 fincs: But... does it build at all? ( ͡° ͜ʖ‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌├┬┴┬┴
15:49 fincs: "Dependency libdrm_nouveau found: NO found 2.4.75 but need: '>=2.4.81'" <-- heh, time to bump our fake libdrm_nouveau version I guess
15:50 imirkin: karolherbst: thanks for testing
15:50 imirkin: i sorta expected that, but nice to confirm
16:00 imirkin: karolherbst: do you need me to give you a patch for the ASTC/ETC bits, or you can work it out yourself?
16:00 imirkin: just a small change in nvc0_screen.c iirc
16:01 imirkin: the bigger deal is getting deqp built :)
16:01 imirkin: to test
16:01 karolherbst: yeah.. deqp is more annoying
16:01 karolherbst: I already know what to change in mesa
16:01 karolherbst: deqp is still building...
16:02 fincs: Hmm, what happened to #include "util/u_format.h"?
16:03 fincs: Getting non-existent header error in winsys/nouveau/switch
16:03 imirkin: it moved maybe
16:03 imirkin: the format stuff got a significant overhaul
16:03 fincs: Looks like it's util/format/u_format.h now
16:04 imirkin: btw, i wouldn't at all be opposed to having a second winsys upstreamed for nouveau
16:04 imirkin: i dunno how many changes you had to make to core stuff, but if it's reasonable and abstractable, this should all be upstreamable
16:05 fincs: It's non-trivial
16:05 imirkin: ok
16:05 fincs: https://github.com/devkitPro/mesa/tree/switch-19.0.0
16:05 karolherbst: fincs: well.. you essentially just need to port over to have the changes be a "platform" no?
16:06 fincs: Also, upstreaming stuff related to homebrew toolchains isn't exactly a great idea
16:06 fincs: Especially considering Nintendo is in Khronos
16:06 HdkR: Why not? Just stick some Switches in CI :P
16:06 fincs: Homebrew stuff as a whole isn't supposed to exist
16:06 karolherbst: fincs: well at least the offset stuff could be upstreamed
16:06 karolherbst: one patch less
16:06 fincs: Yeah there are small things that can be upstreamed
16:07 karolherbst: fincs: as long as you don't violate laws you are fine
16:07 karolherbst: even patent won't matter
16:07 karolherbst: *patents
16:07 fincs: Technically running custom code on Switch violates DMCA
16:07 karolherbst: and?
16:07 imirkin: ... but also don't put yourself in legal jeopardy
16:07 karolherbst: you don't have to ship binaries with this being enabled
16:07 imirkin: ultimately cases are won not by who's right, but by who has more money
16:07 karolherbst: yeah, that's true
16:07 fincs: That's the thing
16:08 karolherbst: fincs: but the mesa bits won't fall under dmca
16:08 imirkin: and while i don't know your financial situation, i hope you won't be offended by the suggestion that nintendo has more :)
16:08 karolherbst: as there is literally no code to get around anything
16:08 fincs: Hmm meh, EGL internal interface changed yet again
16:09 karolherbst: you just target your own API, no?
16:09 karolherbst: how does nintendo come into play here? ;)
16:09 fincs: Ninty is in Khronos
16:09 karolherbst: and?
16:09 karolherbst: you still ohnly target your own API
16:09 karolherbst: *only
16:09 fincs: Our "own" API which runs on their device, which legally speaking shouldn't even exist
16:09 karolherbst: it could be an "embeded platform API for embeded devices" :p
16:10 karolherbst: and you call the platform embedded
16:10 karolherbst: or something
16:10 karolherbst: but mhh, the winsys code might be a bit more annoying as you'd target an aPI derived from nvidias stuff
16:11 fincs: The nvidia stuff is in our fake libdrm_nouveau
16:11 karolherbst: ohh.. I see
16:11 imirkin: fincs: btw, you should also get GL_EXT_texture_shadow_lod in the update :)
16:11 karolherbst: might be good to have a list of libnx APIs used by your stuff
16:11 fincs: And our fake libdrm_nouveau uses wrapper objects coming from libnx (our main switch support system library), which in turn talk to Nvidia objects through ioctl
16:11 fincs: imirkin: I need to catch up with stuff :)
16:11 karolherbst: well.. from inside mesa
16:15 fincs: "remove boolean from state tracker APIs" <-- lmao this is the reason why our EGL driver is so broken
16:15 karolherbst: fincs: why?
16:15 imirkin: that was me, right? :)
16:16 fincs: boolean != bool and compiler complains that the function pointer isn't the same type
16:16 imirkin: boolean = char
16:16 karolherbst: imirkin: that's not required
16:16 imirkin: there's a "typedef boolean char"
16:16 imirkin: while in the larger world it's not required, it definitely was the case in mesa.
16:16 karolherbst: their bool could be something else
16:16 imirkin: (and still is, i didn't kill all the usage, just in the common api's)
16:17 imirkin: bool is _Bool
16:17 karolherbst: fincs: what's wrong with your bool btw?
16:17 fincs: Hey, st_context_iface::flush got a new parameter, and this "notify_before_flush_cb" thing sounds handy for being a better place for our fence code... might consider in the future
16:18 fincs: Mkay, after many many fixes this built
16:18 fincs: ... But does it work? :p
16:18 imirkin: as we all know, anything that compiles *must* work.
16:18 HdkR: There is never a bug, only features
16:19 fincs: I had problems with EGL driver before, like over a year ago
16:19 imirkin: and boy does nouveau have features!
16:19 fincs: It built fine but blew up at runtime :)
16:20 fincs: pkgconfig .pc files look fine
16:20 fincs: libEGL.a looks a bit... slim
16:22 fincs: Hmm what happened to libmesa_util?
16:22 fincs: Did that get broken up or something?
16:22 imirkin: poof
16:22 imirkin: where'd it go?
16:23 fincs: Ah, it's now libmesa_common
16:24 fincs: Ah there we go, libEGL.a more like it
16:25 karolherbst: imirkin: the ETC and ASTC tests are all gles2+, right?
16:26 imirkin: in deqp? or piglit?
16:26 karolherbst: deqp
16:26 imirkin: deqp is all gles2+ yea
16:26 karolherbst: okay, cool
16:26 imirkin: but ... they might only be in some specific "api"'s tests
16:29 fincs: Aaaaand I get a bunch of NIR related linker errors, great
16:30 fincs: Time to debug this
16:30 imirkin: just pushed the NV_viewport_swizzle stuff to master
16:31 fincs: :)
16:32 imirkin: tagr: under the guise of making it work on GM20B ... I'm trying to get GL_NV_viewport_array2 working. I've done everything right, and yet it doesn't work. so ... clearly I didn't do something right. Can you see if there's a trick to making gl_ViewportMask[] actually work for viewports > 1? output 0x3a0 in vertex/tess/gs shaders.
16:33 imirkin: tagr: right now my thinking is that it's something in gr init
16:38 karolherbst: imirkin: huh.. astc doesn't seem to work with mesa master :/
16:38 karolherbst: the deqp tests are just failing
16:38 imirkin: karolherbst: uhhhhh
16:38 imirkin: that's surprising.
16:39 karolherbst: dEQP-GLES31.functional.copy_image.compressed.viewclass_astc_4x4_rgba.rgba_astc_4x4_khr_rgba_astc_4x4_khr.texture2d_to_texture2d eg
16:39 imirkin: used to work
16:39 imirkin: https://hastebin.com/uzahuduler.bash
16:40 karolherbst: ohhh
16:40 karolherbst: that's because GLX and EGL are still broken
16:40 karolherbst: tagr: ^^
16:42 karolherbst: mhhh.. maybe it would work with a newer kernel, but I highly doubt it
16:42 imirkin: older is more likely :)
16:42 imirkin: before the modifier stuff
16:42 karolherbst: ahh, yeah
16:43 karolherbst: but I think tagr debugged that once and wasn't able to reproduce it
16:43 karolherbst: or something
16:43 karolherbst: dunno
16:43 karolherbst: it's broken for me
16:43 fincs: Aaaaand finally, it links successfully
16:43 imirkin: karolherbst: can you just comment out the modifier junk?
16:43 imirkin: or is it kernel-level?
16:44 karolherbst: no, it's in mesa
16:44 fincs: Simple triangle program works on Switch emulator
16:45 karolherbst: imirkin: I just check if my kmsro branch still works
16:45 karolherbst: and just use that
16:46 fincs: Venerable es2gears program also works on emulator
16:49 fincs: Works on hardware
16:49 fincs: Later today I'll try testing the viewport mask stuff
16:50 fincs: Now that I have a working switch-mesa built off imirkin's branch :)
16:51 imirkin: cool
16:53 karolherbst: fincs: you should rebase more often :p
16:53 karolherbst: at least every latest release
16:56 karolherbst: imirkin: btw, the assert I hit is this one: deqp-gles31: ../src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c:258: nvc0_validate_fb: Assertion `!fb->zsbuf' failed.
16:56 karolherbst: I am wondering what we actually need to do to fix this issue
16:56 imirkin: it's not fixable
16:56 imirkin: that case is just not supported
16:56 imirkin: can't have linear scanout with zeta
16:57 imirkin: er, linear color
16:57 karolherbst: okay.. so what has the tegra driver to do to not run into this issue ;)
16:57 imirkin: probably copies
16:57 imirkin: can you try with DRI2?
16:58 imirkin: LIBGL_DRI3_DISABLE=1
16:58 karolherbst: deqp-gles31: ../src/gallium/state_trackers/dri/dri2.c:560: dri2_allocate_textures: Assertion `drawable->textures[statt]' failed.
16:58 imirkin: heh
16:58 imirkin: different error! :)
16:58 karolherbst: :)
16:59 karolherbst: at this point I'd really just merge the kmsro stuff because that's at least working and using the tegra driver has no benefit as it is today afaik
16:59 karolherbst: but...
17:00 karolherbst: tagr didn't want to switch over because we could make use of hw accel stuff for certain things
17:01 karolherbst: anyway, I don't really care how we fix this, just that it gets fixed and we don't leave a broken driver in the tree for over a year
17:06 karolherbst: maybe I rework the commit in a way so we can decide at runtime what to use... dunno how hard that would be though... probably very
17:07 imirkin: gnurou did send me a TK1 board
17:08 karolherbst: :)
17:08 imirkin: unfortunately any kernel i try running on it causes it to die once there's any real ethernet activity
17:08 imirkin: l4t seems fine
17:08 karolherbst: ehhh
17:08 karolherbst: my jetson nano is fine
17:08 karolherbst: with an upstream kernel
17:08 imirkin: so dunno if i'm doing something wrong
17:08 imirkin: or ... what
17:08 imirkin: but i never got it to work in a stable manner. a little unfortunate =/
17:08 karolherbst: maybe you use the wrong dts file :p
17:09 imirkin: this was a while ago, perhaps whatever the issue was is fixed now
17:09 karolherbst: imirkin: I have my own .config file though, maybe that would help
17:09 karolherbst: the tk1 and the nano might be not that different
17:09 karolherbst: uses r8169 on mine
17:10 karolherbst: I still run a 5.4 kernel btw... some arm tree actually, because of some... fixes
17:10 karolherbst: I should update my kernel :)
17:12 imirkin: yeah, it's a standard PCIe device
17:12 imirkin: but something dies
17:13 imirkin: and it seems to only happen when i start using nfs a lot
17:13 karolherbst: maybe not enough power
17:13 imirkin: i'm using the given power supply
17:13 karolherbst: I have this issue as well
17:13 imirkin: and it seems to work with l4t
17:13 karolherbst: yeah... probably not :p
17:13 karolherbst: l4t draws less power
17:13 imirkin: ah
17:13 karolherbst: because they are able to lower power consumption of the GPU ;)
17:13 karolherbst: but if you get cose to the limit the device can just shut off
17:13 karolherbst: now I am on PoE which has enough
17:13 karolherbst: USB only gave me like 2A
17:14 imirkin: huh, ok
17:14 imirkin: well there's a separate power brick that they gave me
17:14 imirkin: (that came with it)
17:14 karolherbst: and through PoE the board draws up to 4A or something
17:14 karolherbst: yeah.. maybe it's different in your case
17:14 karolherbst: the nano is super low power
17:14 imirkin: not sure the TK1 can run off PoE
17:14 imirkin: it's a "big" board... fan and everything
17:14 imirkin: mini-itx or whatever
17:15 karolherbst: my switch can deliver up to 30W :)
17:15 imirkin: make that nano-itx.
17:17 imirkin: anyways ... i hope tagr or skeggsb will have something clever to say about this gl_ViewportMask thing. so annoying to be SO close.
17:17 karolherbst: now the test passes with kmsro :)
17:18 imirkin: karolherbst: any problem with having both tegra and kmsro enabled? it'd pick tegra by default, but could force kmsro via GALLIUM_DRIVER or whatever?
17:18 karolherbst: they install into the same file.. but maybe
17:19 karolherbst: I don't really know how all of that works, but kmsro is a bit messy
17:19 karolherbst: imirkin: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2960/diffs
17:20 karolherbst: but yeah, maybe
17:21 karolherbst: anyway running the astc tests without hw support and later with hw support :)
17:25 karolherbst: imirkin: does deqp supports gbm btw? I kind of don't really know what their surfaceless target is...
17:26 karolherbst: would be cool if we could run deqp without having to spawn an X server
17:27 karolherbst: also.. do we even need kmsro/tegra for wayland? I think this is really only required for X, no?
17:42 imirkin: karolherbst: i think with the (semi)recent EGL work, surfaceless should work fine
17:42 imirkin: i.e. egl + surfaceless platform
17:42 imirkin: karolherbst: so ... my question is ... could you do that kmsro enablement WITHOUT nuking tegra?
17:43 imirkin: i think it would work fine
17:43 imirkin: it'd still load tegra_dri.so by defualt. but you could force kmsro with GALLIUM_DRIVER=kmsro
17:43 imirkin: the whole problem is happening due to less-copies, so it's a good problem to have
17:52 karolherbst: imirkin: kmsro would also install a tegra_dri.so file
17:52 karolherbst: because X tells how the driver is called ;)
17:52 imirkin: all the *_dri.so files are the same file
17:52 imirkin: they're hard-linked together
17:52 karolherbst: mhhh.. also true
17:53 imirkin: hwoever X would use the tegra driver
17:53 imirkin: but clients could use kmsro if they wanted
17:53 karolherbst: actually.. let me check that
17:53 imirkin: but would continue to use tegra if they didn't change GALLIUM_DRIVER
17:53 karolherbst: ahh, it's also true for kmsro.. so yeah
17:54 karolherbst: imirkin: well.. the problem is, it's only broken with X in the first place
17:54 karolherbst: and with wayland/gbm we don't need a wrapper driver
17:54 karolherbst: afaik
17:54 imirkin: but X itself isn't broken
17:54 imirkin: so no problem.
17:54 karolherbst: true
17:54 imirkin: i'm just saying - it's a way to add the option without changing the status quo
17:54 imirkin: then both options can be available at ocne
17:55 karolherbst: the thing is just, longterm it makes no sense to care about the tegra driver
17:55 karolherbst: if people switch to wayland, then it's pointless afaik
17:55 karolherbst: although.. not sure about xwayland here
17:56 karolherbst: maybe that was the reason tagr didn't see any issues, maybe it works with xwayland
17:56 karolherbst: mhhhh
17:58 karolherbst: imirkin: btw, even with kmsro dri2 is broken, so I guess something is just broken there
17:58 imirkin: could be yea
18:13 imirkin: heh. the sample shader in NV_geometry_shader_passthrough is not valid.
18:14 imirkin: it's supposed to have locations explicitly assigned, but it doesn't
18:14 imirkin: fincs: can you confirm?
18:16 imirkin: fincs: also ... stupid question ... what's the point of the passthrough GS? what can you do in a passthrough GS that you can't in an earlier stage?
18:18 karolherbst: imirkin: isn't the point of the extension to make it easier to write passthrough geometry shaders?
18:18 cwabbott: karolherbst: I only know this because I had to do the same thing with freedreno, but yes, surfaceless supported... gitlab-ci uses surfaceless so you can find the magic commandline arguments/env variables needed under .gitlab-ci/deqp-runner.sh
18:19 imirkin: karolherbst: in unextended GL, only GS can write viewport/layer, and only GS can "multiply" primitives
18:19 imirkin: karolherbst: however with GL_NV_viewport_array2, both of those are possible in all (non-frag) stages
18:19 karolherbst: ohh, I thought you were working on NV_geometry_shader_passthrough
18:19 imirkin: i am looking at it
18:20 imirkin: and trying to determine a use-case for when i'd use it in the presence of GL_NV_viewport_array2.
18:20 karolherbst: ahhh
18:20 karolherbst: cwabbott: okay, good to know
18:20 imirkin: only thing is if you have some weird calculation that you want to do once per primitive, instead of per vertex
18:20 karolherbst: the jetson is soooo slow :(
18:21 imirkin: but that calculation can't even read the whole primitive anyways...
18:22 karolherbst: imirkin: maybe the "Interactions with NV_geometry_shader_passthrough" part of the spec helps with your question?
18:23 imirkin: in NV_viewport_array2?
18:23 karolherbst: yes
18:23 imirkin: erm ... not at all
18:24 karolherbst: sad
18:24 imirkin: the question is, "why is the passthrough GS a useful construct given that NV_viewport_array2 exists"
18:24 karolherbst: ahhh
18:24 karolherbst: when do you need a passthrough gs?
18:24 karolherbst: I really don't know GL that well though
18:25 imirkin: a pure passthrough gs is needed when you want to specify layer/viewport for the primitive
18:25 imirkin: another common use-case is to "multiply" primitives, e.g. send a single prim to multiple viewports
18:26 karolherbst: mhhh
18:26 imirkin: aka the stuff that NV_viewport_array2 does :)
18:26 imirkin: which you can use in VS or TES
18:27 karolherbst: the 5th question in viewport_array2 also has some interesting information here
18:27 imirkin: again, can do that in any stage.
18:27 karolherbst: yeah.. mhh
18:27 karolherbst: dunno
18:28 imirkin: ohhhh wait
18:28 imirkin: wait
18:28 imirkin: wait.
18:28 imirkin: i think i totally misread this.
18:28 imirkin: the passthrough gs *does* have access to all the input vertices.
18:28 imirkin: but it just computes one set of stuff.
18:29 imirkin: yeah ok. and that also solves the arrayness issues. coz there are none.
18:29 imirkin: i like this.
18:29 imirkin: that makes a _lot_ more sense.
18:30 imirkin: and also makes it a lot easier to implement =]
18:30 karolherbst: imirkin: regarding the astc stuff.. can I speed up the testing by.. let's say I only check one tests or a subset? Cause deqp takes a _lot_ of time here
18:30 karolherbst: or maybe I should compile with opts
18:30 imirkin: there are approximately a million astc tests
18:30 karolherbst: yeo
18:30 karolherbst: yep
18:31 imirkin: you could manually select a bunch of them
18:31 karolherbst: mhhh
18:31 karolherbst: then I need the list first
18:31 imirkin: also i doubt the software decode of astc is very fast
18:31 imirkin: the list is in one of the deqp-master.txt files
18:31 karolherbst: ahhh
18:31 karolherbst: makes sense
18:31 karolherbst: so a release build will probably help a lot
18:31 karolherbst: of both, mesa and deqp
18:31 imirkin: there's also some option to deqp to make it dump a list
18:39 karolherbst: ahh yeah.. deqp was built with -O0 :(
18:39 karolherbst: mesa with -O2 at least
18:50 fincs: imirkin: "it's supposed to have locations explicitly assigned, but it doesn't" <-- Spec says "in a separable program"
18:51 fincs: "what's the point of the passthrough GS?" <-- Nvidia calls it Fast GS
18:51 fincs: And I guess they shouldn't be lying when they call something "fast"
18:51 fincs: I.e. there's probably some crazy fast path for passthrough GS
18:52 karolherbst: fincs: I think "fast" is related to writing code though :p
18:52 karolherbst: but yeah... there is hw support for it, no?
18:52 fincs: "related to writing code" <-- I don't think so at all
18:52 fincs: Yes, this is something special in hardware
18:52 fincs: Special flag in shader header + some regs for specifying the passthrough mask
18:52 karolherbst: could be faster then
18:52 fincs: "why is the passthrough GS a useful construct given that NV_viewport_array2 exists" <-- I think Nvidia intends gl_ViewportMask to be used from passthrough GS mainly
18:53 karolherbst: if it wouldn't be, they would implement it in the shader
18:54 karolherbst: on the other hand.. why waste transistors on something nobody uses :p
18:54 karolherbst: would be interesting to know if nvidia optimizes some gs shaders though
18:54 fincs: https://webcache.googleusercontent.com/search?q=cache:NYyIIwMPcBoJ:https://gameworksdocs.nvidia.com/GraphicsSamples/CubemapRenderingSample.htm+&cd=1&hl=en&ct=clnk&gl=us
18:54 karolherbst: if a gs does plain copies from an input, you could use gs passthrough on that, right?
18:55 fincs: Not sure if Nvidia autodetects it, probably not
18:55 karolherbst: if not, it's not worth it
18:55 imirkin: my confusion arose from thinking that the passthrough GS couldn't access the whole input primitive
18:57 fincs: Found this: https://on-demand.gputechconf.com/siggraph/2016/presentation/sig1609-kilgard-jeffrey-keil-nvidia-opengl-in-2016.pdf
18:57 fincs: ^ explains passthrough gs
18:58 karolherbst: imirkin: btw, did you test PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES?
18:58 imirkin: given that it _can_ access the whole input primitive, it makes sense why it's useful
18:58 imirkin: karolherbst: a bit, yeah
18:58 imirkin: and also the code seemed like it worked fine
18:59 karolherbst: imirkin: the doc seems a bit off
18:59 imirkin: ?
19:00 karolherbst: pipe_draw_info::draw doesn't exist, or does it?
19:00 imirkin: heh, no
19:00 karolherbst: :p
19:00 karolherbst: ohhh wait
19:00 karolherbst: it asked about start
19:00 karolherbst: ehh, my mistake
19:00 imirkin: :)
19:01 imirkin: karolherbst: oh hey ... on the nvidia blob, can you run that viewport mask test again
19:01 imirkin: but get rid of this line:
19:01 imirkin: #extension GL_ARB_fragment_layer_viewport: require
19:01 imirkin: i want to see if blob turns on viewport index when GL_NV_viewport_array2 is supplied
19:03 karolherbst: Failed to compile VS: 0(10) : error C7531: global variable gl_ViewportMask requires "#extension GL_NV_viewport_array2 : enable" before use
19:03 karolherbst: ohh wait
19:03 karolherbst: ahh
19:03 karolherbst: wrong thing
19:04 karolherbst: imirkin: Failed to compile FS: 0(9) : error C7532: global variable gl_ViewportIndexIn requires "#version 430" or later
19:05 imirkin: gl_ViewportIndex
19:05 imirkin: oh, that's their stupid error
19:05 imirkin: ok
19:05 imirkin: thanks
19:06 imirkin: you *do* have the #extension GL_NV_viewport_array2: require in there, right?
19:06 karolherbst: yes
19:06 imirkin: ok thanks
19:07 imirkin: i'll remove the frag enable for that stuff then
19:07 imirkin: it wasn't clear in the spec
19:15 karolherbst: *sigh* https://github.com/karolherbst/envytools/commit/0b5cff6b699dc4ba15870776a2529c3ed287b3e0
19:15 karolherbst: could have been so easy
19:16 karolherbst: but no, you start with the fastest :p
19:16 karolherbst: ohh
19:16 karolherbst: I messed up the variant in the new ones
19:17 karolherbst: https://github.com/karolherbst/envytools/commit/1ffd92f9b51609ac32b98aad83f819cda4b326dc
19:17 karolherbst: (verified on TU116 or so)
19:17 karolherbst: dunno if it's actually like this on TU10X mhhh
19:19 imirkin: lol
19:19 karolherbst: but yeah, those regs are still writeable :)
19:19 imirkin: what does GV100 do?
19:19 imirkin: G80:GV100 doesn't include GV100
19:19 fincs: Argh apparently I cannot GL today
19:20 imirkin: fincs: that's better than me ... i can't do GL any day.
19:20 karolherbst: imirkin: ohh, you are right
19:20 imirkin: karolherbst: better do like G80:TU102
19:20 imirkin: and then TU102:
19:20 imirkin: or TU102-, doesn't matter
19:20 imirkin: that way it's obvious you didn't miss anything
19:20 fincs: Where did I fuck up? v
19:20 fincs: https://hastebin.com/uzatunuxiv.cpp
19:21 karolherbst: imirkin: but we usually use the X- syntax, no?
19:21 imirkin: you wrote GL, that was your first mistake
19:21 imirkin: karolherbst: usually the : syntax
19:21 fincs: Yes I know
19:21 karolherbst: mhhh <bitfield pos="4" name="DEVID_WR" variants="NV40-"/>
19:21 imirkin: fincs: what's not working?
19:21 fincs: Displays nothing, triangle doesn't show up, only clear color
19:21 karolherbst: I only see the - one
19:21 imirkin: karolherbst: ah you mean for the infinite end? maybe you're right
19:21 karolherbst: yeah
19:21 imirkin: fincs: that's not ideal.
19:21 imirkin: mesa or blob?
19:21 fincs: mesa/nouveau on Switch
19:21 karolherbst: : works as well though
19:22 fincs: I've tried compiling existing examples and they work perfectly
19:22 imirkin: if it's a debug build, should print various "Mesa Debug" things for when you screw things up
19:22 fincs: It's not
19:22 karolherbst: I keep the commit local though until I can test on TU10x
19:22 imirkin: fincs: core or compat context?
19:22 fincs: Core 4.3
19:23 imirkin: you have to bind a VAO
19:23 imirkin: no VAO = no draw
19:23 fincs: lol seriously?
19:23 fincs: That's so stupid
19:23 imirkin: should have had an error to that effect too
19:23 imirkin: perhaps it's an insufficiently-debug build
19:24 imirkin: you can also hook into the KHR_debug things
19:24 imirkin: so you get side-reports of various errors / potential issues
19:24 karolherbst: okay, I can check that on a TU104 actually
19:25 fincs: Okay I bind the dummy vao and it shows up
19:25 fincs: Now to write the actual test
19:25 karolherbst: MESA_DEBUG=1 :p
19:25 imirkin: =]
19:25 karolherbst: or was it MESA_DEBUG=context rather?
19:25 karolherbst: dunno
19:26 imirkin: it just prints them no matter what with a proper debug build
19:26 imirkin: to stderr
19:26 karolherbst: yeah, that's true
19:29 imirkin: INTERESTING
19:29 imirkin: if i just enable the viewport index output
19:29 imirkin: in the omask
19:29 imirkin: but don't actually write to it
19:29 imirkin: then it doesn't work
19:29 fincs: Viewport mask + index for me results in a GPU crash
19:29 imirkin: let's see what happens if i do the write without the omask...
19:30 fincs: Okay
19:30 fincs: gl_ViewportMask[0] = 0xf;
19:30 fincs: With nouveau/mesa running on Switch, this doesn't actually broadcast
19:31 imirkin: heh
19:31 imirkin: so we're doing something wrong at the shader / context setup level
19:31 imirkin: that's just SUPER DUPER
19:31 fincs: Writing gl_ViewportIndex + gl_ViewportMask results in a GPU Crash
19:31 imirkin: fincs: with nouveau/mesa too?
19:31 fincs: And my application gets booted out with an error message
19:31 fincs: Yes
19:32 imirkin: can you try to "bisect" the setup
19:32 fincs: Wdym
19:32 imirkin: well
19:32 imirkin: presumably there are a bunch of differences between nouveau/mesa and the working thing
19:32 fincs: The working thing uses my API, which is based off NVN
19:32 imirkin: right
19:32 imirkin: but ultimately it generates a pushbuf
19:32 imirkin: which gets sent to the gpu
19:32 fincs: gl_ViewportMask[0] = 2;
19:32 fincs: This results in no primitives being output
19:32 imirkin: yes
19:33 imirkin: that's the behavior i see as well
19:33 fincs: gl_ViewportIndex = 1; gl_ViewportMask[0] = 2; <-- This also results in crash
19:33 imirkin: anyways, perhaps we can copy whatever setup is being done by the nvn-based thing
19:33 imirkin: or do you not have access to that pushbuf?
19:33 fincs: Here, have fun: https://github.com/devkitPro/deko3d/blob/master/source/maxwell/gpu_3d_base.cpp#L59
19:34 imirkin: cool thanks
19:34 fincs: I already tried playing around with this init without success, and I couldn't get my thing to fail
19:34 imirkin: but perhaps you can get the nouveau thing to work? :)
19:34 imirkin: or alternatively
19:34 imirkin: perhaps you can comment out random bits of it
19:34 fincs: Commented out every single unknown thing and it would still work
19:34 imirkin: oh, if it's the SPA version, i'll kill someone
19:34 fincs: !!!
19:34 fincs: It might as well be
19:35 fincs: Does nouveau set SPA version at all?
19:35 imirkin: no
19:35 imirkin: it's never mattered for anything
19:35 fincs: Okay
19:35 fincs: So that's what I'm going to test
19:35 fincs: Please hold on tight
19:35 imirkin: the SPA version in the shader header is also set to some old thing
19:35 fincs: Yes
19:35 imirkin: i tried increasing that without effect
19:35 fincs: And I "fixed" that
19:36 fincs: Let me try patching nouveau to set SPA version
19:37 imirkin: interesting
19:37 imirkin: w << CmdInline(3D, VertexProgramPointSize{}, E::VertexProgramPointSize::Enable{});
19:37 imirkin: i could imagine that mattering too
19:37 imirkin: unlikely tho
19:38 fincs: BEGIN_NVC0(push, NVC0_3D(0x0310), 1); PUSH_DATA (push, 0x0503);
19:38 fincs: That should do it, right?
19:39 imirkin: something like that
19:39 imirkin: but not that
19:39 imirkin: try SUBC_3D(0x0310)
19:39 fincs: Ah
19:39 fincs: Good catch
19:40 fincs: Building
19:40 imirkin: could also be something like this: w << Cmd(3D, SetRenderLayer{}, E::SetRenderLayer::UseIndexFromVTG{});
19:40 imirkin: i did think of that though
19:40 imirkin: and tried force-enabling it
19:40 imirkin: with no effect
19:41 fincs: No dice
19:41 fincs: Maybe spa needs to be set both in reg and in shader header
19:41 imirkin: probably does
19:41 imirkin: i thought you already fixed up the shader header though?
19:41 fincs: Yes but not in nouveau code
19:41 imirkin: o
19:42 fincs: I generate it with slightly different code
19:42 fincs: It's... complicated
19:42 fincs: What did you do to fix it? 0x20061 -> 0x60061?
19:44 fincs: ^ did that, still no dice
19:44 imirkin: i forget
19:44 imirkin: i tried a few
19:44 fincs: - vp->hdr[0] = 0x20061 | (1 << 10);
19:44 fincs: + vp->hdr[0] = 0x60061 | (1 << 10);
19:45 fincs: Hmm
19:45 fincs: Viewport near/far
19:45 fincs: Are they initialized at all?
19:45 fincs: Oh right, it works with ViewportIndex alone so that shouldn't be the issue
19:47 fincs: Argh, what are we missing
19:48 imirkin: well this is odd.
19:48 fincs: At least we now know there's nothing wrong with kernel side init
19:48 fincs: And the problem is 100% within nouveau userland
19:49 imirkin:has an idea
19:49 imirkin: so an interesting observation
19:50 imirkin: is that if i do the write to ViewportIndex "first"
19:50 imirkin: it doesn't help
19:50 fincs: For me it crashes both ways
19:50 fincs: Before and after
19:50 fincs: And like I said, compiler side of things is working
19:52 imirkin: yeah, no, i got nothin'
19:54 fincs: Let me try forcing VP_POINT_SIZE to 1
19:54 fincs: Nope, that's not it
19:55 imirkin: w << Cmd(3D, Unknown514{}, 8 | (getDevice()->getGpuInfo().numWarpsPerSm << 16));
19:55 imirkin: what about something like that?
19:57 fincs: That's one of the unknowns, I tried commenting it out in order to make it fail and it would still work
19:57 imirkin: well, initial context state could differ between nouveau and however that thing is set up
19:58 fincs: Initial context is the same for both nouveau and deko3d when running on Switch
19:58 imirkin: ah right, good point
19:58 fincs: (because they both run off Nvidia blob)
19:58 fincs: Let me try provoking vertex
19:58 imirkin: can you just comment out like half of that setup3DEngine thing
19:58 imirkin: and see if that causes this to break
19:58 fincs: Yeah that's what I did originally
19:59 fincs: Everything with "unknown" commented out
19:59 fincs: As well as *all* of the firmware calls
19:59 imirkin: so ... more commenting needed then :)
19:59 imirkin: that's why i just said bisect
19:59 imirkin: literally comment out half
20:00 imirkin: unfortunately some things are probably required for any rendering
20:00 fincs: That's also a problem
20:02 fincs: Let me try screwing around with one last thing: VIEW_VOLUME_CLIP_CTRL
20:02 fincs: I set that to 0x181D
20:03 fincs: Nope :\
20:04 karolherbst: imirkin: you have no further comments on the commit,r ight? I just confirmed it on a TU104: https://github.com/karolherbst/envytools/commit/1ffd92f9b51609ac32b98aad83f819cda4b326dc
20:04 karolherbst: that TU104 even boots with 2.5 :)
20:04 karolherbst: but I am too lazy to check all the other PCI regs :D
20:05 karolherbst: but mhhhhh
20:05 karolherbst: this TU104 officially is also only PCIe 3.0 compliant
20:05 karolherbst: ufff
20:05 karolherbst: I smell broken PCIe 4.0 support
20:07 karolherbst: maybe I should declare 0 as 8_0 as well mhhh
20:07 karolherbst: and add a TODO
20:14 imirkin: karolherbst: G80:TU102
20:14 karolherbst: ehhh
20:14 karolherbst: I forgot to push my local changes
20:14 karolherbst: ahh no, I just linked to the old version
20:14 imirkin: lgtm
20:16 fincs: Okay, time to comment out shit
20:22 fincs: I commented out almost *everything* except for the stuff I can't remove
20:22 fincs: And it still works
20:22 fincs: Absolutely minimal 3D init: https://hastebin.com/edufogatur.php
20:22 imirkin: ok
20:23 imirkin: so then it's the other way
20:23 imirkin: we're doing something to break it
20:23 fincs: The only other commands I'm calling are
20:23 fincs: dkCmdBufBindRenderTarget
20:23 fincs: dkCmdBufSetViewports
20:23 fincs: dkCmdBufBindRasterizerState
20:23 fincs: dkCmdBufBindColorState
20:23 fincs: dkCmdBufBindColorWriteState
20:23 imirkin: and i assume SetViewports doesn't do anything too funny?
20:23 fincs: (state setting commands)
20:23 fincs: https://github.com/devkitPro/deko3d/blob/master/source/maxwell/gpu_3d_base.cpp#L329
20:24 fincs: Just writes to ViewportTransform/Viewport registers
20:24 imirkin: right, just the standard stuff
20:24 fincs: And the rasterizer state:
20:24 fincs: https://github.com/devkitPro/deko3d/blob/master/source/maxwell/gpu_3d_state.cpp#L30
20:24 fincs: (ColorState is just blend, ColorWriteState is just enabling which colors to allow writing)
20:25 imirkin: yeah, nothing unexpected
20:25 imirkin: so nouveau is setting something it shouldn't
20:25 imirkin: that screws everything up
20:25 fincs: It's starting to look like it, yes
20:26 imirkin: i'm just going to start by disabling the magic 3d init :)
20:28 fincs: Magic 3d init is identical to mine, except for a single register which I tried copying over to my init and it would still work
20:40 imirkin: ok, i've cleared out basically all of the init we do in nvc0_screen, and other than breaking clears, it doesn't seem to have helped
20:41 fincs: :\
20:41 fincs: Are there any other places where nouveau does "magic" writes?
20:42 fincs: I recall fsh setup was full of magic
20:42 imirkin: don't think so... let's see
20:42 imirkin: BEGIN_NVC0(push, SUBC_3D(0x0360), 2);
20:42 imirkin: PUSH_DATA (push, 0x20164010);
20:42 imirkin: PUSH_DATA (push, 0x20);
20:42 imirkin: heh
20:42 imirkin: so we do
20:42 fincs: Yeah
20:42 fincs: But hmm, it's fsh magic though
20:43 fincs: For a simple fragment shader I write 0x087F6080 / 0x20 there
21:20 imirkin: welp, i've commented out nearly everything
21:20 imirkin: still no go.
21:57 fincs: :\
21:57 karolherbst: :/
21:57 fincs: I guess the problem would not be init but somewhere else
21:57 karolherbst: maybe it's indeed some gr stuff
21:57 fincs: State stuff
21:58 fincs: karolherbst: nouveau running on nvidia blob on Switch fails
21:58 karolherbst: well.. sure
21:58 karolherbst: or what did you test?
21:58 fincs: I finally tested switch-mesa
21:58 karolherbst: ohh...
21:58 karolherbst: I see
21:58 fincs: With imirkin's branch
21:58 karolherbst: ehhh
21:58 karolherbst: well, yeah, then it's something in mesa
21:58 fincs: It has to be, yeah
21:58 karolherbst: mhhh
21:59 karolherbst: imirkin: is there anything obviously missing from the mmt trace?
21:59 fincs: Could it be a GL bug?
21:59 karolherbst: might be that some ioctl is parsed incorrectly
21:59 karolherbst: but it looked fine
21:59 karolherbst: fincs: doubtful, why would it be?
21:59 karolherbst: those are shader only features, no?
21:59 fincs: Should be
21:59 fincs: But it also interacts with multiviewport
21:59 karolherbst: well.. sure
21:59 karolherbst: but I mean you just have to change some shader related bits
21:59 karolherbst: but mhhh
22:00 imirkin: i've commented out like 90% of non-essential things
22:00 imirkin: both init as well as state stuff
22:00 fincs: Would be nice to see a log of calls into nouveau functions
22:00 karolherbst: imirkin: in nvc0?
22:00 imirkin: yes
22:00 karolherbst: mhh
22:00 imirkin: nvc0_state, nvc0_state_validate
22:00 imirkin: nvc0_screen
22:00 imirkin: turned off compute
22:01 imirkin: disabled 3d blits
22:01 fincs: Hmm
22:01 karolherbst: GP104_3D.UNK0220[0x2] = 0x1 ?
22:02 karolherbst: what about this thingy
22:02 fincs: What does the fragment shader header + code look like?
22:02 imirkin: tried adding it.
22:02 karolherbst: mhhh
22:03 karolherbst: I have an idea
22:03 imirkin: another interesting point is that if i manually add in the viewport index write at the start
22:03 imirkin: and add it to the output map
22:03 imirkin: it doesn't help
22:03 imirkin: the write has to be RIGHT BEFORE the viewport mask write
22:04 imirkin: fincs: http://paste.debian.net/1140001/
22:04 fincs: At this point the only thing we haven't checked is the fragment shader
22:04 fincs: Okay, let me compare
22:04 imirkin: in the trace, that write has a (st 0xd)
22:04 imirkin: which i've tried replicating too
22:04 imirkin: as well as disabling the scheduler entirely
22:04 imirkin: and i've tried adding yields all over the place.
22:05 karolherbst: yeah.. that won't help :p
22:05 imirkin: i know
22:05 imirkin: i was desperate though :)
22:05 imirkin: it's clearly getting the value
22:05 imirkin: writing a 0 causes nothing
22:05 fincs: Wait a minute
22:05 fincs: Is that the fragment shader?
22:05 imirkin: oh oops
22:05 imirkin: no
22:05 imirkin: why does frag shader matter?
22:06 fincs: I want to see if I have it different
22:06 fincs: As in the header
22:06 imirkin: http://paste.debian.net/1140003/
22:07 imirkin: other than version stuff, looks the same
22:07 fincs: Yeah... the only thing is that I have 0x000FF000 at HDR[10] but that shouldn't matter...
22:08 imirkin: for VP
22:08 imirkin: i think that was necessary on like nvc0. or not even necessary but cargo-culted. dunno
22:08 fincs: I have that in the fsh too apparently
22:08 imirkin: uhhh
22:08 imirkin: well, it's 0 in the trace here
22:09 fincs: Well yeah
22:09 fincs: This is ignored in fsh as there is no output :p
22:09 imirkin: (i.e. trace of blob)
22:09 fincs: I was just lazy and initialized this field regardless of shader type
22:09 imirkin: hehe
22:10 fincs: Hmm, in that vsh header, you don't have that weirdo 0x02000000 bit set
22:11 fincs: In hdr[0]
22:11 imirkin: as you perhaps recall
22:11 imirkin: that was like the first thing i had played with
22:11 fincs: True
22:11 imirkin: but i'll set it just in case
22:12 karolherbst: ahh ehh, crap, deqp surfaceless doesn't build? :/
22:14 imirkin: used to...
22:14 karolherbst: well.. building master branch
22:14 imirkin: fincs: yeah, no help
22:15 imirkin: i've also set 0x0310
22:15 fincs: Hmm
22:15 fincs: I wonder if it's a shader setup issue
22:15 fincs: nouveau never writes to the VP_A regs right?
22:15 karolherbst: ohhhh
22:15 karolherbst: imirkin: the VK-GL-CTS is the upstream now
22:15 karolherbst: interesting
22:16 karolherbst: not deqp
22:16 imirkin: karolherbst: no, they're different
22:16 karolherbst: nono, really
22:16 karolherbst: I am serious about that
22:16 imirkin: VK-GL-CTS has ES-CTS tests
22:16 imirkin: (and GL-CTS obviously)
22:16 karolherbst: nope
22:16 karolherbst: VK-GL-CTS is the upstream now
22:16 karolherbst: even referenced from deqp docs
22:16 imirkin: ok.
22:17 imirkin: that's news to me.
22:17 karolherbst: https://source.android.com/devices/graphics/deqp-testing
22:17 karolherbst: "his page is not maintained and will be deprecated. See the Vulkan and OpenGL CTS Wiki for current information."
22:17 imirkin: ok
22:18 karolherbst: they even merge from there into the tree
22:18 karolherbst: interesting
22:18 karolherbst: they have a master and "upstream-master" branch
22:18 karolherbst: in deqp
22:18 imirkin: cool
22:18 imirkin: nice that it's all unified then
22:18 karolherbst: yeah
22:18 imirkin: again - news to me :)
22:18 karolherbst: yeah.. to me as well
22:18 imirkin: fincs: so ...
22:19 imirkin: i'm going to send you a commit
22:19 imirkin: can you see if running it makes it work with "nouveau"?
22:19 fincs: Okay
22:19 imirkin: the basic theory is that there are multiple things wrong.
22:20 imirkin: fincs: https://github.com/imirkin/mesa/commits/for-fincs
22:21 imirkin: i.e. we're screwing something up with init
22:21 fincs: "great hacks!" is the only commit right?
22:21 imirkin: and also some sort of underlying thing isn't enabled
22:21 imirkin: yes
22:21 imirkin: it might cause conflicts though. i commented out a lot of stuff.
22:21 imirkin: unfortunately comment-region comments out each line individually
22:22 fincs: Building
22:23 imirkin: i disabled a _lot_ of stuff, so this basically will only work with VERY simple applications
22:23 fincs: Aaaaaand I still only get viewport0 output
22:23 imirkin: bleh.
22:23 imirkin: what else is there left to remove?
22:26 fincs: Let me try setting dummy config for vertex shader A
22:28 fincs: ... I get a crash
22:30 fincs: Whoops, I fucked up the command list
22:30 fincs: Still broken with dummy vsh-A config added
22:35 fincs: Hmm, what's this "MACRO_TEP_SELECT" thing
22:37 fincs: As well as MACRO_GP_SELECT
22:37 fincs: Let me try disabling these macros (replacing them with straight up SP_SELECT(n) writes)
22:38 fincs: Aaaand no dice
22:40 fincs: What the fuck is there left to disable
22:43 imirkin: ;)
22:46 fincs: ... Could the drawing code be bad
22:46 fincs: As in
22:46 fincs: Draw calls
22:49 imirkin: sure, anything is possible
22:49 imirkin: it's just going through the regular nvc0_draw_arrays in this case
22:49 imirkin: (in my case at least)
22:50 fincs: Hmm, I don't see anything wrong :\
22:51 fincs: Maybe something in nvc0_draw_vbo?
22:52 imirkin: nothing jumps out at me
22:52 fincs: I don't even use vertex arrays in my test
22:55 fincs: Meh, I don't see anything suspicious either
23:00 fincs: Argh, this is probably something very stupid
23:00 fincs: Is it possible to trace the state that comes from GL at all?
23:00 fincs: As in, a full dump of the current GL state
23:01 imirkin: not sure what you mean
23:01 imirkin: like the cso's?
23:01 imirkin: GALLIUM_TRACE=foo.xml will trace all the gallium calls and dump various structures
23:01 fincs: Whichever thing gallium/mesa internally uses to keep track of GL state
23:02 karolherbst: nice.. you can get rid of 300 targets in deqp by removing the glslang and spirv-tools external stuff :)
23:03 karolherbst: imirkin: the actual reason I got curious here was that googles master tree only supports python2, but VK-GL-CTS aka upstream-master was python3+
23:03 imirkin: "GL state" a bit tricky
23:06 karolherbst: huh?!?
23:06 karolherbst: imirkin: the test hangs with nvidia now....
23:06 imirkin: karolherbst: which test?
23:06 karolherbst: the shader_test you linked
23:07 imirkin: i asked you to change something in it, right?
23:07 imirkin: i forget what
23:07 imirkin: oh no - you already did
23:07 karolherbst: yeah...
23:07 karolherbst: well.. nothing happens
23:07 karolherbst: it's odd
23:07 karolherbst: glxinfo does work, so.. mhh
23:09 fincs: I'm this close from just saying fuck it and dumping pushbuffers and parsing them
23:09 karolherbst: I have a libdrm pushbuf parser :p
23:09 karolherbst: https://github.com/envytools/envytools/pull/203
23:09 fincs: I have... a Switch emulator :p
23:09 karolherbst: :D
23:10 karolherbst: I bet that one would not help
23:10 fincs: With a tiny change I did which vomits everything submitted to the GPU
23:10 karolherbst: as I assume no game is using that
23:10 karolherbst: also
23:10 karolherbst: we already have nvidia traces
23:10 fincs: The problem is not what nvidia does
23:10 fincs: The problem is what nouveau does - there's something in here which is fucking things up :p
23:11 karolherbst: imirkin: do you have a libdrm debug build?
23:11 karolherbst: I can adjust depushbuf to correctly parse the full thing
23:11 karolherbst: if anything is missing
23:11 imirkin: ppprobably
23:11 karolherbst: :)
23:12 imirkin: how do i get ti to dump stuff again?
23:12 karolherbst: NOUVEAU_LIBDRM_DEBUG=255
23:12 karolherbst: ohh you have to manually add -DDEBUG to the build I think
23:13 imirkin: i have it.
23:13 imirkin: now what do i do?
23:13 karolherbst: pastebin it
23:13 karolherbst: there is probably still stuff missing in the tool, so I'd like to take a look first
23:13 imirkin: http://paste.debian.net/1140016/
23:14 karolherbst: ahh yeah :) I will need to adjust stuff I think
23:15 karolherbst: that's gp108, right?
23:15 imirkin: yes
23:15 fincs: Ok I have dumps
23:15 fincs: Now to parse them
23:15 imirkin: fincs and karolherbst ... race!
23:15 karolherbst: ahh yeah "unknown mode 4" :)
23:16 karolherbst: ohh 4 was this annoying one
23:16 imirkin: 1IC0?
23:16 imirkin: which one's 4?
23:17 karolherbst: yeah.. I think so
23:17 karolherbst: the one line long one
23:18 imirkin: how are you counting these?
23:18 imirkin: is it the one that starts with 0xa?
23:18 fincs: Got this parsed
23:18 imirkin: basically first command goes to addr+0, and all the rest go to addr+4
23:19 karolherbst: demmt has this mthd_data_available field for it
23:22 karolherbst: imirkin: it's NVC0_FIFO_PKHDR_IL
23:22 imirkin: oh
23:22 imirkin: that's just the immediate mode?
23:22 imirkin: i.e. the value is in the pushbuf
23:22 imirkin: er, in the command
23:23 karolherbst: yeah.. I think so
23:23 imirkin: instead of the size
23:23 karolherbst: so instead of the size you have the data
23:23 karolherbst: yeah
23:23 fincs: Lmao nouveau uploads program data using inline2mem?
23:24 imirkin: why not
23:24 fincs: I find that hilarious
23:24 fincs: I guess it makes more sense on PC
23:24 imirkin: on blob it uses p2mf -- that's the same thing, right?
23:24 fincs: (with the non-uniform memory model)
23:24 fincs: On Switch we have unified mem
23:24 imirkin: right, there could be a lot of optimizations for a UMA situation
23:25 imirkin: you want code in vram
23:25 imirkin: and you don't want the cpu to be sitting there waiting for pcie
23:26 fincs: ... This is only uploading viewport0
23:26 imirkin: huh?
23:26 imirkin: oh, initially
23:26 imirkin: it'll get the rest "later"
23:26 fincs: As in
23:26 fincs: It literally never sets that
23:26 imirkin: inconceivable.
23:27 fincs: This is with your "for-fincs" branch btw
23:27 imirkin: yeah, i'm running with basically that
23:27 karolherbst: imirkin: probably it still broken but... https://gist.githubusercontent.com/karolherbst/2a082b7492074a7e5485cef24c6e6fd5/raw/65628a055f9c7dd57724d560d48e64d5699bfb4c/gistfile1.txt
23:27 imirkin: something is off in your decoding, or something else
23:27 karolherbst: .. let me check
23:28 imirkin: that was for fincs
23:28 imirkin: although your decoding is also off :)
23:28 karolherbst: yeah
23:28 karolherbst: but
23:28 karolherbst: only the first pb :p
23:28 karolherbst: it gets better late
23:28 karolherbst: r
23:28 fincs: I got until the point where there is a draw call
23:28 fincs: So I should have it all
23:28 karolherbst: "UPLOAD.DST_ADDRESS_HIGH 0" is where it is right
23:28 karolherbst: and later
23:29 karolherbst: I think I messed up the mode 4 bits.. mhh
23:30 imirkin: fincs: you're right
23:30 imirkin: there's something in core mesa
23:30 imirkin: which detects usage of "later" things
23:30 imirkin: this is why adding the viewport index helped
23:30 imirkin: FUUUCK
23:30 imirkin: sorry
23:31 fincs: Hahahahahahahaha
23:31 fincs: I called it
23:31 imirkin: really very sorry i wasted a day of your time on it
23:31 imirkin: ugh
23:31 karolherbst: ufff
23:31 fincs: Np :)
23:31 HdkR: :D
23:31 fincs: So it wasn't nouveau's fault, it wasn't kernel's fault
23:31 fincs: It was
23:31 fincs: mesa/gallium
23:32 imirkin: yes
23:32 imirkin: iirc it's core mesa
23:32 imirkin: i completely forgot abou tit
23:32 fincs: So we need to make mesa treat gl_ViewportMask in the same way as gl_ViewportIndex
23:32 fincs: I.e. disable clever optimization that is so foot shooty
23:32 imirkin: ah no, it's in st/mesa
23:33 fincs: I don't know where st ends and gallium starts, sorry
23:33 imirkin: st drives gallium
23:33 imirkin: gallium is an API
23:35 karolherbst: huh.. my parsing is correct though... weird
23:35 imirkin: there's something off in there
23:35 karolherbst: 0x80000482 -> method 0x1208 value 0
23:35 imirkin: e.g. i don't see any viewport transform stuff in there at all
23:35 imirkin: should be in there somewhere
23:36 imirkin: and naturally it all works fine now.
23:37 imirkin: skeggsb: tagr: ignore my earlier question about viewport stuff. it was a stupid stupid bug on my end.
23:37 fincs: ( ͡° ͜ʖ‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌├┬┴┬┴
23:37 karolherbst: 0x238c is CB_POS, but my parser doesn't pick it up either
23:37 karolherbst: mhhh
23:38 karolherbst: I think I need to do something about those subchannels
23:39 fincs: I'm still wondering why writing to both gl_ViewportIndex and gl_ViewportMask crashes on my Switch, but not on imirkin's Pascal card
23:39 imirkin: brb
23:42 imirkin: probably an extra check by the blob software that you don't do bad things
23:42 fincs: Blob seems to output the two writes anyway, at least on Switch
23:44 imirkin: fincs: fixed thing here: https://github.com/imirkin/mesa/commits/viewport_array2
23:44 fincs: So the change is in st_atom.c?
23:45 imirkin: yes.
23:45 fincs: Mkay
23:45 fincs: Let's rebase again! :p
23:45 imirkin: (and shader_enums.h to add the bit)
23:47 imirkin: i expect i'll have passthrough gs next weekend or so
23:47 fincs: ( ͡° ͜ʖ ͡°)( ͡° ͜ʖ ͡°)
23:47 imirkin: i think it's going to be easier than i first thought -- it doesn't change the array-ness of inputs at all
23:47 fincs: More like, it has no inputs at all lol
23:47 imirkin: just a lot of extra rules + passthrough qualifier, plus have to make synthetic outputs
23:48 imirkin: hm? no, it can have inputs
23:48 imirkin: varyings
23:48 fincs: Okay rebased
23:48 imirkin: it still has a gl_in[] and so on
23:48 imirkin: all subject to the usual rules
23:48 fincs: Buildtime
23:48 imirkin: + a bunch of new ones
23:48 fincs: Gotta love meson's speed ( ͡° ͜ʖ ͡°)
23:50 fincs: It finally works \o/
23:50 fincs: (just tested)
23:52 fincs: https://github.com/devkitPro/mesa/commits/switch-21.x
23:55 imirkin: yay
23:55 imirkin: thanks for tracking it down
23:55 imirkin: that should have been the first thing i checked :(
23:56 fincs: More importantly, thanks to you for implementing this stuff :)
23:56 imirkin: i even remembered that opt while i was writing the core stuff
23:56 imirkin: but then forgot all about it =/
23:56 imirkin: yw. hopefully someone finds it useful
23:57 fincs: Btw
23:57 fincs: mesa 21.0 when
23:57 imirkin: few months
23:57 imirkin: quarterly releases
23:57 fincs: Ah darn
23:57 fincs: We only package stable releases
23:57 imirkin: er
23:57 imirkin: well, 21 is going to be in 2021
23:57 imirkin: but 20.1 will be ... actually moderately soon, i imagine
23:58 fincs: Ah I didn't remember it had to do with the data
23:58 imirkin: it's april now
23:58 fincs: *date
23:58 imirkin: when did 20.0 come out...
23:58 fincs: I hope the dual issue stuff + viewport mask can get in for 20.1
23:58 imirkin: feb 19
23:58 karolherbst: cool subchan is working now :)
23:58 imirkin: so ... all things being equal, may 19
23:59 imirkin: it never works quite like that of course, there's variability in these things
23:59 imirkin: i don't think 20.1 has been branched yet
23:59 fincs: (*cough* covid19 *cough*)
23:59 imirkin: once it is, usually about a month
23:59 imirkin: alright, well i still have to write a pile of piglit tests for this thing