01:40imirkin: karolherbst: so?
03:41Lyude: karolherbst: heh-I think I got the brightness issues fixed. I think the GPU just handles saving/restoring backlight state on the DPCD itself
03:41Lyude: because the registers appear to be set when turning the screen back on, not sure about suspend/resume yet though
04:15Lyude: oh i see, it's even weirder then that. you just enable the backlight before the panel is actually powered on?
04:16Lyude:wonders if we're actually even talking to the panel connected to the gpu, or just some microcontroller talking to the actual panel
10:11karolherbst: imirkin: ahh.. I setup everything on the jetson but forgot to run the tests... :D
14:14karolherbst: imirkin: the tests fails on gm20b here
14:15fincs: Is that nouveau userland + nouveau kernel driver?
14:15karolherbst: yes
14:15fincs: Wait a minute, gm20b is tegra
14:16karolherbst: yes
14:16karolherbst: I have a jetson nano
14:16fincs: I thought tegra stuff use the Linux4Tegra + nvidia kernel bits?
14:16karolherbst: we have support for that in nouveau
14:16karolherbst: nvidia people worked on that :p
14:16fincs: Oh
14:16karolherbst: but yeah. I have my own kernel build
14:17fincs: Is it possible to test with nvidia's kernel driver, in order to know for sure who is responsible for the test failing?
14:17karolherbst: I already tested with nvidias driver on my gp107
14:17karolherbst: and there it passed
14:18karolherbst: ohh you mean nouveau userland + nvidia kernel driver
14:18karolherbst: mhhh
14:18karolherbst: no
14:18fincs: Yes I mean that
14:18fincs: nouveau userland + nvidia kernel
14:18karolherbst: in theory nvidia userland + nouveau kernel driver would work, but I don't know where to get the proper binaries from
14:19fincs: I should bite the bullet and finally get around rebasing the switch changes on top of a more recent mesa so that I can actually test imirkin's branch on Switch
14:19karolherbst: but I doubt it supports gm20b anyway
14:19karolherbst: fincs: where did you test it on before?
14:19karolherbst: I thought the stuff worked for you
14:19fincs: Yes, it worked for me but I only tested the compiler bits alongside my non-nouveau GPU code
14:20karolherbst: ahh, I see
14:20fincs: That's why I was saying that the compiler bits are correct in imirkin's branch
14:20karolherbst: I did an mmt trace and gave it to imirkin, but he meant everything is fine in the driver as well :/
14:20karolherbst: I am sure it's something stupid somewhere
14:22fincs: Yeah it has got to be something stupid
14:23fincs: I checked nouveau's gpu init code (nvc0_screen) yesterday too, and I couldn't find anything wrong with it
14:23karolherbst: anyway.. I will test astc stuff now :)
14:24fincs: Cool :)
14:24karolherbst: we only enable it for gk20a right now
14:24karolherbst: would be cool to enable it for gm20b as well
14:24fincs: I'm sure it will work
14:24karolherbst: ohh and etc as well
14:24karolherbst: yeah... maybe
14:24fincs: Because official Switch stuff uses ASTC
14:25karolherbst: ohh, it's more about nouveaus code
14:25fincs: However, ETC stuff seems to be missing from what official devs are expected to use
14:25fincs: I still think it's there
14:25fincs: But it would be nice to have actual confirmation
14:25karolherbst: I shouldn't compile stuff on the jetson nano :D
14:25karolherbst: it's so slow
14:25fincs: IIRC you only need to change a single check to enable ASTC/ETC stuff in nouveau for gm20b
14:25karolherbst: and even under full load it consumes like 5W
14:25karolherbst: fincs: yeah.. but... there can always be random silly things
14:25fincs: My default development workflow is cross compilation :p
14:26karolherbst: cross compilation is annoying though :/
14:26fincs: Nah
14:26karolherbst: I'd rather setup a compile server
14:26fincs: I eat crossbuilds for breakfast
15:40fincs: Alright, that was an incredibly painful rebase
15:40fincs: But it's done
15:40fincs: But... does it build at all? ( ͡° ͜ʖ├┬┴┬┴
15:49fincs: "Dependency libdrm_nouveau found: NO found 2.4.75 but need: '>=2.4.81'" <-- heh, time to bump our fake libdrm_nouveau version I guess
15:50imirkin: karolherbst: thanks for testing
15:50imirkin: i sorta expected that, but nice to confirm
16:00imirkin: karolherbst: do you need me to give you a patch for the ASTC/ETC bits, or you can work it out yourself?
16:00imirkin: just a small change in nvc0_screen.c iirc
16:01imirkin: the bigger deal is getting deqp built :)
16:01imirkin: to test
16:01karolherbst: yeah.. deqp is more annoying
16:01karolherbst: I already know what to change in mesa
16:01karolherbst: deqp is still building...
16:02fincs: Hmm, what happened to #include "util/u_format.h"?
16:03fincs: Getting non-existent header error in winsys/nouveau/switch
16:03imirkin: it moved maybe
16:03imirkin: the format stuff got a significant overhaul
16:03fincs: Looks like it's util/format/u_format.h now
16:04imirkin: btw, i wouldn't at all be opposed to having a second winsys upstreamed for nouveau
16:04imirkin: i dunno how many changes you had to make to core stuff, but if it's reasonable and abstractable, this should all be upstreamable
16:05fincs: It's non-trivial
16:05imirkin: ok
16:05fincs: https://github.com/devkitPro/mesa/tree/switch-19.0.0
16:05karolherbst: fincs: well.. you essentially just need to port over to have the changes be a "platform" no?
16:06fincs: Also, upstreaming stuff related to homebrew toolchains isn't exactly a great idea
16:06fincs: Especially considering Nintendo is in Khronos
16:06HdkR: Why not? Just stick some Switches in CI :P
16:06fincs: Homebrew stuff as a whole isn't supposed to exist
16:06karolherbst: fincs: well at least the offset stuff could be upstreamed
16:06karolherbst: one patch less
16:06fincs: Yeah there are small things that can be upstreamed
16:07karolherbst: fincs: as long as you don't violate laws you are fine
16:07karolherbst: even patent won't matter
16:07karolherbst: *patents
16:07fincs: Technically running custom code on Switch violates DMCA
16:07karolherbst: and?
16:07imirkin: ... but also don't put yourself in legal jeopardy
16:07karolherbst: you don't have to ship binaries with this being enabled
16:07imirkin: ultimately cases are won not by who's right, but by who has more money
16:07karolherbst: yeah, that's true
16:07fincs: That's the thing
16:08karolherbst: fincs: but the mesa bits won't fall under dmca
16:08imirkin: and while i don't know your financial situation, i hope you won't be offended by the suggestion that nintendo has more :)
16:08karolherbst: as there is literally no code to get around anything
16:08fincs: Hmm meh, EGL internal interface changed yet again
16:09karolherbst: you just target your own API, no?
16:09karolherbst: how does nintendo come into play here? ;)
16:09fincs: Ninty is in Khronos
16:09karolherbst: and?
16:09karolherbst: you still ohnly target your own API
16:09karolherbst: *only
16:09fincs: Our "own" API which runs on their device, which legally speaking shouldn't even exist
16:09karolherbst: it could be an "embeded platform API for embeded devices" :p
16:10karolherbst: and you call the platform embedded
16:10karolherbst: or something
16:10karolherbst: but mhh, the winsys code might be a bit more annoying as you'd target an aPI derived from nvidias stuff
16:11fincs: The nvidia stuff is in our fake libdrm_nouveau
16:11karolherbst: ohh.. I see
16:11imirkin: fincs: btw, you should also get GL_EXT_texture_shadow_lod in the update :)
16:11karolherbst: might be good to have a list of libnx APIs used by your stuff
16:11fincs: And our fake libdrm_nouveau uses wrapper objects coming from libnx (our main switch support system library), which in turn talk to Nvidia objects through ioctl
16:11fincs: imirkin: I need to catch up with stuff :)
16:11karolherbst: well.. from inside mesa
16:15fincs: "remove boolean from state tracker APIs" <-- lmao this is the reason why our EGL driver is so broken
16:15karolherbst: fincs: why?
16:15imirkin: that was me, right? :)
16:16fincs: boolean != bool and compiler complains that the function pointer isn't the same type
16:16imirkin: boolean = char
16:16karolherbst: imirkin: that's not required
16:16imirkin: there's a "typedef boolean char"
16:16imirkin: while in the larger world it's not required, it definitely was the case in mesa.
16:16karolherbst: their bool could be something else
16:16imirkin: (and still is, i didn't kill all the usage, just in the common api's)
16:17imirkin: bool is _Bool
16:17karolherbst: fincs: what's wrong with your bool btw?
16:17fincs: Hey, st_context_iface::flush got a new parameter, and this "notify_before_flush_cb" thing sounds handy for being a better place for our fence code... might consider in the future
16:18fincs: Mkay, after many many fixes this built
16:18fincs: ... But does it work? :p
16:18imirkin: as we all know, anything that compiles *must* work.
16:18HdkR: There is never a bug, only features
16:19fincs: I had problems with EGL driver before, like over a year ago
16:19imirkin: and boy does nouveau have features!
16:19fincs: It built fine but blew up at runtime :)
16:20fincs: pkgconfig .pc files look fine
16:20fincs: libEGL.a looks a bit... slim
16:22fincs: Hmm what happened to libmesa_util?
16:22fincs: Did that get broken up or something?
16:22imirkin: poof
16:22imirkin: where'd it go?
16:23fincs: Ah, it's now libmesa_common
16:24fincs: Ah there we go, libEGL.a more like it
16:25karolherbst: imirkin: the ETC and ASTC tests are all gles2+, right?
16:26imirkin: in deqp? or piglit?
16:26karolherbst: deqp
16:26imirkin: deqp is all gles2+ yea
16:26karolherbst: okay, cool
16:26imirkin: but ... they might only be in some specific "api"'s tests
16:29fincs: Aaaaand I get a bunch of NIR related linker errors, great
16:30fincs: Time to debug this
16:30imirkin: just pushed the NV_viewport_swizzle stuff to master
16:31fincs: :)
16:32imirkin: tagr: under the guise of making it work on GM20B ... I'm trying to get GL_NV_viewport_array2 working. I've done everything right, and yet it doesn't work. so ... clearly I didn't do something right. Can you see if there's a trick to making gl_ViewportMask[] actually work for viewports > 1? output 0x3a0 in vertex/tess/gs shaders.
16:33imirkin: tagr: right now my thinking is that it's something in gr init
16:38karolherbst: imirkin: huh.. astc doesn't seem to work with mesa master :/
16:38karolherbst: the deqp tests are just failing
16:38imirkin: karolherbst: uhhhhh
16:38imirkin: that's surprising.
16:39karolherbst: dEQP-GLES31.functional.copy_image.compressed.viewclass_astc_4x4_rgba.rgba_astc_4x4_khr_rgba_astc_4x4_khr.texture2d_to_texture2d eg
16:39imirkin: used to work
16:39imirkin: https://hastebin.com/uzahuduler.bash
16:40karolherbst: ohhh
16:40karolherbst: that's because GLX and EGL are still broken
16:40karolherbst: tagr: ^^
16:42karolherbst: mhhh.. maybe it would work with a newer kernel, but I highly doubt it
16:42imirkin: older is more likely :)
16:42imirkin: before the modifier stuff
16:42karolherbst: ahh, yeah
16:43karolherbst: but I think tagr debugged that once and wasn't able to reproduce it
16:43karolherbst: or something
16:43karolherbst: dunno
16:43karolherbst: it's broken for me
16:43fincs: Aaaaand finally, it links successfully
16:43imirkin: karolherbst: can you just comment out the modifier junk?
16:43imirkin: or is it kernel-level?
16:44karolherbst: no, it's in mesa
16:44fincs: Simple triangle program works on Switch emulator
16:45karolherbst: imirkin: I just check if my kmsro branch still works
16:45karolherbst: and just use that
16:46fincs: Venerable es2gears program also works on emulator
16:49fincs: Works on hardware
16:49fincs: Later today I'll try testing the viewport mask stuff
16:50fincs: Now that I have a working switch-mesa built off imirkin's branch :)
16:51imirkin: cool
16:53karolherbst: fincs: you should rebase more often :p
16:53karolherbst: at least every latest release
16:56karolherbst: imirkin: btw, the assert I hit is this one: deqp-gles31: ../src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c:258: nvc0_validate_fb: Assertion `!fb->zsbuf' failed.
16:56karolherbst: I am wondering what we actually need to do to fix this issue
16:56imirkin: it's not fixable
16:56imirkin: that case is just not supported
16:56imirkin: can't have linear scanout with zeta
16:57imirkin: er, linear color
16:57karolherbst: okay.. so what has the tegra driver to do to not run into this issue ;)
16:57imirkin: probably copies
16:57imirkin: can you try with DRI2?
16:58imirkin: LIBGL_DRI3_DISABLE=1
16:58karolherbst: deqp-gles31: ../src/gallium/state_trackers/dri/dri2.c:560: dri2_allocate_textures: Assertion `drawable->textures[statt]' failed.
16:58imirkin: heh
16:58imirkin: different error! :)
16:58karolherbst: :)
16:59karolherbst: at this point I'd really just merge the kmsro stuff because that's at least working and using the tegra driver has no benefit as it is today afaik
16:59karolherbst: but...
17:00karolherbst: tagr didn't want to switch over because we could make use of hw accel stuff for certain things
17:01karolherbst: anyway, I don't really care how we fix this, just that it gets fixed and we don't leave a broken driver in the tree for over a year
17:06karolherbst: maybe I rework the commit in a way so we can decide at runtime what to use... dunno how hard that would be though... probably very
17:07imirkin: gnurou did send me a TK1 board
17:08karolherbst: :)
17:08imirkin: unfortunately any kernel i try running on it causes it to die once there's any real ethernet activity
17:08imirkin: l4t seems fine
17:08karolherbst: ehhh
17:08karolherbst: my jetson nano is fine
17:08karolherbst: with an upstream kernel
17:08imirkin: so dunno if i'm doing something wrong
17:08imirkin: or ... what
17:08imirkin: but i never got it to work in a stable manner. a little unfortunate =/
17:08karolherbst: maybe you use the wrong dts file :p
17:09imirkin: this was a while ago, perhaps whatever the issue was is fixed now
17:09karolherbst: imirkin: I have my own .config file though, maybe that would help
17:09karolherbst: the tk1 and the nano might be not that different
17:09karolherbst: uses r8169 on mine
17:10karolherbst: I still run a 5.4 kernel btw... some arm tree actually, because of some... fixes
17:10karolherbst: I should update my kernel :)
17:12imirkin: yeah, it's a standard PCIe device
17:12imirkin: but something dies
17:13imirkin: and it seems to only happen when i start using nfs a lot
17:13karolherbst: maybe not enough power
17:13imirkin: i'm using the given power supply
17:13karolherbst: I have this issue as well
17:13imirkin: and it seems to work with l4t
17:13karolherbst: yeah... probably not :p
17:13karolherbst: l4t draws less power
17:13imirkin: ah
17:13karolherbst: because they are able to lower power consumption of the GPU ;)
17:13karolherbst: but if you get cose to the limit the device can just shut off
17:13karolherbst: now I am on PoE which has enough
17:13karolherbst: USB only gave me like 2A
17:14imirkin: huh, ok
17:14imirkin: well there's a separate power brick that they gave me
17:14imirkin: (that came with it)
17:14karolherbst: and through PoE the board draws up to 4A or something
17:14karolherbst: yeah.. maybe it's different in your case
17:14karolherbst: the nano is super low power
17:14imirkin: not sure the TK1 can run off PoE
17:14imirkin: it's a "big" board... fan and everything
17:14imirkin: mini-itx or whatever
17:15karolherbst: my switch can deliver up to 30W :)
17:15imirkin: make that nano-itx.
17:17imirkin: anyways ... i hope tagr or skeggsb will have something clever to say about this gl_ViewportMask thing. so annoying to be SO close.
17:17karolherbst: now the test passes with kmsro :)
17:18imirkin: karolherbst: any problem with having both tegra and kmsro enabled? it'd pick tegra by default, but could force kmsro via GALLIUM_DRIVER or whatever?
17:18karolherbst: they install into the same file.. but maybe
17:19karolherbst: I don't really know how all of that works, but kmsro is a bit messy
17:19karolherbst: imirkin: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2960/diffs
17:20karolherbst: but yeah, maybe
17:21karolherbst: anyway running the astc tests without hw support and later with hw support :)
17:25karolherbst: imirkin: does deqp supports gbm btw? I kind of don't really know what their surfaceless target is...
17:26karolherbst: would be cool if we could run deqp without having to spawn an X server
17:27karolherbst: also.. do we even need kmsro/tegra for wayland? I think this is really only required for X, no?
17:42imirkin: karolherbst: i think with the (semi)recent EGL work, surfaceless should work fine
17:42imirkin: i.e. egl + surfaceless platform
17:42imirkin: karolherbst: so ... my question is ... could you do that kmsro enablement WITHOUT nuking tegra?
17:43imirkin: i think it would work fine
17:43imirkin: it'd still load tegra_dri.so by defualt. but you could force kmsro with GALLIUM_DRIVER=kmsro
17:43imirkin: the whole problem is happening due to less-copies, so it's a good problem to have
17:52karolherbst: imirkin: kmsro would also install a tegra_dri.so file
17:52karolherbst: because X tells how the driver is called ;)
17:52imirkin: all the *_dri.so files are the same file
17:52imirkin: they're hard-linked together
17:52karolherbst: mhhh.. also true
17:53imirkin: hwoever X would use the tegra driver
17:53imirkin: but clients could use kmsro if they wanted
17:53karolherbst: actually.. let me check that
17:53imirkin: but would continue to use tegra if they didn't change GALLIUM_DRIVER
17:53karolherbst: ahh, it's also true for kmsro.. so yeah
17:54karolherbst: imirkin: well.. the problem is, it's only broken with X in the first place
17:54karolherbst: and with wayland/gbm we don't need a wrapper driver
17:54karolherbst: afaik
17:54imirkin: but X itself isn't broken
17:54imirkin: so no problem.
17:54karolherbst: true
17:54imirkin: i'm just saying - it's a way to add the option without changing the status quo
17:54imirkin: then both options can be available at ocne
17:55karolherbst: the thing is just, longterm it makes no sense to care about the tegra driver
17:55karolherbst: if people switch to wayland, then it's pointless afaik
17:55karolherbst: although.. not sure about xwayland here
17:56karolherbst: maybe that was the reason tagr didn't see any issues, maybe it works with xwayland
17:56karolherbst: mhhhh
17:58karolherbst: imirkin: btw, even with kmsro dri2 is broken, so I guess something is just broken there
17:58imirkin: could be yea
18:13imirkin: heh. the sample shader in NV_geometry_shader_passthrough is not valid.
18:14imirkin: it's supposed to have locations explicitly assigned, but it doesn't
18:14imirkin: fincs: can you confirm?
18:16imirkin: fincs: also ... stupid question ... what's the point of the passthrough GS? what can you do in a passthrough GS that you can't in an earlier stage?
18:18karolherbst: imirkin: isn't the point of the extension to make it easier to write passthrough geometry shaders?
18:18cwabbott: karolherbst: I only know this because I had to do the same thing with freedreno, but yes, surfaceless supported... gitlab-ci uses surfaceless so you can find the magic commandline arguments/env variables needed under .gitlab-ci/deqp-runner.sh
18:19imirkin: karolherbst: in unextended GL, only GS can write viewport/layer, and only GS can "multiply" primitives
18:19imirkin: karolherbst: however with GL_NV_viewport_array2, both of those are possible in all (non-frag) stages
18:19karolherbst: ohh, I thought you were working on NV_geometry_shader_passthrough
18:19imirkin: i am looking at it
18:20imirkin: and trying to determine a use-case for when i'd use it in the presence of GL_NV_viewport_array2.
18:20karolherbst: ahhh
18:20karolherbst: cwabbott: okay, good to know
18:20imirkin: only thing is if you have some weird calculation that you want to do once per primitive, instead of per vertex
18:20karolherbst: the jetson is soooo slow :(
18:21imirkin: but that calculation can't even read the whole primitive anyways...
18:22karolherbst: imirkin: maybe the "Interactions with NV_geometry_shader_passthrough" part of the spec helps with your question?
18:23imirkin: in NV_viewport_array2?
18:23karolherbst: yes
18:23imirkin: erm ... not at all
18:24karolherbst: sad
18:24imirkin: the question is, "why is the passthrough GS a useful construct given that NV_viewport_array2 exists"
18:24karolherbst: ahhh
18:24karolherbst: when do you need a passthrough gs?
18:24karolherbst: I really don't know GL that well though
18:25imirkin: a pure passthrough gs is needed when you want to specify layer/viewport for the primitive
18:25imirkin: another common use-case is to "multiply" primitives, e.g. send a single prim to multiple viewports
18:26karolherbst: mhhh
18:26imirkin: aka the stuff that NV_viewport_array2 does :)
18:26imirkin: which you can use in VS or TES
18:27karolherbst: the 5th question in viewport_array2 also has some interesting information here
18:27imirkin: again, can do that in any stage.
18:27karolherbst: yeah.. mhh
18:27karolherbst: dunno
18:28imirkin: ohhhh wait
18:28imirkin: wait
18:28imirkin: wait.
18:28imirkin: i think i totally misread this.
18:28imirkin: the passthrough gs *does* have access to all the input vertices.
18:28imirkin: but it just computes one set of stuff.
18:29imirkin: yeah ok. and that also solves the arrayness issues. coz there are none.
18:29imirkin: i like this.
18:29imirkin: that makes a _lot_ more sense.
18:30imirkin: and also makes it a lot easier to implement =]
18:30karolherbst: imirkin: regarding the astc stuff.. can I speed up the testing by.. let's say I only check one tests or a subset? Cause deqp takes a _lot_ of time here
18:30karolherbst: or maybe I should compile with opts
18:30imirkin: there are approximately a million astc tests
18:30karolherbst: yeo
18:30karolherbst: yep
18:31imirkin: you could manually select a bunch of them
18:31karolherbst: mhhh
18:31karolherbst: then I need the list first
18:31imirkin: also i doubt the software decode of astc is very fast
18:31imirkin: the list is in one of the deqp-master.txt files
18:31karolherbst: ahhh
18:31karolherbst: makes sense
18:31karolherbst: so a release build will probably help a lot
18:31karolherbst: of both, mesa and deqp
18:31imirkin: there's also some option to deqp to make it dump a list
18:39karolherbst: ahh yeah.. deqp was built with -O0 :(
18:39karolherbst: mesa with -O2 at least
18:50fincs: imirkin: "it's supposed to have locations explicitly assigned, but it doesn't" <-- Spec says "in a separable program"
18:51fincs: "what's the point of the passthrough GS?" <-- Nvidia calls it Fast GS
18:51fincs: And I guess they shouldn't be lying when they call something "fast"
18:51fincs: I.e. there's probably some crazy fast path for passthrough GS
18:52karolherbst: fincs: I think "fast" is related to writing code though :p
18:52karolherbst: but yeah... there is hw support for it, no?
18:52fincs: "related to writing code" <-- I don't think so at all
18:52fincs: Yes, this is something special in hardware
18:52fincs: Special flag in shader header + some regs for specifying the passthrough mask
18:52karolherbst: could be faster then
18:52fincs: "why is the passthrough GS a useful construct given that NV_viewport_array2 exists" <-- I think Nvidia intends gl_ViewportMask to be used from passthrough GS mainly
18:53karolherbst: if it wouldn't be, they would implement it in the shader
18:54karolherbst: on the other hand.. why waste transistors on something nobody uses :p
18:54karolherbst: would be interesting to know if nvidia optimizes some gs shaders though
18:54fincs: https://webcache.googleusercontent.com/search?q=cache:NYyIIwMPcBoJ:https://gameworksdocs.nvidia.com/GraphicsSamples/CubemapRenderingSample.htm+&cd=1&hl=en&ct=clnk&gl=us
18:54karolherbst: if a gs does plain copies from an input, you could use gs passthrough on that, right?
18:55fincs: Not sure if Nvidia autodetects it, probably not
18:55karolherbst: if not, it's not worth it
18:55imirkin: my confusion arose from thinking that the passthrough GS couldn't access the whole input primitive
18:57fincs: Found this: https://on-demand.gputechconf.com/siggraph/2016/presentation/sig1609-kilgard-jeffrey-keil-nvidia-opengl-in-2016.pdf
18:57fincs: ^ explains passthrough gs
18:58karolherbst: imirkin: btw, did you test PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES?
18:58imirkin: given that it _can_ access the whole input primitive, it makes sense why it's useful
18:58imirkin: karolherbst: a bit, yeah
18:58imirkin: and also the code seemed like it worked fine
18:59karolherbst: imirkin: the doc seems a bit off
18:59imirkin: ?
19:00karolherbst: pipe_draw_info::draw doesn't exist, or does it?
19:00imirkin: heh, no
19:00karolherbst: :p
19:00karolherbst: ohhh wait
19:00karolherbst: it asked about start
19:00karolherbst: ehh, my mistake
19:00imirkin: :)
19:01imirkin: karolherbst: oh hey ... on the nvidia blob, can you run that viewport mask test again
19:01imirkin: but get rid of this line:
19:01imirkin: #extension GL_ARB_fragment_layer_viewport: require
19:01imirkin: i want to see if blob turns on viewport index when GL_NV_viewport_array2 is supplied
19:03karolherbst: Failed to compile VS: 0(10) : error C7531: global variable gl_ViewportMask requires "#extension GL_NV_viewport_array2 : enable" before use
19:03karolherbst: ohh wait
19:03karolherbst: ahh
19:03karolherbst: wrong thing
19:04karolherbst: imirkin: Failed to compile FS: 0(9) : error C7532: global variable gl_ViewportIndexIn requires "#version 430" or later
19:05imirkin: gl_ViewportIndex
19:05imirkin: oh, that's their stupid error
19:05imirkin: ok
19:05imirkin: thanks
19:06imirkin: you *do* have the #extension GL_NV_viewport_array2: require in there, right?
19:06karolherbst: yes
19:06imirkin: ok thanks
19:07imirkin: i'll remove the frag enable for that stuff then
19:07imirkin: it wasn't clear in the spec
19:15karolherbst: *sigh* https://github.com/karolherbst/envytools/commit/0b5cff6b699dc4ba15870776a2529c3ed287b3e0
19:15karolherbst: could have been so easy
19:16karolherbst: but no, you start with the fastest :p
19:16karolherbst: ohh
19:16karolherbst: I messed up the variant in the new ones
19:17karolherbst: https://github.com/karolherbst/envytools/commit/1ffd92f9b51609ac32b98aad83f819cda4b326dc
19:17karolherbst: (verified on TU116 or so)
19:17karolherbst: dunno if it's actually like this on TU10X mhhh
19:19imirkin: lol
19:19karolherbst: but yeah, those regs are still writeable :)
19:19imirkin: what does GV100 do?
19:19imirkin: G80:GV100 doesn't include GV100
19:19fincs: Argh apparently I cannot GL today
19:20imirkin: fincs: that's better than me ... i can't do GL any day.
19:20karolherbst: imirkin: ohh, you are right
19:20imirkin: karolherbst: better do like G80:TU102
19:20imirkin: and then TU102:
19:20imirkin: or TU102-, doesn't matter
19:20imirkin: that way it's obvious you didn't miss anything
19:20fincs: Where did I fuck up? v
19:20fincs: https://hastebin.com/uzatunuxiv.cpp
19:21karolherbst: imirkin: but we usually use the X- syntax, no?
19:21imirkin: you wrote GL, that was your first mistake
19:21imirkin: karolherbst: usually the : syntax
19:21fincs: Yes I know
19:21karolherbst: mhhh <bitfield pos="4" name="DEVID_WR" variants="NV40-"/>
19:21imirkin: fincs: what's not working?
19:21fincs: Displays nothing, triangle doesn't show up, only clear color
19:21karolherbst: I only see the - one
19:21imirkin: karolherbst: ah you mean for the infinite end? maybe you're right
19:21karolherbst: yeah
19:21imirkin: fincs: that's not ideal.
19:21imirkin: mesa or blob?
19:21fincs: mesa/nouveau on Switch
19:21karolherbst: : works as well though
19:22fincs: I've tried compiling existing examples and they work perfectly
19:22imirkin: if it's a debug build, should print various "Mesa Debug" things for when you screw things up
19:22fincs: It's not
19:22karolherbst: I keep the commit local though until I can test on TU10x
19:22imirkin: fincs: core or compat context?
19:22fincs: Core 4.3
19:23imirkin: you have to bind a VAO
19:23imirkin: no VAO = no draw
19:23fincs: lol seriously?
19:23fincs: That's so stupid
19:23imirkin: should have had an error to that effect too
19:23imirkin: perhaps it's an insufficiently-debug build
19:24imirkin: you can also hook into the KHR_debug things
19:24imirkin: so you get side-reports of various errors / potential issues
19:24karolherbst: okay, I can check that on a TU104 actually
19:25fincs: Okay I bind the dummy vao and it shows up
19:25fincs: Now to write the actual test
19:25karolherbst: MESA_DEBUG=1 :p
19:25imirkin: =]
19:25karolherbst: or was it MESA_DEBUG=context rather?
19:25karolherbst: dunno
19:26imirkin: it just prints them no matter what with a proper debug build
19:26imirkin: to stderr
19:26karolherbst: yeah, that's true
19:29imirkin: INTERESTING
19:29imirkin: if i just enable the viewport index output
19:29imirkin: in the omask
19:29imirkin: but don't actually write to it
19:29imirkin: then it doesn't work
19:29fincs: Viewport mask + index for me results in a GPU crash
19:29imirkin: let's see what happens if i do the write without the omask...
19:30fincs: Okay
19:30fincs: gl_ViewportMask[0] = 0xf;
19:30fincs: With nouveau/mesa running on Switch, this doesn't actually broadcast
19:31imirkin: heh
19:31imirkin: so we're doing something wrong at the shader / context setup level
19:31imirkin: that's just SUPER DUPER
19:31fincs: Writing gl_ViewportIndex + gl_ViewportMask results in a GPU Crash
19:31imirkin: fincs: with nouveau/mesa too?
19:31fincs: And my application gets booted out with an error message
19:31fincs: Yes
19:32imirkin: can you try to "bisect" the setup
19:32fincs: Wdym
19:32imirkin: well
19:32imirkin: presumably there are a bunch of differences between nouveau/mesa and the working thing
19:32fincs: The working thing uses my API, which is based off NVN
19:32imirkin: right
19:32imirkin: but ultimately it generates a pushbuf
19:32imirkin: which gets sent to the gpu
19:32fincs: gl_ViewportMask[0] = 2;
19:32fincs: This results in no primitives being output
19:32imirkin: yes
19:33imirkin: that's the behavior i see as well
19:33fincs: gl_ViewportIndex = 1; gl_ViewportMask[0] = 2; <-- This also results in crash
19:33imirkin: anyways, perhaps we can copy whatever setup is being done by the nvn-based thing
19:33imirkin: or do you not have access to that pushbuf?
19:33fincs: Here, have fun: https://github.com/devkitPro/deko3d/blob/master/source/maxwell/gpu_3d_base.cpp#L59
19:34imirkin: cool thanks
19:34fincs: I already tried playing around with this init without success, and I couldn't get my thing to fail
19:34imirkin: but perhaps you can get the nouveau thing to work? :)
19:34imirkin: or alternatively
19:34imirkin: perhaps you can comment out random bits of it
19:34fincs: Commented out every single unknown thing and it would still work
19:34imirkin: oh, if it's the SPA version, i'll kill someone
19:34fincs: !!!
19:34fincs: It might as well be
19:35fincs: Does nouveau set SPA version at all?
19:35imirkin: no
19:35imirkin: it's never mattered for anything
19:35fincs: Okay
19:35fincs: So that's what I'm going to test
19:35fincs: Please hold on tight
19:35imirkin: the SPA version in the shader header is also set to some old thing
19:35fincs: Yes
19:35imirkin: i tried increasing that without effect
19:35fincs: And I "fixed" that
19:36fincs: Let me try patching nouveau to set SPA version
19:37imirkin: interesting
19:37imirkin: w << CmdInline(3D, VertexProgramPointSize{}, E::VertexProgramPointSize::Enable{});
19:37imirkin: i could imagine that mattering too
19:37imirkin: unlikely tho
19:38fincs: BEGIN_NVC0(push, NVC0_3D(0x0310), 1); PUSH_DATA (push, 0x0503);
19:38fincs: That should do it, right?
19:39imirkin: something like that
19:39imirkin: but not that
19:39imirkin: try SUBC_3D(0x0310)
19:39fincs: Ah
19:39fincs: Good catch
19:40fincs: Building
19:40imirkin: could also be something like this: w << Cmd(3D, SetRenderLayer{}, E::SetRenderLayer::UseIndexFromVTG{});
19:40imirkin: i did think of that though
19:40imirkin: and tried force-enabling it
19:40imirkin: with no effect
19:41fincs: No dice
19:41fincs: Maybe spa needs to be set both in reg and in shader header
19:41imirkin: probably does
19:41imirkin: i thought you already fixed up the shader header though?
19:41fincs: Yes but not in nouveau code
19:41imirkin: o
19:42fincs: I generate it with slightly different code
19:42fincs: It's... complicated
19:42fincs: What did you do to fix it? 0x20061 -> 0x60061?
19:44fincs: ^ did that, still no dice
19:44imirkin: i forget
19:44imirkin: i tried a few
19:44fincs: - vp->hdr[0] = 0x20061 | (1 << 10);
19:44fincs: + vp->hdr[0] = 0x60061 | (1 << 10);
19:45fincs: Hmm
19:45fincs: Viewport near/far
19:45fincs: Are they initialized at all?
19:45fincs: Oh right, it works with ViewportIndex alone so that shouldn't be the issue
19:47fincs: Argh, what are we missing
19:48imirkin: well this is odd.
19:48fincs: At least we now know there's nothing wrong with kernel side init
19:48fincs: And the problem is 100% within nouveau userland
19:49imirkin:has an idea
19:49imirkin: so an interesting observation
19:50imirkin: is that if i do the write to ViewportIndex "first"
19:50imirkin: it doesn't help
19:50fincs: For me it crashes both ways
19:50fincs: Before and after
19:50fincs: And like I said, compiler side of things is working
19:52imirkin: yeah, no, i got nothin'
19:54fincs: Let me try forcing VP_POINT_SIZE to 1
19:54fincs: Nope, that's not it
19:55imirkin: w << Cmd(3D, Unknown514{}, 8 | (getDevice()->getGpuInfo().numWarpsPerSm << 16));
19:55imirkin: what about something like that?
19:57fincs: That's one of the unknowns, I tried commenting it out in order to make it fail and it would still work
19:57imirkin: well, initial context state could differ between nouveau and however that thing is set up
19:58fincs: Initial context is the same for both nouveau and deko3d when running on Switch
19:58imirkin: ah right, good point
19:58fincs: (because they both run off Nvidia blob)
19:58fincs: Let me try provoking vertex
19:58imirkin: can you just comment out like half of that setup3DEngine thing
19:58imirkin: and see if that causes this to break
19:58fincs: Yeah that's what I did originally
19:59fincs: Everything with "unknown" commented out
19:59fincs: As well as *all* of the firmware calls
19:59imirkin: so ... more commenting needed then :)
19:59imirkin: that's why i just said bisect
19:59imirkin: literally comment out half
20:00imirkin: unfortunately some things are probably required for any rendering
20:00fincs: That's also a problem
20:02fincs: Let me try screwing around with one last thing: VIEW_VOLUME_CLIP_CTRL
20:02fincs: I set that to 0x181D
20:03fincs: Nope :\
20:04karolherbst: imirkin: you have no further comments on the commit,r ight? I just confirmed it on a TU104: https://github.com/karolherbst/envytools/commit/1ffd92f9b51609ac32b98aad83f819cda4b326dc
20:04karolherbst: that TU104 even boots with 2.5 :)
20:04karolherbst: but I am too lazy to check all the other PCI regs :D
20:05karolherbst: but mhhhhh
20:05karolherbst: this TU104 officially is also only PCIe 3.0 compliant
20:05karolherbst: ufff
20:05karolherbst: I smell broken PCIe 4.0 support
20:07karolherbst: maybe I should declare 0 as 8_0 as well mhhh
20:07karolherbst: and add a TODO
20:14imirkin: karolherbst: G80:TU102
20:14karolherbst: ehhh
20:14karolherbst: I forgot to push my local changes
20:14karolherbst: ahh no, I just linked to the old version
20:14imirkin: lgtm
20:16fincs: Okay, time to comment out shit
20:22fincs: I commented out almost *everything* except for the stuff I can't remove
20:22fincs: And it still works
20:22fincs: Absolutely minimal 3D init: https://hastebin.com/edufogatur.php
20:22imirkin: ok
20:23imirkin: so then it's the other way
20:23imirkin: we're doing something to break it
20:23fincs: The only other commands I'm calling are
20:23fincs: dkCmdBufBindRenderTarget
20:23fincs: dkCmdBufSetViewports
20:23fincs: dkCmdBufBindRasterizerState
20:23fincs: dkCmdBufBindColorState
20:23fincs: dkCmdBufBindColorWriteState
20:23imirkin: and i assume SetViewports doesn't do anything too funny?
20:23fincs: (state setting commands)
20:23fincs: https://github.com/devkitPro/deko3d/blob/master/source/maxwell/gpu_3d_base.cpp#L329
20:24fincs: Just writes to ViewportTransform/Viewport registers
20:24imirkin: right, just the standard stuff
20:24fincs: And the rasterizer state:
20:24fincs: https://github.com/devkitPro/deko3d/blob/master/source/maxwell/gpu_3d_state.cpp#L30
20:24fincs: (ColorState is just blend, ColorWriteState is just enabling which colors to allow writing)
20:25imirkin: yeah, nothing unexpected
20:25imirkin: so nouveau is setting something it shouldn't
20:25imirkin: that screws everything up
20:25fincs: It's starting to look like it, yes
20:26imirkin: i'm just going to start by disabling the magic 3d init :)
20:28fincs: Magic 3d init is identical to mine, except for a single register which I tried copying over to my init and it would still work
20:40imirkin: ok, i've cleared out basically all of the init we do in nvc0_screen, and other than breaking clears, it doesn't seem to have helped
20:41fincs: :\
20:41fincs: Are there any other places where nouveau does "magic" writes?
20:42fincs: I recall fsh setup was full of magic
20:42imirkin: don't think so... let's see
20:42imirkin: BEGIN_NVC0(push, SUBC_3D(0x0360), 2);
20:42imirkin: PUSH_DATA (push, 0x20164010);
20:42imirkin: PUSH_DATA (push, 0x20);
20:42imirkin: heh
20:42imirkin: so we do
20:42fincs: Yeah
20:42fincs: But hmm, it's fsh magic though
20:43fincs: For a simple fragment shader I write 0x087F6080 / 0x20 there
21:20imirkin: welp, i've commented out nearly everything
21:20imirkin: still no go.
21:57fincs: :\
21:57karolherbst: :/
21:57fincs: I guess the problem would not be init but somewhere else
21:57karolherbst: maybe it's indeed some gr stuff
21:57fincs: State stuff
21:58fincs: karolherbst: nouveau running on nvidia blob on Switch fails
21:58karolherbst: well.. sure
21:58karolherbst: or what did you test?
21:58fincs: I finally tested switch-mesa
21:58karolherbst: ohh...
21:58karolherbst: I see
21:58fincs: With imirkin's branch
21:58karolherbst: ehhh
21:58karolherbst: well, yeah, then it's something in mesa
21:58fincs: It has to be, yeah
21:58karolherbst: mhhh
21:59karolherbst: imirkin: is there anything obviously missing from the mmt trace?
21:59fincs: Could it be a GL bug?
21:59karolherbst: might be that some ioctl is parsed incorrectly
21:59karolherbst: but it looked fine
21:59karolherbst: fincs: doubtful, why would it be?
21:59karolherbst: those are shader only features, no?
21:59fincs: Should be
21:59fincs: But it also interacts with multiviewport
21:59karolherbst: well.. sure
21:59karolherbst: but I mean you just have to change some shader related bits
21:59karolherbst: but mhhh
22:00imirkin: i've commented out like 90% of non-essential things
22:00imirkin: both init as well as state stuff
22:00fincs: Would be nice to see a log of calls into nouveau functions
22:00karolherbst: imirkin: in nvc0?
22:00imirkin: yes
22:00karolherbst: mhh
22:00imirkin: nvc0_state, nvc0_state_validate
22:00imirkin: nvc0_screen
22:00imirkin: turned off compute
22:01imirkin: disabled 3d blits
22:01fincs: Hmm
22:01karolherbst: GP104_3D.UNK0220[0x2] = 0x1 ?
22:02karolherbst: what about this thingy
22:02fincs: What does the fragment shader header + code look like?
22:02imirkin: tried adding it.
22:02karolherbst: mhhh
22:03karolherbst: I have an idea
22:03imirkin: another interesting point is that if i manually add in the viewport index write at the start
22:03imirkin: and add it to the output map
22:03imirkin: it doesn't help
22:03imirkin: the write has to be RIGHT BEFORE the viewport mask write
22:04imirkin: fincs: http://paste.debian.net/1140001/
22:04fincs: At this point the only thing we haven't checked is the fragment shader
22:04fincs: Okay, let me compare
22:04imirkin: in the trace, that write has a (st 0xd)
22:04imirkin: which i've tried replicating too
22:04imirkin: as well as disabling the scheduler entirely
22:04imirkin: and i've tried adding yields all over the place.
22:05karolherbst: yeah.. that won't help :p
22:05imirkin: i know
22:05imirkin: i was desperate though :)
22:05imirkin: it's clearly getting the value
22:05imirkin: writing a 0 causes nothing
22:05fincs: Wait a minute
22:05fincs: Is that the fragment shader?
22:05imirkin: oh oops
22:05imirkin: no
22:05imirkin: why does frag shader matter?
22:06fincs: I want to see if I have it different
22:06fincs: As in the header
22:06imirkin: http://paste.debian.net/1140003/
22:07imirkin: other than version stuff, looks the same
22:07fincs: Yeah... the only thing is that I have 0x000FF000 at HDR[10] but that shouldn't matter...
22:08imirkin: for VP
22:08imirkin: i think that was necessary on like nvc0. or not even necessary but cargo-culted. dunno
22:08fincs: I have that in the fsh too apparently
22:08imirkin: uhhh
22:08imirkin: well, it's 0 in the trace here
22:09fincs: Well yeah
22:09fincs: This is ignored in fsh as there is no output :p
22:09imirkin: (i.e. trace of blob)
22:09fincs: I was just lazy and initialized this field regardless of shader type
22:09imirkin: hehe
22:10fincs: Hmm, in that vsh header, you don't have that weirdo 0x02000000 bit set
22:11fincs: In hdr[0]
22:11imirkin: as you perhaps recall
22:11imirkin: that was like the first thing i had played with
22:11fincs: True
22:11imirkin: but i'll set it just in case
22:12karolherbst: ahh ehh, crap, deqp surfaceless doesn't build? :/
22:14imirkin: used to...
22:14karolherbst: well.. building master branch
22:14imirkin: fincs: yeah, no help
22:15imirkin: i've also set 0x0310
22:15fincs: Hmm
22:15fincs: I wonder if it's a shader setup issue
22:15fincs: nouveau never writes to the VP_A regs right?
22:15karolherbst: ohhhh
22:15karolherbst: imirkin: the VK-GL-CTS is the upstream now
22:15karolherbst: interesting
22:16karolherbst: not deqp
22:16imirkin: karolherbst: no, they're different
22:16karolherbst: nono, really
22:16karolherbst: I am serious about that
22:16imirkin: VK-GL-CTS has ES-CTS tests
22:16imirkin: (and GL-CTS obviously)
22:16karolherbst: nope
22:16karolherbst: VK-GL-CTS is the upstream now
22:16karolherbst: even referenced from deqp docs
22:16imirkin: ok.
22:17imirkin: that's news to me.
22:17karolherbst: https://source.android.com/devices/graphics/deqp-testing
22:17karolherbst: "his page is not maintained and will be deprecated. See the Vulkan and OpenGL CTS Wiki for current information."
22:17imirkin: ok
22:18karolherbst: they even merge from there into the tree
22:18karolherbst: interesting
22:18karolherbst: they have a master and "upstream-master" branch
22:18karolherbst: in deqp
22:18imirkin: cool
22:18imirkin: nice that it's all unified then
22:18karolherbst: yeah
22:18imirkin: again - news to me :)
22:18karolherbst: yeah.. to me as well
22:18imirkin: fincs: so ...
22:19imirkin: i'm going to send you a commit
22:19imirkin: can you see if running it makes it work with "nouveau"?
22:19fincs: Okay
22:19imirkin: the basic theory is that there are multiple things wrong.
22:20imirkin: fincs: https://github.com/imirkin/mesa/commits/for-fincs
22:21imirkin: i.e. we're screwing something up with init
22:21fincs: "great hacks!" is the only commit right?
22:21imirkin: and also some sort of underlying thing isn't enabled
22:21imirkin: yes
22:21imirkin: it might cause conflicts though. i commented out a lot of stuff.
22:21imirkin: unfortunately comment-region comments out each line individually
22:22fincs: Building
22:23imirkin: i disabled a _lot_ of stuff, so this basically will only work with VERY simple applications
22:23fincs: Aaaaaand I still only get viewport0 output
22:23imirkin: bleh.
22:23imirkin: what else is there left to remove?
22:26fincs: Let me try setting dummy config for vertex shader A
22:28fincs: ... I get a crash
22:30fincs: Whoops, I fucked up the command list
22:30fincs: Still broken with dummy vsh-A config added
22:35fincs: Hmm, what's this "MACRO_TEP_SELECT" thing
22:37fincs: As well as MACRO_GP_SELECT
22:37fincs: Let me try disabling these macros (replacing them with straight up SP_SELECT(n) writes)
22:38fincs: Aaaand no dice
22:40fincs: What the fuck is there left to disable
22:43imirkin: ;)
22:46fincs: ... Could the drawing code be bad
22:46fincs: As in
22:46fincs: Draw calls
22:49imirkin: sure, anything is possible
22:49imirkin: it's just going through the regular nvc0_draw_arrays in this case
22:49imirkin: (in my case at least)
22:50fincs: Hmm, I don't see anything wrong :\
22:51fincs: Maybe something in nvc0_draw_vbo?
22:52imirkin: nothing jumps out at me
22:52fincs: I don't even use vertex arrays in my test
22:55fincs: Meh, I don't see anything suspicious either
23:00fincs: Argh, this is probably something very stupid
23:00fincs: Is it possible to trace the state that comes from GL at all?
23:00fincs: As in, a full dump of the current GL state
23:01imirkin: not sure what you mean
23:01imirkin: like the cso's?
23:01imirkin: GALLIUM_TRACE=foo.xml will trace all the gallium calls and dump various structures
23:01fincs: Whichever thing gallium/mesa internally uses to keep track of GL state
23:02karolherbst: nice.. you can get rid of 300 targets in deqp by removing the glslang and spirv-tools external stuff :)
23:03karolherbst: imirkin: the actual reason I got curious here was that googles master tree only supports python2, but VK-GL-CTS aka upstream-master was python3+
23:03imirkin: "GL state" a bit tricky
23:06karolherbst: huh?!?
23:06karolherbst: imirkin: the test hangs with nvidia now....
23:06imirkin: karolherbst: which test?
23:06karolherbst: the shader_test you linked
23:07imirkin: i asked you to change something in it, right?
23:07imirkin: i forget what
23:07imirkin: oh no - you already did
23:07karolherbst: yeah...
23:07karolherbst: well.. nothing happens
23:07karolherbst: it's odd
23:07karolherbst: glxinfo does work, so.. mhh
23:09fincs: I'm this close from just saying fuck it and dumping pushbuffers and parsing them
23:09karolherbst: I have a libdrm pushbuf parser :p
23:09karolherbst: https://github.com/envytools/envytools/pull/203
23:09fincs: I have... a Switch emulator :p
23:09karolherbst: :D
23:10karolherbst: I bet that one would not help
23:10fincs: With a tiny change I did which vomits everything submitted to the GPU
23:10karolherbst: as I assume no game is using that
23:10karolherbst: also
23:10karolherbst: we already have nvidia traces
23:10fincs: The problem is not what nvidia does
23:10fincs: The problem is what nouveau does - there's something in here which is fucking things up :p
23:11karolherbst: imirkin: do you have a libdrm debug build?
23:11karolherbst: I can adjust depushbuf to correctly parse the full thing
23:11karolherbst: if anything is missing
23:11imirkin: ppprobably
23:11karolherbst: :)
23:12imirkin: how do i get ti to dump stuff again?
23:12karolherbst: NOUVEAU_LIBDRM_DEBUG=255
23:12karolherbst: ohh you have to manually add -DDEBUG to the build I think
23:13imirkin: i have it.
23:13imirkin: now what do i do?
23:13karolherbst: pastebin it
23:13karolherbst: there is probably still stuff missing in the tool, so I'd like to take a look first
23:13imirkin: http://paste.debian.net/1140016/
23:14karolherbst: ahh yeah :) I will need to adjust stuff I think
23:15karolherbst: that's gp108, right?
23:15imirkin: yes
23:15fincs: Ok I have dumps
23:15fincs: Now to parse them
23:15imirkin: fincs and karolherbst ... race!
23:15karolherbst: ahh yeah "unknown mode 4" :)
23:16karolherbst: ohh 4 was this annoying one
23:16imirkin: 1IC0?
23:16imirkin: which one's 4?
23:17karolherbst: yeah.. I think so
23:17karolherbst: the one line long one
23:18imirkin: how are you counting these?
23:18imirkin: is it the one that starts with 0xa?
23:18fincs: Got this parsed
23:18imirkin: basically first command goes to addr+0, and all the rest go to addr+4
23:19karolherbst: demmt has this mthd_data_available field for it
23:22karolherbst: imirkin: it's NVC0_FIFO_PKHDR_IL
23:22imirkin: oh
23:22imirkin: that's just the immediate mode?
23:22imirkin: i.e. the value is in the pushbuf
23:22imirkin: er, in the command
23:23karolherbst: yeah.. I think so
23:23imirkin: instead of the size
23:23karolherbst: so instead of the size you have the data
23:23karolherbst: yeah
23:23fincs: Lmao nouveau uploads program data using inline2mem?
23:24imirkin: why not
23:24fincs: I find that hilarious
23:24fincs: I guess it makes more sense on PC
23:24imirkin: on blob it uses p2mf -- that's the same thing, right?
23:24fincs: (with the non-uniform memory model)
23:24fincs: On Switch we have unified mem
23:24imirkin: right, there could be a lot of optimizations for a UMA situation
23:25imirkin: you want code in vram
23:25imirkin: and you don't want the cpu to be sitting there waiting for pcie
23:26fincs: ... This is only uploading viewport0
23:26imirkin: huh?
23:26imirkin: oh, initially
23:26imirkin: it'll get the rest "later"
23:26fincs: As in
23:26fincs: It literally never sets that
23:26imirkin: inconceivable.
23:27fincs: This is with your "for-fincs" branch btw
23:27imirkin: yeah, i'm running with basically that
23:27karolherbst: imirkin: probably it still broken but... https://gist.githubusercontent.com/karolherbst/2a082b7492074a7e5485cef24c6e6fd5/raw/65628a055f9c7dd57724d560d48e64d5699bfb4c/gistfile1.txt
23:27imirkin: something is off in your decoding, or something else
23:27karolherbst: .. let me check
23:28imirkin: that was for fincs
23:28imirkin: although your decoding is also off :)
23:28karolherbst: yeah
23:28karolherbst: but
23:28karolherbst: only the first pb :p
23:28karolherbst: it gets better late
23:28karolherbst: r
23:28fincs: I got until the point where there is a draw call
23:28fincs: So I should have it all
23:28karolherbst: "UPLOAD.DST_ADDRESS_HIGH 0" is where it is right
23:28karolherbst: and later
23:29karolherbst: I think I messed up the mode 4 bits.. mhh
23:30imirkin: fincs: you're right
23:30imirkin: there's something in core mesa
23:30imirkin: which detects usage of "later" things
23:30imirkin: this is why adding the viewport index helped
23:30imirkin: FUUUCK
23:30imirkin: sorry
23:31fincs: Hahahahahahahaha
23:31fincs: I called it
23:31imirkin: really very sorry i wasted a day of your time on it
23:31imirkin: ugh
23:31karolherbst: ufff
23:31fincs: Np :)
23:31HdkR: :D
23:31fincs: So it wasn't nouveau's fault, it wasn't kernel's fault
23:31fincs: It was
23:31fincs: mesa/gallium
23:32imirkin: yes
23:32imirkin: iirc it's core mesa
23:32imirkin: i completely forgot abou tit
23:32fincs: So we need to make mesa treat gl_ViewportMask in the same way as gl_ViewportIndex
23:32fincs: I.e. disable clever optimization that is so foot shooty
23:32imirkin: ah no, it's in st/mesa
23:33fincs: I don't know where st ends and gallium starts, sorry
23:33imirkin: st drives gallium
23:33imirkin: gallium is an API
23:35karolherbst: huh.. my parsing is correct though... weird
23:35imirkin: there's something off in there
23:35karolherbst: 0x80000482 -> method 0x1208 value 0
23:35imirkin: e.g. i don't see any viewport transform stuff in there at all
23:35imirkin: should be in there somewhere
23:36imirkin: and naturally it all works fine now.
23:37imirkin: skeggsb: tagr: ignore my earlier question about viewport stuff. it was a stupid stupid bug on my end.
23:37fincs: ( ͡° ͜ʖ├┬┴┬┴
23:37karolherbst: 0x238c is CB_POS, but my parser doesn't pick it up either
23:37karolherbst: mhhh
23:38karolherbst: I think I need to do something about those subchannels
23:39fincs: I'm still wondering why writing to both gl_ViewportIndex and gl_ViewportMask crashes on my Switch, but not on imirkin's Pascal card
23:39imirkin: brb
23:42imirkin: probably an extra check by the blob software that you don't do bad things
23:42fincs: Blob seems to output the two writes anyway, at least on Switch
23:44imirkin: fincs: fixed thing here: https://github.com/imirkin/mesa/commits/viewport_array2
23:44fincs: So the change is in st_atom.c?
23:45imirkin: yes.
23:45fincs: Mkay
23:45fincs: Let's rebase again! :p
23:45imirkin: (and shader_enums.h to add the bit)
23:47imirkin: i expect i'll have passthrough gs next weekend or so
23:47fincs: ( ͡° ͜ʖ ͡°)( ͡° ͜ʖ ͡°)
23:47imirkin: i think it's going to be easier than i first thought -- it doesn't change the array-ness of inputs at all
23:47fincs: More like, it has no inputs at all lol
23:47imirkin: just a lot of extra rules + passthrough qualifier, plus have to make synthetic outputs
23:48imirkin: hm? no, it can have inputs
23:48imirkin: varyings
23:48fincs: Okay rebased
23:48imirkin: it still has a gl_in[] and so on
23:48imirkin: all subject to the usual rules
23:48fincs: Buildtime
23:48imirkin: + a bunch of new ones
23:48fincs: Gotta love meson's speed ( ͡° ͜ʖ ͡°)
23:50fincs: It finally works \o/
23:50fincs: (just tested)
23:52fincs: https://github.com/devkitPro/mesa/commits/switch-21.x
23:55imirkin: yay
23:55imirkin: thanks for tracking it down
23:55imirkin: that should have been the first thing i checked :(
23:56fincs: More importantly, thanks to you for implementing this stuff :)
23:56imirkin: i even remembered that opt while i was writing the core stuff
23:56imirkin: but then forgot all about it =/
23:56imirkin: yw. hopefully someone finds it useful
23:57fincs: Btw
23:57fincs: mesa 21.0 when
23:57imirkin: few months
23:57imirkin: quarterly releases
23:57fincs: Ah darn
23:57fincs: We only package stable releases
23:57imirkin: er
23:57imirkin: well, 21 is going to be in 2021
23:57imirkin: but 20.1 will be ... actually moderately soon, i imagine
23:58fincs: Ah I didn't remember it had to do with the data
23:58imirkin: it's april now
23:58fincs: *date
23:58imirkin: when did 20.0 come out...
23:58fincs: I hope the dual issue stuff + viewport mask can get in for 20.1
23:58imirkin: feb 19
23:58karolherbst: cool subchan is working now :)
23:58imirkin: so ... all things being equal, may 19
23:59imirkin: it never works quite like that of course, there's variability in these things
23:59imirkin: i don't think 20.1 has been branched yet
23:59fincs: (*cough* covid19 *cough*)
23:59imirkin: once it is, usually about a month
23:59imirkin: alright, well i still have to write a pile of piglit tests for this thing