00:04 imirkin: fincs: sound like a reasonable test for the viewport stuff to basically have an "inverse" matrix multiply the position that would be output, and then have a correponding viewport swizzle?
00:04 imirkin: and in the end it should be a no-op
00:05 imirkin: for the viewport swizzle stuff, taht is
00:05 fincs: I guess you could do that
00:05 fincs: However I think it's a bit overcomplicated
00:06 imirkin: is there an easier way to test it?
00:06 fincs: Just swizzle manually gl_Position in the vsh instead?
00:06 imirkin: well, the matrix is the way to do that
00:06 imirkin: that way i just have one shader with a configurable matrix
00:06 fincs: Ah
00:06 imirkin: vs 8 shaders that do it "by hand"
00:06 imirkin: right?
00:06 fincs: Well yes
00:07 imirkin: i find viewports to be one of the less understandable aspects of all this stuff, unfortunately
00:07 imirkin: should be interesting to come up with a decent test...
00:07 fincs: If you want a shader that will work with any possible combination of swizzles then yeah...
00:08 fincs: The viewport swizzle stuff is meant to be used alongside gl_ViewportMask to autogenerate all the 6 faces of a cube
00:08 imirkin: ("all this stuff" = GL)
00:08 imirkin: yeah whatever
00:08 imirkin: functional tests have nothing to do with how it's MEANT to be used :)
00:08 fincs: Yeah but :p
00:09 imirkin: anyways, dinner shortly, but might be able to finish up NV_viewport_swizzle and NV_viewport_array2 tonight
00:09 fincs: :)
00:09 imirkin: (the former just needs tests ... the latter is "done", but needs some actual checking, followed by tests)
00:10 fincs: I'm excited to see support for these missing NV features being added :)
00:11 imirkin: the passthrough shader will require some thought
00:11 fincs: True
00:11 imirkin: if there are any other useful NV_* that you're aware of, esp easy-to-do ones, let me know
00:11 imirkin: (or EXT_*, i'm not picky)
00:13 fincs: Fragment shader interlock :p
00:13 fincs: (probably too cursed though)
00:14 fincs: (ARB_fragment_shader_interlock, dunno if the NV_ version has more stuff in it)
00:15 fincs: Looks identical at first glance (other than ARB<->NV suffixes)
00:17 fincs: 0x1224 has the fragment shader interlock layout: 0=disabled 1=pixel_interlock_ordered 2=pixel_interlock_unordered 3=sample_interlock_ordered 4=sample_interlock_unordered (code forces 3/4 setting if per-sample invocations are enabled)
00:18 fincs: The SASS generated for the beginInvocationInterlock/endInvocationInterlock calls looks absolutely disgusting, I still need to reverse that
00:20 fincs: Also (unrelated) - it looks like there is a special fragment shader mode when the output is a constant color, and nvidia detects/uses it
00:25 fincs: I tried playing with those regs myself but I was unable to make the output change at all; not even if the output is non-constant
01:28 imirkin: hm ok
01:28 imirkin: yeah, the interlock stuff requires ... work
01:28 imirkin: it'd be easy to detect constant color output
01:29 imirkin: is that really a common use-case?
01:29 imirkin: i'd think it's better not to have that opt to get proper coverage via tests (which tend to output a fixed color)
01:29 fincs: Disclaimer: I don't really know what this constant color thing is really doing, and I haven't been able to see any visible changes by playing around with the regs
01:29 fincs: Compiler generates proper fsh anyway in that case, and it's bound
01:30 fincs: The regs are used "in addition", not "replacing"
01:36 imirkin: hm, need to check where that gl_ViewportMask thing is enabled
01:36 imirkin: coz otherwise it writes off the end of the shader header...
01:36 imirkin: good thing i still have the original traces
01:36 fincs: Hm wdym?
01:37 imirkin: the export enable bit in the shader header
01:37 imirkin: it'd be at 0x50 for 0x3a0
01:37 imirkin: but the shader header is 0x50, so ...
01:37 fincs: Heh
01:38 imirkin: either it's longer, or the bit is elsewhere
01:38 fincs: What's the offset for writing to gl_ViewportMask?
01:38 fincs: (in e.g. AST)
01:39 fincs: Ah, 0x3a0
01:39 fincs: Hmm let me think
01:39 imirkin: 0x01000000 19 = 0x1000000
01:39 imirkin: i guess it's there =]
01:41 imirkin: which leads one to ask ... doesn't that conflict with texcoord output enablement?
01:41 imirkin: there should be 8 texcoords, 4 components each...
01:42 fincs: Don't the bits in the shader header correspond directly with hardware attribute indices?
01:42 imirkin: oh no, i'm doing math wrong
01:42 imirkin: there's an extra 0x40 in there
01:43 imirkin: yeah, all good
01:43 imirkin: it fits.
01:43 fincs: ( ͡° ͜ʖ ͡°)
01:43 imirkin: no funny business required.
01:44 fincs: If my calculations are correct, it would be bit0 of "OmapReserved" (i.e. last field)
01:44 fincs: (in https://nvidia.github.io/open-gpu-doc/Shader-Program-Header/Shader-Program-Header.html)
01:44 imirkin: your claculations are off
01:45 imirkin: it's bit 24 of the last word
01:45 fincs: Yeah
01:45 fincs: Same thing
01:45 fincs: "OmapReserved" is the last byte :p
01:45 imirkin: oh. right. the remainder is texcoord
01:45 imirkin: yea
01:45 imirkin: btw, i'm just noticing this -- bit 25 of the first word is also set on this vertex shader
01:46 imirkin: which exports a layer
01:46 imirkin: and/or viewport mask
01:46 fincs: Hmmm
01:46 imirkin: could be unrelated
01:46 fincs: But that's a different bit from the passthrough gs thing
01:46 imirkin: yes
01:47 fincs: I wonder if that has anything to do with the dual vertex shader shit
01:48 imirkin: the VP_A vs VP_B stuff?
01:48 fincs: Yes
01:48 imirkin: dunno. it's not set on all vertex shaders, i think
01:48 imirkin: i would have noticed by now
01:48 fincs: I still don't know what makes the compiler generate VP_A/VP_B
01:48 imirkin: i've never seen it use VP_A
01:48 fincs: Apparently some Switch games do
01:49 imirkin: i have no idea what it's for, or how it works
01:49 fincs: I found a patent for it
01:50 fincs: https://patents.google.com/patent/US8564616B1/en
02:06 imirkin: fincs: if i do gl_ViewportMask[0] = 0xf
02:07 imirkin: should i expect 4 primitives to be rasterized separately with separate gl_ViewportIndex's
02:07 imirkin: or does it only rasterize it once?
02:12 imirkin: hrmph. well, looks like setting gl_ViewportMask[0] = 0xf has no effect...
02:13 imirkin: ah no, i failed at setting that high bit
02:13 fincs: Now comes the debug phase :p
02:13 imirkin: but setting the bit does nothing.
02:14 imirkin: well, i'm not even 100% sure that what i'm doing is meant to work
02:14 fincs: Did you actually set up 4 viewports?
02:14 imirkin: https://hastebin.com/uwariheqor.cs
02:14 imirkin: ("viewport indexed" calls glViewportIndexedfv)
02:15 fincs: Mmm
02:15 imirkin: and the first one correctly shows up as red in a 50x50 box
02:16 imirkin: but since i'm unfamiliar with the ext, was hoping to get verification that it at least *looks* like it should work
02:16 imirkin: before i go and do some more poking
02:16 fincs: What happens if you write to gl_ViewportIndex instead of gl_ViewportMask?
02:18 imirkin: it works
02:18 imirkin: i mean, e.g. i just set to 1, and i got a green square
02:18 fincs: What happens if you do gl_ViewportMask[0] = 0x8?
02:18 imirkin: blank
02:18 imirkin: hm, i guess the other viewports are somehow not enabled?
02:19 imirkin: let me do some checking....
02:19 fincs: "If gl_ViewportMask[] is written by the final VTG stage, then gl_ViewportIndex in the fragment stage will have the index of the viewport that was used in generating that fragment."
02:20 imirkin: yes
02:20 fincs: So it should have worked...
02:20 imirkin: =]
02:20 imirkin: ok
02:20 fincs: Unless something in piglit is screwing you up
02:21 imirkin: well the fact that writing 0x1 works but 0x2 doesn't to viewport mask suggests there's some enable missing
02:21 imirkin: do you have notes on what 0x1138 does?
02:22 fincs: Huh, when is that register written?
02:23 imirkin: search me
02:23 imirkin: but it's in the trace
02:23 fincs: "search me"
02:23 imirkin: aka "i don't know"
02:23 imirkin: https://hastebin.com/savoqeqegu.http
02:24 fincs: I don't seem to have encountered that reg before
02:24 fincs: What card are you testing on?
02:24 imirkin: (i.e. "you could search me and you wouldn't find it")
02:24 imirkin: i'm testing on GP108, but the trace is from some GM20x
02:25 fincs: Hmmm
02:25 imirkin: someone came in here a couple years ago and did some traces
02:25 imirkin: i don't know if they were planning on doing the work to implement in mesa, but if they were, it certainly hasn't materialized
02:26 imirkin: either way, writing 0x1138 = 1 didn't help
02:26 fincs: 0x1138 (or 0x44E as I'd call it) doesn't seem to be used in nvn's main function for configuring shaders
02:27 fincs: Hmm are you sure you enabled the viewportmask output correctly in the imap?
02:27 fincs: Err omap
02:27 fincs: Can you force word[19] |= 1<<24;
02:28 imirkin: https://hastebin.com/izaweperem.bash
02:29 fincs: Hmm let me try something
02:29 imirkin: which decodes as https://hastebin.com/egiqumakip.php
02:33 fincs: Other than Nvidia not using ALD.128/AST.128, their SASS is basically the same
02:33 fincs: Let me look at the header
02:35 imirkin: yeah, it's probably better if we didn't. but it works fine.
02:35 imirkin: setting the position isn't the bit that's not working
02:35 imirkin: making the shader header 100% identical to what's in the trace doesn't help either
02:36 fincs: Nvidia's shader header: https://hastebin.com/enuwitixut
02:36 fincs: word0 looks different
02:37 imirkin: interesting. they also have that bit 25 set
02:37 imirkin: i already tried 40461. let's try 60461
02:38 imirkin: https://hastebin.com/tutubulize.bash - still no go.
02:38 imirkin: le me try with a GS in there
02:38 imirkin: see if the issue is something dumb
02:40 imirkin: nope, same problem from GS
02:41 fincs: Hmm
02:41 imirkin: i can push what i have if you're interested in playing with it
02:41 fincs: Yeah I could look at it, but tomorrow (currently it's almost 5am for me)
02:41 imirkin: there's also the occasional problem that we don't init something correctly in the kernel
02:42 imirkin: k, i'll try to clean things up a bit and push
02:42 fincs: Doubt that's the issue
02:42 imirkin: it's rarely the issue.
02:42 imirkin: xfb resume was messed on fermi for a while coz of that.
02:44 fincs: There are 16 viewports, right?
02:45 imirkin: yes
02:45 fincs: Tomorrow I'll try testing with my setup
02:45 fincs: And see if it really is an init/reg problem or if there's something else going on
02:46 fincs: But for now
02:46 fincs:-> bed
02:48 imirkin: nite
03:58 imirkin: fincs: https://github.com/imirkin/mesa/commits/viewport_array2 (the branch also has the swizzle work)
04:54 imirkin: fincs: looks like i'm missing something silly. writing gl_ViewportIndex = 1; gl_ViewportMask = 0xf *does* broadcast to all 4
04:55 imirkin: so there's something silly that is keyed on viewport index that also needs to accoutn for the mask
05:02 imirkin: the sampler shader that i have writes 0x1 to viewport mask. perhaps if something else is written it also enables the viewport index otuput (even if it's not actually written)
10:51 fincs: Interesting, that makes me suspect Imap/Omap fuckery
11:01 fincs: fsh has ImapViewportIndex set as expected
11:11 fincs: Looks like nvidia sets that weirdo 0x02000000 bit for all vshs
11:11 fincs: I tested with a simple gl_Position-writing shader, so that's most certainly unrelated
11:12 fincs: (maybe that has something to do with the multi-vsh thing)
11:13 fincs: Also, nvn seems to only handle that layer_viewport_relative reg during GSH setup, nowhere else
11:13 fincs: (not sure if that's intentional, or if I misunderstood what this reg is)
11:28 fincs: (Also, I misreported the values of the fragment shader interlock layout register, there's some bit magic I missed)
11:33 HdkR: fincs: Why are you touching fragment shader interlock things?
11:33 fincs: I was asked for a list of other fun features that aren't in nouveau :p
11:35 HdkR: oof FSI is one of those things that only non-real people use :P
11:35 fincs: Yeah
11:35 fincs: I saw what the nvidia compiler generates for the intrinsics and it is really disgusting
11:36 HdkR: Was one of those things that make OIT fast but only Intel ever made it fast
11:37 fincs: There's like, a mini support lib appended to the shader and it does function calls... yuck
11:39 HdkR: It's simple enough to beginInvocation + endInvocation right after each other to see the pain train
11:40 fincs: CAL everywhere :p
11:41 fincs: Anyway, we were trying to figure out if using gl_ViewportMask needs some special enable we're missing
11:42 fincs: According to imirkin, it seems to work if you write to gl_ViewportIndex as well
11:43 HdkR: :)
11:43 fincs: (I suspect Imap/Omap fuckery but I haven't been able to observe any weird bits set)
11:43 HdkR: Don't forget to look at the program header
11:43 fincs: Yeah that's where I'm looking
11:44 fincs: Other than an undocumented bit in word0 set (for all vshs apparently), it's what it should be
12:26 fincs: Okay
12:27 fincs: I ported over imirkin's changes to my setup and my API; and gl_ViewportMask works absolutely fine
12:27 fincs: So this tells me there's some register setting we're indeed missing
12:28 fincs: Let me try to figure out which reg it is
14:16 fincs: More testing - if I write to both gl_ViewportMask[0] and gl_ViewportIndex, I get a nice GPU crash (pushbuffer kickoff failed)
14:16 karolherbst: Lyude: https://gist.githubusercontent.com/karolherbst/01056e8a6b7c6d7d326a8d413f80d3af/raw/336a16b24debd36f7028120409183f8e1e2b8ab5/gistfile1.txt
14:17 karolherbst: that looks exactly like nouveau behaves when I had the PCIe link to 8.0 workaround
14:17 fincs: I'm starting to suspect there is some issue with piglit that is making the test not behave properly
14:17 karolherbst: works for a few cycles and then breaks
14:27 fincs: I think it would be a good idea to write a test using straight up GL without piglit/any other test framework
14:27 fincs: Just to rule that possibility out
14:28 karolherbst: fincs: piglit isn't doing all that much though
14:29 fincs: The test uses multiple viewports though; maybe something was overlooked in that test file
14:30 imirkin: fincs: well, those shader_test files are just a really simple ways to write GL programs
14:30 fincs: So far I can't seem to find any magic reg that makes this work or fail
14:30 imirkin: i can make an apitrace for you of what that shader test does. pretty sure there's nothing too odd.
14:31 fincs: There shouldn't be anything odd in theory, yeah
14:31 fincs: At least you now know that the shader compiler side of this is fine
14:32 imirkin: i suspect, but nice to be sure :)
14:32 fincs: (I.e. your code works for me™)
14:32 imirkin: did you write viewport index before the mask, or after? i only tried it before?
14:32 fincs: Oh, I tried writing after
14:32 fincs: Let me try before
14:32 imirkin: i only tried before
14:33 fincs: Nope, still crash
14:33 imirkin: weird.
14:33 imirkin: so ... that actually lends credibility to "forgot to set something up"
14:33 imirkin: coz i don't get any sort of error when i do that
14:33 fincs: Yeah... the question is... what exactly
14:34 fincs: (also fyi that strange 0x02000000 bit in word[0] seems to be set for all vshs)
14:35 fincs: Does nouveau use viewport transform?
14:36 fincs: Looks like it
14:36 fincs: Always on
14:37 fincs: Maybe it's a magic bit in VIEW_VOLUME_CLIP_CTRL?
14:42 karolherbst: fun... the PCI reg changed in turing or volta
14:42 karolherbst: but .. subtle
14:44 karolherbst: _fun_ nvidia has the exact same runpm bug nouveau had
14:45 karolherbst: let's see if my workaround works there as well
14:49 fincs: I've tried disabling every write to unknown registers in my code and despite that, this is still working
14:53 imirkin: i meant like a mmio reg
14:53 fincs: So not a method in the pushbuf?
14:53 imirkin: right
14:54 imirkin: something that would be done in the kernel
14:54 imirkin: or potentially one of the fw methods
14:54 imirkin: which also allow rmw on various context things
14:54 fincs: Stuff related to passing attributes between shader stages needs to be configured in kernel? That would be odd though
14:55 imirkin: eh, more like feature enablement
14:55 imirkin: lots of "CYA" bits in there
14:55 imirkin: aka "cover-your-ass"
14:55 imirkin: although i prefer intel's "chicken" bits
14:55 imirkin: esp their "half_chicken" register
14:55 fincs: What's this about? :p
14:55 imirkin: stuff hw designers put in for the case where they screw something up
14:56 imirkin: so that flipping a bit can turn some bit of questionable functionality off
14:56 fincs: There are quite a lot of hardware reg writes in the initialization sequence
14:56 imirkin: the sure are.
14:57 imirkin: there*
14:57 RSpliet: imirkin: what's the half_chicken register for? Or is it just a 16-bit register full of chicken bits?
14:57 imirkin: yep
14:57 imirkin: 16-bits of chicken
14:57 RSpliet: 16 bits of chicken is usually enough for a sandwich
14:58 fincs: I see 6, plus 1 that is in the middle of conservative raster shit (so it's very likely that one), plus a whole bunch of ones that are in the code for initializing the tiled cache
14:58 fincs: Hmm so there's 6 I have to test then
15:01 fincs: Ok so absolutely none of these do anything to affect the viewport mask stuff
15:02 imirkin: i see firmware writes to 0x418800, 0x419a08, 0x419f78, 0x404468, 0x419a04, and then a second time to 0x419a04
15:02 imirkin: this is on a GM204
15:02 fincs: Same
15:02 imirkin: er, GM206. same idea though.
15:02 fincs: Those are the exact same 6 ones I see and use
15:02 imirkin: so that leaves ... the kernel not setting something up, or something much simpler that we're totally missing
15:03 fincs: Tried out commenting them out and it does nothing
15:03 fincs: My money is on "something much simpler that we're totally missing"
15:03 imirkin: ;)
15:03 imirkin: can you replay traces?
15:03 fincs: This absolutely looks like a stupid bug somewhere
15:03 fincs: I don't have a setup for replaying traces, nope
15:03 imirkin: ok, well i'll make a trace anyways, see if anything obvious in there
15:04 fincs: (And also fyi my host GPU is Maxwell 1st gen, which is *behind* the Switch lol)
15:05 imirkin: here's the apitrace: https://people.freedesktop.org/~imirkin/traces/viewportmask.trace
15:06 fincs: Heh I thought that was going to be a raw pushbuffer dump
15:06 imirkin: you were asking if the piglit was doing something dumb
15:07 fincs: Heh, I have no idea how to use this file, what program should be used to view this file?
15:07 imirkin: https://github.com/apitrace/apitrace
15:09 fincs: Okay I'm looking at it
15:12 fincs: Hmm, triangle_strip is being used
15:12 imirkin: you can also replay it ("apitrace replay")
15:12 imirkin: (or "glretrace")
15:13 fincs: (See above - my desktop gpu is maxwell 1st gen so I can't even use this extension)
15:13 imirkin: for drawing, sure
15:13 imirkin: i meant on the switch. dunno what kind of setup you have there.
15:13 fincs: Switch uses a custom microkernel operating system written from scratch by Nintendo
15:13 imirkin: so no glretrace then ;)
15:14 fincs: Nvidia's Linux kernel driver blob basically runs as a user process, and applications talk to it through IPC
15:14 fincs: We ported mesa/nouveau to this setup and it works way better than it should :p
15:14 fincs: (only the userland bits)
15:14 imirkin: right, of course
15:15 imirkin: i think someone who was involved in the porting of it talked to me about it a while back
15:15 fincs: Yeah
15:15 fincs: Armada did the initial work, then I fixed it so that it would work properly :p
15:15 imirkin: yeah, right. Armada.
15:16 fincs: I'm going to try changing my code to use a triangle strip instead of a plain triangle
15:16 imirkin: can't imagine that would matter
15:17 fincs: Yeah but maybe it gets confused because you're generating two primitives with that
15:17 imirkin: don't see why that would matter, but ok
15:18 imirkin: btw - i rechecked... the sample program actually wrote "1 << viewport" where viewport was a uniform.
15:18 imirkin: so the mmt trace should be fully representative
15:19 fincs: I see a hardcoded 0xf write in the trace you posted
15:20 fincs: "draw rect -1 -1 2 2" <-- is this the cmd that is generating the rect?
15:20 imirkin: yes, the one i posted. but the mmt is from a shader someone else did
15:20 fincs: Ah
15:21 imirkin: https://people.freedesktop.org/~imirkin/traces/gm206/vs-viewportmask.shader_test
15:21 imirkin: although they didn't set up the secondary viewports
15:21 imirkin: so the blob could be getting smart about it? dunno
15:22 fincs: Okay so yeah, with a rect/trianglestrip it still works
15:26 fincs: Do you need to enable any feature in GL in order to use glViewportIndexedf?
15:26 imirkin: it's part of ARB_viewport_array
15:26 imirkin: (which is also core in GL 4.1 or 2, iirc)
15:27 imirkin: remember - adding the index write makes it work
15:27 imirkin: so it's not a GL-level fail, most likely
15:27 fincs: You were testing on Pascal, right?
15:27 imirkin: yes, GP108
15:28 fincs: Hmm I wonder if behaviour is different on Pascal
15:28 imirkin: i don't have any GM20x
15:28 imirkin: could be, but unlikely
15:35 fincs: Ah, I missed nvc0_magic_3d_init as a source of unk methods
15:40 fincs: All the writes in nvc0_magic_3d_init are things I already do... except for the write to 0x02d0 which I don't do
15:43 fincs: Adding this write to my code does nothing
15:43 imirkin: :)
15:44 imirkin: skeggsb: any idea on what might be missing in gr init or somewhere else which would cause viewport mask to behave oddly?
15:44 fincs: So yeah... not even the magic init is the culprit
15:49 imirkin: skeggsb: basically the way the issue appears is that writing viewport mask doesn't seem to actually broadcast anything to viewports other than the first (setting it to 0 *does*, however cause the primitive not to appear). adding a shader write to the viewport index all of a sudden makes the mask work properly. however blob does not do this. i suspect that writing the viewport index triggers some mode which allows any viewport to be used, which should
15:49 imirkin: also be triggered for the mask but isn't. my guess is some missed bit of gr init since we've ruled out pretty much everything else.
15:50 imirkin: skeggsb: note that this is a GM20x+ feature.
15:50 imirkin: karolherbst: do you have a GM20x around?
15:59 karolherbst: imirkin: yes
16:00 karolherbst: GM204 I think and the gm20b :p
16:00 imirkin: could you try my viewport arrya stuff?
16:01 karolherbst: imirkin: you mean the MR or something else?
16:03 imirkin: karolherbst: something else
16:03 imirkin: karolherbst: https://github.com/imirkin/mesa/commits/viewport_array2
16:04 karolherbst: why gm20x though? Would pascal be fine or do you want to make sure it works on gm20x as well?
16:04 imirkin: it's already not working on pascal :)
16:04 imirkin: at least on GP108
16:04 karolherbst: I see :)
16:04 imirkin: i want to check that it also doesn't work on GM20x, since the trace i'm going off of is GM206 and fincs is testing on GM20B
16:04 fincs: Fwiw, the compiler bits in that branch at least are fine
16:04 imirkin: (but he's testing with the switch nvidia blob kernel driver, so it's not 1:1 testing)
16:04 fincs: Generated shaders are correct
16:06 imirkin: karolherbst: and a piglit patch ... https://people.freedesktop.org/~imirkin/traces/0001-shader_runner-allow-setting-viewports-directly.patch
16:06 imirkin: karolherbst: and this shader_test: https://hastebin.com/jufacowabu.cs
16:09 karolherbst: imirkin: btw as I have currently the blob running on the gm204 system, should I create traces first?
16:09 imirkin: go for it!
16:09 karolherbst: k
16:09 imirkin: pretty sure i have what i need
16:09 imirkin: but ... can't hurt to have more info
16:09 fincs: Tracing blob with that piglit patch + test?
16:10 imirkin: the trace i have is from a slightly different test with GM206
16:10 karolherbst: need to do some admin stuff on the machine first, so it will take a while, but I assume until the end of the day you should get your answers
16:10 imirkin: karolherbst: also if you have just a general mmiotrace of a gm20x coming up, that'd be useful. (do we have some in the repo?)
16:10 karolherbst: we should have some in the repo
16:10 imirkin: ok cool, i'll check
16:12 imirkin: hrmph. i can no longer access the repo...
16:13 imirkin: aha, but in my checkout there are a couple of gm206 traces =]
16:13 imirkin: oh. even gp108 in there.
16:14 imirkin: and gv100. neat.
16:16 imirkin: so now i'm looking for a reg like this one:https://github.com/envytools/envytools/blob/master/rnndb/graph/gf100_pgraph/tpc.xml#L32
16:21 imirkin: looks like we're missing a write of 0 to 0x419810 -- unlikely that's it though.
16:21 imirkin: yeah, it's 0 anyway
16:23 karolherbst: imirkin: do you even know if the test passes on the blob?
16:23 imirkin: no
16:23 karolherbst: maybe I just test this first, because that's the easiest here right now :)
16:24 imirkin: still need the piglit patch obviously
16:29 karolherbst: mhh.. this is literally the first time I try to use nvidia on wayland.. let's see how that goes
16:30 fincs: Doesn't the blob have no support whatsoever for wayland?
16:30 karolherbst: it does, a little
16:30 fincs: I remember being forced into using X, the last time I attempted to use Linux
16:31 karolherbst: you get no GLX support in XWayland
16:31 karolherbst: which is a big issue
16:33 karolherbst: "EGL vendor string: NVIDIA" \o/ well at least eglinfo seems to work
16:33 imirkin: karolherbst: replay the apitrace i posted earlier
16:33 imirkin: should work with eglretrace i'd think
16:33 karolherbst: piglit works with EGL as well :p
16:34 imirkin: oh right, yea
16:34 imirkin: trace here: https://people.freedesktop.org/~imirkin/traces/viewportmask.trace
16:34 imirkin: should see a frame with 4 squares of different colors
16:34 karolherbst: mhh PIGLIT_PLATFORM was the env var?
16:35 imirkin: PIGLIT_PLATFORM iirc :p
16:35 karolherbst: PIGLIT_PLATFORM=wayland __NV_PRIME_RENDER_OFFLOAD=1 ./bin/shader_runner tmp.shader_test -auto -fbo
16:35 karolherbst: mhhh
16:35 karolherbst: no idea if the nvidia driver is used actually
16:36 karolherbst: but I am getting a Test requires unsupported extension GL_NV_viewport_array2
16:36 imirkin: i'd say "no" then
16:37 karolherbst: oh well... hopefully I don't mess up spawning an Xserver on the nvidia gpu
16:57 karolherbst: finally :D I should create a shell script to start this damn X server
16:58 karolherbst: okay.. nvidia passes that test
16:58 imirkin: sorry, didn't mean for it to be a pain for you
16:58 imirkin: cool
16:59 fincs: Alright so it is not a piglit issue then
16:59 karolherbst: it's not.. just Xorgs args are messy
16:59 karolherbst: and you know.. if it does a vtswitch and you can't switch back...
16:59 karolherbst: If I enable the "gl_ViewportIndex=0" line it still passes
16:59 karolherbst: in case it matters
17:00 fincs: Hmm, I haven't checked what the official compiler does when you write to both Index and Mask
17:00 karolherbst: okay.. now mmt
17:03 imirkin: fincs: afaik it's allowed. although probably less-than-fully-defined if a shader actually writes both
17:03 imirkin: but you could imagine something like if () { index = foo; } else { mask = bar; }
17:04 fincs: Official compiler ignores the Index write, only writes to Mask
17:04 imirkin: what if you do the if/else?
17:05 fincs: (However the bit in the omap for Index is still set)
17:05 fincs: Let me test if/else
17:06 fincs: Oh wait
17:06 fincs: I misread this
17:06 fincs: It does write to both
17:06 fincs: Sorry
17:06 fincs: Doesn't help that viewportindex is so close to gl_Position in AST...
17:07 karolherbst: imirkin: https://filebin.net/9soublg0uyfxveew/file-bin.log.xz?t=9nlxinpx
17:10 fincs: Hmm I wonder why does this crash for me then
17:10 fincs: Need to investigate
17:10 imirkin: karolherbst: which GPU is this? GPsomething?
17:10 karolherbst: gp107, yes
17:11 karolherbst: thought this makes more sense for you as you are on a gp108 :p
17:11 imirkin: my demmt can't decode it =/
17:11 karolherbst: mhhh
17:11 karolherbst: mhhh
17:11 karolherbst: do I still have local commits for that?
17:11 imirkin: ERROR: nvrm_ioctl_host_map56: cannot find object 0xc1d0002d 0xcaf0000a
17:11 karolherbst: ahhh
17:11 karolherbst: okay.. wait
17:12 karolherbst: imirkin: ohh no, I think you just need to update your install
17:13 imirkin: ah, maybe
17:13 karolherbst: imirkin: you need db2163cd929cb3daa33c689f68256739eb23d947:)
17:13 imirkin: kk
17:14 karolherbst:still has local UVM stuff...
17:14 karolherbst: and yes.. nvidia makes incompatible changes to UVM
17:14 karolherbst: :(
17:14 karolherbst: removing fields and stuff
17:15 imirkin: annoying.
17:15 karolherbst: yes
17:15 karolherbst: well.. we could import the uvm headers for every driver version :/
17:15 karolherbst: ...
17:18 imirkin: karolherbst: the nouveau_vbios_trace repo appears to be weirdly messed up -- its first commit has a diff hash than the original repo.
17:18 karolherbst: yes
17:18 karolherbst: I converted it to lfs
17:18 imirkin: ahhh
17:18 imirkin: ok
17:42 imirkin: fincs: i think someone switch-related managed to dump some nvidia method names for that gpu ... do you remember where i can find that?
17:50 fincs: Wait what? Someone did that? I don't remember such a thing
17:52 imirkin: i remember seeing the nvidia method names
17:52 imirkin: associated with each index
17:52 fincs: Huh
17:52 fincs: Are you sure that was the 3d engine?
17:52 imirkin: yes
17:52 fincs: Hmmm
17:52 imirkin: potentially by one of the emulator efforts
17:53 fincs: Are you sure you didn't just get linked to this page (which is really outdated at this point)? https://switchbrew.org/wiki/GPU
17:53 imirkin: yes. those are the nvidia method names iirc
17:53 fincs: Some of them
17:53 imirkin: yeah, but some not. o well.
17:54 fincs: That uses a mix of method names lifted from compute, and ones reworded from NVN command names, and even from nouveau's register database
17:54 imirkin: ah sad ok
17:54 imirkin: i thought it was from a driver dump
17:54 imirkin: sometimes they leave that stuff in
17:54 fincs: If nvidia had left full 3d method names then that would be a treasure lol
17:56 karolherbst: imirkin: will you still need me to test that stuff on maxwell or is the mmt trace enough to figure out what's wrong?
17:57 imirkin: karolherbst: not enough =/
17:57 karolherbst: :/
17:57 karolherbst: mhhh
17:57 karolherbst: weird
17:58 karolherbst: want an mmiotrace as well when executing this test?
17:59 imirkin: nah
18:08 imirkin: karolherbst: so yeah, based on that trace, no clue what's missing
18:08 imirkin: i expect it's some bit of GR init
18:08 karolherbst: maybe...
18:08 karolherbst: is there any mme script involved here?
18:29 Lyude: karolherbst: btw - did you check suspend/resume with your rpm workaround?
18:30 imirkin: karolherbst: no. the mme scripts show up as PM: in the demmt
18:30 imirkin: (so i would have caught it)
18:35 karolherbst: Lyude: I think so, why?
18:36 karolherbst: at least on the XPS I did
18:41 karolherbst: Lyude: just checked on the 2nd gen P1, works there as well
18:45 karolherbst: Lyude: anyway, I digged deeper into the runpm issue on nvidia and it behaves exactly like with nouveau :/
18:45 karolherbst: oh well
18:47 Lyude: karolherbst: pardon? runpm seems to work just fine on mine
18:47 Lyude: w/ your patches
18:50 karolherbst: "<Lyude> karolherbst: btw - did you check suspend/resume with your rpm workaround?"
18:50 karolherbst: or what was your question about now?
18:50 Lyude: karolherbst: no I meant - did you hit some new runpm issue?
18:50 karolherbst: no, I just found out that nvidia is equally broken as nouveau was
18:50 Lyude: ahhh lol
18:51 karolherbst: yeah
18:51 karolherbst: I even changed the PCIe link speed to 2.5
18:51 karolherbst: as this is what nouveau does through devinit
18:51 karolherbst: insta fail
18:51 karolherbst: so...
18:51 karolherbst: and I even applied my workaround to the nvidia driver, kept working for 250 cycles
18:51 karolherbst: then I stopped the test
18:57 karolherbst: no idea what to do about this entire issue anymore
18:57 karolherbst: maybe just leave it be
18:57 karolherbst: the workaround seems to work and nvidia broken...
18:57 karolherbst: so...
18:58 HdkR: Winning against the blob is always a good thing :P
20:56 imirkin: karolherbst: reminder to check the viewport mask thing on your gm20x
20:56 imirkin: i think the tegra x1 would be fine btw
20:56 karolherbst: imirkin: is there a point in checking if it's broken on pascal? Or maybe I just missunderstood the situation
20:57 imirkin: the point is to check if it's broken on maxwell too
20:57 imirkin: or if it's just a pascal thing
20:57 imirkin: it works for fincs on the switch, but that's not exactly an apples-to-apples comparison
20:57 fincs: Yeah, it's not
20:58 imirkin: but it's still nouveau setting up the 3d context / emitting commands / etc. but with the nvidia RM, basically.
20:58 karolherbst: I see
20:58 karolherbst: yeah, I could check on the tegra, would be easier anyway
21:04 karolherbst:hates machines without hostnames
21:06 karolherbst:likes managed switches
21:48 karolherbst:should setup a cross compiler
22:03 fincs: Join/parts spam :\
22:05 imirkin: gr
22:06 fincs: Yay moderation \o/