00:04fdobridge: <marysaka> oh hmm... the STUFF commit doesn't build as a standalone I see :painpeko:
00:05fdobridge: <gfxstrand> Yeah, it needs the CF fixes
00:05fdobridge: <marysaka> well I will combine that I guess
00:05fdobridge: <gfxstrand> sure
00:42fdobridge: <marysaka> MME tests still pass, time for rebase
00:43fdobridge: <marysaka> https://gitlab.freedesktop.org/marysaka/mesa/-/commits/nvk/mme-fermi-v3
00:45fdobridge: <marysaka> I think I should squash the MME builder integration with the commit that adds it but I think I will call it a day on this
00:45jekstrand: kk
00:45fdobridge: <karolherbst🐧🦀> IDEs want those files to be listed... at least some... I think... maybe, dunno 😄
00:46fdobridge: <karolherbst🐧🦀> at least with cmake this was always important to add all header files as well, because otherwise the generator wouldn't include the header files in projects
00:46fdobridge: <gfxstrand> Ok, so for some reason on Maxwell, I'm getting different edge behavior on triangles.
00:47fdobridge: <gfxstrand> I'm drawing two triangles to make a quad and the diagonal is missing fragments.
00:47fdobridge: <karolherbst🐧🦀> huh
00:47fdobridge: <karolherbst🐧🦀> "strange"
00:48fdobridge: <karolherbst🐧🦀> but you know.. since 2nd gen Maxwell we can actually draw quads anyway, so I'm surprised you use triangles in the first place
00:49fdobridge: <karolherbst🐧🦀> also.. there is `NV_fill_rectangle` new since 2nd gen Maxwell as well, which might explain what you are hitting now
00:49fdobridge: <gfxstrand> Because I haven't hooked up magic paths
00:49fdobridge: <gfxstrand> Also, this is 1st gen maxwell
00:50fdobridge: <karolherbst🐧🦀> ohh wait.. fill_rectangle is the quad feature
00:50fdobridge: <karolherbst🐧🦀> yeah...
00:50fdobridge: <karolherbst🐧🦀> my point was.. it might work as epxected on 2nd gen
00:50fdobridge: <gfxstrand> Also, drawing two triangles really should work....
00:50fdobridge: <karolherbst🐧🦀> my point was.. it might work as expected on 2nd gen (edited)
00:51fdobridge: <karolherbst🐧🦀> mhhh.. different defaults or something? stuff changed in weird ways around that time
00:52fdobridge: <karolherbst🐧🦀> do you set `NV9097_SET_LINE_MODE_POLYGON_CLIP_GENERATED_EDGE`?
00:52jekstrand: Yeah, looking at that
00:55fdobridge: <karolherbst🐧🦀> or uhh... let me try to find something
00:55jekstrand: Yeah, that doesn't do anything
00:56jekstrand: SMOOTH_EDGE_TABLE, maybe?
00:56fdobridge: <karolherbst🐧🦀> maybe?
00:56jekstrand: Doesn't look like nouveau sets that either
00:59fdobridge: <gfxstrand> If we don't have watertight meshes, people are gonna get grumpy....
01:01fdobridge: <karolherbst🐧🦀> might want to check `nvc0_blitctx_prepare_state`
01:02fdobridge: <karolherbst🐧🦀> even calls two macros, interesting
01:06jekstrand: Yeah, nothing interesting there.
01:07fdobridge: <karolherbst🐧🦀> currently looking at `nvc0_blit_3d`, especially the `VIEW_VOLUME_CLIP_CTRL` a.k.a `NVA097_SET_VIEWPORT_CLIP_CONTROL`
01:07fdobridge: <karolherbst🐧🦀> especially as we set a bit not documented? wtf
01:09jekstrand: Viewports should be fine
01:09fdobridge: <karolherbst🐧🦀> yeah..
01:10fdobridge: <karolherbst🐧🦀> I think there was something to change "rounding" behavior or something like that on the edges, but I can't remember what it was...
01:10jekstrand: SNAP_GRID_LINE_ROUNDING_MODE?
01:10karolherbst: maybe?
01:11fdobridge: <karolherbst🐧🦀> `NVC0_3D_UNK0318` in nvc0... "fun"
01:11fdobridge: <karolherbst🐧🦀> at least doesn't seem like we ever set it
01:11jekstrand: Yeah and it shouldn't matter
01:12jekstrand: I'm also drawing them with the same orientation so we shouldn't have edge winding order problems.
01:12fdobridge: <karolherbst🐧🦀> yeah.. I mean, the gallium blitter code also draws two triangles and it's not causing those issues
01:12jekstrand: DRAW_CONTROL maybe?
01:13karolherbst: too new
01:13karolherbst: but maybe?
01:13jekstrand: right
01:13jekstrand: Yeah, doesn't exist on maxwell
01:14karolherbst: could be that we just overlap a little in nvc0 to be super safe
01:14jekstrand: If that's the solution, no apps will render properly
01:14karolherbst: yeah...
01:17jekstrand: it's possible I'm getting the guardband wrong.
01:17jekstrand: But that shouldn't matter either
01:17jekstrand: That shouldn't cause edges to separate
01:19jekstrand: I'll poke at this tomorrow. My stomach is grumbling.
01:19karolherbst: uhhh wait...
01:19karolherbst: we don't draw two triangles for blits
01:19karolherbst: we draw one....
01:20karolherbst: uhhh
01:21karolherbst: I know this topic came up...
01:26fdobridge: <karolherbst🐧🦀> @gfxstrand `[Samstag, 21. April 2018] [16:47:35 CEST] <pendingchaos> I'm getting some poor msaa quality. it only seems to effect some of the edges of a triangle is this a know issue?` 🙃
01:26fdobridge: <gfxstrand> Oh boy...
01:27fdobridge: <karolherbst🐧🦀> let me do some more digging and see what I can find
01:27fdobridge: <karolherbst🐧🦀> `the idea of that code is that you draw a single triangle`, yes, but why
01:29fdobridge: <karolherbst🐧🦀> but that discussion was mostly around msaa being funky and the edges having funky issues like that
01:29fdobridge: <karolherbst🐧🦀> or are you doing msaa stuff?
01:30fdobridge: <karolherbst🐧🦀> in which case... "don't do that?" 😄 dunno
01:34fdobridge: <gfxstrand> I suspect it's something to do with MSAA.
01:35fdobridge: <gfxstrand> That's what my gut is telling me.
01:35fdobridge: <karolherbst🐧🦀> mhhh
01:35fdobridge: <gfxstrand> I'll play with it more tomorrow
01:35fdobridge: <karolherbst🐧🦀> 2nd gen maxwell added custom sample locations
01:35fdobridge: <karolherbst🐧🦀> just saying
01:36fdobridge: <gfxstrand> Yeah
01:41karolherbst: HdkR: sooo... drawing two triangles to draw a rectangle, what do you say about that?
02:33fdobridge: <karolherbst🐧🦀> @gfxstrand you won't believe what I just found
02:34fdobridge: <karolherbst🐧🦀> @gfxstrand https://on-demand.gputechconf.com/siggraph/2016/presentation/sig1609-kilgard-jeffrey-keil-nvidia-opengl-in-2016.pdf slides 65+
02:41fdobridge: <gfxstrand> Makes sense
02:41fdobridge: <gfxstrand> I figured it was something like that. I need to poke about
02:41fdobridge: <karolherbst🐧🦀> I just think that those artifacts are something we have to live with or we really have to understand how the GPU draws and how to prevent that from happening
02:42fdobridge: <karolherbst🐧🦀> at least on GPU gens without that fine control of things
02:43fdobridge: <karolherbst🐧🦀> I'm sure we did this single triangle approach in nvc0 for exactly that reason
02:43fdobridge: <gfxstrand> I think I know what's wrong. I'll play with it tomorrow.
02:43fdobridge: <karolherbst🐧🦀> okay, cool
02:44fdobridge: <karolherbst🐧🦀> hopefully you'll figure it out
05:29HdkR: karolherbst: Screenspace or world space? :P
11:36OftenTimeConsuming: karolherbst: Got a hang on an upgraded mesa version and no dmesg errors. I guess I need to turn on debug mode in kernel...
11:47karolherbst: OftenTimeConsuming: yeah.. can also be a random mesa bug. Usually it helps to have a solid and reliable reproducer, or to know what the gpu/kernel is complaining about
11:48OftenTimeConsuming: It's a shame I have neither.
16:11fdobridge: <gfxstrand> Wait, whaaaaaat?!? It's not that the triangles aren't covering some pixles it's that edge pixels are ending up black.
16:11fdobridge: <gfxstrand> They're not ending up clear color
16:12fdobridge: <gfxstrand> I wonder if it's a sampler issue. Maybe something with derivatives being wrong?
16:13fdobridge: <gfxstrand> Yup! That's it. No cracking when I use a constant color.
16:35fdobridge: <karolherbst🐧🦀> heh
16:35fdobridge: <karolherbst🐧🦀> "fun"
16:36fdobridge: <karolherbst🐧🦀> maybe it's related to the helper invoc thing... which I highly doubt... that reminds me.. I still need to upstream those bits.. *sigh*
16:36fdobridge: <Esdras Tarsis> why is VK_KHR_maintenance2 labeled medium?
16:38fdobridge: <Esdras Tarsis> besides being important for zink this extension is important for d3d9 frontend of DXVK
16:43fdobridge: <gfxstrand> Because it's a grab-bag with a bunch of stuff. Each individual item is probably easy but there's a lot in there.
16:43fdobridge: <gfxstrand> And someone (probably me) needs to do very careful review when it lands to make sure we don't miss anything.
16:44fdobridge: <gfxstrand> Not *that* helper invoc thing but maybe helper invoc related.
16:44fdobridge: <karolherbst🐧🦀> yeah...
16:44fdobridge: <karolherbst🐧🦀> maybe a different bit in that magic reg 🙃
16:45fdobridge: <gfxstrand> The texture op shouldn't be doing derivatives it should be txl.
16:45fdobridge: <gfxstrand> But I'm wondering if the compiler is messing it up somehow
16:46fdobridge: <karolherbst🐧🦀> could be
16:48jekstrand: Yeah, I'm not getting TXL...
16:48jekstrand: I'm getting TEX with levelZero set?!?
16:48jekstrand: Makes sense, I guess.
16:52karolherbst: yeah.. doesn't matter either way
16:53karolherbst: it's all the same in the ISA anyway
16:54jekstrand: Yeah, the question, then, is why do I have so many sources to this texture op?
16:54jekstrand: It should be coords and a tex/samp handle pair
16:55karolherbst: how many do you get after RA?
16:55jekstrand: I've got at least two regs worth I don't need.
16:55jekstrand: 19: tex 2D_ARRAY $r255 $s31 rgba f32 $r0q $r4t $r0t (8)
16:55karolherbst: that's fine
16:55karolherbst: $r4t contains the coords
16:55karolherbst: and the handle goes into $r0t
16:56karolherbst: not sure why it allocates a triple reg, but...
16:56karolherbst: it has to be quad aligned either way
16:56karolherbst: there are "scalar" modes to compress that all, but I'd rather not mess with that part of the code again
16:57karolherbst: "scalar" actually just removes the gaps you need to fill
16:58jekstrand: In any case, it is texturing when not on the edges so the handles are in the right place
16:58karolherbst: yep
16:58karolherbst: lod would be in src1.y
16:59jekstrand: which is 0
17:00karolherbst: it's not read if .LZ is set anyway
17:00jekstrand: sure
17:01karolherbst: ahh.. src1.z actually needs to be set in any case
17:01karolherbst: those are offsets
17:01jekstrand: ah
17:01karolherbst: and I assume they are all 0
17:01jekstrand: yeah
17:01jekstrand: r2 is being set to garbage, though.
17:02jekstrand: Well, it's being set to stuff but the stuff is nonsense
17:02karolherbst: and src0.x is lod clamp in the high bits and array idx in the low bits
17:02karolherbst: src0.yzw are str
17:02karolherbst: lod clamp only if .LC is set obviously
17:02jekstrand: 0: linterp pass f32 $r0 a[0x7c] (8)
17:02jekstrand: 1: rcp f32 $r0 $r0 (8)
17:02jekstrand: 2: pinterp mul u32 $r1 a[0x70] $r0 (8)
17:02jekstrand: 3: pinterp mul u32 $r0 a[0x74] $r0 (8)
17:02jekstrand: 4: mov u32 $r2 c1[0x20] (8)
17:02jekstrand: 5: mov u32 $r3 c1[0x24] (8)
17:02jekstrand: 6: fma ftz f32 $r5 $r1 c1[0x28] $r2 (8)
17:02jekstrand: 7: fma ftz f32 $r6 $r0 c1[0x2c] $r3 (8)
17:02jekstrand: 8: linterp flat u32 $r0 a[0x64] (8)
17:02jekstrand: 9: add s32 $r0 $r0 c1[0x38] (8)
17:02jekstrand: 10: cvt f32 $r4 u32 $r0 (8)
17:02jekstrand: 11: mov u32 $r0 c1[0xa0] (8)
17:03jekstrand: 12: mov u32 $r1 c1[0xa4] (8)
17:03jekstrand: 13: ld u64 $r2d g[$r0d+0x0] (8)
17:03jekstrand: 14: and u32 $r0 $r3 0x000fffff (8)
17:03jekstrand: 15: and u32 $r1 $r2 0xfff00000 (8)
17:03jekstrand: 16: or u32 $r0 $r0 $r1 (8)
17:03jekstrand: 17: mov u32 $r1 0x00000000 (8)
17:03jekstrand: 18: cvt ftz u16 $r4 f32 $r4 (8)
17:03jekstrand: 19: tex 2D_ARRAY $r255 $s31 rgba f32 $r0q $r4t $r0t (8)
17:03jekstrand: BB:1 (1 instructions) - idom = BB:0, df = { }
17:03jekstrand: 20: exit - # (8)
17:03karolherbst: ohh wait
17:04karolherbst: for 2D the coords are in src0.xy
17:04karolherbst: or are they...
17:04karolherbst: it's 2D_ARRAY anyway
17:04karolherbst: and the coords seem to be in $r5 and $r6, which is correct
17:05karolherbst: so yeah.. the shader looks fine
17:05karolherbst: we really should get the scalar packing working in NAK, because that actually helps with RA a lot
17:06karolherbst: so if you only have a 2D access, you add two scalar sources and are done
17:06karolherbst: anyway... I think the shader is fine
17:15jekstrand: Yeah, I'll put it on the ToDo list
17:45jekstrand: How are helper invocations controlled?
17:47karolherbst: what do you mena?
17:48karolherbst: afaik we have 0 knobs on those in the channel
18:32jekstrand: karolherbst: I think it is our favorite bug. :-(
18:32jekstrand: karolherbst: Where do I find your kernel patch?
18:33jekstrand: I think the Mesa patch is in a draft MR somewhere.
18:36jekstrand: Or not
18:40jekstrand: Found the mesa patch
18:46jekstrand: Found the kernel branch
18:46jekstrand: Hooray for IRC logs.
19:01jekstrand: Yup! That fixes it.
19:01jekstrand: Time for some more micro-blogging.
19:03jekstrand: Doing another CTS run now.
19:25karolherbst: :)
19:25karolherbst: I really should clean it up and make it possible to use it in a non messy way (a.k.a. always set that bit)
19:31jekstrand: Yeah
19:37jekstrand: Ok, deqp-runner says it'll be done in another 2 minutes.
19:37jekstrand: This run is looking WAY better than yesterday's
19:38karolherbst: figures, because I always did my runs with that helper invoc patch anyway
19:40fdobridge: <gfxstrand> `Pass: 218742, Fail: 10939, Crash: 1039, Warn: 4, Skip: 1328582, Flake: 1257, Duration: 36:13`
19:40fdobridge: <gfxstrand> Much better!
19:41fdobridge: <gfxstrand> That's 80-90% of the fails gone.
19:41fdobridge: <karolherbst🐧🦀> how does it look on turing?
19:41fdobridge: <gfxstrand> Turing is at about 500 crash/fail
19:41fdobridge: <karolherbst🐧🦀> ahh
19:55fdobridge: <gfxstrand> Ok, next: Something is wrong with 3D blits. No idea what yet.
20:23fdobridge: <marysaka> nice!
21:18fdobridge: <gfxstrand> I wish the 3D blit tests didn't suck so much.
21:19fdobridge: <gfxstrand> They test well enough but the output they give, they do this weird projection thing to try and visualize the 3D image and it's impossible to see the individual layers. 😦
22:09fdobridge: <gfxstrand> It appears indexed rendering also doesn't work. I think I'm going to try and solve that one.
22:21fdobridge: <gfxstrand> Actually... the clear in this test appears to be failing.
22:27fdobridge: <gfxstrand> *grumble*
22:28fdobridge: <gfxstrand> Is my copy not happening?!?
22:30fdobridge: <karolherbst🐧🦀> check dmesg 😛
22:31fdobridge: <gfxstrand> nothing in dmesg
22:31fdobridge: <gfxstrand> I've been watching it like a hawk
22:34fdobridge: <karolherbst🐧🦀> mhh, weird
22:34fdobridge: <karolherbst🐧🦀> maybe something in the macro?
22:34fdobridge: <gfxstrand> Shouldn't be.
22:34fdobridge: <marysaka> I rebased and applied your suggestion on isaspec MR btw @gfxstrand, do I need to squash the changes?
22:35fdobridge: <gfxstrand> Yeah, it should just be one patch in the end.
22:35fdobridge: <marysaka> okay all good then thanks 👍
22:38fdobridge: <gfxstrand> Left you one more comment. Sorry I didn't notice that yesterday.
22:39fdobridge: <marysaka> No issue
22:40fdobridge: <gfxstrand> Hrm... multiple clear tests failing: `dEQP-VK.draw.renderpass.multiple_clears_within_render_pass.clear_clear_c_r8g8b8a8_snorm_d_d16_unorm_big_triangle`
22:44fdobridge: <marysaka> should be all good now :AkkoYay:
22:44fdobridge: <karolherbst🐧🦀> @gfxstrand what's the oldest you've got? maxwell 1st?
22:45fdobridge: <gfxstrand> Ok, I gave you an RB tag. Throw it on there and I'll assign marge.
22:45fdobridge: <gfxstrand> I think I got a Kepler.
22:46fdobridge: <gfxstrand> I don't have Pascal
22:47fdobridge: <gfxstrand> I don't have Pascal. They're not the cheapest yet and we'll likely never be able to reclock them so...
22:47fdobridge: <karolherbst🐧🦀> odd that even the 1030 are still quite pricy
22:48fdobridge: <karolherbst🐧🦀> odd that even the 1030 are still quite pricey (edited)
22:48fdobridge: <karolherbst🐧🦀> ohh.. there are some cheap ones
22:49fdobridge: <gfxstrand> It was enough that I didn't want to bother. I'll probably pick one up at some point for completeness.
22:50fdobridge: <marysaka> I have one around I think if you ever want me to test stuffs I guess
22:50fdobridge: <gfxstrand> Meh. We've got lots of hardware to play with for now.
22:50fdobridge: <gfxstrand> I doubt there will be a lot of Pascal-specific bugs.
22:50fdobridge: <karolherbst🐧🦀> yeah.. there shouldn't be
22:51fdobridge: <karolherbst🐧🦀> pascal is more or less like 2nd gen maxwell
22:51fdobridge: <karolherbst🐧🦀> just compute launches are a bit different
22:51fdobridge: <karolherbst🐧🦀> but they are like turing, soo...
22:51fdobridge: <marysaka> In my brain it's 3rd gen maxwell :AkkoDerp:
22:51fdobridge: <karolherbst🐧🦀> kepler is interesting because it uses the old texture headers
22:51fdobridge: <karolherbst🐧🦀> yeah.. sounds about right 😄
22:52fdobridge: <karolherbst🐧🦀> all that generation mess is quite fuzzy on nvidia anyway
22:52fdobridge: <karolherbst🐧🦀> it's not like they redesign the entire GPU
23:03fdobridge: <gfxstrand> @marysaka I just remembered: We should add some sort of a `bitsize % 32 == 0` assert in the python too. I commented and Rob left a ➕ so I assume that's agreement.
23:03fdobridge: <marysaka> ah right :painpeko:
23:04fdobridge: <marysaka> the question is where should we put it as all the code seems to already assume 32 bits (there is division around by 32 for example)
23:04fdobridge: <gfxstrand> Something like this should do:
23:04fdobridge: <gfxstrand> ```python
23:04fdobridge: <gfxstrand> # We only support multiples of 32 bits for now
23:04fdobridge: <gfxstrand> assert isa.bitsize % 32 == 0
23:05fdobridge: <gfxstrand> ```
23:05fdobridge: <gfxstrand> Right after
23:05fdobridge: <marysaka> Maybe in the ISA constructor...?
23:05fdobridge: <gfxstrand> Something like this should do:
23:05fdobridge: <gfxstrand> ```python
23:05fdobridge: <gfxstrand> # We only support multiples of 32 bits for now
23:05fdobridge: <gfxstrand> assert isa.bitsize % 32 == 0
23:05fdobridge: <gfxstrand> ```
23:05fdobridge: <gfxstrand> Right after
23:05fdobridge: <gfxstrand> ```python
23:05fdobridge: <gfxstrand> s = State(isa)
23:05fdobridge: <gfxstrand> ``` (edited)
23:05fdobridge: <marysaka> or actually in validate_isa...
23:05fdobridge: <gfxstrand> I'd put it in the generator because that's where the problem is.
23:05fdobridge: <gfxstrand> I think the ISA constructor theoretically doesn't care.
23:06fdobridge: <marysaka> format and value methods does divide by 32 though.
23:07fdobridge: <gfxstrand> Yeah, I guess they do.
23:07fdobridge: <🌺 ¿butterflies? 🌸> NV rolls out new blocks when they are ready
23:08fdobridge: <marysaka> I don't mind anywhere tbh, I'm not too familiar with that code, I will just do as you want
23:08fdobridge: <gfxstrand> Ok, yeah, I think `validate_isa()` is the right place.
23:08fdobridge: <karolherbst🐧🦀> sure, but rolling out new blocks and trashing the entire API to it are two separate things
23:12fdobridge: <marysaka> Pushed the changes
23:13fdobridge: <gfxstrand> You still need to add my `Reviewed-by:` tag to the commit message. Yeah, that bit's annoying. We might be doing away with it in the not-too-distant future.
23:13fdobridge: <gfxstrand> But we still use them for common code.
23:13fdobridge: <marysaka> oh I wasn't sure if that was fine as you asked another change here
23:13fdobridge: <gfxstrand> Yeah, it's good now.
23:14fdobridge: <marysaka> ok then 👍
23:14fdobridge: <marysaka> okay pushed that, thanks for help 😄
23:15fdobridge: <marysaka> okay pushed that, thanks for your help 😄 (edited)
23:17fdobridge: <gfxstrand> There you go. Assuming CI doesn't fail for silly reasons, it should land in an hour or two. 😄
23:22fdobridge: <marysaka> What would be the next steps after that btw? Like should my MR contains your MME builder structure change or is that going to be part of another MR? (or directly nvk/main)
23:23fdobridge: <gfxstrand> I'm happy for your MR to contain all the new MME stuff.
23:23fdobridge: <gfxstrand> No need to split it into multiple MRs by author.
23:23fdobridge: <gfxstrand> We just need something that makes some amount of sense and isn't going to build-break in the middle.
23:24fdobridge: <marysaka> I see
23:31fdobridge: <marysaka> Well I made sure that at least the last 8 commits are buildeable, not sure of the one before :AkkoDerp:
23:31fdobridge: <marysaka> I guess I will test that a bit
23:35fdobridge: <karolherbst🐧🦀> know that `git rebase -i origin/main --exec 'ninja -C build'` trick? It is a wonderful way of checking.
23:37fdobridge: <marysaka> oh that's a nice one :linaalert:
23:46fdobridge: <marysaka> well everything build already :linakira: