00:00 pmoreau: Phew, finally done rebasing all the clover patches, squashing them and grouping them in a better fashion.
00:00 pmoreau: Now, onto making sure every single commit compiles, and some code cleanup within some of the commits.
00:05 karolherbst: :)
00:05 karolherbst: if you get that merged, I might end up piping that through my nir pass
00:19 pmoreau: karolherbst: FYI, it’s on my clover_spirv_support branch; it’s the same code as on the nouveau_spirv_support, but with the updated clover part.
00:42 wgkj: imirkin: are you there?
00:42 imirkin_: .
00:42 imirkin_: haven't looked at nv40 if that's what you're asking about.
00:43 wgkj: https://bugs.freedesktop.org/show_bug.cgi?id=102349
00:44 wgkj: Is Martin Peres here?
00:44 wgkj: https://bugs.freedesktop.org/show_bug.cgi?id=102352
00:44 imirkin_: he is. mupuf.
00:45 wgkj: mupuf: https://bugs.freedesktop.org/show_bug.cgi?id=102352
00:46 wgkj: i really dont know what to do further. i have those problems since over 2 years, cant use plasma or gnome - probably most live linux OS iso you can download and many other things. I am spending much time in reporting the errors. What else should i do?
00:48 wgkj: karolherbst: you can now do the nouveau work beeing payed. you told me that you would like to fix the memory problems. any news on that? https://bugs.freedesktop.org/show_bug.cgi?id=102430
02:17 karolherbst: imirkin_: those compound RA issues are kind of annoying :/ I have things like "r0 = unpack_64_2x32_split_y ssa_32" in nir and I just convert that to a split, but a phi node then ends up with a split and a non split source :(
02:18 karolherbst: I could try to just do a convert, but...
02:19 imirkin_: i thought i covered that
02:19 imirkin_: bow down before your 32-bit overlords or face their wrath
02:20 karolherbst: well... ssa_32 is the result of a 64bit abs
02:21 karolherbst: ohhh wait
02:21 karolherbst: merge s64 %r35d %r134 %r135 + split u32 { %r38 %r37 } %r35d yeah... that makes sense
02:21 imirkin_: make it look like what the tgsi fe does, or deal with fixing up bugs arising from different usage.
02:21 imirkin_: did you end up with a phi u64?
02:21 karolherbst: no
02:21 karolherbst: the phi uses one result of the split
02:22 imirkin_: hmmmm
02:22 imirkin_: hmmmmmmmmmmmmmm
02:22 imirkin_: hm.
02:22 karolherbst: you see my problem
02:22 imirkin_: i do.
02:22 imirkin_: what happens with tgsi?
02:23 karolherbst: no merge+split
02:23 imirkin_: yeah, merge + split like that should get ... merged
02:23 karolherbst: because TGSI doesn't end up doing that unpack_64_2x32_split_y thing
02:23 karolherbst: mhhh
02:23 imirkin_: right
02:23 imirkin_: coz it's all 32-bit to start with ;)
02:23 karolherbst: I could try to fix it up
02:23 karolherbst: but yeah
02:23 karolherbst: such merge+splits could be just eliminated
02:23 imirkin_: they should be
02:23 imirkin_: maybe i only do it in the other direction
02:23 imirkin_: i.e. split + merge
02:23 karolherbst: should I do it as a nir fixup?
02:24 karolherbst: or where should stuff like that be handled
02:24 karolherbst: while converting to ssa?
02:24 imirkin_: there's a peephole pass
02:24 imirkin_: after ssa
02:24 imirkin_: MergeSomething
02:24 imirkin_: i forget
02:24 imirkin_: one of the first ones to run
02:26 karolherbst: well I see several of those stupid merge+splits
02:26 imirkin_: can you just read the code?
02:26 karolherbst: and they stav even after ssa opts
02:26 karolherbst: yeah
02:26 imirkin_: perhaps it handles a diff case, i dunno.
02:26 imirkin_: it's definitely there and highly related
02:26 imirkin_: so please just rtfs.
02:27 karolherbst: ahh, you mean the MergeSplits thing
02:27 karolherbst: and yeah, it does split+merge, not merge+split
02:27 imirkin_: should be easy to adjust though.
02:27 imirkin_: and/or extend
02:28 karolherbst: thing is, if it is a peephole only running at opt level > 0 it breaks if opts are disabled :(
02:28 imirkin_: yeah, it shouldn't be required.
02:28 karolherbst: allthough
02:28 karolherbst: it passes when I run with opt level 0 anyway
02:29 karolherbst: ohh right
02:29 karolherbst: because there are still the movs
02:45 nrr: Lyude: https://gist.githubusercontent.com/nrr/7fc70f083ab5febb39c40afe8081e943/raw/1b9e9fcbc346c7ff4800966e8c74114c7e8f8fe1/klog-everything it seems like the read faults i'm getting are all PTE except one instance where the reason is UNBOUND_INST_BLOCK
02:46 imirkin_: ouch, those PBENTRY things are bad
02:46 nrr: yeah?
02:46 imirkin_: i guess that was a while ago
02:47 nrr: yeah, that was last month when i got those messages.
03:32 nrr: well, it persists even after upgrading to 4.15.0-0.rc6.git0.1.fc28.x86_64. comically, all it takes is logging into a gnome-shell session under Xwayland, opening up firefox (57.0.1 for those keeping score), and attempting to enter a URL. freezes every time.
12:47 mupuf: wgkj: believe it or not, I actually set up my reverse engineering machine to be able to work on this bug while on vacation... but the power got cut and my UPS did not start the router again when the power went back...
15:12 BootI386: Very, very sad :D
15:12 karolherbst: ...
16:02 pmoreau: karolherbst: I see I have a v2 to review in my inbox :-)
16:25 karolherbst: :)
16:27 karolherbst: have fun
16:29 pmoreau: karolherbst, imirkin_: Is there a way to have messages sent through “pipe_debug_message” displayed as regular debug messages? i.e. would it be possible, with NV50_PROG_DEBUG=1 or whatever, to pipe those messages to stdout/stderr (not rely on the user program to do that)?
16:30 imirkin_: pmoreau: i think with a debug build, mesa prints them
16:31 pmoreau: I don’t have them in my debug build when running an OpenCL program. (I do get them for some tests of the OpenCL CTS though)
16:34 pmoreau: Looking at the code, it seems to be fed directly to the debug callback, and that’s about it.
16:38 imirkin_: with opencl, iirc it gets surfaced out via that callback thing
16:38 imirkin_: (i added that =] )
16:39 imirkin_: basically when you do clCreateDevice() you can pass in a callback
16:39 imirkin_: (or some similar function)
16:41 pmoreau: karolherbst: For patch 2, I was thinking that the prog->type is already displayed right before the dump along different statistics, but that’s through the pipe_debug_message
16:42 pmoreau: imirkin_: Right, but I wanted to know whether there was some debug mode to print it out regardless of whatever the application, regardless of whether it registers a callback or not.
16:42 imirkin_: there is not.
16:43 imirkin_: have a look at _pipe_debug_message -- you can do it there
16:44 pmoreau: That’s what I ended up doing (adding a call to _debug_printf), and then realised I had commented out the code to _pipe_debug_message in nvc0_program.cpp --"
17:39 Lyude: nrr: interesting, also I forgot to leave my computer on last night >:(
17:39 Lyude: let me know how the new kernel goes as well
17:46 nrr: Lyude: haha, not well. still the same messages in the kernel logs.
17:46 Lyude: that's alright, we at least know the problem probably hasn't been fixed in the kernel driver then.... any luck on finding anything more substantial to reproduce it with?
17:48 nrr: aside from the firefox bit i mentioned, no, but now that i have at least that much, and given that i'm running what amounts to the latest kernel, i can hopefully make use of BPF to get some tracing around it so that i can actually write a test case that consistently exercises it.
17:50 Lyude: mm, I have a feeling I might be able to reproduce this if I run more stuff at the same time
17:53 Lyude: nrr: also this might be a bit silly but; if worse comes to worse think there might be some point I could get ssh access to that machine?
17:54 nrr: Lyude: maybe. are you capable of getting on an IKEv2 VPN?
17:54 Lyude: i'd imagine I probably am
17:54 Lyude: does it have a networkmanager plugin?
17:57 nrr: appears so! let me see if i can carve out some time this evening to provision you a set of keys, and i'll give you instructions.
17:58 Lyude: alright, karolherbst btw: you know anything about memory management with nouveau?
18:28 karolherbst: Lyude: skeggsb and imirkin_ should know more
18:28 imirkin_: just skeggsb
18:28 karolherbst: imirkin_: I am sure you still know more than me :)
18:53 karolherbst: imirkin_: what would you consider show stopper for merging that nir stuff, variable indexing + 64 bit values?
18:53 imirkin_: variable indexing - yes
18:53 imirkin_: 64-bit values, just disable them
18:54 karolherbst: so just disable fp64 and int64 extensions / disable caps?
18:55 imirkin_: yea
18:56 karolherbst: ugh... most crashes are due to this: "../src/mesa/state_tracker/st_atom_array.c:595: setup_non_interleaved_attribs: Assertion `array' failed."
18:56 karolherbst: >50%
18:56 karolherbst: and texturegathereroffsets is another 250
18:56 karolherbst: that assert is hit for arb_vertex_attrib_64bit
18:56 imirkin_: yeah i dunno what st_atom_array is or does :)
18:57 imirkin_: probably something with arrays. and atoms. maybe tracking some STate?
18:57 karolherbst: and 4.20 variable indexing is like 33% of crashes
18:57 karolherbst: might be
18:57 karolherbst: maybe the AMD guys know more
18:58 imirkin_: yeah, i mean i dunno where the issues are - but irrespective of where they are, we can't have code that's this broken :)
18:58 karolherbst: right
18:58 imirkin_: there's no rush for any of this
18:58 karolherbst: right
18:59 karolherbst: it would be just nice if we could people to play around with this and report issues we don't catch with piglit/deqp/CTS
18:59 karolherbst: but as long as we have enough there, it is fine to not merge it :)
18:59 imirkin_: right
18:59 imirkin_: and it doesn't have to be perfect even when you do merge it
18:59 imirkin_: fixing the last .1% of the issues takes like ... a lot of the time
19:00 karolherbst: right
19:00 imirkin_: i'm worried about *core* misunderstandings in the impl
19:00 karolherbst: but at least all the glsl tests should pass :)
19:00 imirkin_: i think that variable-indexing failing likely means there is one.
19:00 karolherbst: might be, yes
19:00 imirkin_: (without digging deeper myself)
19:00 karolherbst: I have to take a deeper and more concentrated look
19:00 imirkin_: if it's just some dumb issue somewhere, then wtvr
19:01 karolherbst: ohh, that variable indexing is. let me check
19:02 imirkin_: but it's the sort of thing that can go horribly wrong if the packing is messed up
19:02 imirkin_: or if your indexing logic is fof
19:02 imirkin_: or ... a number of things
19:02 imirkin_: which are EXTREMELY important to get right
19:02 imirkin_: since otherwise random shit fails for totally random reasons
19:02 Lyude: Is anyone else having problems ssh'ing into freedesktop.org
19:02 imirkin_: see email from yesterday
19:02 imirkin_: it's disabled until the PTI thing is worked out
19:03 karolherbst: well it works without MemoryOpts
19:03 imirkin_: yeah, i have a big problem with approaches that talk like that. it leads me to believe you haven't the faintest clue of what's going on.
19:03 imirkin_: which worries me greatly
19:03 karolherbst: I know I knew when I was looking into it
19:04 imirkin_: ;)
19:04 karolherbst: I am sure we shouldn't end up with stuff like this: ld u64 $r0d l[$r8+0x10]
19:04 Lyude: also karolherbst I'm assuming you didn't hit any of this the last time you ran piglit with concurrency? https://paste.fedoraproject.org/paste/~w6dz4Z18a9s4u6oOh6EMA
19:04 karolherbst: which got opted from "ld s32 $r2 l[$r0+0x10]" + "ld s32 $r3 l[$r0+0x14]"
19:05 imirkin_: karolherbst: there's an implicit assumption that things are vec4-aligned in there.
19:05 imirkin_: i.e. that all indirects are a factor of 16
19:05 karolherbst: right, so the code assumes $r8 is a factor of 0x10
19:05 imirkin_: so you can just look at the offset and be good
19:05 karolherbst: we talked about this some days ago
19:05 imirkin_: yes.
19:05 imirkin_: why isn't it?
19:05 karolherbst: because st/nir packs by default
19:06 imirkin_: ok, can you switch it?
19:06 karolherbst: there is no switch, but I could make one
19:06 imirkin_: (why is st/nir involved in this decision in the first place?)
19:06 karolherbst: it runs nir_compact_varyings
19:06 imirkin_: (how does nir know how we pack our locals?)
19:06 karolherbst: st/nir runs nir_compact_varyings that is
19:06 imirkin_: these aren't varyings
19:06 imirkin_: they're locals.
19:07 karolherbst: yeah, but stuff is stored into them
19:07 imirkin_: anyways - don't try to optimize stuff.
19:07 imirkin_: it'll just lead to problems like this.
19:07 karolherbst: "mov u32 $r1 c0[0x10]" + "st s32 # l[$r0+0xc] $r1"
19:07 imirkin_: if tgsi does stuff vec4-aligned, you also do stuff vec4-aligned
19:07 karolherbst: and so on
19:07 imirkin_: where does r0 come from?
19:07 imirkin_: you control that entirely.
19:07 imirkin_: you can ensure it's vec4-aligned.
19:07 karolherbst: "mul s32 $r0 $r0 3" + "mul u32 $r0 $r0 0x0000000c"
19:08 karolherbst: and originally mov u32 $r0 c0[0x50]
19:10 imirkin_: where did the 0xc come from?
19:10 imirkin_: (and/or the 3)
19:10 karolherbst: 0xc is the size of the uniform
19:10 imirkin_: what uniform
19:11 imirkin_: pastebin glsl code, nir code, and the initially-generated nvir
19:11 imirkin_: i will point exactly where you're goign wrong.
19:12 karolherbst: https://gist.github.com/karolherbst/8d29b6fcfba8b0e846ee75645cbded87
19:13 imirkin_: this is longer than i hoped. hold on.
19:13 imirkin_: 11: st s32 # l[0xc] %r4 (0)
19:14 imirkin_: where does that 0xc come from?
19:14 imirkin_: why isn't it 0x10?
19:15 karolherbst: imirkin_: decl_reg vec3 32 r0[9]
19:15 imirkin_: oh.
19:16 imirkin_: THAT is highly unfortunate.
19:16 karolherbst: ;)
19:16 imirkin_: er wait
19:16 imirkin_: no
19:16 imirkin_: no
19:16 imirkin_: it's fine
19:16 imirkin_: ignore me. it's fine.
19:16 imirkin_: so explain again why it's 0xc
19:16 imirkin_: and not 0x10
19:17 karolherbst: r0[1] = imov ssa_468, I store into the second element of the array, which starts at 0xc if I assume each element is a vec3
19:17 imirkin_: why does it start at 0xc?
19:17 imirkin_: why do you assume it's a vec3?
19:17 karolherbst: because the declaration tells me it is a vec3
19:17 imirkin_: <imirkin_> if tgsi does stuff vec4-aligned, you also do stuff vec4-aligned
19:17 imirkin_: solution: don't do that. vec4-align and move on with life.
19:18 karolherbst: well, I think nir relies on that in some parts, I could be wrong though
19:18 imirkin_: how could it
19:18 imirkin_: how would it know?
19:18 imirkin_: that you secretly didn't place the vec3 components extremely adjacent to each other?
19:18 karolherbst: mhh, good point
19:19 imirkin_: and yes. it'd be nice to place them next to one another.
19:19 imirkin_: and perhaps at some later date we could implement that
19:19 imirkin_: but you can't be in the business of fixing everything that's wrong with the world
19:19 karolherbst: okay, I guess I try to treat every array as a fixed vec4 thing then and see how that goes
19:20 imirkin_: the more you do "like tgsi does it", the easier your life will be.
19:20 imirkin_: optimizations are for later.
19:21 imirkin_: and perhaps there are reasons why tgsi does it the way it does it ;)
19:24 karolherbst: imirkin_: it passes now with memoryopt
19:24 imirkin_: :)
19:24 imirkin_: ain't that nice.
19:24 karolherbst: I will stick TODOs into the code then
19:26 karolherbst: but the fix in MemoryOpts should be fairly trivial, we just need to check if the indirect is a factor of either 0x8 or 0x10, but I can work on that later
19:28 imirkin_: we might not always know
19:28 imirkin_: anyways, yeah, it could be more careful
19:29 karolherbst: well, if we don't know, we can't do the opt, right?
19:30 imirkin_: well right now we know - it's always vec4-aligned. problem solved ;)
19:32 karolherbst: :D
19:32 imirkin_: this may be a bigger problem for attribs.
19:32 imirkin_: depending on how the packing works for them
19:33 karolherbst: it's.... anoying, I remember looking into it
19:33 karolherbst: like I get dvec4 input and nir only tells me about one slot being used ;)
19:33 karolherbst: and then you can have it component wise
19:33 karolherbst: so I have to check if the component is in the next slot and so on
19:34 imirkin_: fix the nir.
19:34 imirkin_: you don't want nothing to do with the 64-bit vertex attribs.
19:34 karolherbst: wondering if it can be lowered
19:36 karolherbst: imirkin_: regarding int64, I could just do "nir_lower_int64" :)
19:36 imirkin_: not just int64
19:36 imirkin_: fp64
19:36 karolherbst: and maybe nir_lower_double_ops as well
19:37 karolherbst: uhm... but I would end up with "double add, mul, and fma"
19:37 karolherbst: useless
19:52 karolherbst: skeggsb: https://gist.githubusercontent.com/karolherbst/2d61ed5bc5a9aa9536e8a0b939796686/raw/fd0553a1f27c992f60aca090da645ad28859e7f4/gistfile1.txt
19:53 karolherbst: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
19:53 karolherbst: or maybe you fixed that as well already
19:54 skeggsb: how'd you cause that?
19:54 karolherbst: piglit
19:54 karolherbst: texsubimage
19:54 skeggsb: reproducible?
19:54 imirkin_: hes just that good :)
19:54 karolherbst: mhh, maybe one in 25 full runs
19:55 skeggsb: but always the same test, etc?
19:55 karolherbst: no idea
19:55 karolherbst: I just know that one in 25 runs crashes my system quite hard
19:55 karolherbst: or one in 50
19:55 karolherbst: but the trace looks familiar
20:11 karolherbst: imirkin_: so most of the variable indexing tests pass now, <1% fail
20:24 Lyude: [ 4268.552386] nouveau 0000:01:00.0: fifo: read fault at 0001e30000 engine 00 [GR] client 01 [GPC2/T1_0] reason 02 [PTE] on channel 24 [007e1d3000 Xwayland[1530]] -- reproduced nrr's error at least once me thinks
20:25 karolherbst: Lyude: uhh, somebody had an issue with firefox on wayland
20:25 karolherbst: Lyude: where resizing at the bottom just crashed it
20:25 Lyude: oh no
20:25 karolherbst: or froze it
20:26 karolherbst: maybe you could try that as well :)
20:26 Lyude: any idea who it was?
20:26 karolherbst: no
20:26 karolherbst: would have to dig through the logs
20:26 Lyude: yeah sure, just gimme a card to try. i've got everything up to maxwell 1 in my apartment right now
20:26 Lyude: okie
20:34 pmoreau: Lyude: Maybe this bug: https://bugs.freedesktop.org/show_bug.cgi?id=104222
20:35 imirkin_: Lyude: you have a NV4?
20:35 Lyude: pmoreau: I think we found a match
20:35 Lyude: nrr: I think we found your bug
20:35 Lyude: imirkin_: ok sorry, I should it doesn't get older then fermi xD
20:37 karolherbst: Lyude: can you trigger the bug with firefox?
20:38 Lyude: once this piglit test ends i'll give it a shot
20:38 Lyude: the resizing makes perfect sense btw
20:38 Lyude: resizing windows in xwayland results in a gbm bo being created and the previous one for the window being destroyed
20:38 Lyude: so it makes sense it would race like that
20:39 Lyude: nrr was also seeing that bug in X as well, but I'm pretty sure the glamor codepaths (it does happen on nouveau's ddx as well mind you) would be doing the same thing for a couple of things, including window resizing
20:40 imirkin_: when nouveau lists which process it is, it's lying
20:40 Lyude: no i'm aware
20:40 imirkin_: so just coz it says 'X' doesn't mean it was X
20:40 imirkin_: it means that X opened the fd way back when :)
20:40 Lyude: but i know that this is how xwl does things because I had to deal with window resizing when writing the eglstream code :)
20:40 Lyude:touched most of that code]
20:41 imirkin_: the issue is almost definitely in the GL impl
20:41 Lyude: also,i've already managed to trigger ttm errors all over the place with piglit runs
20:41 imirkin_: or in nouveau's memory management logic
20:41 Lyude: i think it's in the former but i'm not 100% yet
20:41 Lyude: erm, *ladder
20:41 imirkin_: (or both! what a happy coincidence that would be.)
20:41 Lyude: hehe
20:42 Lyude: i'll see if I can make a bit of headway with this mst retraining and then I'll check out this bug in an hour or two
20:44 Lyude: karolherbst: i'll test the resize thing right now though, since that piglit run just finished
20:44 Lyude: (wanted to see which piglit tests actually threw the PTE errors)
20:45 Lyude: karolherbst: bingo!
20:45 Lyude: [ 5888.557036] nouveau 0000:01:00.0: fifo: read fault at 00049b2000 engine 00 [GR] client 0a [GPC2/T1_3] reason 02 [PTE] on channel 14 [007ec97000 Xwayland[1530]]
20:46 Lyude: resizing really quickly most certainly breaks things on wayland
20:47 imirkin_: so ...
20:47 imirkin_: just for the record
20:47 imirkin_: i did recently fix a bug
20:48 imirkin_: whereby we were deleting a buffer too early
20:48 imirkin_: when we were running out of space for shaders and reallocating the code segment
20:48 imirkin_: however i'm pretty sure that bug looked different
20:48 imirkin_: i.e. it wasn't GPCN/T* that triggered it
20:49 Lyude: might be worth a shot anyway, i've seen a lot of varying errors on here (although interestingly enough the firefox reproducer seems to be rather specific here)
20:49 imirkin_: but here's a question --
20:49 imirkin_: does adding MESA_DEBUG=flush fix it?
20:51 Lyude: let's see
21:02 Lyude: imirkin_: nope
21:02 imirkin_: https://github.com/imirkin/mesa/commit/c4fd2ce1514cdc334db48e9522377c438df3ab8a
21:03 imirkin_: that's the commit in question
21:03 imirkin_: but i'm fairly certain it will do nothing for you
23:58 imirkin_: karolherbst: btw, the tg4 thign is PIPE_CAP_TEXTURE_GATHER_OFFSETS iirc. there's a diff between 4-offset tg4 and 1-offset tg4. most hw only supports the latter, and the 4-offset tg4 gets lowered into 4x 1-offset tg4's