04:27 seyeongkim: kernel 4.x with P400, P5000, P6000 card and more than 64 GB RAM can't boot properly, what kind of subject i can research for this issue?
04:28 imirkin: iirc someone reported this earlier
04:28 imirkin: perhaps you, or an associate?
04:28 imirkin: (i can't imagine there are too many of these systems running around)
04:28 imirkin: anyways, iirc the comment was that the 64GB thing had nothing to do with nouveau, the system just wouldn't boot even if nouveau was never loaded
04:30 imirkin: anyways, not sure what would be special about a 64GB limit as far as nouveau is concerned... i could imagine things going south at 4GB or at 1TB.
04:30 imirkin: although the 4GB barrier is fairly well-tested these days
04:30 imirkin: i'd have no trouble believing there were issues with a >1TB system though.
04:31 seyeongkim: Thanks. and related machine has 1TB
04:31 seyeongkim: need to set kernel parameter mem=64000mb
04:31 imirkin: can you try a smaller number than 1TB (but larger than 64GB)?
04:32 seyeongkim: I'm going to check it again
04:32 imirkin: (the 1TB limit arises from the fact that nvidia gpu's have a 40-bit virtual address space)
04:32 HdkR: Almost feel like it would be an issue when you hit the 40bit VA limit
04:32 HdkR: lol
04:32 imirkin: otoh, that's just the VA ... i forget if the PTE's can address higher system ram locations.
04:33 seyeongkim: if i test it with 1080, you can presume that same issue there?
04:34 imirkin: yes
04:34 seyeongkim: ok, Im far from exact machine so I'm thinking alternatives..
04:34 seyeongkim: I'll dig that part thanks imirkin HdkR
04:34 imirkin: any G80 or later gpu has this limitation afaik
04:35 seyeongkim: ah one more thing.. imirkin you may know code file for this limitation? e.g manually increasing for only testing
04:35 imirkin: i'm glancing at the gp100 vmm... i can't quite tell if gp100 is 40- or 48-bit
04:35 imirkin: it's an architectural limit
04:35 seyeongkim: ah ok
04:36 seyeongkim: thanks a lot
04:36 imirkin: yeah, looks like GP100 can go up to 47 bits
04:36 imirkin: oh
04:36 imirkin: but limited to 40-bit by default
04:37 imirkin: er no. limited to 47 bits by default
04:37 imirkin: but with an option to use the gm200 setup
04:37 imirkin: (which is 40-bit)
04:38 imirkin: seyeongkim: may i ask what you're doing with such a machine and nouveau?
04:38 seyeongkim: actually our customer reported this issue to us , not sure what they really do with this
04:38 imirkin: k. someone was in here a couple days ago talking about the same thing
04:39 imirkin: there can't be too many people trying this, so probably same person
04:39 imirkin: you can check the irc logs (see topic)
04:39 seyeongkim: ah ok
04:39 seyeongkim: thanks
04:40 HdkR: Obviously the solution would be to send imirkin a system that breaches the 40bit VA limit to work around the problem :P
04:41 imirkin: lol
04:41 imirkin: while that'd be self-serving, if you want things fixed, send them to skeggsb
04:47 seyeongkim: ok I checked him in near team :) i don't know why he didn't update the case. Thanks
10:22 tagr: imirkin: so that patch gets rid of the errors, but I suspect it's only hiding the issue, I get a bunch of these: https://hastebin.com/vevopoviki.apache
10:24 tagr: everything is also pretty sluggish, obviously
11:47 imirkin: tagr: yeah that's totally bogus
11:47 imirkin: sorry for sending you such a half-baked patch
12:26 tagr: imirkin: no problem, I'm happy to test anything you think could help
12:28 tagr: imirkin: the patch helped confirm that on Linux 4.16.6, the freezes are actually permanent, so display doesn't refresh (other than the cursor)
12:28 tagr: with 4.14, it recovers after a couple of seconds at maximum, or in many cases is even hardly noticeable
12:29 tagr: which means that something must've changed in that area, right?
12:54 imirkin_: tagr: maybe ... could be something silly though. i think skeggsb had the additional theory that if this is related to the software channel, then using DRI3 would reduce the likelihood of issues.
12:54 imirkin_: the downside of DRI3 is that it doesn't 100% work with the nouveau ddx
12:54 imirkin_: but ... 99.9% :)
12:56 karolherbst: imirkin_: did you see this patch? https://lists.freedesktop.org/archives/mesa-dev/2018-April/192430.html it looks fine to me, allthough usually I wouldn't even bother.
12:57 karolherbst: and I guess compilers are smart enough already...
12:57 imirkin_: i did
12:57 imirkin_: i meant to apply it
12:57 imirkin_: but clearly that fell through.
12:57 karolherbst: okay
12:58 karolherbst: I can push it as well if you don't have time
13:05 imirkin_: go for it
13:06 karolherbst: k
13:46 tomeu: karolherbst: btw, don't know why, but I needed these changes to build llvm-spirv: https://github.com/tomeuv/SPIRV-LLVM-Translator/commit/9c82149364739b19c85b0db4a0b96dc34c976deb
13:47 karolherbst: tomeu: do you really need the first one?
13:47 tomeu: karolherbst: don't think so
15:42 karolherbst: imirkin_: mhh, something is causing me to have more spilling fails, even for trivial enough shaders
15:48 karolherbst: imirkin_: yeah.. maybe I just ported the RA fix wrongly we need for 64 bit values
15:57 karolherbst: imirkin_: yeah, something in 5428066f5e1ef5ea6ae04c84019f270023cfc6aa breaks stuff for me :(
15:57 karolherbst: or rather this + cwabbotts fix
15:59 karolherbst: imirkin_: duh... I know the issue
15:59 karolherbst: mov
15:59 karolherbst: ohh wait, doesn't make sense
15:59 imirkin_: that should have been a no-op
16:00 imirkin_: that only affects nv50
16:00 karolherbst: yeah... I know
16:00 imirkin_: and even then, only in very rare cases
16:00 karolherbst: that's why I said + cwabbotts fix
16:00 imirkin_: i mean literal no-op
16:00 karolherbst: I know
16:00 imirkin_: like ... the code should do exactly the same thing
16:00 karolherbst: right, but when I revert it, it works
16:00 imirkin_: if reverting it helps in any way, that's highly surprising
16:00 karolherbst: I have to apply https://github.com/karolherbst/mesa/commit/def1d1ddc2e8dca2ae967557f1c20204c7d9a96a on top of it
16:01 karolherbst: last change is relevant
16:01 imirkin_: oh, unless connor's fix was in that logic
16:01 karolherbst: but
16:01 karolherbst: even then
16:01 karolherbst: he basically just add a " && defi->op != OP_MERGE && defi->op != OP_SPLIT) "
16:01 imirkin_: ok, so copy that into the code i refactored
16:02 karolherbst: ... as if I didn't try that already ;)
16:02 imirkin_: i can't imagine why anything else would matter
16:02 karolherbst: me neither
16:02 imirkin_: try to figure it out -- it should literally be the same lines of code executing before and after the change
16:02 karolherbst: except something random is random in a different way
16:02 imirkin_: just refactored into a function.
16:05 karolherbst: imirkin_: .. guess what
16:06 karolherbst: ohh wait, no
16:07 karolherbst: I thought I fixed it, but it was just correct result still stored in VRAM
16:08 karolherbst: imirkin_: cwabbott fix is totally unrelated, it cmpiles fine just with reverting your commit
16:08 karolherbst: I don't need his patch
16:08 karolherbst: (in this case)
16:09 imirkin_: stupid question, but ... you're not compiling for nv50 are you?
16:09 karolherbst: no
16:09 karolherbst: gp107
16:09 karolherbst: and I actually run the kernel
16:09 imirkin_: well, i'll need the details.
16:09 karolherbst: I diff the DEBUG=7 output
16:09 karolherbst: maybe that gives me something
16:13 karolherbst: imirkin_: there are differences like "RIG_Node[%108]($[1]-1): 2 colors, weight inf, deg 12/63" vs "RIG_Node[%108]($[1]-1): 2 colors, weight 7.200000, deg 12/63" right is reverted
16:13 karolherbst: the weight is inf on master
16:13 karolherbst: I mean, without the revert
16:14 imirkin_: hmmmmmm
16:14 imirkin_: that means i'm fucking something up
16:14 imirkin_: can you provide both of those files in full?
16:14 imirkin_: i can't investigate now, but will try to get to it tonight
16:15 imirkin_: i'll also try to stare at the code to see if the issue appears.
16:15 imirkin_: to be clear, this is master vs master + revert, right? no other funny RA-related changes, like connor's
16:15 karolherbst: all RA changes like connor's are removed, it is on my opencl branch though
16:16 imirkin_: ok
16:16 karolherbst: but your change is the newest showing up with git log ... nv50_ir_ra.cpp
16:16 imirkin_: is the branch somewhere i can see?
16:16 imirkin_: [in case i need to double-check anything]
16:16 karolherbst: https://github.com/karolherbst/mesa/commits/tmp
16:17 imirkin_: k. i'll have a look tonight.
16:17 imirkin_: hopefully.
16:17 imirkin_: Lyude: and maybe you can have a look at the DP-MST thing ;)
16:17 karolherbst: imirkin_: well, I try to figure out what changed as well. Your change isn't really that big...
16:25 karolherbst: imirkin_: that fixes it: https://gist.github.com/karolherbst/52ee5433affd605701a6407cb28bdba7
16:25 karolherbst: but
16:25 karolherbst: maybe a smaller patch is needed
16:25 karolherbst: but this is basically the changes you did while moving
16:30 imirkin_: hmmmmmmmm
16:30 karolherbst: duh!!!!
16:31 karolherbst: I found it
16:31 karolherbst: ... bah
16:31 karolherbst: ...
16:31 karolherbst: no :(
16:31 karolherbst: nonono
16:32 karolherbst: imirkin_: https://gist.github.com/karolherbst/52ee5433affd605701a6407cb28bdba7 ;)
16:32 imirkin_: how can that matter?
16:32 karolherbst: return;
16:32 imirkin_: fffffffffffuck
16:33 karolherbst: I like that '// doesn't help' comment though
16:33 imirkin_: doesn't help... but it hurts!
16:33 karolherbst: apparently
16:34 karolherbst: I'll write a fix
16:34 karolherbst: .. well on master
16:40 karolherbst: imirkin_: https://github.com/karolherbst/mesa/commit/3c2476c01aee39c8483636c842779c6dc7103881
16:40 karolherbst: do you want to check if the test still passes?
16:44 imirkin_: yeah..... i'm a bit concerned about sticking the noSpill on there.
16:44 imirkin_: i wanted to leave it off.
16:44 imirkin_: but perhaps it should go on there
16:44 imirkin_: i'll think about it. thanks for tracking it down!
16:47 karolherbst: okay
16:47 imirkin_: [will look tonight, but now with a much higher likelihood of success]
16:48 karolherbst: :)
17:37 Lyude: imirkin_: yep! sorry I didn't get a chance yesterday but I brought back the MST stuff I would need to test it
17:38 imirkin_: yep, no worries. just keeping it near the top of the proverbial stack.
17:38 imirkin_: [until you either do it or tell me to go away]
17:38 Lyude: hehe
18:24 karolherbst: imirkin_: the second arg of popcount is a mask?
18:24 karolherbst: so could I do popcnt $r0 $r0 0xff for chars?
18:26 imirkin_: karolherbst: the two args of popcnt are and'd together
18:26 imirkin_: note that iirc the second arg is lost on maxwell+
18:26 karolherbst: ahh
18:26 karolherbst: okay
18:27 karolherbst: wondering why nvidia still does this then: POPC R0, R0, -0x1;
18:27 karolherbst: ohh wait
18:27 karolherbst: my mistake :)
18:27 karolherbst: I compiled for sm_30
18:28 imirkin_: it's conceivable the 2-arg thing still exists, i haven't extremely investigated
19:21 pendingchaos: imirkin_: what the source of OP_PIXLD used for?
19:25 karolherbst: pendingchaos: you mean for what is pixld used or what the source should be?
19:27 pendingchaos: I guess the second. all code that creates a PIXLD instruction seems to supply it zero, though https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#n2658 seems to say it is used for something
19:32 imirkin_: pendingchaos: some PIXLD ops take an arg
19:32 imirkin_: some don't
19:33 imirkin_: pendingchaos: or perhaps it's the RT index. i really don't know tbh.
21:00 karolherbst: imirkin_: what's LDG.CI.U8?
21:00 karolherbst: the CI especially
21:00 karolherbst: load with global address, but it is caches as a const buffer actually?
21:02 karolherbst: *cached
21:03 HdkR: karolherbst: bzz, wrong
21:03 karolherbst: HdkR: ?
21:04 HdkR: That's not what the CI means :P
21:04 karolherbst: are you sure?
21:04 HdkR: yep
21:05 karolherbst: or does ldg.ci just mean to cache more aggressivly, because the data never changes?
21:05 HdkR: bzzz
21:06 karolherbst: well I am 100% sure it has something to do with caching :p
21:06 HdkR: :D
21:06 HdkR: oh wait no, I misread
21:06 karolherbst: :D
21:06 HdkR: yes, latter
21:07 karolherbst: mhh interesting, reads from global are only cached inside L2 on Kepler
21:07 karolherbst: but on maxwell with .CI it can be promoted to be cached in L1 as well
21:07 karolherbst: or starting with kepler2 actually