00:02 karolherbst: imirkin: can 64 bit reg access be unaligned? Like can you write to $r1d or has it to be $r0d?
00:02 karolherbst: I've get a bad feeling about that suredb instruction
00:05 karolherbst: imirkin: uhm... shouldn't the address be a 64 bit value?
00:11 karolherbst: ohh wait it is internal global image offset + 32 bit offset in the op
00:11 karolherbst: k
00:26 karolherbst: uhh okay, so imageAtomicMin has a different time depending on if the value changed or not
00:26 karolherbst: *execution
00:27 karolherbst: imirkin: I think the executions are just running out of sync
00:27 karolherbst: basically
00:29 karolherbst: I guess I can't just add joins without joinats?
00:29 karolherbst: and dealing with that on a CFG level?
00:43 karolherbst: imirkin: nvidia fails those tests
02:09 pabs3: pmoreau: any further thoughts re the GPU lockup issues with my GT21x (not bisectable)? happy to provide debug info if nouveau has an equivalent of intel-gpu-tools for GPU hang dumping
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl dlusv: Blub\0 espes geaaru_ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl orbrtmezl: Onionnion Processus42 cyndis ▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl qvezw: gruetzkopf Prf_Jakob Hauke ▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl vnnxjnq: endrift ssvb OhGodAGirl ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl mckkceyi: Hauke mupuf Celelibi ▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl tntushxt: cwabbott_ PaulePanter cinch ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl dmrfntlx: Aristar cinch CoreDuo ▄▄▄▄▄▄▄▄▄▄▄

05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl uktnsxm: NolanSyKinsley Exagone313 marex-cloud ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl ctqlz: SXX hl RushPL ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl keoai: bazzy theMaze pmoreau ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl psxwk: aphirst Onionnion kb9vqf ▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl ssshirhf: Cork Aristar norris ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl xfkkuol: ggherdov` anunnaki kmshanah ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl qfaayp: anunnaki agd5f APTX ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl ipygk: Onionnion moben mooch ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl dbjagzuel: Smjert schmidtm theglass ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl odbuj: Cloudef mooch SXX ▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl ssbgawtn: kb9vqf mslusarz anunnaki ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:49 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl cenywfygl: robclark ajmitch geaaru_ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:50 Onionnion: piece of shit
05:50 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl hfhdol: mjg59 gnurou perfinion ▄▄▄▄▄▄▄▄▄▄▄▄
05:50 HdkR: What a great bot
05:51 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl zbjtcdq: kb9vqf aplund xexaxo1 ▄▄▄▄▄▄▄▄▄▄
05:51 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl bhigzkflko: Anssi docmax endrift ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:51 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl hbkwaylk: mariogrip aphirst Tomin ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
05:51 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl uvwcfgbhtk: Blub\0 indy geaaru_ ▄▄▄▄▄▄▄▄▄▄▄▄▄
05:51 yfbvuqx: ▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY APRIL FLOODS DAY BROUGHT TO YOU BY iяс.sцреяиетs.ояg сни sцреявоwl hajwivwj: Jaga mwk CoreDuo ▄▄▄▄▄▄▄▄▄▄▄▄▄▄
12:32 docmax: hi
12:54 pmoreau: pabs3: You can always activate some debug output by setting nouveau.debug=debug or nouveau.debug=trace (see https://nouveau.freedesktop.org/wiki/KernelModuleParameters/); you could also do something like nouveau.debug="debug,PDISP=trace".
13:07 pabs3: thanks, I'll try that tomorrow
14:17 karolherbst: imirkin: okay, further analysis: the test depends on that for each loop iteration, the value stored in the image is decreased by 1 and that the old value != -iteration_count. So if you are t the 5th iteration and the thread changes the value from -4 to -5, it exists, if not (because a different thread already updated it) it tries again next iteration. So what we have is, that sometimes (not deterministic) the value stored in
14:17 karolherbst: the image is decreased by 2, not 1 per loop iteration
14:17 karolherbst: now the question: the tests depends on undefined behaviour or we have to synchronize the threads somewhat
14:18 karolherbst: the problem is, if the value was reduced by 2 at least once, the loop will go forever, because old value < -iteration_count and hence the loop never ends
14:19 karolherbst: I would say the test is depending on undefined behaviour here and should have a different way trying to check if imageAtomicMin is indeed atomic
14:20 robclark: karolherbst, check what CTS and/or deqp does?
14:23 karolherbst: good idea
14:27 karolherbst: robclark: no loop
14:28 karolherbst: robclark: here is what we do currently: https://gist.github.com/karolherbst/743a498aaf4fd89dc06863a805d90b35#file-a-frag
14:28 robclark: I would tend to think hw vendors would make sure CTS didn't depend on unspecified behavior ;-)
14:28 karolherbst: and it doesn't look like a good idea
14:29 karolherbst: robclark: imageAtomicMin with write faster than others, because of syncing. Threads diverge so much that some threads are in iteration 2 some in 1
14:29 karolherbst: and some threads get the old value from a different thread already finished with iteration 2
14:30 karolherbst: so old = -1 and v = -2
14:30 karolherbst: but mhh
14:30 karolherbst: at some point it should be fine again
14:31 robclark: too bad we don't have full git history of CTS test to see if they already had this discussion
14:32 karolherbst: anyway, I guess with a lot of threads it might take a while to repair all the damage
14:34 karolherbst: robclark: but anyhow, we shouldn't have tests which can hang the machine just because something isn't implemented the right way
14:34 karolherbst: annoys people doing tests and devs
14:35 robclark: well, I mean the ideal answer is better hang recovery, since games/etc might do something like that which causes shader to infinite loop..
14:35 karolherbst: robclark: I am sure glsl doesn't guarentee that all locs are always executed in parallel or is this somewhat required?
14:36 robclark: you mean divirgent threads in a warp? That is almost guaranteed that they don't execute in parallel
14:36 robclark: I think only exception are some recent nv gpu's
14:36 karolherbst: yeah but I mean if two threads follow the same path
14:37 karolherbst: robclark: recent nv gpus can do full preemption, maybe even selected threads
14:37 karolherbst: and fault recovery
14:37 robclark: in that case, I'd expect the atomic op to at least be dispatched in parallel.. but tbh I'm not sure what spec says about it
14:37 karolherbst: so more like more diverge
14:37 karolherbst: robclark: yeah, but this is kind of the point here. The test depends on this
14:39 karolherbst: I think
14:39 robclark: karolherbst, read what spec says, if it isn't clear see what ilia/ian have to say.. maybe send patch to piglit list to make it work similar to cts test ;-)
14:40 karolherbst: well first I want to have a decent explanation what is going wrong here
14:40 robclark: btw, check what happens w/ nv blob and this test?
14:42 karolherbst: blob fails all tests
14:43 karolherbst: but it seems like it just opted the loop away
14:43 karolherbst: dunno, I would have to trace it
17:08 imirkin: karolherbst: hm, i wonder if the current code would allow for 2 threads to fight each other
17:08 imirkin: in a way that no one ever wins
17:13 imirkin: pmoreau, pabs3: it's "disp" now since kernel 4.3
17:13 imirkin: (rather than PDISP)
17:18 imirkin: karolherbst: fwiw the atomicity test passes for me on a GF108
17:52 ejsi: is there any documentation on how optimus is handled in windows?
17:57 ejsi: Is there a channel/mailing list/whatever specifically on nvidia driver reverse engineering?
17:58 imirkin: not that i'm aware of
18:00 ejsi: Alright, I'll ask here then:
18:01 ejsi: I'm working with a few other people on getting optimus cards working with pci passthrough on windows vms
18:02 ejsi: After implementing the _ROM call in an ssdt and putting the rom data in ovmf, the driver loads properly
18:03 ejsi: (And fairly close to baremetal performance, based on the passmark gpu compute benchmark I just ran)
18:04 ejsi: You don't get output to a physical display, but you can use rdp with remotefx
18:05 ejsi: But with
18:05 ejsi: Oops, accidental enter
18:08 ejsi: But the directx benchmarks, for example, fail with "invalid mode"
18:09 ejsi: And when I try to start up the nvidia control panel program, which normally gives an error when run on a display that's not connected to an nvidia gpu, just displays nothing when I run it from an rdp connection with all other gpu drivers disabled
18:10 ejsi: What's going wrong?
18:13 ejsi: (I can provide data from envytools etc. from an identically configured linux guest if there's anything in particular that would be useful)
18:15 imirkin: ejsi: well, i suspect that DX wants real modes to be settable
18:15 imirkin: if the GPU has no display / connectors
18:15 imirkin: then you can add a second fake-o video card into your VM
18:16 ejsi: I tried passing through a full intel igpu
18:16 ejsi: The driver loads
18:17 imirkin: you mean both the intel and nvidia gpus?
18:17 ejsi: Yeah they're both loaded
18:17 imirkin: and it didn't help?
18:18 ejsi: I have 3 video cards on this vm now
18:19 imirkin: yeah, i have no clue how windows drivers work
18:19 imirkin: (in general, or specifically wrt nvidia)
18:19 ejsi: Virgl as the primary (00:02.0?), upt mode intel hd 530 as secondary (00:18.0) and the gtx 960m as secondary (tertiary?) on 01:00.0
18:19 imirkin: i stopped using windows around win98, and haven't looked back ;)
18:20 imirkin: and the virgl gpu presumably exposes real modes?
18:20 ejsi: Yeah, it does
18:20 ejsi: And the remote output seems to use that somehow when the driver is loaded
18:21 ejsi: And it pretends the 960m doesn't exist
18:21 ejsi: The same thing with the intel gpu
18:21 imirkin: sorry, this is way out of my area of knowledge
18:21 imirkin: you need someone who knows how windows works
18:22 ejsi: I was hoping there'd be someone here who reverse engineered the nvidia optimus parts to reimplement bits of it on linux
18:22 imirkin: optimus is just the software integration
18:22 imirkin: there's no hw really
18:23 imirkin: the nvidia driver could behave differnetly with a mobile part
18:23 imirkin: but this is all just speculation
18:23 ejsi: I keep seeing the words "Optimus Copy Engine" being thrown around a lot in the optimus whitepaper
18:24 ejsi: No technical documentation ofc, in keeping with nvidia tradition
18:25 imirkin: presumably that's just the regular copy engine
18:25 imirkin: which is an interface to request the gpu copy data around
18:25 imirkin: from place to place
18:26 imirkin: basically a built-in dma copy engine...
18:26 imirkin: helpful for optimus, as well as regular card operations like evicting memory from vram, putting it back, etc
18:28 ejsi: Got it, maybe I should look more at dx documentation first
18:28 imirkin: or wddm
18:29 ejsi: Yeah
18:29 ejsi: tyvm
19:10 karolherbst: imirkin: interesting
19:13 imirkin: and yes, depending on whether it returns a result or not, it might have a diff encoding
19:14 imirkin: on nvc0/gk110 iirc it's the same, just RZ at the end
19:14 imirkin: but on gm107 RED.asdf and ATOM.adsf have diff encodings
19:14 imirkin: (which was the cause of quite some bugs in the initial bringup)
19:55 plutoo: what does writing to FIRMWARE do?
19:55 plutoo: invoke the microcode blob?
20:02 imirkin: such writes cause interrupts to trigger and get handled by the grctx fw
20:03 imirkin: usually it's used to write to context-switched registers
20:03 imirkin: that aren't otherwise accessible
20:03 imirkin: via regular graph methods
20:08 plutoo: how do you know all this stuff
20:08 plutoo: i'm seeing addresses like 0x00418e40/0x00418e58/... being written to FIRMWARE
20:08 imirkin: try it and see? i dunno, i probably read it / it was explained / ...
20:08 imirkin: yeah, so those are pgraph registers
20:09 imirkin: the actual values are passed in via scratch usually
20:09 imirkin: and the method is similar to a mask method
20:09 plutoo: yeah that's exactly what i'm seeing
20:09 imirkin: i.e. read, mask, or, write
20:09 plutoo: macro_14f(0x00418e40, 7, 0xf);
20:09 plutoo: macro_14f(0x00418e58, 0x842, 0xffff);
20:09 plutoo: ... etc
20:09 plutoo: looks like arg0 is the register address, arg1 is value, arg2 is mask
20:10 imirkin: right. so that means "read register 418e40, zero out the low 4 bits, or in 7, write back to 418e40"
20:10 imirkin: the special trick with this is that these are context-switched registers
20:10 imirkin: so the writes have to done as part of the command stream
20:12 plutoo: are the registers at 0x00418000 documented somewhere?
20:13 imirkin: rnndb. those specific ones aren't.
20:13 imirkin: $ ~/src/envytools/rnn/lookup -a 120 418e40
20:13 imirkin: PGRAPH.GPC_BROADCAST.UNKE00+0x40
20:25 < }8]> hey fellas. im trying to start xorg and get " Failed to load module "nouveau" (module does not exist, 0)", but i have nouveau compiled (as module, kernel 4.15.14) and actually already loaded: nouveau 1449984 0
20:25 < }8]> any ideas whats going wrong?
20:26 < }8]> n/m, i needed xserver-xorg-video-nouveau
20:29 < }8]> is xserver-xorg-video-fbdev required for nouveau?
21:14 orbea: }8]: is it blacklisted?
21:15 orbea: is there a problem with your xorg.conf? (try removing it)
21:44 imirkin: if you still need help, pastebin dmesg and xorg logs
21:46 karolherbst: wow, I like those kind of rewrites: 48396 -> 48485 Pass rate
21:51 karolherbst: just piglit sillyness though
22:01 imirkin: ?
22:08 karolherbst: imirkin: sometimes tests get disabled because others crashed/failed
22:13 imirkin: ah
22:28 pmoreau: imirkin: Ah, good to know. We should update the wiki some day.