00:13Lyude: (also, hooray! adding that appears to have completely fixed it)
00:36skeggsb: Lyude: i think my only concern is not being able to reboot after something goes horribly wrong
00:37skeggsb: as long as "echo b > /proc/sysrq-trigger" still works though...
00:37airlied: that one bypasses shutdown
00:38skeggsb: thought it probably would
00:38skeggsb: airlied: do you anticipate any unexpected side-effects from doing what Lyude is suggesting?
00:41airlied: skeggsb: totally, but I've no idea what they will be, at least with amdgpu getting a working shutdown was a bit of pain
00:41airlied: since userspace isn't frozen or dead when you hit the callback
00:41Lyude: airlied: oh I already did the work for that!
00:42Lyude: i actually reworked nouveau's init/deinit so that it uses the new drm helpers for unregistering the device from userspace before shutting it down so userspace has no chance to break it
00:43Lyude: (drm_dev_alloc(), drm_dev_register(), etc. etc...)
00:58Lyude: anyway, I will send out the patches in just a moment
01:31Lyude: airlied: btw, something I forgot to mention as well: it was probably a bit more difficult for amdgpu to get ->shutdown() working since they're not fully converted to the new DRM probing helpers yet (they use them, but they still have a ->load() callback which means they're still prone to racing)
15:17rhyskidd: karolherbst: do you want me to merge your gk104 compute method doc patch into envytools?
15:20karolherbst: rhyskidd: yeah, go ahead. There will be more though
15:22karolherbst: well, if somebody is really bored, one could add everything from the open-gpu-docs :D
15:22rhyskidd: hehe :D
15:22rhyskidd: i've been adding bits and pieces as relevant to my Pascal boards
18:09pqatsi: Hello guys! There is something new about the GP108M? Got this dmesg if i dont use nouveau.modeset=0: https://paste.fedoraproject.org/paste/uhWgBGxuOJfmu02wlDv-og
18:14rhyskidd: pqatsi: there are two major classes of known issues with Pascal (GP10x) cards on laptops
18:14rhyskidd: can you try with nouveau.runpm=0 as a kernel cmdline config?
18:14pqatsi: rhyskidd: yeap
18:45pqatsi: [root@manauara ~]# ( echo -e 'dmesg\n'; dmesg; echo -e 'cmdline\n' ; cat /proc/cmdline) | fpaste -t "Nouveau with runpm=0"
18:45pqatsi: Uploading (251.8KiB)...
18:46pqatsi: System does not crash after gdm login anymore, but in any case there is a glitch before gnome loads, after login. With runpm=0, login tooks a longer time
18:47rhyskidd: hrmm -- can you also check that you have the latest linux-firmware files for gp108?
18:47pqatsi: When runs " DRI_PRIME=1 glxinfo | egrep '(vendor|render)' ", dmesg got mad, really mad
18:48pqatsi: rhyskidd: how can I check?
18:48rhyskidd: should be in /lib/firmware/nvidia/gp108/
18:48pqatsi: [leonardo@manauara ~]$ DRI_PRIME=1 glxinfo | egrep '(vendor|render)'
18:48pqatsi: nvc0_screen_create:906 - Error allocating PGRAPH context for M2MF: -16
18:48pqatsi: libGL error: failed to create dri screen
18:48pqatsi: libGL error: failed to load driver: nouveau
18:48pqatsi: And still running here, dmesg -ew got mad
18:48pqatsi: rhyskidd: [root@manauara ~]# tree /lib/firmware/nvidia/gp108/ | fpaste
18:49pqatsi: Uploading (1.2KiB)...
18:49rhyskidd: your latest dmesg log is missing the start, but it reporting having trouble bringing up the gr engine on your gp108
18:50pqatsi: let me get the boot log
18:50rhyskidd: yup, okay looks all to be there. there was an early version of the gp10x firmware that had errors as supplied by nv, but most major distros now ship the fixed versions
18:51pqatsi: rhyskidd: [leonardo@manauara ~]$ journalctl --boot=0 -a | fpaste
18:51pqatsi: WARNING: your paste size (1165.5KiB) is very large and may be rejected by the server. A pastebin is NOT a file hosting service!
18:51pqatsi: Uploading (1165.5KiB)...
18:51pqatsi: rhyskidd: may I send you the sha1 to check?
18:52pqatsi: [root@manauara ~]# find /lib/firmware/nvidia/gp108/ -type f -print0 | xargs -0 sha1sum | fpaste
18:52pqatsi: Uploading (1.8KiB)...
18:54rhyskidd: sha1's match
18:55rhyskidd: hrmm, i recall we had another gp108 user come on here a little while back with similar issues
18:55rhyskidd: i presume the acpi_* cmdline config are needed to boot linux?
18:55pqatsi: rhyskidd: no, is just a improvement, but not needed to boot
18:55rhyskidd: i know on my xps9560 i have to use some to ensure gp107 comes up right
18:55pqatsi: I can change everything here. Only thing I cant use now is the nvidia card
18:56pqatsi: rhyskidd: what do you use?
18:56pqatsi: Mine is a Inspiron 7000 line
18:56rhyskidd: i'm always a bit cautious with recommending these, as they tend to float around the internet -- being tried randomly without really getting to the low-level issue
18:56rhyskidd: but it shouldn't hurt trying it
18:57rhyskidd: what are you trying to use the nvidia card for?
18:58pqatsi: rhyskidd: little games and maybe some webgl
18:58pqatsi: and video decoding
18:58rhyskidd: ok, pascal series doesn't have reclocking (only boots at slowest speeds) so your experience of much beyond desktop use won't be great
18:59rhyskidd: video decoding almost certainly wont work on the open source driver on pascal
18:59pqatsi: rhyskidd: :(
19:00rhyskidd: caused by nvidia's decision not to ship redistributable firmware files that we require to run the hardware you bought
19:00pqatsi: rhyskidd: there is no way to extract?
19:00pqatsi: you cant redistribute, but may can extract
19:01rhyskidd: theoretically you can, but without being able to distribute -- upstream support will be limited
19:04pqatsi: rhyskidd: there is something in git that does this?
19:05pqatsi: Something like https://nouveau.freedesktop.org/wiki/NVC0_Firmware/ will work?
19:08karolherbst: imirkin: do you know if the trap handler we had worked at some point? hakzam hid it behind NOUVEAU_NVE4_MP_TRAP_HANDLER in 2016, but I am questioning if it worked backed then
19:08karolherbst: I am sure _something_ was missing
19:14RSpliet: pqatrsi: likely they changed the interfaces for that generation of cards, requires RE'ing beyond just extracting firmwares to make it work
19:43pqatsi: RSpliet: :(
19:55karolherbst: nice. RSpliet now I got code disassembling working with nvidia-uvm :)
19:56karolherbst: apperantly whenever we get a COMPUTE.FLUSH = CODE, we can just disassemble whatever last wrotes we got into COMPUTE.DATA
20:10karolherbst: RSpliet: it seems like on maxwell two nops (00070f00 50b00000) + st 0x0 scheds are enough to wait long enough for all kind of register writes
20:10karolherbst: (looking at the trap handler from nvidia)
20:11karolherbst: so they basically do the two nops + storing $r0q and $r4q into l
20:51karolherbst: skeggsb: sooo, something is weird. When I run nvidia, I see no fault handling inside the mmiotrace, but I am sure the shader actually accesses invalid memory and crashes
20:52karolherbst: any idea what we need to change to achieve something similiar?
21:22pmoreau: karolherbst: Duh: I was reading your last sentences and thinking “wouldn’t fault handling be done through MMIO rather than MM?” and then realised you did write mmiotrace, not mmt. --"
21:24pmoreau: Anyway, great work karolherbst and mslusarz on getting mmt working again! \o/
21:26sigod: karolherbst, any news on gtx660 firmware context switching bug, I'd look myself I'm just not sure where
22:19rhyskidd: mwk: any further comments before I merge this https://github.com/envytools/envytools/pull/153 ?
23:02mwk: rhyskidd: one nit
23:40imirkin: karolherbst: it never worked
23:52karolherbst: imirkin: I expected as much
23:52karolherbst: imirkin: anyway, I think to know what value to put inside the TRAP_HANDLER method, but... I don't see what we have to enable besides that
23:54karolherbst: imirkin: I thought that doing a "bpt trap" or something might just work, but not even that
23:57imirkin: so ... remember that you have to write the trap handler via firmware
23:57imirkin: since it's a context-switched reg
23:57imirkin: oh, we have to enable traps
23:57imirkin: i think we disable them
23:57imirkin: there's some bit somewhere
23:58karolherbst: there is a method on the compute class
23:58imirkin: maybe. must be new if there is.
23:58karolherbst: <reg32 offset="0x260c" name="TRAP_HANDLER"/>
23:59karolherbst: since kepler
23:59karolherbst: "#define NVA0C0_SET_TRAP_HANDLER 0x260c" from the open-gpu-doc
23:59karolherbst: and nvidia writes an address into it where something looking like a trap handler is located
23:59imirkin: these are the notes i have from a long time ago
23:59tribals: Hi, folks!
23:59imirkin: for fermi