00:04 imirkin: nouveau_mm.c:76: mm_slab_free: Assertion `i < slab->count' failed.
00:04 imirkin: with Matinee Fight Scene (UE4 demo)
00:04 imirkin: grrr.
00:08 Lyude: 125→~106W now on full load with kepler
00:08 Lyude: this is neat
00:13 imirkin: skeggsb_: ok, so the recovery really sucks
00:14 imirkin: it kills the channel, the process goes into a spin loop, and the display dies
00:14 imirkin: killing the process from another machine restores things though
00:14 imirkin: but that step shouldn't be necessary
00:15 imirkin: skeggsb_: https://hastebin.com/ihahoquked.css
00:15 imirkin: so that all happens pretty quickly. but after that, the Atlantis Demo was in a spin loop taking up 100% cpu
00:15 imirkin: and i didn't even have hw cursor on the screen
00:25 skeggsb_: imirkin: yeah... as i mentioned, the kernel (for the most part) does its job correctly, userspace needs to react appropriately too to make it "seamless"
00:25 skeggsb_: but yes, we *could* potentially kill the process automatically.. not sure that'd be considered good form though, and, we *can* just do the plumbing
00:26 imirkin: well .. we'd have to kill the right process too
00:26 imirkin: e.g. in this case it's not X that'd want killing
00:26 imirkin: but rather the ... "beneficial owner" of the fd :)
00:26 skeggsb_: yeah, that's not idea either :P
00:26 skeggsb_: ideal*
00:27 imirkin: anyways... mostly voicing my complaint that the current situation stinks
00:27 airlied: just invalidate all the mappings on the fd and hope it hits SIGBUS :-P
00:27 imirkin: and iirc it's gotten worse since some of the recovery stuff got added
00:28 imirkin: (since now it pretty much auto-hangs-everything, while before at least sometimes things would keep working)
00:28 skeggsb_: imirkin: i'm honestly not sure how that behaviour could have changed.. either way, the channel wouldn't have made progress before/after, so, it shouldn't be any different
00:29 imirkin: well, X didn't hang before
00:29 imirkin: and i could, many times, kill the process locally
00:29 imirkin: whereas now it feels like i *have* to go to another box
00:31 imirkin: but if i *do* have another box, it seems like recovery is more likely
00:31 imirkin: so i guess you gotta take the bad with the good :)
00:31 imirkin: i guess next time i should see where it's spinning
00:31 imirkin: and see if nouveau kernel module can't kill it based on that
00:32 skeggsb_: i was about to ask if you could actually, that'd be interesting to know
00:33 imirkin: in case it's relevant, note that i live in a no-frills environment ... no compositor, etc.
00:33 imirkin: i do have DRI3 though
00:48 imirkin:should see why nv3x + xvmc never ended up working.... it should have, in principle...
02:35 gnurou_: mangix: that commit indeed looks like it could fix issues on large-memory boards
02:35 gnurou_: mangix: so can I assume your issue is solved now?
03:17 imirkin: hm, that's inconvenient - no way to nuke something from a bin
03:17 Echelon9: so having some progress bringing my GP107 up: https://bugs.freedesktop.org/show_bug.cgi?id=100228
03:18 imirkin: sounds similar to what mangix was seeing
03:18 Echelon9: I know this is a distribution question: but what is the experience of cleanly *uninstalling* the Ubuntu shipping nvidia blob drivers, if i was to temporarily install them to get a mmio trace?
03:19 Echelon9: If i loaded the blob, i'd like to be able to revert to a clean nouveau as is, without having to rebuild this system ...
03:19 imirkin: Echelon9: can you confirm you have the *updated* gp107 firmware data?
03:20 Echelon9: can you provide the link to upstream gp10y firmware and I'll SHA against what I have here?
03:20 imirkin: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/nvidia/gp107
03:20 imirkin: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/nvidia/gp107?id=7c7785c3fca909d876e506e337eaa771a8f4dcbc
03:23 Echelon9: yes, I have the updated gp107 firmware
03:27 imirkin: ok. just checking :)
03:28 Echelon9: nvapeek 0x00101000 also stalls the system -- so I can't get the strap info, but I do have a vbios
03:42 imirkin: skeggsb_: ran out of time. this is how far i got: https://hastebin.com/iqaqiqunuf.cpp
03:42 imirkin: not even compile-tested. the bufctx bin thing was a mistake, forgot to undo it.
03:43 imirkin: missing is (a) the logic to take those list_head's and dump them into the pushbuf list in the kick handler
03:44 imirkin: and (b) compiler changes to deal with ... stuff that i discover needs fixing :)
03:44 imirkin: probably another hour's work another day
04:18 Horizon_Brave: hey everyone
06:36 hakzsam: imirkin_: why do you want it to be non-const? it's a template
06:36 hakzsam: you have to keep track of the handles directly in the driver (they are not stored at st/mesa level)
06:37 hakzsam: radeonsi uses a hash table internally for that, which stores a si_texture_handle_object for each texture handle
06:37 hakzsam: *for each 64-bit value
06:38 gnarface: is there a channel like this for the linux intel/mesa driver?
06:39 gnarface: or do they just hide behind closed doors?
06:39 hakzsam: dri-devel or intel-gfx
06:41 gnarface: thanks hakzsam
07:03 pmoreau: imirkin_: On my end there is only https://patchwork.freedesktop.org/patch/161333/ that would require some attention.
08:25 karolherbst: imirkin_: you could take a final look at my gallium_precise branch and maybe even push it if you think it's fine as it is: https://github.com/karolherbst/mesa/commits/gallium_precise
08:26 karolherbst: imirkin_: I've done a piglit and shader-db run and it looked all fine and solid
08:28 karolherbst: skeggsb_: for me piglit sometimes hangs on those textureGatherer tests if run in parallel
08:28 karolherbst: Lyude: nice!
11:12 Celelibi: Is there a way to add a favorite location by hand?
11:14 karolherbst: Celelibi: what do you mean?
11:14 Celelibi: I make navit crash everytime I try to do it the regular way.
11:18 Celelibi: So I thought I would add manually the ones I want to use.
11:28 karolherbst: Celelibi: what's navit?
11:28 Celelibi: The chan just above. Sorry. :)
11:29 karolherbst: Celelibi: I have no idea what you are talking about. If you are having a bug with an application caused by the nouveau driver, then you should open a bug and provide information, but maybe you already did this
12:12 vpelletier: hi imirkin_ , skeggsb_
12:12 vpelletier: catching up with the logs
12:40 vpelletier: imirkin_: $ sudo ./nvapeek 0x880068
12:40 vpelletier: ...
12:40 vpelletier: that's all that prints (the 3 dots)
12:40 vpelletier: this is with MSI disabled though
13:25 imirkin_: vpelletier: 88068 iirc
13:26 imirkin_: vpelletier: and yes, it'd be good for MSI to be on
13:26 imirkin_: to see what's in that reg...
13:26 imirkin_: skeggsb_: had another fan fail last night... going to see what set of patches i had applied, might not have had your latest
13:26 vpelletier: I tried again with msi on, and same output
13:26 imirkin_: vpelletier: with 88068 or 880068?
13:27 vpelletier: oh, missed that extra 0
13:30 vpelletier: 00088068: 00817805
13:30 imirkin_: does nvapeek take a size?
13:30 vpelletier: "-b 1" causes "..." only again
13:31 imirkin_: right, ok. it should be 0xff
13:31 imirkin_: ... = 0 btw
13:31 imirkin_: it's a quirk of how some other stuff works which causes the ... to be displayed there
13:31 vpelletier: skipping over "cannot read" which appears as zeros, I guess
13:32 vpelletier: so here, 88068 is the address of the MSB (...which would be consistent with endianness), which happens to be zero
13:32 vpelletier: sudo ./git/envytools/nva/nvapeek -b 1 0x88068 4 -> 00088068: 00 81 78 05
13:34 vpelletier: nvapoke -b 1 0x88068 0xff ?
13:43 imirkin_: i guess
13:44 imirkin_: ok, so the pci space doesn't get the auto-endian-adjust logic
13:44 imirkin_: or something
13:44 imirkin_: either way, the code does rd08/wr08
13:44 imirkin_: but yeah, set it to 0xff and see if you get another interrupt
13:50 vpelletier: no luck, it seems
13:52 imirkin_: can you read it back?
13:52 vpelletier: as 00
13:52 imirkin_: but no additional interrupt?
13:52 vpelletier: no, still only one
13:52 vpelletier: I tried writing 3 times, no change
13:52 imirkin_: what if you do like...
13:53 imirkin_: nvapoke 88068 ff817805
13:53 vpelletier: I was tempted indeed :)
13:54 vpelletier: no luck, still 1 interrupt
13:55 vpelletier: mmh
13:55 vpelletier: nouveau did revert to software console when emitting that error message
13:55 vpelletier: would the card still try to emit interrupts ?
13:55 imirkin_: yes, that's by design
13:55 imirkin_: wellllll
13:55 imirkin_: hm
13:55 imirkin_: maybe not
13:56 imirkin_: did the value of 0xff stick this time though?
13:56 vpelletier: I did see fan & temperature stuff when I booted with NvMSI=0 and debug level high
13:56 vpelletier: no, still reads as 00
13:58 vpelletier: Spurious interrupts does not change either
13:58 vpelletier: I remember someone telling there is a hardware bug swapping endianness
13:59 imirkin_: yeah, benh referred to it
13:59 vpelletier: I kind of remember it being the address endianness, but I do not remember clearly enough
14:00 vpelletier: and I'm not too comfortable writing to 68800800 "just in case"
14:00 kwizart: hello, is there a way to detect nvidia card using udev without to rely on any driver ? using pci vendor isn't enought as it would match non vga devices
14:01 imirkin_: vpelletier: no, that's the address... that has to work
14:01 aaronp: If you're scanning PCI, you'll want to look at the device class as well.
14:01 aaronp: But then, what are you going to do with it without a driver? :)
14:01 imirkin_: kwizart: pci class 0300 and 0301 iirc
14:02 Celelibi: karolherbst: Nah, I just misclicked the channel name. Because #navit is just above #nouveau.
14:02 aaronp: Isn't 301 PCI_SUBCLASS_DISPLAY_XGA?
14:02 imirkin_: er no. 0302 it seems.
14:03 karolherbst: Celelibi: ohh, I see
14:03 kwizart: imirkin_, seems like there is 3020 here indeed
14:03 imirkin_: and vendor id = 10de
14:03 imirkin_: it's "VGA Compatible Controller" vs "3D Controller"
14:03 imirkin_: the main difference is that the 3D ones *tend* not to have outputs, although that's not always the case
14:03 imirkin_: just ... more frequently the case.
14:03 kwizart: aaronp, trying to run a systemd fallback script to modprobe nouveau if the nvidia driver isn't here for any reason
14:04 vpelletier: imirkin_: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/sysdev/mpic_u3msi.c#n80
14:04 imirkin_: ooh, reading the code for the msi stuff. interesting idea.
14:05 imirkin_: never occurred to me, that one
14:05 aaronp: kwizart, ah, that makes sense.
14:05 vpelletier: imirkin_: problem, I do not understand what I'm reading
14:05 imirkin_: vpelletier: and do you have a u4-pcie device in your of?
14:05 imirkin_: vpelletier: iirc you had a u3msi situation?
14:06 vpelletier: 32: 1 0 0 0 MPIC-U3MSI 8 Edge nvkm
14:06 imirkin_: if (type == PCI_CAP_ID_MSIX)
14:06 imirkin_: pr_debug("u3msi: MSI-X untested, trying anyway.\n");
14:06 imirkin_: that sounds nice.
14:06 vpelletier: at least the controler (?) displayed in /proc/interrupts is the one from this driver
14:06 imirkin_: vpelletier: right. can you dump your OF somewhere?
14:06 imirkin_: iirc it's in /proc/of or something
14:07 imirkin_: just want the node names, not contents
14:07 imirkin_: hm. a lot of this stuff is behind pr_debug -- can you figure out how to get those prints? i think you can enable it dynamically somehow.
14:08 imirkin_: u3msi: allocated virq 0x%x (hw 0x%x) addr 0x%lx\
14:08 imirkin_: do you see that print?
14:08 vpelletier: https://pastebin.com/raXYcqRi
14:08 imirkin_: hmmm.... ok. so the compatible string must be somewhere deeper.
14:08 vpelletier: [ 5.606224] u3msi: allocated virq 0x20 (hw 0x8) addr 0xf8004080
14:09 imirkin_: return 0xf8004000 | (hwirq << 4)
14:09 imirkin_: ok, so that matches
14:09 vpelletier: this one looks like the one for nvkm, there are 2 others for eth cards
14:09 vpelletier: "virq 0x21 (hw 0x9) addr 0xfee00000" and "virq 0x22 (hw 0xd) addr 0xfee00000"
14:09 imirkin_: hmmmmmmmmmmmmm
14:09 imirkin_: that's not good
14:10 imirkin_: those addresses don't conform
14:10 vpelletier: eth0 has a lot of irq serviced, eth1 hase 2 (no cable)
14:10 imirkin_: those addresses are 0xfee....
14:10 vpelletier: they somehow do work
14:10 imirkin_: while the addr of the nvidia thing is the 0xf8004000 thing
14:10 vpelletier: yep
14:11 kwizart: aaronp, FYI this is the current implementation : https://bugzilla.rpmfusion.org/show_bug.cgi?id=4498#c16
14:11 imirkin_: i'm guessing one is the ht_magic_addr while the other is the u4_magic_addr
14:11 vpelletier: 0001:05:04.0 Ethernet controller: Broadcom Limited NetXtreme BCM5780 Gigabit Ethernet (rev 03)
14:11 vpelletier: 0000:0a:00.0 VGA compatible controller: NVIDIA Corporation G70 [GeForce 7800 GT] (rev a1)
14:11 imirkin_: vpelletier: could i get a lspci -vvvvvnn output for the whoel system?
14:11 vpelletier: they are not on the same controller (bridge ?) though
14:11 vpelletier: I think I pasted one before, let me check
14:12 vpelletier: https://pastebin.com/raw/tgwgqaPQ
14:12 imirkin_: yeah well i've long forgotten it
14:12 vpelletier: here it is, with lspci -t at the end
14:12 imirkin_: that's with NvMSI=0 ?
14:12 vpelletier: no idea
14:12 vpelletier: let's dump again
14:12 imirkin_: ;)
14:13 aaronp: kwizart, I wonder if you can tweak the module load order somehow so that nvidia.ko is loaded before nouveau.ko, if it exists.
14:13 vpelletier: https://pastebin.com/bwDXVVwV
14:14 imirkin_: ok, so that was with msi disabled previously
14:14 vpelletier: "MSI: Enable+" on the nvidia card this time, so previous one was likely NvMSI=0
14:14 vpelletier: gah, beat me to it :)
14:14 imirkin_: ahhh... HT = hypertransport
14:14 imirkin_: which is proably just for the "cool" devices
14:14 imirkin_: aha, BUT! the ethernet is not on hypertransport
14:15 vpelletier: https://pastebin.com/pvp4Xym5 dmesg | grep HT
14:16 imirkin_: this is where benh comes in useful.
14:17 vpelletier: *chirp*
14:18 vpelletier: something I do not get at all about this is whether "Apple Inc. CPC945 PCIe Bridge" not having MSI enabled would prevent underlying cards from having MSI
14:19 imirkin_: answering that question would require understanding how MSI works
14:19 imirkin_: and interrupt delivery
14:20 vpelletier: also, I do not understand why MSI get assigned to hardware irqs (as I understand the "u3msi: allocated virq" lines, all card have a hardware IRQ plus a MSI...)
14:20 imirkin_: MSI is an alternate way of delivering interrupts
14:20 imirkin_: you give the card some address
14:21 imirkin_: and it writes to that address to signify there's an interrupt
14:21 vpelletier: my extremely superficial understanding of MSI is that it saves coper lines (and inputs on the interrupt controller)
14:21 imirkin_: or ... something.
14:21 imirkin_: that allows it to have an arbitrary quantity of different interrupt lines if necessary
14:21 imirkin_: without the physical wiring for those
14:21 vpelletier: yes, and it fires an IRQ line it shares with other devices to "ring the door"
14:21 imirkin_: but the knock on the door isn't a binary 1/0, but rather an address
14:22 imirkin_: so the receiver of the knock knows who's come a-knockin'
14:22 vpelletier: and then can check the address to understand what it tried to signal with that knock
14:22 vpelletier: mmh
14:23 vpelletier: then maybe there is still an irreductible need for one irq wire per MSI source, just allowing 2**32 interrupt reasons or something
14:23 imirkin_: well, you create these virtual interrupts
14:23 imirkin_: so you know which subdevice needs servicing
14:31 vpelletier: $ cat pci@0\,f0000000/compatible -> u4-pcie
14:31 vpelletier: (going back a bit in the discussion)
14:39 vpelletier: https://pastebin.com/0BT57dCq
14:40 vpelletier: openfirmware nodes for the card and its bus
14:46 vpelletier: "As an example, PCI Express does not have separate interrupt pins at all; instead, it uses special in-band messages to allow pin assertion or deassertion to be emulated."
14:47 vpelletier: ( https://en.wikipedia.org/wiki/Message_Signaled_Interrupts )
15:36 azaki: i'm looking at the nouveau feature matrix which says power management is "work in progress" for various cards, but the page was last edited in october of last year, so i'm wondering how things have progressed since? I'm mostly interested in NVC0 (Fermi)
15:38 karolherbst: azaki: for fermi not so much. But Lyude is doing some work regarding clock gating, which can reduce power consumption for Fermi cards as well, but I think her work is mainly focused on Kepler right now
15:38 Tom^: Lyude: if you need some one on a kepler to test something im here.
15:39 Tom^: Lyude: im on my 4 week vacation, forced to run on nouveau, and bored. =D
15:39 karolherbst: "forced"
15:39 Tom^: yes, literally.
15:39 Tom^: because blob introduces hw breakage!
15:39 Tom^: =D
15:39 karolherbst: create a bug :D
15:40 karolherbst: "GPU doesn't work properly with Nvidia GPU. No issues on Nouveau (tm)"
15:40 azaki: karolherbst: is fermi at least usable for day to day stuff? right now i'm using an old radeon hd 4850 and i think it's starting to fail, i have a very lightly used fermi card (GT630 2GB) which came with an asus OEM machine that i figure i could use in the meantime, since i can't afford a new GPU right now
15:40 Tom^: its quite funny actually
15:40 orbea: its fun watching nouveau work better with older cards than the blob does :)
15:40 karolherbst: Tom^: and especially state, that the performance on nouveau is usually enough, but there is feature X you would like to use again
15:40 karolherbst: azaki: lowest clocks only
15:41 Tom^: but how the powermizer settings behave and how things happends, im almost fairly sure i got a wonky temp sensor so it trottles badly and then cant keep up with redrawing my display at 144hz and shit goes weird.
15:41 karolherbst: azaki: but usually it just works, just a matter of how much you need performance
15:41 Tom^: and since nouveau doesnt care about temp. im all good :D
15:41 karolherbst: :D
15:41 karolherbst: Tom^: you could montor the temperature
15:42 karolherbst: would be interesting to know what goes wrong
15:42 Tom^: the problem is its a bit annoying to trigger it, takes a 30 - 40 minutes of heavy load.
15:42 karolherbst: Tom^: nvaforcetemp 95
15:42 Tom^: but its still game breaking if you are in a cs:go game and then suddenly cant see anything.
15:43 Tom^: karolherbst: yea but incase it isnt temp?
15:43 Tom^: karolherbst: :P
15:43 karolherbst: maybe we need to add options for this later: "config=NvLeaveMeAloneIKnowWhatIDoMode=2"
15:44 karolherbst: Tom^: gputest furmark
15:44 karolherbst: I'll give your machine 5 minutes
15:46 azaki: karolherbst: thanks. i mostly need it to be stable; performance would be nice but not a huge necessity i guess.
15:46 karolherbst: well try it out then. Nouveau currently has a few issues with applications having multiple OpenGL contexts at the same time
15:46 karolherbst: there is work for this already, no idea when it's ready though
15:46 azaki: hopefully it can last long enough for the gpu shortage to clear out.. right now miners are apparently buying them by the bucketload =\
15:47 karolherbst: yeah and will sell them for nothing in one year
15:48 azaki: yeah, so on top of me not being able to afford it right now, it's also just a bad time to buy a gpu right now. =c
15:48 Tom^: karolherbst: http://i.imgur.com/Mcj7084.jpg time for blob then.
15:48 karolherbst: buy Tom^s gtx 780ti, he will sell his for 50, cause its broken :p
15:48 karolherbst: Tom^: non benchmark mode on blob
15:48 Tom^: i will when the rx580 comes out
15:48 karolherbst: it draws soooo much power
15:49 karolherbst: Tom^: or give it to mupuf
15:49 Tom^: yea i probably will
15:49 karolherbst: we don't have a nvf1, only nvf0
15:49 karolherbst: afaik
15:49 Tom^: yeah, stores claim it arrives 2nd august
15:49 Tom^: so around then
15:49 Tom^: sad thing is tho that means no more nouveau ;_;
15:49 azaki: Tom^: do you mean when it gets back in stock? or do you live somewhere where it's not out yet?
15:49 Tom^: azaki: yea back in stock
15:50 azaki: ah
15:50 Tom^: azaki: its pretty much out of stock worldwide
15:50 karolherbst: Tom^: traitor
15:50 Tom^: karolherbst: ;_;
15:50 karolherbst: I have a special list for people like you
15:50 karolherbst: ranked by "dagree of betrayal"
15:50 Tom^: a horrible list
15:51 karolherbst: the best
15:51 karolherbst: there are now 2 people on that list, you being 2nd
15:51 Tom^: =D
15:52 azaki: i'm probably on the 'other' list for considering doing the reverse, going from radeon to nouveau with this fermi card =p
15:52 Tom^: azaki: the good list.
15:52 Tom^: anyhow time for blob then.
15:52 karolherbst: there is no good list
15:52 azaki: i meant that the radeon guys may have their own traitor list =p
15:52 azaki: and that i'm on that one
15:52 azaki: lol
15:53 karolherbst: it's the same as with life: good deeds are go unoticed and bad deeds will be blamed on you your entire life
16:04 Tom^: karolherbst: yeah isnt temp http://i.imgur.com/MDTJP6G.jpg , havent happend yet.
16:04 Tom^: karolherbst: :p
16:05 Tom^: karolherbst: pic from google but this is pretty much how it looks on both monitors when it happends http://imgur.com/TqoteH9, for ~5 - 10 seconds. then goes back to normal.
16:06 Tom^: karolherbst: once it begins, i can reproduce it pretty much in any gpu heavy application
16:06 karolherbst: Tom^: yeah, only furmark
16:06 karolherbst: otherwise you reduce heat production
16:14 karolherbst: Tom^: uhh, this looks nasty
17:50 mangix: gnurou: yes my issue is solved. i just need bskeggs to push the changes
18:27 karolherbst: mangix: what solves your issue?
18:37 imirkin_: hakzsam: well, i didn't realize it was a template... would be nicer if it were just a real sampler object instead. wtvr.
18:48 Horizon_Brave: hey everyone
18:48 tobijk: hi
21:03 Lyude: huh
21:03 Lyude: TIL nvidia's blob supports reloading
21:16 Lyude: Tom^: thanks for letting me know! not sure what I'll really need atm, i've got most of the cards I was planning on working with here already
21:16 Lyude: but I will let you know if I need anything
21:17 Tom^: Lyude: sure
21:17 mupuf: Lyude: what do you mean with the blob supporting reloading?
21:17 Lyude: mupuf: i can actually rmmod it and
21:17 Lyude: *rmmod it
21:17 Lyude: and nothing seems to actually break
21:18 mupuf: sure, this has always been working
21:18 Lyude: huh. it is just surprising because not many drm drivers in general can do that very well...
21:18 mupuf: it is very important when faking vbios
21:18 mupuf: oh, right, maybe
21:18 mupuf: nouveau works pretty well
21:18 Lyude: we don't really have working module reloading with nouveau do we?
21:18 mupuf: I don't reboot when I develop
21:19 mupuf: and intel also works quite well
21:19 mupuf: not sure about AMD
21:19 Lyude: yeah i915 has always worked pretty well with that, with some exceptions
21:19 Lyude: AMD is very hit or miss
21:19 airlied: amd breaks cause some of their hw is badly designed
21:19 Lyude: sometimes it works, sometimes it doesn't, sometimes it only works the first 3 times, etc...
21:20 airlied: the video decode engibes dont reset cleanly
21:20 Lyude: gross
21:20 Lyude: ...that explains some things not working during troubleshooting that I thought were supposed to though at least
21:25 mupuf: airlied: shit :o So, the only way the reset the GPU is an actual reboot?
21:29 Lyude: wonder if you could just cheat and not enable the parts of the GPU that won't reset cleanly when developing on it...
21:30 airlied: mupuf: the only really guaranteed way, there are some other hammers you can use, but power removal is the only way to be sure :-P
21:30 mupuf: OMG
21:30 mupuf: amazing for laptops :p
21:31 Lyude: i think i tried those hammers on many occasions and they didn't seem to really help much
21:32 Lyude: at some point i was doing something vce related and remembered not being able to figure out any actual way to get it to reset
21:34 airlied: mupuf: works good on laptops, esp optimus style since they power the gpu off
21:34 mupuf: ah, right
21:35 airlied: also they maybe have fix the engine to be more resettable in later hw,
21:59 skeggsb_: Lyude: i reload the nouveau module a *heap*, what makes you think its not fine?!
22:00 Lyude: hehe, i probably just had one of my scripts doing it wrong and never bothered to fix it
22:00 Lyude: is it just rmmoding everything + unbinding the vt console?
22:00 skeggsb_: we don't restore the vga console etc (i believe nvidia have an x86 emulator somewhere that runs the vbios to do that...), but aside from that...
22:01 skeggsb_: unbind + rmmod works fine
22:01 karolherbst: skeggsb_: there are some race conditions on unloading though
22:01 karolherbst: sometimes you can get unlucky and the kernel crashes
22:02 skeggsb_: yeah, that's never ever happened to me
22:02 karolherbst: it happens for me twice a month or so
22:02 skeggsb_: but, you're correct, some subdevs don't clean up after themselves properly...
22:02 skeggsb_: cancelling work/timers etc
22:02 karolherbst: yes
22:03 skeggsb_: therm, surprise surprise, is especially bad for that
22:03 karolherbst: mhh odd
22:03 karolherbst: it mostly happens with gr for me
22:03 Lyude: mh
22:03 Lyude: *hm
22:03 skeggsb_: um, how?! i don't recall gr having anything like that
22:03 Lyude: I will start reloading instead of rebooting and if i notice therm breaking anything I will see if I can patch it
22:04 Lyude: since i'm already working on it anyhow
22:04 karolherbst: skeggsb_: will tell you when it happens again
22:05 skeggsb_: Lyude: another thing that doesn't work (last i checked) is unloading nvidia and loading nouveau without a reboot
22:05 skeggsb_: they do something to display that we don't know about / can't undo, and it refuses to process our push buffers
22:05 Lyude: oh, huh
22:05 skeggsb_: not a terribly important thing to solve, however
22:06 Lyude: any idea if kexec'ing might workaround that?
22:06 imirkin_: Lyude: don't forget to unbind vtcon
22:06 skeggsb_: i doubt it, kexec doesn't reset hw state
22:06 imirkin_: that's the most common mistake in unloading nouveau
22:06 Lyude: imirkin_: yeah i know about that part at least, i've got some script in one of my scripts somewhere that does that for me
22:06 imirkin_: ok =]
22:06 imirkin_: well, i do that quite a bit, generally works out ok
22:07 imirkin_: sometimes some idiot inserts a bug into the module being reloaded, but that's hardly nouveau's fault.
22:07 imirkin_: hard to fight the pebkac
22:08 imirkin_: flipping between blob and nouveau *used* to work just fine, but a couple years ago it stopped working... something disp-related. all is well, but the fb flips don't work.
22:08 skeggsb_: i think it was something on nvidia's side that changed there
22:09 skeggsb_: istr there was a driver version bump of theirs where that changed
22:09 imirkin_: yeah
22:09 skeggsb_: was such a long time ago now though...
22:09 imirkin_: 330.x iirc
22:09 Tom^: yeah wasnt there a 3xx blob that broke it?
22:09 imirkin_: 325 was fine
22:09 imirkin_: and the next series was not :)
22:09 Tom^: i remember doing it precisely around that time, and then the broken series arrived
22:09 imirkin_: or maybe 340 was the breaking point
22:10 skeggsb_: they can obviously reload after themselves, so, well, we should be able to make it work again if anyone cares
22:10 imirkin_: probably worth re-testing now that the disp stuff is better documented
22:10 imirkin_: and has had like 10 rewrites in between :)
22:10 skeggsb_: pretty sure nothing has changed that's relevant, but, go for it :P
22:11 Tom^: would be quite neat with libglvnd that we have now, simply have both drivers installed and just load the one you want. :p
22:11 Lyude: that actually would be kind of nice
22:11 imirkin_: Tom^: you can do that with optimus
22:11 imirkin_: that's what karol does
22:11 imirkin_: harder when the board also drives display
22:11 Tom^: imirkin_: yeah doesnt that kind of require me to have something to drive my displays?
22:12 imirkin_: yes
22:12 imirkin_: like i said... with optimus ;)
22:12 Tom^: ;)
22:21 Lyude: is there any way to figure out through sysfs, other then guessing, which vt a GPU is bound to?
22:32 imirkin_: *crickets* ... well, i think you have your answer
22:37 Lyude: maybe. however, would someone with optimus mind doing me a favor and getting me the output of $(ls /sys/class/drm/card*/device/graphics) ?
22:38 imirkin_: fb0 for me (just intel here)
22:38 Lyude: and nothing on the nouveau one?
22:38 imirkin_: no nouveau
22:38 Lyude: oh, gotcha
22:38 imirkin_: i mean - it's just a SKL chip in this box
22:38 Lyude: i am asking btw because i think that might actually be the way to do it
22:38 imirkin_: but that's not the console, that's the fbdev
22:38 imirkin_: you need to know which vtcon
22:39 Lyude: yeah but can a vtcon be... hm
22:39 Lyude: you are right
22:39 Lyude: that's annoying, was going to write a script for it
22:39 imirkin_: of which i somehow have 2. no idea what they are.
22:39 Lyude: the first one is early console I think
22:39 Lyude: second one is actual console
22:39 Lyude: i'm curious though: I would think that if one card is bound to the vtcon, then any other cards shouldn't have an fbcon device because they're not bound
22:39 imirkin_: (S) dummy device vs (M) frame buffer device
22:40 Lyude: unless that is not how that works, which is very possible
22:40 imirkin_: fbcon != fbdev
22:40 Lyude: ah right
22:40 imirkin_: there's a fbdev device for each drm device -- the fbdev api emulation is provided by the drm driver
22:40 Lyude: ah
22:41 imirkin_: not sure what fbcon is or how it fits in
22:41 imirkin_: but nouveau also provides fbcon accel hooks
22:41 imirkin_: dunno if that's just part of the fbdev api or what
23:16 mangix: karolherbst: so Gnurou broke 4.11 for my GPU and fixed it in 4.12. It's 3 patches relating to integer overflow (I have 8GB RAM)
23:17 mangix: as for why your branch fails for me, ¯\_(ツ)_/¯
23:19 karolherbst: ahhh
23:19 karolherbst: maybe I miss some patches
23:19 karolherbst: mangix: did you try 4.12-rc?
23:20 mangix: i cherry picked his patches and narrowed down which ones cause my issue
23:20 mangix: they're in your branch
23:20 karolherbst: I see
23:20 karolherbst: maybe another thing broke stuff again...
23:21 mangix: i mean, terminal being quadruples and horizontally squished is already pretty broken to me :P
23:21 mangix: *quadrupled