00:00imirkin_: you saw the existing trace, presumably
00:00imirkin_: 99.99997% of the time, method id's don't change across gens
00:01imirkin_: but yeah, someone needs to figure out the subpixel offset mess
00:06pendingchaos: yeah, I saw the trace
00:11pendingchaos: also: I don't think INTEL_conservative_rasterization is a subset of NV_conservative_raster
00:11pendingchaos: for example, the intel one generates fragments with zero area polygons and has functionality which seems to be similar to NV_conservative_raster_underestimation but taken a bit further
00:11pendingchaos: also, the enable target for the intel one differs in value with the nvidia one
00:11pendingchaos: so I'm not sure how the two extensions interact
00:11imirkin_: yeah, dunno
00:11imirkin_: the names seemed similar ;)
00:12imirkin_: i didn't investigate in depth
00:12imirkin_: i think this is a good "starter" ext, since it's fairly simple to implement. that said, i doubt there are (m)any users
00:12imirkin_: but that's never stopped me from adding stuff :)
00:13imirkin_: pendingchaos: btw, do you have a HDMI 2.0 monitor?
00:14imirkin_: between ben and i, we haven't been able to get both an hdmi 2.0 monitor and a hdmi 2.0-enabled board in the same place
00:18pendingchaos: I think it's an HDMI 1.3
00:18imirkin_: ok. no 4k tv or anythign like that? those will generally be 2.0
00:19pendingchaos: just 1920x1080. looking up the monitors, they seem to be 1.3
00:19imirkin_: yeah, i'm sure those won't be HDMI 2.0
00:19imirkin_: o well.
00:19imirkin_: no worries.
05:57airlied: skeggsb, imirkin_ : btw tried mutter/wayland session from f27 on nv98, crash fest
05:57airlied: 114.093995] nouveau 0000:05:00.0: gr: TRAP_M2MF 00000002 [IN]
05:57airlied: [ 114.094003] nouveau 0000:05:00.0: gr: TRAP_M2MF 00320151 206f0000 00000000 04000000
05:57airlied: [ 114.094004] nouveau 0000:05:00.0: gr: 00200000  ch 1 [000fbbe000 DRM] subc 4 class 5039 mthd 0100 data 00000000
05:57airlied: [ 114.094004] nouveau 0000:05:00.0: fb: trapped read at 00206f0000 on channel 1 [0fbbe000 DRM] engine 00 [PGRAPH] client 03 [DISPATCH] subclient 04 [M2M_IN] reason 00000006 [NULL_DMAOBJ]
06:59mupuf: airlied: so, we really need to get cracking at a test farm for Nouveau ;)
07:20airlied: 99.126403] nouveau 0000:05:00.0: imem: OOM: 00100000 00001000 -28
07:20airlied: a few of those as well
09:26pmoreau: airlied: Try the MMU align patch on the list
09:27pmoreau: airlied: https://patchwork.freedesktop.org/patch/207896/
09:52airlied: pmoreau: oh might try to buikd
09:52airlied: pmoreau: oh might try to build a fedora tmrw
10:05pmoreau: airlied: It’s quite possible it won’t get rid of the OOM error message, but it should fix the other ones at least.
16:17karolherbst: skeggsb: how bad is it, that I hit this? https://github.com/skeggsb/nouveau/blob/master/drm/nouveau/nvif/vmm.c#L68
16:18imirkin_: karolherbst: well, that's the VMM_PUT method failing. so ... not great.
16:19karolherbst: I run piglit on 3 GPUs at the same time
16:20karolherbst: RIP: __list_del_entry_valid+0x81/0x90 RSP: ffffaebbc9f17e98 :3
16:20imirkin_: is there any info above that?
16:20karolherbst: kernel BUG at lib/list_debug.c:56!
16:20imirkin_: (the vmm warn on, that is)
16:20karolherbst: imirkin_: let me check
16:20karolherbst: imirkin_: ahh, "DRM: skipped size 0000000000000000" again
16:21karolherbst: but this error is fun: "list_del corruption. next->prev should be 00000000cef7c77f, but was 0000000058ab65f7"
16:22karolherbst: maybe I should not run that many piglits at the same time
16:24pmoreau: karolherbst: Not the same issue, but since it’s vmm related, you might want to run with https://patchwork.freedesktop.org/patch/207896/ if you’re not doing it already.
16:25karolherbst: actually I do
16:25pmoreau: Okay :-)
16:25karolherbst: I think the problem is more like running on three GPUs
16:26RSpliet: karolherbst: wait... a corruption in an in-kernel DLL using worker threads. That's not even a DRM or nouveau-related LL, is it?
16:27karolherbst: who knows
16:28RSpliet: Well, the top of the call trace is process_one_work...
16:28karolherbst: fun thing about such memory corruptions are, that everybody can cause them :)
16:28karolherbst: just forget to lock some code and wups, happened
16:28karolherbst: RSpliet: yeah, the stack might be totally pointless here
16:29RSpliet: karolherbst: Well, this probs means it's the linked list *containing* the deferred worker items
16:30karolherbst: RSpliet: that check checks for entry->next->prev == entry
16:30karolherbst: so if you modify entry and entry->next at the same time... fun
16:30RSpliet: karolherbst: yeah, bit it's the "list_del_entry_valid" call in process_one_work. The name of process_one_work() sounds like it's popping the top entry from the worklist
16:31karolherbst: yeah, should be
16:31karolherbst: could be a bug inside process_one_work
16:32RSpliet: Shame we don't know whose item that is. Could well be a nouveau work item pushing it onto the list thread-unsafely. Or a write to a negative offset into the payload data structure damaging the LL element
16:32karolherbst: I bet it is nouveau
16:32karolherbst: "Workqueue: events_power_efficient neigh_periodic_work"
16:33karolherbst: maybe just too much concurrency
16:33karolherbst: it is a 12 core machine, doing 36 piglit tests at the same time
16:35imirkin_: karolherbst: hmmm ... skipped size 0 is a frontend thing that happens. if you try a 0-size allocation it just fails it iirc
16:35imirkin_: maybe that's not handled cleanly? dunno
16:35imirkin_: the mesa driver really shouldn't be doing that =/
16:40imirkin_: karolherbst: enable KASAN. iirc someone foudn issues too, which pointed at nouveau being the cause of the list issues
16:41karolherbst: ahh, good idea
17:43karolherbst: imirkin_: the painful part is, it looks the same in the nvir output :)
17:44imirkin_: yeah, the printer misses some stuff =/
17:44imirkin_: look at all the details of what the tgsi thing does
17:46imirkin_: basically all that code is there for a reason... although ignore the commented out and "raw" bits
17:47imirkin_: those are leftovers of a bygone age
17:47karolherbst: well obviously it passes on pascal :(
17:56karolherbst: imirkin_: I forgot to set texi->tex.mask :(
18:04karolherbst: imirkin_: any idea why that doesn't matter on maxwell?
18:05imirkin_: sounds like a bug tbh :)
18:05imirkin_: although .... hm
18:05imirkin_: that's a store, so ... no outputs iirc
18:06imirkin_: not sure why tex.mask would matter then. hrmph.
18:06imirkin_: oh, i think it's a mask on the actual store
18:06imirkin_: this isn't functionality accessible from glsl
18:06imirkin_: but i think you can mask channels, so like do a store for RGB but not A even though it's an RGBA format
18:07karolherbst: yeah, I see
18:07imirkin_: could get rid of it and always just store RGBA, but ... meh. might as well set tex.mask, easy enough
18:07imirkin_: i'm guessing the maxwell emitter doesn't support this masking
18:09karolherbst: imirkin_: "->mask" doesn't exist in the maxwell emiter :)
18:09karolherbst: "emitField(0x14, 4, 0xf); // rgba"
18:09imirkin_: right, so it just skips it
18:25karolherbst: imirkin_: not+and -> and not
18:25karolherbst: missing opt
18:26imirkin_: right. an algebraic opt...
18:26karolherbst: allthough I have a and u32 $r4 $r0 0x3f800000
18:27karolherbst: maybe the immediate prevents it?
18:27imirkin_: and $r0 is the result of a set
18:27imirkin_: which produces 0 / -1
18:27imirkin_: i thought i optimized those... =/
18:27karolherbst: not s32 $r0 $r0
18:27imirkin_: and the original $r0?
18:28karolherbst: ld u64 $r0d c0[0x0]
18:28karolherbst: originally a 32 bit load
18:29karolherbst: in TGSI we get and not c0
18:29karolherbst: no issue
18:29karolherbst: mov u32 $r0 0x3f800000
18:29karolherbst: and u32 $r4 $r0 not c0[0x0]
18:31imirkin_: oh yeah. that's tough
18:31imirkin_: i think you can't have the not and the immediate in the same op or something
18:32imirkin_: so any small perturbation to the instructions can cause it to go one way or the other
18:32karolherbst: it's just there are two more not+and pairs using the same immediate :)
18:32imirkin_: "solving" that would require a lot more sophistication than we currently have
18:33karolherbst: yeah, makes sense
18:33karolherbst: imirkin_: any idea what RG on SUSTGA is?
18:33mslusarz: pendingchaos: your mmt patch looks good to me, but commit message could be improved; "fix tracing" could be applied to more than 1 patch ;), maybe something like "mmt: handle openat"
18:33karolherbst: imirkin_: the mask?
18:34imirkin_: karolherbst: probably.
18:35karolherbst: now I have to figure out how to detect the proper mask
18:35imirkin_: problem solved ;)
18:35karolherbst: ahh, it is the color mask, not the coord mask, right....
18:36karolherbst: 0xf it is then
18:39karolherbst: imirkin_: I doubt it matters to have to correct mask for the format or 0xf, or maybe the hardware is slightly dump or maybe it makes no difference at all perf wise, except you really want to do crazy stuff
18:40imirkin_: i'm sure it makes no difference.
18:40imirkin_: i suspect the masking is something accessible from cuda and/or opencl
18:40imirkin_: where it makes a functional difference, not a performance difference
18:40imirkin_: (well, technically, a performance difference in achieving that masking function)
18:42karolherbst: yeah, makes sense
18:43karolherbst: imirkin_: mhh, PTX has no way to specify the channel mask
18:43pendingchaos: mslusarz: it has been done
18:44imirkin_: karolherbst: could well be something the hw guys said "oh, this is easy, maybe it'll be useful". dunno.
20:56user1: how is GTX 760 performance?
21:02imirkin_: thanks for asking
21:17user1: and for gaming people here use Itch if possible?
21:22imirkin_: not a lot of people use nouveau for gaming. if you're looking for hw with open drivers, you're much better off with AMD hardware.
21:30user1: imirkin_, but the firmware isnt open, so is it that much better?
21:30user1: i wasn't going to do hardcore gaming on this card anyway
21:33user1: my goal here is to use as much open firm and software as possible, with the libre-kernel
21:43karolherbst: user1: then any kepler based GPU is fine. We have open firmware except for acceleration for video decoding
21:52user1: sounds good, thanks
21:59imirkin_: user1: the asic design isn't open either
21:59imirkin_: the firmware is bits that ought to be on a ROM on the board but aren't
22:08user1: asic design? as in hardware?
22:09imirkin_: anyways, if your requirement is to use the "libre" kernel (which - btw - is a total misnomer), then a kepler-based nvidia board is your best bet for performance.
22:09imirkin_: that kernel is "let's randomly rip out random functionality for no apparent reason"
22:10imirkin_: so i definitely don't recommend it
22:10imirkin_: the amd driver is plenty open. all software that runs on your cpu is open.
22:10user1: open hardware is pretty limited today, so far i found only Allwinner boards
22:10imirkin_: it just doesn't exist
22:10user1: but you arent worried about firmware?
22:11imirkin_: it's a fact of hardware. all hardware has firmware, whether you see it or no
22:11imirkin_: differentiating hardware where you can see firmware from hardware where you can't see firmware seems like an arbitrary and unhelpful distinction
22:12imirkin_: you could make the argument that you don't want to use hardware that would use up a precious megabyte of your HDD space instead of shipping that megabyte in a ROM on your board. and that's fine. but it has nothing to do with open vs closed.
22:14user1: how about security? proprietary firmware is not audited
22:15imirkin_: true, but not relevant to what the libre kernel does
22:16imirkin_: it cuts out support for *some* hardware with firmware, but not other hardware with firmware.
22:16imirkin_: if one were to cut out support for all hardware with firmware, you'd be hard-pressed to find a CPU it could run on.
22:19user1: but my chips are supported, i don't see the problem
22:19imirkin_: your CPU has closed firmware in it
22:19imirkin_: it's proprietary and has not been audited.
22:19imirkin_: should the libre kernel drop support for it?
22:20annadane: yes. (:
22:23user1: okay. what other functionality does it remove?
22:24imirkin_: it removes support for arbitrary classes of hardware
22:24imirkin_: like AMD hardware
22:24imirkin_: for no apparent reason
22:24user1: AMD GPU's require blobs
22:24imirkin_: anyways, use what you like. if you want something that works well with open-source drivers, use AMD.
22:25imirkin_: brawndo. it's got what plants crave.
23:19imirkin_: pendingchaos: got mmt going with an older blob version?
23:23imirkin_: pendingchaos: btw, you're already doing better than 99% of the people who come in here and say they want to help nouveau. thank you =]
23:29pendingchaos: I had no older blob installed that worked with my current kernel version, so I used mmt_bin2dedma
23:29pendingchaos: In hindsight, it might have been a bit easier to try to use an older kernel
23:30imirkin_: and then you used the old dedma thing? been so long since i've used it
23:30imirkin_: demmt is so much nicer... when it works =/
23:36imirkin_: Lyude: would i be correct in guessing that you have something of a MST setup available to you for testing?
23:43Lyude: imirkin_: yep!
23:44Lyude: got a remote power cutter on it as well if you need to kick the MST hubs back into order :)
23:44imirkin_: Lyude: i've tried to develop support for MST in xf86-video-nouveau, but several people who have briefly tried it ran into issues. would you be at all interested in helping fix it, or at least test my incremental attempts?
23:45imirkin_: (as you can probably guess, i have no access to nvidia DP + MST setups directly)
23:45Lyude: yeah! I mean I don't know if I have time to write any code but I can answer any questions, i've got a pretty solid understanding of most of MST at this point
23:46imirkin_: well, i'm actually most interested in testing + reporting of what failed (+ stack traces and whatnot)
23:46imirkin_: nouveau should support MST fine, at least in theory
23:46Lyude: ahhhh, yeah sure. and it does pretty well, it's one of the better MST implementations out there from what I've tested
23:46imirkin_: i've tried to add support to the DDX for the appearing/disappearing connectors, but obviously incorrectly
23:46imirkin_: however it's a bit hard to test when the only feedback is "it crashed"
23:46Lyude: mhm, that part is a bit confusing. fbcon doesn't even handle that correctly atm...
23:47imirkin_: or even better, "it didn't work" :)
23:47Lyude: anyway yeah, just send me some stuff to test and i'll let you know how it goes
23:48imirkin_: awesome, really appreciate it! the attempt is at https://github.com/imirkin/xf86-video-nouveau
23:48Lyude: cool, will test it in just a little bit
23:48imirkin_: you'll obviously need an nvidia board with DP, but i sorta assume you can get your hands on one of those as well
23:49imirkin_: and i'm hoping it won't suck up too much of your time, but ... the obvious things are going to be useful. if it crashes, stack traces. if it doesn't crash, xrandr/etc outputs + description of what didn't work
23:50imirkin_: [and if you happen to have the inclination to actually fix my buggy code, certainly wouldn't object to that either!]
23:52imirkin_: pendingchaos: sorry for being pedantic btw. just trying to keep those files clean :)
23:55airlied: imirkin_: is it always crashes or just when they do lots of plugging/unplugging?
23:56imirkin_: airlied: dunno
23:56imirkin_: like i said, the reports weren't extremely detailed
23:56imirkin_: i think on plug or unplug
23:57imirkin_: and in at least one case, i think the failure was caused by kms and not anything in userspace