00:31 imirkin: mwk: https://github.com/envytools/envytools/pull/164
00:31 imirkin: i haven't looked at it yet
00:47 imirkin: this weekend's task - get connector hotplug to work in xf86-video-nouveau
00:54 RSpliet: Lyude: the ones someone replied with in your second backlight patch
00:57 imirkin: he's the guy who created a lot of the current kernel printk-wrapping macros
00:58 imirkin: so he generally knows what he's doing with those
01:15 airlied: imirkin: got an mst hub?
01:18 imirkin: airlied: picked up a couple of monitors
01:18 imirkin: U2415's, they do DP1.2 + chaining
01:19 airlied: ah yeah good enough to get to mst firmware bugs :-p
01:19 imirkin: and woe be unto he who plugs them in a loop
01:19 imirkin: i generally set them up so that only 1 of them speaks DP 1.2
01:20 imirkin: the other stays in DP 1.1 mode
01:20 imirkin: that generally keeps them out of too much trouble
01:20 imirkin: although i get the occasional "blink" (screens go black for about 3s) on a SKL
01:20 imirkin: in the past i used to get watermark errors, but no more
01:23 imirkin: windows blinks every so often too, so ... just reaffirms my hatred of hardware.
11:43 tron42: Hi! I am a developer
11:44 tron42: I own a NVidia card and I use nouveau, I'd like to contribute to the project
11:49 milesrout: ^ what he said
11:49 milesrout: me too
11:49 milesrout: :)
11:56 pabs3:not a nouveau dev, but saw https://nouveau.freedesktop.org/wiki/IntroductoryCourse/
11:57 pabs3: also https://nouveau.freedesktop.org/wiki/FAQ/#index5h3
11:58 pabs3: and https://nouveau.freedesktop.org/wiki/Development/
13:28 RSpliet: tron42, milesrout, pabs3: https://trello.com/b/ZudRDiTL/nouveau
13:29 RSpliet: In case you're looking for inspiration for something to hack on... although you're best off scratching your own itches
13:32 RSpliet: imirkin: thanks for confirming the source of the printk patches. They seemed of good quality, I just wondered whether these patches were a definite win, or rather simply hitting a different point on the mem vs. CPU-cycles trade-off curve
14:30 imirkin: RSpliet: yeah, i have no idea. just mentioning who the dude is.
14:51 imirkin: skeggsb: so apparently it's possible to get 4k@60 on kepler/maxwell1 over hdmi by somehow forcing the output into YUV420 encoding mode. i don't see any references to this anywhere at all - i wonder if it's a pll hack of some sort.
15:24 karolherbst: imirkin: mhh, I mean, this is 1.4b compliant though
15:24 karolherbst: imirkin: wikipedia has a note on that: "Possible by using Y′CBCR with 4:2:2 or 4:2:0 subsampling (as noted)"
15:24 karolherbst: imirkin: @75 might be also possible, but I think not for nv if we cap at 297MHz?
15:26 karolherbst: imirkin: here is the table by the way: https://en.wikipedia.org/wiki/HDMI#Refresh_frequency_limits_for_standard_video
15:26 karolherbst: 5k@30 with 4:2:2 might also work
15:29 karolherbst: imirkin: by the way, do you know how that vblancing stuff works? Especially how we listen on the events in userspace?
16:07 RSpliet: karolherbst: Presumably a userspace thread can call a "wait for vblank" ioctl (or DRM generic ioctl with that effect) will be taken off the scheduler's ready queue and marked blocked on IO or sth. When the VBLANK interrupt hits nouveau, it will ask DRM to put the task back on the ready queue and return w/e it needs (a timestamp? a status code? sth like that...)
16:08 karolherbst: RSpliet: yeah, I am more interested on how we do all that syscalls stuff as the GPU reset notifications should be delivered in the same way
16:08 karolherbst: but
16:08 karolherbst: I am not quite sure how all that works in the end
16:09 RSpliet: I'm not sure if it should be...
16:09 karolherbst: I mean, not the same calls
16:09 karolherbst: but the same mechanism
16:10 karolherbst: I am not 100% if it even works or if I am simply doing something wrong
16:10 karolherbst: at least the sys calls returned 0 to create and enable those notifications
16:10 karolherbst: but..
16:10 karolherbst: there is no ioctl to wait on it
16:10 RSpliet: No that's not what I mean. I think for vblank you generally want the userspace application to request to be woken up on a vblank and block until it hits
16:11 karolherbst: okay sure, for notifications you want some fd and do something on it like read or select or maybe even epoll
16:11 karolherbst: though epoll should be too much for that
16:11 RSpliet: a GPU reset is an exceptional situation that I suspect you want to use a different mechanism. More specifically, sending a "signal" (like SIGKILL but rather a... "SIGMYGPUHASBEENRESETGOANDDOSPECIALSTUFF") to the userspace application that registers for this signal
16:12 karolherbst: mhhh
16:12 karolherbst: using signals for that is always causing for troubles
16:12 karolherbst: we have plenty of fds
16:12 karolherbst: and normally if you think about web servers, they also kind of simply wait until _something_ happens
16:12 karolherbst: and if you are good, you use epoll for that
16:13 RSpliet: Why? I think that's the most reliable mechanism for non-blocking IO initiated by the kernel...
16:13 karolherbst: because you have no control over who gets the signal
16:13 RSpliet: I don't think it's right to poll for an exceptional situation
16:13 karolherbst: RSpliet: epoll_wait ;)
16:14 karolherbst: epoll is just the name of the entire framework
16:18 karolherbst: anyway, I got told that we already have a mechanism for stuff like that
16:18 karolherbst: I was mainly interested in how all that stuff works in the end
16:18 karolherbst: I am simply calling the NTFY stuff on the channel we've got in mesa
16:18 karolherbst: and it seems like it is successful, I just don't know how to get such an event
16:19 RSpliet: Yeah we better have a mechanism for that :-P Channels die all the time, better notify just the one process whose channel got killed rather than every consumer of nouveau :-D
16:19 imirkin: karolherbst: no clue. but it's part of drm, so listening to vblank events should be easy. rtfm :)
16:20 karolherbst: yeah, point is, I don't want to listen to vblank events
16:20 imirkin: i thought you said you did
16:20 imirkin: what do you want?
16:21 RSpliet: imirkin: understand the userspace<->kernel interaction on a GPU or channel reset
16:21 karolherbst: I want to install one of those NVIF_NTFY thingies from userspace and wait on something happening
16:21 imirkin: karolherbst: and re yuv420... the question is *how* it gets used.
16:21 imirkin: i.e. how do i tell the gpu to output yuv420
16:21 imirkin: i only see settings for 422
16:22 karolherbst: imirkin: do you have access to a 1440p@120 display?
16:22 karolherbst: I doubt you have acces to a 5k@30 one, or maybe
16:22 imirkin: 1920x1080 120.00
16:22 imirkin: 3840x2160 60.00
16:22 imirkin: that's the best i can do.
16:22 karolherbst: mhh
16:23 karolherbst: for 1440p@120 and 5k@30 you can use 4:2:2 subsampling with HDMI 1.4
16:23 karolherbst: I think I understood you correct in that you are not able to get those 4:2:0 modes to work, right?
16:23 imirkin: the question isn't whether it's possible in theory
16:23 imirkin: the question is how to configure the gpu to actually do that
16:24 karolherbst: sure, and if nvidia is able to do 4:2:2 but not 4:2:0 it would be easier to check what to do ;) but yeah
16:24 imirkin: i'm looking at the display docs that nvidia published
16:24 imirkin: and i'm not seeing it.
16:24 karolherbst: but there are 4:2:2 references?
16:24 imirkin: grep for "422"
16:24 karolherbst: I think I saw something... let me check
16:25 imirkin: although curious...
16:25 imirkin: NV987D_HEAD_SET_CONTROL_OUTPUT_RESOURCE_PIXEL_DEPTH_BPP_36_444
16:25 imirkin: but there's no 36_422 or 420
16:25 imirkin: but there is a
16:25 imirkin: NV987D_HEAD_SET_CONTROL_OUTPUT_SCALER_FORCE422_ENABLE
16:25 karolherbst: yeah
16:25 karolherbst: hum
16:25 karolherbst: NV907D_HEAD_SET_CONTROL_OUTPUT_RESOURCE_PIXEL_DEPTH_BPP_16_422?
16:26 imirkin: and naturally there's a
16:26 karolherbst: or NV907D_HEAD_SET_CONTROL_OUTPUT_RESOURCE_PIXEL_DEPTH_BPP_32_422
16:26 imirkin: NV987D_HEAD_SET_CONTROL_OUTPUT_RESOURCE_PIXEL_DEPTH_BPP_24_422
16:26 imirkin: anyways
16:26 imirkin: my point is - it's unclear from the docs
16:26 karolherbst: yeah, seems so
16:26 imirkin: but it does seem like something the hw definitely supports
16:27 karolherbst: but in the end you have to limit the bandwith, right? And the GPU can't magically push more through the cable than specified
16:27 karolherbst: so I would assume the display has to be able to understand whatever gets pushed there
16:28 imirkin: the GPU has to be told what to push on the cable...
16:28 imirkin: you give the GPU an image to scan out
16:28 imirkin: and tell it "convert it into bits please"
16:28 imirkin: it needs to know what the format of those bits is
16:28 imirkin: otherwise fail
16:28 karolherbst: isn't it that output_resource_pixel stuff?
16:28 imirkin: uh huh
16:29 imirkin: but 420 and 422 are diff formats
16:29 imirkin: so ... how do i make it do 420
16:29 karolherbst: doesn't seem there is a 420
16:29 imirkin: but i know that the hw is able to do it somehow
16:29 imirkin: there are all kinds of references to it
16:29 karolherbst: ahh
16:29 karolherbst: mhh
16:29 imirkin: in online docs about driver support etc
16:29 karolherbst: do you know if nvidia does it on your setup?
16:29 imirkin: i haven't dug into it yet.
16:30 imirkin: but if your suggestion is "just look at a mmiotrace", then i could have thought of that myself :p
16:30 karolherbst: volta seems to have some 420 stuff
16:30 karolherbst: but that isn't really helpful to you either
16:30 imirkin: kepler+ is supported according to online references
16:30 karolherbst: interesting
16:30 imirkin: oh, btw - unrelated - i've borrowed a GM206 (for the DP-MST testing) -- anything you want from me in its regard?
16:31 imirkin: i'll grab a vbios for posterity, but i won't have access to the board for that long - i don't want to keep it on loan longer than necessary
16:31 karolherbst: not really, I have a GM204 myself :p and usually if it comes to display things, Lyude is looking into it
16:31 imirkin: kk
16:32 karolherbst: reminds me, I need to cleanup my no interlaced modes on DP patches
16:33 karolherbst: imirkin: and to stay sane, maybe you want this: https://github.com/karolherbst/nouveau/commit/1e59d8489151fef845f1e111d5d9a55669dc0b04
16:33 karolherbst: no idea if current code can trigger really faulty behaviour
16:33 karolherbst: but...
16:33 karolherbst: it might
16:33 imirkin: i like sanity...
16:34 imirkin: seems like a pretty huge error ... skeggsb --^
16:34 karolherbst: yeah.. it is part of my interlaced fixes
16:34 karolherbst: I thought I would clean it up sooner
16:34 imirkin: karolherbst: oh wait - in current code shouldn't matter
16:34 karolherbst: yeah
16:34 imirkin: coz there's nothing else in that union
16:34 karolherbst: right
16:35 imirkin: but if you add anything else, then fail
16:35 karolherbst: but before you end up adding something and run into the same error
16:35 imirkin: it's still clearly the right thing to do, but not fatal currently
16:35 karolherbst: I am actually wondering why we want to call that mstm thing on all connectors, but maybe we have dunno
16:35 karolherbst: I don't know enough
16:35 karolherbst: uhm, encoders
16:35 imirkin: yeah that's not something i have any clue about
16:36 imirkin: seems like we'd only want to do it for DP/eDP encoders? but what do i know. perhaps that's not a thing that an encoder can be.
16:52 imirkin: pendingchaos: can you send a patch to add sched info to xf86-video-nouveau? it's likely i'll be doing a release this weekend.
16:52 pendingchaos: imirkin: adding .beginsched/.endsched? or using the output of envysched?
16:53 pendingchaos: in other words: should it require envysched to compile the shaders
16:54 pendingchaos: or should I copy over the output into the *nv110.fp and nv110.vp files
16:55 karolherbst: mhh, at least I got the kernel to try to send three notifications instead of 2...
16:55 karolherbst: one is the "gr: TRAP ch 2 [00ffbac000 fault[8462]]" message, the other the "fifo: channel 2: killed" stuff
16:55 karolherbst: and the third is mine... okay
16:56 karolherbst: so how to get it
16:57 karolherbst: mhh, we should fix that CI stuff
16:57 karolherbst: ohh wait
16:57 karolherbst: pendingchaos: you should fix that CI error :p
16:57 imirkin: pendingchaos: it would ideally not require envysched
16:58 pendingchaos: karolherbst: I don't think it was envysched causing it
16:58 pendingchaos:nods at imirkin
16:58 karolherbst: yeah... mhh weird
16:58 karolherbst: but the envytools builds were all passing
16:59 karolherbst: and yours is a few hours after the last passing one
16:59 karolherbst: odd
17:02 karolherbst: pendingchaos: I get the same fail locally
17:03 karolherbst: and the commit before it compiles :)
17:06 karolherbst: pendingchaos: ohhhhhhh
17:06 karolherbst: pendingchaos: name conflict
17:06 karolherbst: pendingchaos: "#include <sched.h>"
17:07 karolherbst: inside nva/nvaspyi2c.c
17:07 karolherbst: so it includes your file instead of whatever it included before
17:08 pendingchaos:nods
17:08 pendingchaos: sorry, I have to go
17:08 pendingchaos: I should be back in a few hours
17:14 karolherbst: imirkin: do you think we should just "abort" in case we detect a gpu reset inside mesa for applications not requesting the robustness bit on context creation? Alternative is that processes just hang and freeze the system. I think this would also make debugging a bit easier because the application just kills itself. Worst case it is X, but everything is lost in that case anyway. And users have to wait less before being able to
17:14 karolherbst: use the machine again. Normally they would just force shutdown
17:14 karolherbst: I guess
17:15 imirkin: karolherbst: i'd prefer to do the gpu reset and limp along
17:15 karolherbst: well, we do the GPU reset
17:15 karolherbst: but you have to delete the screen
17:15 imirkin: i.e. ensure all the same allocations are there
17:15 imirkin: yeah
17:15 imirkin: all textures are lost/etc. but keep going.
17:15 karolherbst: if an application is requesting a robustness context, that's okay
17:15 imirkin: first off, they'll get reuploaded 99% of the time
17:15 karolherbst: because applications ar esupposed to delete the context and create a new one
17:15 imirkin: through the normal course of running the application
17:15 karolherbst: for certain drivers
17:16 karolherbst: ohhh
17:16 karolherbst: mhh
17:16 karolherbst: but we have to delete the entire screen...
17:16 karolherbst: this is really really painful
17:16 imirkin: so you'll get a few shit frames
17:16 karolherbst: well
17:16 imirkin: but then it'll be fine again
17:16 imirkin: and if not, the user can exit
17:16 karolherbst: mhhhhh
17:16 karolherbst: screen creation is at dlopen time though. No idea what to do about that yet
17:16 karolherbst: I was thinking about just replacing all the fds and move on
17:17 imirkin: why would the fd's need replacing?
17:17 karolherbst: for robustness that is fine as we can switch to the GL_ERROR_GPU_RESET or whatever that is table
17:17 karolherbst: imirkin: because all the objects are dead basically
17:17 karolherbst: we need to allocate new channels
17:17 imirkin: ehhh ... ok
17:17 karolherbst: basically redo everything we do on screen creation
17:18 imirkin: having the explicit userspace VA management would really help here.
17:18 imirkin: i'd say get that in first
17:18 karolherbst: nouveau_drm_screen_create
17:18 imirkin: and then move on from there.
17:18 karolherbst: mhhh, yeah, but I don't really want to rely on any ETA here. Anyway, we have to get that notification about the GPU reset anyhow
17:19 karolherbst: I am just wondering if I should spend a minute and just quit the application
17:19 karolherbst: or only throw an error?
17:19 karolherbst: dunno
17:19 karolherbst: for robustness that is well defined
17:19 imirkin: worry about that later?
17:19 karolherbst: yeah, maybe
17:19 karolherbst: anyway, I need to implement something for robustness and we can be spec complient here
17:19 karolherbst: *compliant
17:20 imirkin: what about the current impl isn't compliant?
17:20 karolherbst: well, you can't create such a context with nouveau anyhow
17:20 imirkin: pretty sure we're allowed to weasel out of it.
17:20 karolherbst: currently
17:20 karolherbst: yeah, by not allowing to create such a context ;)
17:20 imirkin: right
17:21 karolherbst: I would expect that at least compositors or window managers or important stuff like that make use of it...
17:21 karolherbst: hopefully
17:21 karolherbst: and recreate contexts
17:21 karolherbst: would make the life for users less painful
17:21 imirkin: i'm not saying it's bad to implement...
17:22 imirkin: but having userspace VA management would make it all a lot more tractable to just auto-recover
17:22 karolherbst: well, I am not there yet anyway. Still having to get the notifications about GPU reset into mesa :)
17:22 imirkin: and it's also needed for vulkan
17:22 imirkin: so ... why not do that
17:22 karolherbst: yeah, in the end we want to get there anyway
17:22 imirkin: instead of reimplementing the same thing 20x
17:22 karolherbst: but this reset thing might be just a hour of time to implement
17:22 imirkin: then stop talking about it and implement it
17:23 karolherbst: as we really don't have to do anything really
17:23 imirkin: coz you've been talking about it for >30 mins :p
17:23 karolherbst: :p
17:23 karolherbst: I know
17:23 karolherbst: just wondering what we do for non robustness applications for now
17:23 imirkin: but it's more fun to complain...
17:23 imirkin: we hang indefinitely
17:23 imirkin: stuck while trying to reset the channel
17:23 karolherbst: right
17:23 karolherbst: but if we now the channel is dead, we could just abort immediatly
17:24 imirkin: i believe that's what we do.
17:24 karolherbst: or do something else
17:24 imirkin: and fail at it.
17:24 karolherbst: well, mesa won't ever know the channel is dead
17:24 imirkin: some locking thing iirc?
17:24 imirkin: i mean in the kernel
17:24 karolherbst: sure, we know that in the kernel
17:25 karolherbst: mhh, let me check where my applicaiton actually gets stuck
17:26 karolherbst: okay mhh, waiting on a fence of course
17:26 karolherbst: uhm, well the kernel isn't stuck actually
17:27 karolherbst: it just killed the channel and reset the engines
17:27 karolherbst: the situation where we get stuck is for shaders being stuck in a loop
17:27 karolherbst: and then we wait inside the kernel
17:27 karolherbst: but for those I just want to trap the shaders
17:28 imirkin: in my experience, the kernel usually gets stuck trying to kill the channel
17:28 karolherbst: mhh, doesn't happen for me
17:28 imirkin: unless i go in and kill the offending process myself
17:28 karolherbst: instead of those inf loop ones
17:28 karolherbst: sure
17:28 karolherbst: the process is stuck
17:28 karolherbst: waits inside nouveau_fence_wait for me
17:28 imirkin: and takes X with it
17:29 karolherbst: ohh
17:29 karolherbst: mhh
17:29 imirkin: i mean - X is stuck
17:29 imirkin: but killing the process fixes X
17:29 karolherbst: right
17:29 karolherbst: and for those cases I want those applications to abort or do whatever whenever we know the channel is dead
17:29 karolherbst: instead of having to wait or to kill the application
17:31 imirkin: but the application is stuck inside nouveau iirc
17:31 karolherbst: mhh
17:31 imirkin: nouveau tries to kill the app
17:31 imirkin: and fails.
17:31 imirkin: or something.
17:31 karolherbst: the thing I currently work with is some CL applications and this is the bt: https://gist.githubusercontent.com/karolherbst/2a3f427c77201c3b4114f8ea66040640/raw/4db4f64f0d6e6b99c254e0616b55590023dce498/gistfile1.txt
17:31 imirkin: that's just stuck in userspace
17:31 karolherbst: yeah
17:32 imirkin: i'm talking about X being stuck (a diff process)
17:32 karolherbst: wondering about what happens in your case so that nouveau gets stuck
17:32 karolherbst: well, is nouveau stuck or X just waiting on the stuck process?
17:32 imirkin: btw, we really should update the comm or pid or whatever when an fd gets passed to another process
17:32 karolherbst: yeah
17:32 imirkin: i dunno.
17:33 karolherbst: anyway, I want to take care of the simple cases first, which I am able to trigger easily and then try to get those X being stuck things under control as well
17:33 karolherbst: but
17:33 karolherbst: from my experience this also happens on prime setups
17:33 karolherbst: or on my laptopt
17:33 karolherbst: *laptop
17:33 karolherbst: where X is stuck, because it just waits on stupid clients
17:33 karolherbst: and other weird effects
17:34 imirkin: skeggsb: would you take a patch which updates the cli->name on every ioctl() [if the pid has changed]
17:38 karolherbst: imirkin: in which scenarios is that relevant though? compositors/sessions after login getting a fd from logind?
17:38 karolherbst: or is it also happening for random X clients?
17:38 imirkin: karolherbst: it's relevant in my scenario... fd is allocated by X, passed to client.
17:38 karolherbst: right, but when does it happen?
17:38 imirkin: all errors/etc show up as coming from X
17:38 imirkin: always.
17:38 karolherbst: ohhh
17:38 karolherbst: painful
17:39 imirkin: (except for DRI_PRIME, of course, which doesn't go through X)
17:39 karolherbst: that explains
17:39 karolherbst: yeah, after I am done with it on my prime system here, I go to my gm204 and check if that helps there
17:39 karolherbst: I have a plasma installation there, soo probably no problem in triggered some nice errors
17:39 karolherbst: *with triggering
17:40 karolherbst: anyhow, that ntfy+ioctl code is really really painful to follow :/
17:46 imirkin: crap... there are like 20 copies of the client name. inconvenient.
17:48 imirkin: cli->name and then there's the nvkm-side cli name...
17:49 karolherbst: maybe we shouldn't save it at all and just look it up in case it becomes relevant?
17:49 karolherbst: we have the pid
17:49 imirkin: yeah, checking how all that works
17:50 imirkin: pretty sure there's a %pSomething to print the current task name
17:50 karolherbst: yeah
17:50 imirkin: ... except the "DRM" client is special
17:51 karolherbst: mhh, printk %p only exists for functions afaik
17:51 imirkin: ?
17:51 karolherbst: ohh wait, maybe there is even more
17:51 karolherbst: huh, no processes
17:51 karolherbst: https://www.kernel.org/doc/Documentation/printk-formats.txt
17:56 imirkin: so nvkm-side, looks like it's only used in nvif_printk and derivatives
17:57 imirkin: which among other things happens in nvif_ioctl (trace-level)
17:57 imirkin: which in turn means it happens a lot
17:58 imirkin: ok. so fine. i'll fix it.
18:22 karolherbst: RSpliet: okay, found the code
18:22 karolherbst: apperantly we do a "wake_up_interruptible(&filp->event_wait);"
18:22 karolherbst: and filep is a drm_file
18:24 karolherbst: okay, and it writes into the object I supplied
18:49 karolherbst: ahh okay, so that 's just the normal drm event stuff. I guess there are APIs for that
18:52 karolherbst: pendingchaos: which application was it that was hanging your GPU? hitman? or the new Tomb Raider?
19:24 pendingchaos: karolherbst: Hitman is fine. F1 2015 (IIRC) and Tomb Raider with TressFX cause a hang (though I don't think I have mentioned F1 2015)
19:24 pendingchaos: I will probably rename sched.h to codesched.h or something
19:25 pmoreau: IIRC, hakzsam was investigating some F1 2015 hang a long time ago, some kind of infinite loop going on in one of the compute? shaders.
19:25 karolherbst: pmoreau: ahh, but those are the simpliest ones to debug
19:26 pendingchaos: maybe NV<something>WATCHDOG would "fix" that
19:26 karolherbst: pmoreau: we just need the framework around it, but we know everything to figure that out
19:26 pmoreau: Ah, nice :-)
19:26 karolherbst: pmoreau: yeah, I was able to trap a kernel from kernel space to abort stuck shaders
19:26 karolherbst: we just need to dump the state and so on
19:27 karolherbst: and have a buffer for the state
19:28 karolherbst: pmoreau: there is a compute method to set a trap handler on the channel, but it seems to only work for compute shaders
19:28 karolherbst: and is more or less new
19:28 karolherbst: maxwell+ or something
19:28 karolherbst: before that we have a channel bound mmio register
19:29 karolherbst: anyway, need to look into it after I am done with the reset stuff
19:37 pmoreau: Can’t wait to be able to do some actual debugging with Nouveau :-)
19:38 karolherbst: yeah.. that would be nice
19:39 karolherbst: I was totally surprised how well that cuda-gdb stuff works
19:44 pmoreau: I have been using it extensively when programming in CUDA: being able to set breakpoints, step-through at warp, block, or grid level, reading the registers/memory content, getting the PC of the instruction triggering a mem fault
19:46 pendingchaos: imirkin: where should I post the patch? nouveau@lists.freedesktop.org?
19:46 pendingchaos: should I include a subject prefix other than "[PATCH]"?
19:47 imirkin: nouveau@ is fine
19:47 imirkin: cc me
19:47 imirkin: no special prefix necessary
19:47 imirkin: it's not SUCH a high-volume list
19:48 imirkin: F1 2015 definitely hangs when you try to start a race
19:48 imirkin: hakzsam tracked it down to some value not being initialized/set properly
19:48 imirkin: which causes an infinite loop in the shader
19:49 imirkin: however it's some value in some ssbo
19:49 imirkin: so unclear why it's wrong to begin with
19:50 karolherbst: I se
19:50 karolherbst: e
19:52 karolherbst: okay, let's see if reading on the fd is actually giving me the event now :)
19:59 karolherbst: nice
19:59 karolherbst: it works
20:01 imirkin: pendingchaos: what's with the "wt 0x3f" stuff at the beginning?
20:01 imirkin: does nvidia do that?
20:01 pendingchaos: no, envysched does though
20:01 pendingchaos: in case the code is used as a function
20:01 pendingchaos: it probably could be removed
20:02 pendingchaos: (I assume the "wt 0x3f" stuff is needed in the functions in gm107.asm)
20:03 karolherbst: mhhh
20:03 karolherbst: yeah, it makes sense
20:03 karolherbst: wt 0x3f is implicit if you do "(st 0x0)"
20:04 karolherbst: "(st 0x0)" basically means "(st 0x1f wt 0x3f)" I think
20:04 pendingchaos: isn't a delay of zero used to signify dual-issue? I assumed "(st 0x0) (st 0x0) (st 0x0)" was some sort of special sequence the hardware checked for
20:06 karolherbst: st 0x0 is translated to 0x7e0
20:06 karolherbst: and for dual issueing you actually have to use "(st 0x0 yl)"
20:06 karolherbst: otherwise it doesn't dual issue
20:06 pendingchaos: ah
20:07 karolherbst: it is still some magic value
20:07 karolherbst: pendingchaos: https://github.com/karolherbst/mesa/commit/b700edc474dc4856058541bba2158fd9597967d4
20:09 karolherbst: mhh actuall a lone wt 0x0 should translate into all write barriers set and max stall
20:09 karolherbst: I am not 100% sure though
20:10 karolherbst: that yield flag is also suppose to increase performance, that's why I am not exactly sure, and nvidia sets it pretty much everywhere except on a few things
20:10 karolherbst: anyhow, if we set it everywhere, perf is getting much worse
20:14 Lyude: RSpliet: there hasn't, but the patches also haven't been merged yet either. I wouldn't think the control flow would be that big of a deal since most of the codepaths that's used in are either not hot paths, or are debugging code. it also really doesn't add that much to begin with
20:15 Lyude: i didn't see anything wrong with it when I looked at it
20:21 rhyskidd: karolherbst: ^
20:21 karolherbst: :D
20:22 karolherbst: yeah... I will work more on that if I get to the trap handling stuff
20:33 pendingchaos: karolherbst: the CI error is fixed
20:33 karolherbst: cool
20:33 pendingchaos: just renamed sched.h to codesched.h
20:49 karolherbst: mhh
20:49 karolherbst: sadly that even doesn't exactly contains a lot of data...
20:50 karolherbst: https://gist.githubusercontent.com/karolherbst/3a0ca1de2025cff2ac27bacf9c2ae659/raw/417fdd8499b6acefd11ea1ab27edefef42d43b13/gistfile1.txt
20:51 karolherbst: which is fine as long as it is the only event type we are subscribing to...
20:51 karolherbst: still
20:52 karolherbst: well, we have that token field at least
20:52 karolherbst: so we can pass in some func pointers through that, or some struct with data
20:53 karolherbst: soo, of course it needs kernel patches + libdrm
20:55 karolherbst: skeggsb: https://github.com/karolherbst/nouveau/commit/def9a55e56ae90fc79d99062e11c6991a73f5d25
20:55 karolherbst: I guess we also should increase the minor version
20:56 karolherbst: or we don't and just check against EACCESS
21:16 rhyskidd: pendingchaos: other than perhaps adding some documentation for envysched, it looks good to me
21:17 pendingchaos: ah yeah, the .rst stuff
21:17 pendingchaos: I should probably add stuff for that
21:48 rhyskidd: pendingchaos: good work getting envysched together
21:48 rhyskidd: well done
21:49 pendingchaos: thanks
22:06 pendingchaos: pushed some commits to add stuff to docs/envydis/index.rst and add a -p option to print "sched (st 0x0) (st 0x0) (st 0x0)"
22:52 karolherbst: imirkin: chromium checks for LOSE_CONTEXT_ON_RESET_ARB, nice
22:53 karolherbst: aware of any issues with chromium and nouveau?
23:13 karolherbst: mhhhhhhhh
23:13 karolherbst: we kind of need an event loop thingy with that...
23:14 karolherbst: or uhm, well, we could probably use dup?
23:21 karolherbst: mhh, no, dup doesn't help here
23:51 karolherbst: soo, noce
23:51 karolherbst: https://gist.githubusercontent.com/karolherbst/534fbab94b14c349b32366b1222ea88b/raw/c129550c5760f6d2719907a4f2a6236573e7af6f/gistfile1.txt
23:51 karolherbst: libdrm: https://github.com/karolherbst/drm/commit/cf79f43688d8e1ecc6833df0c38632db96e1327a
23:51 karolherbst: mesa: https://github.com/karolherbst/mesa/commit/67f5a991ba39a5ca844eace09b72e08b0972196d
23:52 karolherbst: imirkin: ^^ this is what we can currently do with the kernel interfaces
23:53 karolherbst: pmoreau: you might be interested as well