02:17 imirkin: AndrewR: i won't be able to test this on the nv42 myself for a bit, but skeggsb suggested changing this to a return false: https://github.com/skeggsb/nouveau/blob/master/drm/nouveau/nvkm/engine/fifo/dmanv40.c#L49 (for pre-nv44 only, but since you only have the one board, wtvr)
02:27 AndrewR: imirkin, unfortunately, I moved this older comp to another room and changed videocard to nv11 (because I have only one dvi->vga adapter)..so, I can't test right now, too :/ But tnx..
02:27 imirkin: ok, well, whichever of us gets to it first
09:34 Marex: karolherbst: so what did you want me to test ?
09:35 karolherbst: if you force to always set PCIe speed to 5_0, just change the pstate and check if the performance under the xrandr setup is much better
09:36 Marex: karolherbst: ah ok, mail me a patch and I'll try it asap
09:37 karolherbst: Marex: is it okay without a patch and just the modification? Because I can't clone the kernel tree here
09:38 karolherbst: Marex: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/drivers/gpu/drm/nouveau/nvkm/subdev/bios/perf.c?id=refs/tags/v4.10.8#n138
09:38 karolherbst: in the "case 0x35:"
09:38 karolherbst: add a "info->pcie_speed = NVKM_PCIE_SPEED_5_0;"
09:40 Marex: karolherbst: https://pastebin.com/6u7Y2e3c ?
09:40 karolherbst: looks good
09:40 karolherbst: after switching pstates, you should see in lspci -vv that the LnkSta goes up to 5.0 GT/s x4 Width
09:40 Marex: uh now what was that command for debian kpkg to test a patch :)
09:41 Marex: let's see
09:42 Marex: ah test-patches :)
09:45 Marex: karolherbst: why do you think 5GT/s should work btw ?
09:47 karolherbst: doubled bandwith
09:47 karolherbst: and because we had users with the same problem
09:47 Marex: karolherbst: I mean, why do you think the GPU should physically support that
09:47 karolherbst: because the GPU tells us
09:47 Marex: ha, OK
09:47 karolherbst: the GPU is smart enough to report three caps
09:48 karolherbst: 1. what the GPU supports, 2. what the current configuration supports and 3. ???
10:34 dboyan: karolherbst: Do you still remember the nvboost stability issue on my gk208? Do you have time to look into it or any hint how I can do it myself with the mmio trace (in case I have time)?
10:34 RSpliet: dboyan: May I just put a little note on the side on your proposal based on the discussion we had on insn scheduling yesterday?
10:34 dboyan: RSpliet: sure
10:35 karolherbst: dboyan: I just need more time
10:35 karolherbst: dboyan: give me time :p
10:35 karolherbst: or a GPU with that issue :D
10:35 RSpliet: I'd like you to keep in mind that pre-RA scheduling does not necessarily have to have a big impact on performance. If you manage to reduce the GPC count of a program from say 30 to 27 based on reduced live sets, it means you've done something good
10:36 RSpliet: even though the hardware will not schedule extra warps per SM in parallel and thus the performance isn't directly impacted
10:36 RSpliet: (Because if the same pass reduces reg usage from 34 to 31, you _will_ see a difference in perf :-))
10:37 RSpliet: Consider the use of for instance "shader-db" to measure what your benchmarks do in terms of reg alloc and insn count - along side with gaming benchmarks
10:37 RSpliet: *what your scheduling pass does
10:38 karolherbst: well, more important is to reduce stalls though
10:38 karolherbst: sure you can drastically reduce the GPR count, but especially for pixmark_piano, nvidia sues more GPRs than we do
10:38 karolherbst: and still has ~20% more perf
10:39 karolherbst: but the defs<->uses are far away
10:39 karolherbst: which is super important on nvidia gpus
10:39 RSpliet: karolherbst: the question is whether you can reduce stalls with a pre-RA scheduling pass. Don't you need RA information to know how far you can separate the defs from the uses without impacting GPC count too badly?
10:39 karolherbst: not really
10:40 RSpliet: karolherbst: it seems we differ in opinion. Interesting questions for dboyan to answer as he goes along ;-)
10:40 karolherbst: yes
10:40 karolherbst: we only need to know the amount of live values, and this is also possible prior RA
10:40 karolherbst: just a little odd in SSA form
10:40 karolherbst: but it works
10:40 karolherbst: phi handling will be a bit messy
10:41 karolherbst: or branches in general
10:41 karolherbst: but if you only look at BBs, you can do a lot already and you keep complexity low
10:43 RSpliet: karolherbst: this is all true. But the fact will be that separating def from usage will always increase your live-set. There's an indication that the ``artificial'' hazards introduced by RA are useful for finding a good balance between GPC usage and distance between def and use
11:27 karolherbst: dboyan: imirkin will mentor for you, right?
11:30 dboyan: yeah, I think so
11:53 karolherbst: dboyan: I was just asking, because there is a proposal which would be better suited for imirkin, but I can take care of this one as well
11:55 karolherbst: but I doubt he wants to do the other one anyway
11:55 karolherbst: so everything is fine
11:57 dboyan: Not sure about the whole process. But iirc one person can only mentor one project, and you know, there aren't many applicants this year
11:57 karolherbst: we have 3 for nouveau
11:57 dboyan: wow, that's surprising
11:57 karolherbst: as I said: it's all fine
11:58 karolherbst: Satchelboi: :O you are even on IRC
11:59 dboyan: karolherbst: just curious. Are you also an registered mentor?
11:59 karolherbst: isn't the list public?
11:59 dboyan: really?
12:00 karolherbst: no idea?
12:00 dboyan: i don't think i've seen a "mentor list"
12:02 karolherbst: well, I would like to mentor a project. I am sure it's fun
12:07 pmoreau: karolherbst: What are the two other projects?
12:08 karolherbst: pmoreau: maxwell video accell and some pascal stuff
12:09 pmoreau: Ah, the Maxwell video accel rings a bell
12:09 pmoreau: Do you remember what kind of Pascal stuff it is?
12:10 karolherbst: I look it up
12:11 karolherbst: REing Command Stream on Pascal for valgrind-mmt
12:11 karolherbst: at least this is what I understand
12:12 dboyan: well, pascal is nearly the same with gm200
12:12 dboyan: except something with compute
12:12 karolherbst: yeah, the proposal has a focus on CUDA/compute
12:12 pmoreau: I won’t complain if someone figures out compute on Pascal :-D
12:12 karolherbst: :D
12:12 dboyan: demmt -m 137 is working quite well here
12:13 karolherbst: I see a mentor here!
12:13 dboyan: yeah, but there's only a few command state to figure out. I guess
12:15 pmoreau: karolherbst: How did you find the proposals btw, I can’t seem to properly navigate the GSoC website.
12:15 karolherbst: you click on the organisation
12:15 karolherbst: but you are no mentor as well
12:16 pmoreau: Ah, you have to be a mentor to see them, gotcha
12:16 karolherbst: dunno
12:16 karolherbst: it seems that way
12:16 hakzsam: yes, except if the proposals are public
12:16 pmoreau: Ok
12:17 dboyan: yeah, I think only mentors can see the proposals. At least I don't see them on the site
12:17 karolherbst: hakzsam: I guess you don't want to mentor any of the nouveau proposals?
12:18 hakzsam: karolherbst: "want" is probably not the right term. Not enough time is most likely the reason
12:18 hakzsam: I know just nothing about video accel and dboyan already has a mentor
12:18 hakzsam: and not the worst :p
12:19 dboyan: actually nvidia provides documentation about compute launch descriptor for pascal...
12:19 hakzsam: correct.
12:20 dboyan: although no idea if it's enough to get it working
12:20 hakzsam: should be a good start
12:21 hakzsam: but valgrind-mmt still doesn't work with latest blob, right?
12:21 hakzsam: at least, I think it can't decode compute for some reasons
12:22 dboyan: it can't deassably compute programs
12:22 karolherbst: hakzsam: thing is, I don't know that much about video accel as well, but imirkin does :(
12:22 dboyan: s/deassably/deassembly
12:22 karolherbst: I can help out with the tracing stuff and so on
12:23 hakzsam: dboyan: we only need a way to read/decode the descriptor in a first time
12:23 hakzsam: and because pascal uses the same ISA as maxwell, I don't think they changed a lot of things
12:23 hakzsam: karolherbst: you can always be an unofficial mentor
12:24 karolherbst: hakzsam: how uncool is that
12:24 dboyan: ah, the GP104_COMPUTE is also different
12:24 karolherbst: either full or none at all :D
12:24 hakzsam: karolherbst: well, imirkin is not always here :)
12:24 karolherbst: yeah, I can help out ocassionally, and the helps out with the maxwell video stuff
12:24 karolherbst: *he
12:25 hakzsam: dboyan: what's your timezone btw?
12:25 dboyan: UTC+8
12:25 karolherbst: uhh wow
12:25 karolherbst: both have an offset of 6 hours to my location
12:25 karolherbst: but for imirikin it's more like 12
12:25 karolherbst: and 0 for the other one
12:26 karolherbst: imirkin: I guess we could work both with both and one of use takes responsibility and the mentoring overhead for one or something like this
12:26 pmoreau: hakzsam: One thing that didn’t work with recent version of the blob was the detection of the chipset, due to a change of the structure used (it needs an extra field).
12:26 hakzsam: pmoreau: go ahead and fix it up! :)
12:27 dboyan: hakzsam: I forgot, there are also things like "GP104_COMPUTE.0x2a4 = 0x3000000" in the trace. I don't know if it is the same on maxwell
12:27 hakzsam: dboyan: no clue, maybe the doc describes this field?
12:27 pmoreau: hakzsam: The problem is, does it work if you pass a bigger structure to what an old version expected?
12:27 dboyan: hakzsam: There is no doc for this I guess
12:28 dboyan: this might be one part of REing
12:28 hakzsam: pmoreau: can't you detect the version? I'm not very familiar with valgrind-mmt to be honest
12:28 hakzsam: dboyan: yeah, but hopefully most of the fields are already documented
12:29 pmoreau: hakzsam: Detect the version of the blob and pass in the correct structure? Maybe…
12:29 dboyan: hakzsam: only 6 of them except the already known "UNKxxx" in my trace
12:30 hakzsam: ok
12:31 hakzsam: I think it shouldn't be hard to make it work on pascal though
12:32 dboyan: yeah, i agree. it's only a matter of time
12:47 dboyan: hakzsam: I was wrong just now. There are dozens of other unknown states of GP104_COMPUTE. There are some more work then.
12:47 Satchelboi: kerolherbst I'm here, sorry. Just a little quiet right now.
12:47 dboyan: most of them are zeros
12:48 hakzsam: if they are zeros, all is good :)
12:48 karolherbst: Satchelboi: no problem. As it turns out, I will most likely mentor your project
12:49 karolherbst: Satchelboi: I just don't know what mupuf sent to you, but I think this was just related to getting a commit inside Mesa?
12:50 karolherbst: Satchelboi: we can also discuss things in private, if you feel better this way, just project related things should remain inside this channel :p
12:50 Satchelboi: karolherbst: It was, yeah. I received some prereq work to verify I can do what I proposed for the project.
12:50 karolherbst: what is the work about?
12:51 karolherbst: but it is fun reading your proposal at least :)
12:52 Satchelboi: Pretty easy task. Download and compile mesa locally, make a trivial patch, send it with git-send email and rework until it's accepted.
12:52 karolherbst: is there a ticket or something for this? I am simply interested what it is
12:52 karolherbst: or can you choose anything?
12:53 Satchelboi: I can choose anything, but I think there actually is a ticket. I pulled that proposal idea from the idea list x.org provided on gsoc.
12:53 Satchelboi: Found the ticket: https://trello.com/c/mvf6HDsA/30-vp6-maxwell-support
12:54 dboyan: Satchelboi: I guess you will find https://github.com/imirkin/re-vp2 useful
12:55 Satchelboi: dboyan: Those would be extremely useful actually, thanks. I've been spending the last few days learning my way around the envytools I'll need, so I'll put this in that pile of things too!
13:01 karolherbst: Satchelboi: concentrate on the Mesa patch first :p With everything else I can also help you if really needed. But I was refering to a ticket for the patch, not the VP6 work
13:02 karolherbst: Satchelboi: fyi: my local time is +6 of yours
13:04 Satchelboi: karolherbst: Oh whoops. There was no ticket for the patch, no. Also, time difference noted for any questions or contact.
13:04 Satchelboi: I've been working with someone through this proposal, so I might poke them if something comes up at a bad time too.
13:04 karolherbst: and I also work from 10 to 19 o clock here. But I am sure I can work something out
13:04 karolherbst: Satchelboi: let me guess, his name starts with an "M"
13:05 Satchelboi: No actually, haha
13:05 karolherbst: ohh what :D
13:05 Satchelboi: Lyude has been helping me the whole way through so far, and definitely helped me get farther than I could have hoped to on my own.
13:06 karolherbst: ahh I see
13:08 Satchelboi: For now though, I'll put a rush on working on a patch between classes. I have until April 16th, but I was advised pushing a patch could take up to a week to do.
13:08 karolherbst: yes
13:09 karolherbst: what do you want to work on for the patch?
13:10 karolherbst: Satchelboi: do you think you get a patch read this weekend?
13:10 Satchelboi: I'm not sure yet to be honest. Since it wasn't specific on watch patch to work on, I'll look through the Trello and see if anything is posted there.
13:11 Satchelboi: I'm aiming to have it completely by this weekend for submission, yes. Sooner if possible.
13:12 Satchelboi: Excuse any poor typing right now either please, I just woke up and haven't made it to the coffee part of the morning yet.
13:12 karolherbst: mhh, most of the trello cards are rather big though. No idea how much time you have and I would prefer something kind of related to your proposal
13:13 karolherbst: I know tons of simple patches for the kernel driver though
13:13 karolherbst: or compiler related things
13:13 Satchelboi: I've already been working towards one of the trello cards, but I don't see it being done by this weekend. Suggestions would be a huge help
13:13 karolherbst: Satchelboi: we can do it like this: you think about what you want to work on until tomrrow, and if you don't come up with anything, I will decide :p
13:14 Satchelboi: Sounds good. If I get something, I'll poke you with it for a look over.
13:14 karolherbst: nice
13:14 Satchelboi: I can give you some other places to reach me too if you want, since I don't know how often irc is available for you.
13:14 karolherbst: (I also read the log of the channel... usually, sometimes I forget)
13:15 karolherbst: by the way. I also don't know what you have discussed with Lyude. Not that he actually wants to mentor you and I am stealing you or something like this :p
13:16 Satchelboi: Haha no, I don't think there'll be any contest there. She was helping me get set up and learn my way around things, but I imagine it was a bit of a strain on her own work to do.
13:19 karolherbst: nice
13:22 dboyan: Satchelboi, one piece of advice here. If you want some "trivial" patch really quick, you can fix some typos or coding styles in the code. That'll be easier than implementing some functionality
13:23 dboyan: Satchelboi: https://cgit.freedesktop.org/mesa/mesa/commit/?id=052b3d4e2f159038137504f01e9ff2380a67af8b is one of my patches long ago
13:24 karolherbst: or don't start with boring stuff alltogether :p
13:25 Satchelboi: dboyan: I'll take a look at that too for ideas. I'm really not sure where to start for searching for patches to work on
13:26 karolherbst: Satchelboi: do you have any special interests?
13:27 karolherbst: Satchelboi: I could think of something like this for the first patch, just without the video files of nouveau: https://github.com/karolherbst/mesa/commit/d12d1cdbd3f26f4f9437bc4d03e4870279a83354
13:27 karolherbst: aka, change an important constant and see what breaks
13:27 Satchelboi: karolherbst: No interests in particular, just something to work towards.
13:27 Kresp: my system freezes whenever i try to play video with vdpau decoding. tried mpv and vlc - problem happens with both. software decoding works fine. GPU is GT9800 (G92). machine is stil running - I can ssh into it, pkill X and run startx to restore it. /var/log/messages has a bunch of "DRM: skipped size 0" messages and one timeout at
13:27 Kresp: drivers/gpu/drm/nouveau/nvkm/engine/fifo/chang84.c:111/g84_fifo_chan_engine_fini()! kernel is 4.10.6; nouveau is 1.0.13. vdpauinfo shows support for H264 decoding. nvidia-firmware is installed (340.32). how do i troubleshoot this further?
13:27 karolherbst: Satchelboi: what I did was basically I changed a constant, and I fixed up all the places where the constant was also used... just directly
13:28 karolherbst: preferable inside the vdpau related source files
13:28 karolherbst: but
13:28 karolherbst: I don't know if there is an issue like thisd
13:28 karolherbst: most likely though
13:28 Satchelboi: In that case, I'll check to see if just that issue exists.
13:29 karolherbst: but I leave my suggestions until tomorrow. It was just a hint if you think "fixing whitespaces" is too boring
13:29 karolherbst: (it is too boring, that's why I never did something like this to begin with)
13:29 dboyan: yeah, "fixing whitespaces" is something like the last straw
13:30 karolherbst: you get a minus point just for suggesting this already :p
13:31 karolherbst: Satchelboi: everything with *malloc*(magic value * magic value) is a good start
13:31 karolherbst: Satchelboi: https://github.com/karolherbst/mesa/commit/d12d1cdbd3f26f4f9437bc4d03e4870279a83354#diff-d2148248b5e530db60229aa9a7d81e40L1240
13:31 karolherbst: this was basically my starting point
13:32 karolherbst: reducing it is fun, cause you hit all those crazy out of bounds bugs
13:32 dboyan: karolherbst: you're right, I do think reading code and find things interesting inside are way better. And actually that was the one of the two most trivial patches.
13:34 Satchelboi: Awesome, looks like that would make a good point to start at then.
13:34 karolherbst: Satchelboi: okay, change of plans, I won't have enough time today I guess, to find a proper issue, but tomorrow I will have enough. So _maybe_ I may come up with something tomrrow or otherwise the day after
13:35 Satchelboi: That's fine. In the meantime I'll work with what you guys just proposed and see what I can pull out of that.
13:36 karolherbst: nice :) as long as you like the idea, it's perfect
13:38 imirkin: Satchelboi: the idea of the patch is to demonstrate that you know how to operate computers, git, and code editors.
13:38 imirkin: Kresp: unfortunately vdpau is broken on G92
13:39 imirkin: Kresp: i recently acquired a G92 myself to look into what's going on. thus far, without much progress.
13:39 Kresp: imirkin: that's unfortunate
13:39 nyef: imirkin: Broken for nouveau, or broken generally?
13:39 Kresp: any alternatives?
13:39 Kresp: vaapi?
13:39 imirkin: Kresp: use nvidia blob drivers, or get a different gpu, or don't use h264 vdpau decoding.
13:40 nyef: ... Guess that answers my question.
13:40 Kresp: nyef: DXVA copyback works fine, just checked, so probably for nouveau only
13:40 imirkin: for some reason the VLD engines hangs
13:40 imirkin: i haven't had a chance to investigate properly yet
13:40 imirkin: just confirmed that it also happens on my G92
13:41 Kresp: imirkin: there's one problem. 9800GT is EOL'd past 340.xx driver. I want to run GTX960 at the same time. is it possible to have both versions installed and running, older - for 9800GT and newer - for GTX960?
13:42 imirkin: with nvidia blob drivers? not sure. only if 340.x supports GTX 960. it might.
13:43 imirkin: note that nouveau does not presently support any accelerated video stream decoding on GTX 960.
13:44 dboyan: unfortunately, 340 series doesn't seem to support maxwell
13:45 dboyan: oh, it seems to support gm10x
13:45 dboyan: not 960
13:46 imirkin: ah yeah. that sounds right
13:46 imirkin: i think gm20x was in 350.x or so? it was all so long ago.
13:46 imirkin: Kresp: you should be able to run nvidia blob and nouveau at the same time though, if you wanted...
13:46 imirkin: i.e. use a newer blob driver, which will only bind to the new device
13:46 imirkin: and then load nouveau, which will bind to the remaining device.
13:47 imirkin: i've never tried such a thing
13:47 Kresp: thanks. I'll look into it
13:48 dboyan: imirkin: I still have no idea how clockhi/lo get reset across runs. Do we need to solve that before landing ARB_shader_clock?
13:50 imirkin: Kresp: why do want both gpu's plugged in?
13:50 imirkin: dboyan: i don't think so. do nha's piglits pass?
13:50 Kresp: imirkin: passthrough 960 to windows VM while keeping terminal/browser open on 9800
13:51 imirkin: Kresp: ah. so then actually the 340.x drivers are perfect for you
13:51 imirkin: Kresp: they'll bind to the 9800, and leave the 960 alone, which you can then pass through in a vm :)
13:51 imirkin: you could also use the pci-stub driver to block a driver from binding to it
13:52 dboyan: imirkin: haven't tested yet. But I can test it now
13:52 Kresp: yeah. I wanted to try something similar for a long time, but only now have gotten HW with proper IOMMU support
13:56 imirkin: dboyan: please do run them...
14:08 mupuf: Satchelboi: Sorry, I did not specify anything as to what the patch could be. Even a documentation fix would be acceptable for me
14:12 dboyan: well, CMake is always a magic to me :/
14:13 imirkin: dboyan: yeah, it's basically unusable.
14:16 dboyan: imirkin: both tests (clock{,2x32}.shader_test) pass on my machine
14:19 karolherbst: Satchelboi: also, while doing the GsoC related things, please CC me in every mail to the ML
14:22 Satchelboi: mupuf: Alright, thanks for the clarification
14:22 imirkin: dboyan: excellent =]
14:22 Satchelboi: karolherbst: I'll make sure I do that from now on
14:23 karolherbst: thanks :)
14:23 karolherbst: I think I have to document stuff or be at least aware of what you do in that period. Makes it easier for me
14:23 imirkin: RSpliet: btw, i disagree that pre-RA scheduling won't improve perf. one of the big points of pre-RA scheduling is to make sure to insert enough "unrelated" instructions to hide latency of operations.
14:25 mupuf: karolherbst: whatever works best for you
14:25 Satchelboi: karolherbst: I'll keep a list of what I do too, I can send it along as its updated.
14:27 dboyan: imirkin: then I'll resend 2 and 3 of clock series. I also added support for nv50 in 3/3
14:28 karolherbst: Satchelboi: you can configure git.sendmail to have a list of CCs it always uses
14:28 karolherbst: makes things easy
14:28 Satchelboi: Oh, well in that case.
14:28 karolherbst: Satchelboi: it isn't as important in the replys, but if you reply to all it should include me as well then
14:29 karolherbst: but having the starting points of the threads is good enough
14:29 imirkin: dboyan: oh cool. i can test it for you.
14:30 RSpliet: imirkin: that's true for both pre-RA and post-RA. Pre-RA has the advantage of greater freedom (but comes with greater responsibility!), but on the downside it's difficult to judge whether there will be false-dependency related hazards introduced by the RA that follows
14:31 imirkin: RSpliet: well, at least it allows the RA to do the right thing ;)
14:31 imirkin: needless to say, both are important.
14:31 RSpliet: of course, with such large amounts of HW threading the question is whether these hazards matter
14:32 karolherbst: if you miss out pre RA scheduling, post RA scheduling hasn't enough freedom anyway
14:32 RSpliet: but it remains an interesting question what strategy is best performed at what stage...
14:32 karolherbst: true
14:32 karolherbst: placement is important in pre RA
14:33 karolherbst: because you have to move stuff less far in post RA
14:33 karolherbst: if you get it roughly right in pre RA, you can make it perfect in post RA
14:34 RSpliet: well, the only certainty I'd accept is that you have to limit your live test pre-RA... that is a bit of a worthless operation post-RA :-D
14:35 RSpliet: *live sets
15:16 Kresp: I installed nvidia-drivers. lspci -k lists both nouveau and nvidia as loaded for the video card. but if I echo device ID to /sys/bus/pci/drivers/nouveau/unbind, nvidia driver does not take over and display just goes dark until reboot. how do I restrict driver from loading at all for one selected device?
15:17 karolherbst: Marex: any luck with the pcie patch?
15:17 dboyan: Kresp: you can blacklist certain modules
15:17 imirkin: Kresp: i thought you wanted to use blob for the 9800GT and nothing for the GTX960
15:18 Kresp: imirkin: no. I want to pass GTX960 back and forth, to VM when I need windows and back to host aftewards
15:18 Mortiarty: imirkin, did you have any luck in figuring out the vdpau artifact problem?
15:18 imirkin: Mortiarty: haven't started yet
15:19 Marex: karolherbst: I built the kernel, but didn't reboot the machine yet
15:19 Marex: karolherbst: I'm currently working on some v4l2 stuff ...
15:19 karolherbst: okay :)
15:20 karolherbst: nice
15:20 Marex: no, not really :)
15:24 karolherbst: huh? v4l2 is very important :O
15:25 RSpliet: Marex: for capture devices, or for android/embedded video decoders?
15:33 karolherbst: Satchelboi: I am looking forward to working on the project with you. One thing in advance: if you think I do a crappy job, just kick me until I have no other chance than to do my job properly :p In most cases I won't feel personally attacked when somebody points at me and tells me what I did wrong. To be completly honest, this will be my first mentoring. I'll try to do my best, but don't hesitate to tell me if you
15:33 karolherbst: are unhappy about anything
15:56 Marex: RSpliet: mx6 IPU CSI
15:56 Marex: RSpliet: using that patchset from Steve
17:02 jamm: how does nouveau's codegen work?
17:02 jamm: I'm trying to understand the instruction set used in the shader/*.fp files, but all I got is http://docs.nvidia.com/cuda/cuda-binary-utilities/#maxwell-pascal
17:05 imirkin_: jamm: that's like going to #gcc and asking "how does gcc work"
17:07 imirkin_: jamm: the isa formatting is defined by envydis/gm107.c
17:07 imirkin_: it's roughly modeled after the output of nvdisasm
17:19 jamm: imirkin_: right, sorry for the broad question. I'm trying to understand on a higher level on how it factors into shader execution
17:20 imirkin_: it's a compiler
17:20 imirkin_: it gets stuff on input, and produces an executable
17:20 imirkin_: much like gcc :)
17:25 jamm: ah, right, got confused there for a while.. i think i dove in too early, will have to read up a lot more than i thought, but sounds like fun ^_^
17:26 imirkin_: shaders execute very much like regular CPU programs. they're a sequence of opcodes, they get data on input, and produce data on output.
17:27 jamm: i've written shaders in the past, but that was on windows and webGL so far
17:31 jamm: ah, makes sense. So shader programs (running on GPU) are compiled using shader compilers (like the one on nouveau's codegen) and normal, CPU programs are compiled using our usual gcc,clang etc.
17:32 jamm: it's just that the shader runs on a different language and executed on the GPU chip rather than CPU because the codegen spits out opcodes understandable by the GPU instruction set..
17:32 jamm: i could be missing something above, but am i looking at the right direction?
17:37 imirkin_: that's correct
17:37 jamm: okay, my understanding of codegen was wrong, so the shader compilers (the frontend) compiles into an intermediate language, which is then consumed by the codegen, something like a backend?
17:37 imirkin_: sure, you can think of it that way
17:37 jamm: great, thanks! at least i know i am at the right direction, will be looking more into this
18:23 Satchelboi: karolherbst: Looking forward to working with you too! Don't be afraid to kick me for things or be blunt either, it's the most efficient way to tell me I need to change something. Hoping I won't be the cause of too many headaches over the duration of this ;)
18:39 mupuf: Satchelboi: oh, you can bet both of you will have headaches. But ultimately, what matters is to get you up to speed and productive! :)
19:12 karolherbst: Satchelboi: :)
19:31 Satchelboi: I was hoping my last minute proposal submission wasn't the source of any first headaches either >~>
20:05 karolherbst: Satchelboi: nope, as long as it is well written it is fine. Allthough personally I would prefer if there was a "I will write a proposal for" 2 weeks earlier
20:30 karolherbst: Satchelboi: that way we can deal with the "are you able to do the stuff" prior the deadline and we are all set
20:32 Satchelboi: karolherbst: Expect much more frequent updates on that than the proposal, haha.
20:32 karolherbst: :) hopefully
20:33 karolherbst: Satchelboi: I have to write a report after the first month
20:33 karolherbst: I think
20:33 karolherbst: I have to go through the mentoring stuff in detail once more
20:33 karolherbst: and yes, the most headache will be reing that VP6 engine
20:33 karolherbst: Satchelboi: what nvidia GPUs do you have?
20:34 Satchelboi: karolherbst: The first evaluation isn't due until June 26th it said for the project. I wasn't even expecting an answer for another month
20:34 karolherbst: :p
20:34 Satchelboi: Right now I have a gtx 660, but I have access to several others from a friend. I can ask her for a list of what she has.
20:35 karolherbst: mhh
20:35 karolherbst: there are VP6 and VP7 based GPUs
20:35 karolherbst: VP7 is basically a VP6+
20:35 karolherbst: GM206 GPUs have VP7
20:35 karolherbst: I think it still makes sense to have a GPU with VP6 and start with this
20:35 karolherbst: and later add support for VP7
20:36 Satchelboi: So I'll need GM107 and G206?
20:36 imirkin_: allegedly the format should be quite similar to what VP3+ all have
20:36 karolherbst: Satchelboi: any maxwell besides GM206 has VP6
20:36 karolherbst: GM107, GM108, GM200 and GM204
20:37 Satchelboi: Alright. GM107's are pretty cheap now, so I can grab one if I need to
20:40 karolherbst: or you RE NVENC :P :O
20:40 Satchelboi: I can get a gm107! So all set there
20:40 karolherbst: but I guess it is too late for that now ;)
20:41 karolherbst: decoding is all nice and so, but _encoding_ is even more awesome :p
20:41 karolherbst: well there is also next year
20:41 Satchelboi: There's always time to look into it if the project moves faster than expected too.
20:42 karolherbst: I doubt that, but yes
20:42 Satchelboi: Optimistic thinking!
20:42 karolherbst: I think imirkin_ needed 3 months full time for one VP engine
20:42 karolherbst: your proposal is indeed very optimistic
20:43 imirkin_: well, i didn't know anything about nouveau at the time
20:43 imirkin_: and it was an entirely different format from the already-existing VP4 impl
20:44 imirkin_: whereas i think that VP6+ are identical to VP3/4/5
20:44 imirkin_: (except obviously some new stuff for H.265 added in GM206)
20:44 karolherbst: would be nice, yes
20:44 imirkin_: the programming interface will be different
20:44 imirkin_: since it's a single engine now rather than 3
20:46 karolherbst: nice
20:49 mooch2: i have a gm107
20:49 mooch2: it's my main gpu :^)
20:50 karolherbst: mooch2: but you didn't write the proposal :p
20:51 Satchelboi_: imirkin: I dont know a ton about Nouveau right now, so most of it will be learning the rope
20:51 Satchelboi_: mooch2: I have something set up for the necessary equipment now luckily
20:52 Satchelboi_: Its a little funny considering my primary desktop is AMD powered right now
20:52 imirkin_: Satchelboi_: probably best not to be using nouveau as your main driver
20:52 imirkin_: makes experimentation easier
20:55 karolherbst: VDPAU decoding can be offloaded as well, right?
20:56 imirkin_: you can always start another X server if need be
20:58 karolherbst: makes sense
20:59 karolherbst: Satchelboi_: by the way, I won't say the project failed if you don't get any patches merged. As long as most of the required bits are reverse engineered I will be happy enough. Implementing is low priority in my eyes
21:00 karolherbst: also I would prefer more smaller patch series than less big ones
21:00 Satchelboi_: imirkin: Dont worry, I'm not using it right now as my driver
21:00 karolherbst: patch series can also be against envytools for the rnndb database to document all the regs and so on
21:01 Satchelboi_: karolherbst: The reverse engineering was the primary focus of it all, so that's going to be my main area of effort
21:01 karolherbst: nice
21:01 karolherbst: because you planned a little more than 1 month for the reeing part
21:01 karolherbst: I would just plan 3 months for this and spend all the non needed time on implementing
21:02 Satchelboi_: I honestly wasn't sure how long either of those were going to take, so I just went with some blocks that seemed balanced.
21:02 karolherbst: you can then implement stuff after the project is finished as well, that would be fine by me and I am sure nobody else will complain as well
21:02 karolherbst: yeah, there are reevaluating phases as well, just saying how I see things
21:03 imirkin_: welll, the way RE often works is that you "finish" the RE, go to implement it, and realize you're only 20% done with the RE :)
21:03 karolherbst: having code is not the requiernment for the GSoC project as long as the stuff you create is usefull
21:03 karolherbst: and this as well
21:04 karolherbst: implementing is your smallest concern
21:04 Satchelboi_: imirkin_: Oh dear. I'll try and account for that happening then
21:04 imirkin_: [unless you're very experienced with RE, which i'm guessing isn't the case]
21:04 karolherbst: REing that PCIe stuff was fun, cause that happened like always
21:04 karolherbst: ohh I am done, no I am not
21:05 karolherbst: I think my reclocking patches were like 95% REing in total
21:06 imirkin_: for the H.264 stuff i had to go back and forth, find additional flags that need to be set, etc.
21:06 Satchelboi_: imirkin_: Very far from the case
21:06 imirkin_: esp since i'm not an expert on H.264 encoding
21:06 imirkin_: and there still remains an H.264 bug
21:06 imirkin_: across all families
21:06 Satchelboi_: I have some experience with skills I can use for RE, but not with RE itself
21:07 imirkin_: feels like some motion vector buffer is undersized or something, but i haven't been able to make a dent in it thus far.
21:08 karolherbst: increase all buffser :p
21:09 imirkin_: if only i'd thought of that...
21:09 karolherbst: you're welcome :D
21:11 Satchelboi_: karolherbst: I have to head out for a while now and won't be back until much later, so i'll check in with you tomorrow for the patch idea
21:12 karolherbst: much later I sleep
21:12 karolherbst: good :)
23:14 skeggsb: RSpliet: i have a fix for your dmi issue, will push it later today
23:17 nyef: skeggsb: Got a minute or two to talk about the stereoscopy patch series?
23:18 RSpliet: skeggsb: thanks!!
23:18 skeggsb: nyef: i haven't looked any further since last time, still planning on that after i move (this weekend) when i can test it :P
23:19 RSpliet: skeggsb: nothing but a little goof-up?
23:19 skeggsb: m->v.blankus is a u16, which is enough for the result, but not enough for the intermediate calculation steps :P
23:20 RSpliet: ... ah!
23:20 nyef: skeggsb: That's fair, I guess. In the meantime, I've done a forward-port to 4.11-rc5 (one function name had been changed) and typofixed one of the patches, and was thinking to send the result out as a v3 series.
23:21 skeggsb: nyef: hmm, if you're going to do that, i'll have another look over today and check for obvious issues - to save you a potential -v4
23:21 nyef: Thank you.
23:22 skeggsb: i didn't notice anything on my first glance through :)
23:22 nyef: Well, that's encouraging. (-:
23:23 nyef: One thing that you might run into is that drm_crtc_get_hv_timing() has been renamed to drm_mode_get_hv_timing() at some point along the way.
23:23 skeggsb: ack
23:23 nyef: (That's the revision for the forward-port.)