02:09 dboyan: cmake is clearly doing something stupid in my deqp build =/
02:10 dboyan: a stray -L eats an input .o file
02:10 Teklad: Why is there a stray -L?
02:10 dboyan: no idea
08:35 leberus__: karolherbst: did you happen to have some time to review all patches? I've seen you commented one of the patches but i wasn't sure if you have reviewed all of them an thats the only thing you found. Thanks in advance
08:36 karolherbst: leberus__: they look fine otherwise, I am just unhappy with hwmon, not your patches
08:36 karolherbst: I would rather like fill in a few structs with func pointers instead of doing this method dispatching
08:40 karolherbst: mhhhhhhhhh I have an idea :O
08:40 karolherbst: there is a binary package for nvidia GPUs für mac os x, maybe we should do a binary comparison of the Linux and Mac OS X drivers
08:40 karolherbst: and check for big blobs which are nearly identical
08:52 leberus__: karolherbst: ok, then i'll make v5 dropping the check you pointed out, so it can be merged afterwards maybe
11:19 RSpliet: karolherbst: you'd be surprised how much NVIDIA shares between OSes
11:19 RSpliet: oh
11:19 RSpliet: nope, still here
11:20 karolherbst: binary format?
11:20 karolherbst: it's not like the binaries will be the same
11:20 RSpliet: they implement a lot of OS primitives *in* the blob so that they can run it OS agnostic. Tiny bit of OSS glue around it to make it play nice with the operating system it actually runs on
11:20 karolherbst: I know?
11:21 karolherbst: but if you compile something for Linux and then for windows, it looks completly different if you look at the binaries
11:21 karolherbst: well not completly
11:21 karolherbst: but there are significant enough changes
11:22 karolherbst: the entire code section will differ significantly
11:22 RSpliet: it's worth a try, but I guess it depends a lot on the compilers they use
11:23 karolherbst: sure
11:24 RSpliet: they seem to link against glibc on Linux, so you might be lucky ;-)
11:26 RSpliet: oh wait, wrong file I'm looking at
11:26 RSpliet: sounded weird to me in the first place
11:28 karolherbst: I am sure the blobs are inside the kernel module anyway
11:28 karolherbst: so, no libc
11:28 RSpliet: exactly
11:28 RSpliet: nv-kernel.o_binary
11:30 karolherbst: name of the mac os x file?
11:30 RSpliet: no that's the one in the linux package, after running with --extract-only
11:30 karolherbst: yeah well
11:30 karolherbst: but yeah, should be inside there somewhere
11:31 RSpliet: elf headers don't give away a lot of info on the used compiler
11:31 RSpliet: other than OS/ABI: UNIX - System V
11:31 karolherbst: doesn't matter
11:31 RSpliet: merely curious
11:51 dboyan_: karolherbst: why not diff the driver binary before and after the support of a certain hardware?
11:52 karolherbst: mhhhhh
13:11 RSpliet: dboyan_: what do you expect to find from that?
13:14 dboyan_: some extra things that might be interesting
13:15 dboyan_: I think only comparing .rodata section is enough
13:33 dboyan_: karolherbst, RSpliet: Actually I compared modified hexdump output of rodata section of two consecutive versions of blob's kernel module. Most of the diffence have nearly equal sizes on both parts. And the newer version has a few extra sections.
13:40 karolherbst: yeah, but the big changes are interesting like the falcon blobs
14:05 dboyan_: imirkin: I ran dEQP on my gp107, most of dEQP-GLES31.functional.compute.indirect_dispatch.* tests failed
14:05 dboyan_: also, the test crashed midway
14:06 dboyan_: during on of the dEQP-GLES31.functional.draw_indirect.compute_interop test
14:07 hakzsam: expected fail IIRC
14:07 hakzsam: ie. the "we submit too fast" issue
14:33 dboyan_: otherwise, compute related tests seemed okay generally
14:34 hakzsam: cool
14:39 technohacker: hello devs :)
14:39 technohacker: i was reading the nouveau FAQ about power management
14:40 technohacker: i have a Geforce GT 525M (NVC0)
14:40 technohacker: is there any way i can get it to stop heating up?
14:41 technohacker: the video card is disabled when idle (as vgaswitcheroo shows)
14:41 technohacker: however, HDMI Audio remains permanently on
14:42 karolherbst: technohacker: is the GPU still warm enough, so that you can assume it isn't completly off?
14:42 technohacker: yes, my laptop case feels warm to the touch even when idle
14:43 technohacker: here is the vgaswitcheroo output
14:43 karolherbst: technohacker: 1. did you try the most recent kernel? 2. did it work before?
14:44 technohacker: 0:IGD:+:Pwr:0000:00:02.0
14:44 technohacker: 1:DIS: :DynPwr:0000:01:00.0
14:44 technohacker: 2:DIS-Audio: :Pwr:0000:01:00.1
14:44 karolherbst: technohacker: can you switch the audio card to DynPwr as well?
14:44 technohacker: i'm on Arch linux, kernel 4.10.12-1
14:45 technohacker: i'm not sure as to how i can switch power states
14:46 technohacker: for 2, i was on nvidia proprietary with bumblebee
14:46 technohacker: so it kept the card switched off
14:47 karolherbst: Lekensteyn: any ideas?
14:48 technohacker: karolherbst: i should mention that laptop-mode-tools is able to power down DIS-Audio
14:48 karolherbst: technohacker: mhhh, does the GPU goes off as well if it powers down DIS-AUDIO?
14:49 technohacker: yes, my laptop case goes cool
14:49 karolherbst: okay
14:49 karolherbst: then we found the problem
14:49 karolherbst: laptop-mode-tools messes around
14:49 karolherbst: disable audio pm control in laptop-mode-tools
14:50 technohacker: okay
14:50 karolherbst: technohacker: I guess in vgaswitcheroo the GPU line became DynOff?
14:51 technohacker: yes, DIS is DynOff, DIS-Audio is Off
14:53 karolherbst: okay
14:53 karolherbst: after you disabled it in laptop-mode-tools
14:53 karolherbst: you might have to reboot
14:54 technohacker: (i'm yet to find out where the setting is)
14:54 karolherbst: because DIS-Audio has to be DynOff or DynPwr
14:54 karolherbst: technohacker: somewhere in the modules most likely... they splitted everything
14:54 technohacker: karolherbst: okay, thanks! :)
14:55 imirkin: dboyan_: indirect probably fails because we manually patch up the launch descriptor
14:55 imirkin: dboyan_: and with the new layout, the patching is in the wrong place
14:56 technohacker: karolherbst: I guess i've found out where laptop-mode is handling the GPU
14:57 technohacker: karolherbst: there was a module for vgaswitcheroo
14:57 karolherbst: technohacker: well, it's not related to the GPU, but the Audio stuff
14:57 karolherbst: ohh
14:57 karolherbst: okay, maybe it does hte HDMI stuff over it, no idea
14:57 technohacker: karolherbst: maybe. okay so i guess i'll enable AC PM as well.
14:58 dboyan_: imirkin: you mean mme90c0_launch_grid_indirect?
14:58 technohacker: thanks for the help and keep it up! :D
14:58 imirkin: dboyan_: tbh i don't remember how it works. that could be it.
15:04 hakzsam: imirkin: well, some of those fails are unrelated to compute :)
15:05 imirkin: well not ALL the fails
15:05 imirkin: but the fact that all indirect dispatch fails
15:05 imirkin: certainly seems indicative :)
15:06 hakzsam: yup
15:06 hakzsam: he has to update the offset in the descriptor (just noticed this)
15:08 hakzsam: ie. nve4_launch_grid() in the info->indirect if
15:08 karolherbst: Lyude: sorry that I somehow forget to look in depth into your patches, will do this today
15:09 Lyude: karolherbst: for clock gating? don't worry about it, they aren't done yet anyway :p
15:09 karolherbst: yeah I know, but you asked and I somehow sidedrifted
15:22 Lekensteyn: karolherbst: I guess you found the problem already: for GPU to power off, runpm must be auto for both the GPU and HDMI audio function
15:22 karolherbst: yes
15:33 dboyan_: So actually I don't need the macros, it's still in nve4_launch_grid
15:34 dboyan_: i think i'll fix it up some time tomorrow
16:50 mooch2: well shit, i got declined for a job at red hat
17:00 karolherbst: Lyude: I left a few comments
17:11 Lyude: karolherbst: ack, I like your idea
17:30 EvilMachine: Hi. I have a GTX295 with two GPUs. I read that you don’t have much access to such cards. Can I help out?
17:30 EvilMachine: (Internally, it’s a SLI setup AFAIK)
17:31 imirkin_: EvilMachine: well, things should generally work on that, including reclocking
17:31 imirkin_: SLI is a whole other ball of wax that hasn't been tackled
17:32 imirkin_: largely because nobody's figured out how to use it even if we knew exactly how it all worked at the hw level
18:02 EvilMachine: imirkin_: Hmm, the closed driver uses it though
18:03 EvilMachine: imirkin_: So if Noveau managed to analyze the closed code enough to get everything else working, what’s different with the SLI part of the driver code?
18:04 EvilMachine: imirkin_: By “should generally work” you mean for a single core, right?
18:13 imirkin_: EvilMachine: both gpu's, just not splitting up the workload among them
18:14 EvilMachine: Ah, oka
18:14 EvilMachine: y
18:14 imirkin_: EvilMachine: the issue is in how to do the splitting. it's not obvious. it's not a matter of understanding how the hw works, it's a matter of figuring out how to do it automatically.
18:14 imirkin_: it's the sort of thing that was pretty straight-forward with single-pass rendering, but less so with multipass rendering
18:15 EvilMachine: So rendering two different things to two different screens works fine, but not onto the same screen? Or just not teaming up on the same single image?
18:15 imirkin_: it's effectively causing the hw to become a tiled renderer
18:15 EvilMachine: But yes, I get how that would be non-trivial.
18:15 imirkin_: just not teaming up on the same single application workload
18:15 imirkin_: two applications could run on separate GPUs
18:15 imirkin_: or a single very sophisticated application could itself split up the workload, with great difficulty.
18:16 imirkin_: [actually i'm not sure that'd be possible right now, but making it possible wouldn't be very difficult]
18:17 EvilMachine: could it be used e.g. in a 3d game, to draw alternating frames on a single screen? or at least on different screens? (or hey, have the one gpu render it and transfer it to the other gpu for merging)
18:17 imirkin_: sure, in theory
18:17 imirkin_: in practice it's hard to tell, in software, one frame from the next.
18:17 EvilMachine: alright, then i don’t think more is even necessary
18:17 EvilMachine: and frankly i’m not sure the closed driver even does more
18:18 EvilMachine: i read something about alternating frames back when i bought it.
18:18 imirkin_: my guess is that the closed driver has several heuristics available to it
18:18 imirkin_: and they test those heuristics with each piece of software
18:18 imirkin_: and have internal configurations that say to use one or another with each one
18:18 EvilMachine: maybe. but if the alternating frames thing works, who cares? as far as i’m concerned, that means “everything works”.
18:19 imirkin_: but how do you tell when one frame finishes and the next begins?
18:19 imirkin_: again, it's the sort of thing that's easy on single-pass renderers
18:19 imirkin_: but more difficult on multi-pass renderers
18:19 EvilMachine: ak, ok
18:19 EvilMachine: because you can’t tell if something is a second pass
18:20 imirkin_: right.
18:20 imirkin_: with tiled hw renderers (like most embedded gpu's), you get stuck with similar issues
18:20 imirkin_: i think using a similar approach could work
18:20 imirkin_: but it'd require a ton of driver work to make it happen
18:21 EvilMachine: hmm, I’d probaby put the resulting images / data of each pass in the memory of both gpus, so that it wouldn’t matter which one had to do the next step. but that would not help with parallelity, i guess. :)
18:21 imirkin_: robclark: how hard would it be for another driver to steal the work you've done on that in freedreno?
18:21 imirkin_: EvilMachine: but ... which pass is which
18:21 imirkin_: it's not like you declare "this is pass 1"
18:22 robclark: imirkin, what are we talking about?
18:22 EvilMachine: imirkin_: i’m unclear about why it would matter, given that every result is given to both gpus
18:22 EvilMachine: but i’m not an expert, so nevermind. :)
18:22 robclark: you mean the re-order stuff?
18:22 EvilMachine: (this smells like it’s way over my head :)
18:23 robclark: (or just tiling in general)
18:23 robclark: either way I guess the answer is "not much other than some example code that can be cut/paste/reworked"
18:24 EvilMachine: It sounds to me like the same problem that programmers of functional languages like Haskell have when trying to get things to be in a fixed sequence. Esp. on multi-core systems.
18:26 EvilMachine: (Haskell makes no guarantees on what bit of code is evaluated when. It solves IO, by generating a long list of “actions”, and stringing them through a state monad with “RealWorld” as the state. ^^
18:26 EvilMachine: )
18:26 imirkin_: robclark: yeah.... i guess. basically SLI seems like effectively a tiler, except the memory stuff is even wierder.
18:27 imirkin_: it seems like it could even be possible with a meta-state tracker
18:27 imirkin_: and totally different underlying gpu's, assuming some basic similarities
18:28 EvilMachine: I’m finally gonna switch from closed nvidia drivers now. VDPAU finally stopped working in mplayer, and the patching for driver compatibility has gotten rather silly .
18:28 imirkin_: well, vdpau should kinda-work on nouveau
18:28 imirkin_: h.264 may have artifacts though
18:29 imirkin_: and vc-1 is entirely unsupported on your gpu gen (while nvidia does support it)
18:29 EvilMachine: that’s alright.
18:29 EvilMachine: what is VC-1?
18:29 imirkin_: bluray or hddvd format? i forget
18:29 imirkin_: aka wmv9 maybe?
18:29 EvilMachine: optical drives? what is this? the 90?
18:29 EvilMachine: 90s
18:30 EvilMachine: yeah, wmv can kiss my …
18:30 imirkin_: the data can live whereever :)
18:30 imirkin_: optically or otherwise. it's just a sequence of bits.
18:30 imirkin_: https://en.wikipedia.org/wiki/VC-1
18:31 imirkin_: "It is widely characterized as an alternative to the ITU-T and MPEG video codec standard known as H.264/MPEG-4 AVC."
18:31 imirkin_: anyways, your hw has bitstream processing for h.264 but not vc-1. so i left vc-1 as unimplemented.
18:32 EvilMachine: I have no tv since 2004, no optical drive since 2008, have sworn to never “buy” media again in 2007, and will only finance creative people directly and for their actual work. Everyone who wants to “sell” me a copy of the result (information/data) of their hard work, will be “paid” by a copy (“specimen”) of the result (money) of my hard work.
18:32 EvilMachine: </rant>
18:33 EvilMachine: No problem. All I ever watch, is MP4 and MKV videos and YouTube.
18:34 EvilMachine: And if necessary, there’s always ffmpeg conversion.
18:34 imirkin_: mkv can contain whatever, including vc1 afaik
18:34 imirkin_: it's a transport format, like avi or mpeg2-ts.
18:35 EvilMachine: True. I’ve yet to see it contain other video formats as H.26[45] and Theora. :)
18:35 EvilMachine: I know MKV very well. Binary markup is my favorite format of file format. :D
18:36 EvilMachine: (I’m probably the only one, who uses MKA as his default format. :)
18:37 EvilMachine: But … blah … I’m gonna go eat something. Thanks for the clarifications, imirkin_. And for all the hard work in making the world a little freer.
18:45 imirkin_: although maybe mesa includes a vc-1 bitstream decoder now? if so, it might be reusable
19:03 Lyude: did some more analysis on some of the mmio traces and what they did for clockgating, and it looks like the nvidia blob does not actually just set up clockgating for all of the engines (PCOPY1 and 2 don't seem to get set at all on fermi), so it looks like actually checking for each engine's existance isn't a bad idea
19:04 karolherbst: Lyude: I think nvidia setups the clock gating reg when they also init all their engines
19:04 Lyude: i figured as much
19:04 karolherbst: there is only PCOPY0 on fermi anyway
19:04 karolherbst: I think it is the ".ce[0]" in nouveau
19:05 Lyude: based off the fact fermi seems to set the PGRAPH to (A=auto, R=run) both RRRA and AARA
19:05 karolherbst: gf104
19:05 karolherbst: has PCOPY2
19:05 karolherbst: but be aware
19:05 karolherbst: gf106 < gf104
19:05 Lyude: :|
19:05 karolherbst: because gf106 == nvc3 and gf104 == nvc4
19:05 Lyude: nvidia
19:05 Lyude: come on...
19:06 Lyude: eh, regardless I'm pretty sure I'm going to do the same thing as well
19:06 karolherbst: it's get even more fun
19:06 Lyude: just to make sure we don't enable clockgating at a spot that might be too early
19:06 karolherbst: gf116 aka nvcf has no PCOPY1
19:06 karolherbst: only PCOPY0
19:06 Lyude: wat
19:06 karolherbst: same for gf117 (nvd7) and gf119 (nvd9)
19:07 Lyude: numbers are meaningless
19:07 imirkin_: the order of chips is as defined in envytools
19:07 Lyude: that's the lesson I take from all of this, at least. lol
19:07 karolherbst: best part
19:07 imirkin_: https://github.com/envytools/envytools/blob/master/rnndb/nvchipsets.xml
19:07 karolherbst: gk20a aka nvea has only PCOPY2, no PCOPY0 or PCOPY1 :p
19:07 Lyude: why did I feel like I knew that was coming
19:08 karolherbst: what was the reason again for 0xea > 0x106?
19:08 imirkin_: came later? dunno. it's about the same as GK208
19:08 karolherbst: okay
19:09 karolherbst: even the same like gk110?
19:09 karolherbst: most likely not
19:09 imirkin_: no, more similar to gk208 than gk110
19:09 karolherbst: ...
19:09 karolherbst: k
19:09 karolherbst: why even bother
19:09 imirkin_: the ctxsw fw uses fuc5
19:09 imirkin_: while on gk110 it uses fuc3
19:09 karolherbst: makes sense
19:09 karolherbst: I thgouth gk110 already uses fuc4?
19:10 Lyude: also, what is the translation between PCOPY and what the kernel calls the engine?
19:10 karolherbst: .ce
19:10 karolherbst: ce: copy engine?
19:10 karolherbst: it's a guess
19:10 karolherbst: don't know for sure
19:12 Lyude: mupuf: have any idea? ^
19:13 Lyude: i think CE is right but I don't see any mention of a PCOPY3, 4, etc. but I see NVKM_ENGINE_CE{0..5}
19:13 karolherbst: should be PCOPY then
19:13 karolherbst: fun GP100 has 6 where gp102 only has 4
19:14 Lyude: ahhh, I see
19:19 pmoreau: PyroSamurai: Hello! Did you had some time to look at SPIR-V, try to run the code on your computer?
19:39 Lyude: so the only thing that leaves to question is why there's no clockgating control for PCOPY3
19:39 karolherbst: Lyude: you mean it isn't reverse engineered yet ;)
19:40 karolherbst: check out which of the unknowns are touched near the PCOPY3 init parts
19:40 Lyude: ah, lol. I'm not sure if that's it though, using my magic mmio grepper script:
19:40 Lyude: https://hastebin.com/adimakedeq.txt
19:41 Lyude: The only unknown register I can see getting set is that one, which gets set on pascal and fermi, although I'm going to run that again with --context to see if it happens around where we setup PCOPY3
19:41 karolherbst: please don't pose email addresses :p
19:41 Lyude: oh shit i'm really sorry
19:41 Lyude: i didn't even think of that
19:42 karolherbst: I did the same mistake once
19:42 karolherbst: mhhh
19:42 karolherbst: UNK254, the heck is that
19:42 Teklad: Lyude: Prepare yourself for ungodly amounts of spam.... the end is nigh.
19:42 karolherbst: check what engines is touched around that
19:43 Lyude: good thing those pastes expire eventually
19:43 Lyude:adds something to filter out emails from this script...
19:54 Lyude: karolherbst: (emails stripped automatically this time :) https://hastebin.com/etivitixuh.txt
19:55 Lyude: p much the same spot all of the fermi traces set it
19:55 Lyude: the thing is UNK254 seems to be the only unknown register getting set
19:56 Lyude: in that range of clockgating registers I mean
20:00 Lyude: (also, what does the kernel call PGRAPH?
20:05 karolherbst: gr
20:06 karolherbst: mhh, maybe unk254 is related to clk26 and clk27? no idea what those do
20:06 Lyude: for now I think I'm just gonna let that clockgate control be
20:07 Lyude: especially since it's the only one that sets itself to RARA as opposed to AARA
20:07 karolherbst: yeah
20:07 karolherbst: enabling it for gr makes all the difference anyway
20:07 karolherbst: or most of it
21:13 EvilMachine: imirkin: I switched to noveau, now my GPU fan is very 1337. Because its speed is fixed at 1337 RPM. Is that normal with the unfinished power management? (2×NVA0 aka GTX295 card)
21:14 EvilMachine: Oh, it even went to 1338.
21:18 EvilMachine: Hmm… every
21:18 EvilMachine: Hmm… everybody in bed already… :)
21:22 karolherbst: mupuf: I think I have to show your LED video tomorrow again :D
21:22 mupuf: karolherbst: why_
21:22 karolherbst: ohh, right. I told you about the presentation I wanted to hold about Nouveau here in Hamburg? it's tomorrow
21:23 karolherbst: we should re the rgb LEDs at some point as well :D
21:26 EvilMachine: does anyone know how to get the fan of the GTX295 to settle down?
21:31 EvilMachine: okay, apparently it is already automatic. just way louder than the closed driver, so I have to fix that
21:35 waltercool: guys, small question, did nvidia still helping since acourbot left Nvidia?
21:36 RSpliet: waltercool: there are a few NVIDIA people lurking here that might know more about this than we do
21:38 karolherbst: if somebody else wants to take a look/comment on the slides: https://drive.google.com/file/d/0B78S7GSrzebIYnpSTTkzV3B4WmM/view
21:38 RSpliet: karolherbst: what's the venue?
21:38 karolherbst: just for fun
21:39 karolherbst: it's for the CCCHH in Hamburg in a room which you might call a hackerspace
21:39 RSpliet: cool
21:40 karolherbst: yeah
21:40 RSpliet: Slide 5: I suspect most people care more about the APIs supported than the power management features. I would move that up
21:40 karolherbst: I was kind of asked to talk about it
21:40 karolherbst: hey, perf is more important :p
21:41 RSpliet: I can do nothing, but at a really really high freq :-P
21:41 karolherbst: ohh, I can put something else than the bullet point as well, nice
21:42 karolherbst: ⚠ very important
21:42 karolherbst: why can't latex handle that...
21:42 waltercool: I hope the last part (Fan control) could come soon
21:43 waltercool: that's the only thing delaying reclocking, isn't?
21:43 karolherbst: well, on maxwell2, yes
21:44 RSpliet: Slide 14: there's a shit-ton of docs that NVIDIA did release (despite only being the tip of the iceberg). Including kfractal's envytools fork with reg specs for 2nd gen Kepler as far as Tegra engines go, VBIOS specs, header defs and all the other stuff in open-gpu-doc. Help with hardware bugs that skeggsb encountered
21:45 karolherbst: ohh right, I could mention that
21:45 RSpliet: not to mention the android tegra kernel driver that clarifies a few things :-P
21:46 RSpliet: and the old "nv" modeset driver for old hardware that helped bootstrap the process
21:46 karolherbst: allthought he VBIOS specs are very basic
21:46 RSpliet: "nv" was obscured, unreadable, non-intentional help, but nonetheless :-D
21:47 RSpliet: back on slide 5, you might want to clarify hwmon for those not so deep into Linux (fan, power and temperature monitoring)
21:48 karolherbst: do they still work on their android driver?
21:48 RSpliet: I think they'd like to drop it in favour of nouveau... but that's a long process
21:49 RSpliet: slide 6 might be a bit too short and too much a summary
21:49 RSpliet: in my opinion
21:49 karolherbst: that's fine
21:49 karolherbst: I don't want to put redable text on the slides actually
21:50 karolherbst: *readable
21:50 karolherbst: well
21:50 karolherbst: you shouldn't read while I talk :p
21:50 RSpliet: then why not Klingon? :-D
21:50 karolherbst: good idea
21:50 RSpliet: but more seriously, I think it might be good to talk about the engines/subdevs. Then explain that each engine/subdev has their own MMIO register subspace.
21:51 karolherbst: I plan less than half an hour for it, but I was already told, that they will be tons of questions for sure. I can expect around 40+ viewers
21:51 karolherbst: RSpliet: yeah well, I still want to focus on the project, not the hardware so much
21:51 RSpliet: if you want to talk sensors, they're connected through GPIO, which is controlled in the PBUS subdev (from the top of my head, I'd be wrong there :-D)
21:51 karolherbst: ohh I could mention GPIOs
21:52 karolherbst: buzzwords are always important
21:52 RSpliet: I think a narrative like that could help understand the significance of "engines"(/subdevs)
21:53 karolherbst: I think most people will get, that a display engine and a rendering engine is kind of important :p
21:53 RSpliet: sure, but the term "engine" is new
21:53 karolherbst: yeah, I know
21:54 karolherbst: I could follow up that talk with a talk about the actual hardware :p
21:54 RSpliet: is "engine" a HW concept, a SW concept? where do you draw the line? is the memory controller an engine? (no, but PFB is :-P)
21:54 karolherbst: it's a hardware concept
21:54 RSpliet: merely warning you that you could be confusing people on that slide and lose them ;-)
21:54 karolherbst: because it says so in the slide title
21:56 RSpliet: Development should be your last slide (too) I think, so they can connect at the end of the talk while others pester you with questions :-D
21:56 skeggsb: Lyude: "CE" is what nvidia call the device, not PCOPY
21:56 karolherbst: yeah, makes sense
21:57 karolherbst: skeggsb: they should listen to us, we have better names
21:57 RSpliet: skeggsb: that's just short for Copy Engine, right?
21:57 skeggsb: i presume so
21:57 karolherbst: does the "P" stand for Power by the way? Power Copy Engine :3
21:57 RSpliet: Privileged I think
21:58 skeggsb: P is a prefix on register names, for Privileged
21:58 skeggsb: yeah :)
21:58 karolherbst: okay, they seriously have to ask us regarding naming stuff
21:58 skeggsb: like, display regs have NV_PDISP_* for stuff that RM is supposed to touch, and NV_UDISP for stuff that's mean for the user of the classes
21:58 skeggsb: in our case, that's KMS, nvidia let userspace directly touch a lot of that stuff
21:59 karolherbst: it makes so much sense, that it sounds too boring :p
21:59 karolherbst: mhh
21:59 karolherbst: why they even want to control that stuff from userspace... oh well
21:59 RSpliet: karolherbst: slide 12 you'd have to introduce Gallium or not bother :-P but more importantly, it might be useful to say that codegen sort of predates the LLVM craze ;-)
22:00 skeggsb: karolherbst: their driver (until kms?) used it as the modeset API
22:00 RSpliet: (or keep in the back of your mind, people nowadays seem to assume that all compilers must be LLVM)
22:00 karolherbst: I expect that everybody can use search engines, so I just tell them while I am there
22:00 karolherbst: skeggsb: right
22:00 skeggsb: command submission has the same thing (NV_UDMA?), so the kernel doesn't need to be involved to submit stuff to the GPU
22:02 skeggsb: they've always had that design, and i rather like it. it's the reason for the added "dma object" complexity on pre-MMU GPUs, to prevent userspace touching stuff it's not allowed to, while still giving it "direct" control over its channels
22:02 karolherbst: mhh, I see
22:02 karolherbst: I guess for that you need proper channel/context control?
22:03 skeggsb: what do you mean?
22:03 karolherbst: well, how does it work out, if multiple processes want to poke stuff into the GPU at once?
22:03 skeggsb: there's more than one channel
22:03 Lyude: for engines in nouveau's kernel driver, is engine->func->fini called for things other then unloading the driver/suspending the GPU?
22:03 karolherbst: ohh, I see
22:03 skeggsb: the hw (with sw assist on earlier GPUs) switches automatically
22:04 karolherbst: Lyude: not for unloading, but suspending
22:04 karolherbst: Lyude: dtor is called at unloading
22:04 skeggsb: Lyude: it's called for each subdev/engine to reset them to an "off" state at init too
22:04 karolherbst: afaik
22:04 karolherbst: skeggsb: uhh, didn't know that
22:04 Lyude: Ah, so engine->func->fini, then engine->func->init?
22:04 skeggsb: Lyude: yep
22:04 karolherbst: skeggsb: sounds like the reason why nouveau can handle being loaded after nvidia?
22:05 Lyude: Also, are the engines only initialized when something is using them? judging by the engine->usecount thing in nvkm_engine_init()
22:05 skeggsb: can? we haven't been able to for a long time actually
22:05 karolherbst: mhh od
22:05 skeggsb: they modify the display state in some way we can't get back out of
22:05 karolherbst: I do it sometimes, and it didn't broke
22:05 karolherbst: oh well
22:05 karolherbst: like I care about the display
22:06 skeggsb: nothing gets processed from the display push buffers for some reason
22:06 skeggsb: not a terribly high priority thing to look into, but i can't imagine it's anything too complicated
22:06 karolherbst: RSpliet: I've updated the slides
22:07 karolherbst: mhhh, I could add a Linus to the slides :D
22:08 mwk: ... a Linus?
22:08 karolherbst: yeah you know, when he said some bad words about nvidia
22:08 RSpliet: presumably one with one of his fingers upright
22:09 RSpliet: I'll let you guess which one
22:09 karolherbst: I think I will put the tex up into some git repository
22:09 mwk: ah, right.
22:09 Lyude: Also, what do intr and dtor stand for?
22:10 karolherbst: interrupt and destructor
22:10 mwk: intr is interrupt handler
22:10 Lyude: ahhh
22:10 DocMAX: anyone running ubuntu 17.04 and gtx 970? i have problem... i get scrambled text in console
22:11 DocMAX: 16.04 worked
22:11 karolherbst: DocMAX: I think you need an evern newer kernel now
22:12 karolherbst: DocMAX: https://bugs.freedesktop.org/show_bug.cgi?id=94990 ?
22:15 DocMAX: kernel is very new. 4.10
22:15 DocMAX: 16.04 was much older.. 4.4
22:16 karolherbst: yeah
22:16 karolherbst: on 4.4 maxwell2 wasn't really supported, no hardware acceleration
22:16 karolherbst: that's why it kind of worked
22:16 karolherbst: I think you will need even 4.12 for the fix? not quite sure
22:16 karolherbst: skeggsb knows more
22:17 DocMAX: skeggsb hi
22:20 skeggsb: yeah, looks like 4.12
22:24 DocMAX: so i need 4.12?
22:24 DocMAX: how do i get it on ubuntu 17.04?
22:26 RSpliet: DocMAX: you'd have to ask the folks at #ubuntu for that
22:26 RSpliet: we don't do distro support here ;-)
22:27 DocMAX: oh sorry. i intended to write there
22:27 RSpliet: no worries!
22:28 DocMAX: last word: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc8
22:29 DocMAX: i think i found something
22:31 karolherbst: DocMAX: might work :p
22:39 karolherbst: Jaga: you were responsible for it :O why didn't I know that before :O
22:39 karolherbst: I mean, that Linus calling bad words to Nvidia
23:01 karolherbst: I guess this is the last update now
23:01 PyroSamurai: pmoreau: sorry, I haven't had time to test anything out. Been busy with setting up the server I just built and other HW stuff. Probably won't get to it this week and I
23:02 PyroSamurai: pmoreau: am going to my brothers graduation ceremony for his Master degree completion :)
23:02 PyroSamurai: pmoreau: on Friday
23:03 DocMAX: i tried 4.11-rc8, no luch
23:03 DocMAX: karolherbst, skeggsb
23:03 DocMAX: luck
23:04 karolherbst: it's not 4.12
23:04 DocMAX: no but near 4.12
23:04 karolherbst: well, not near enough
23:04 DocMAX: can i get the nouveau driver separately?
23:04 karolherbst: yeah, if you compile it yourself
23:04 karolherbst: but then you need a recent enough kernel
23:04 karolherbst: otherwise it won't compile
23:05 karolherbst: there should be a PPA for drm-next kernels
23:05 DocMAX: is the patch allready in the nouveau sources?
23:05 karolherbst: but I am not sure if the changes are in there already
23:05 karolherbst: it should be
23:07 DocMAX: i found a site with drm-next kernels 4.11
23:08 DocMAX: will try
23:08 karolherbst: 4.11 won't work
23:08 DocMAX: whatever drm-next means
23:08 DocMAX: but there is no 4.12
23:09 khronosschoty: drm-next sounds scary for some reason
23:20 DocMAX: you wont blieve me
23:21 DocMAX: http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-next/current/
23:21 DocMAX: GTX 970 WORKS!!!
23:21 DocMAX: Kernel 4.11!
23:22 DocMAX: karolherbst, skeggsb
23:22 karolherbst: I am sure it is a 4.12 one?
23:22 karolherbst: what does uname -a say?
23:23 DocMAX: Linux game 4.11.0-996-generic #201704212201 SMP Sat Apr 22 02:03:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
23:23 karolherbst: okay, that's 4.11 + 4.12 drm stuff
23:23 karolherbst: right
23:24 DocMAX: i can feel the acceleration
23:24 DocMAX: not like 4.4
23:24 DocMAX: feels like windows
23:25 DocMAX: great performance for a open source drive
23:25 DocMAX: r
23:29 PyroSamurai: now if only we can reach the statement "great performance for a driver. period."