07:25 karolherbst: imirkin: nvenc is just fast :/ 144 fps on fullhd
07:40 mupuf: karolherbst: hey, what happened with the pcie reclocking effort/
07:40 karolherbst: mupuf: ben wants to send me comments
07:40 karolherbst: :D
07:40 karolherbst: so I have to wait
07:41 mupuf: ok!
07:41 karolherbst: I want to get my stuff merged first anyway before starting yet another thing for the kernel module
07:41 mupuf: thanks :)
07:41 karolherbst: I have like 8 branches pending or something
07:41 mupuf: yes...
07:41 mupuf: ttyl!
07:41 karolherbst: anyway, nvenc sounds interessting
07:41 karolherbst: and it would be also something helpful :D
07:43 karolherbst: mupuf: if you want you can look over my envytools stuff though
07:43 karolherbst: the unk50 and unk5c seems to make always sense for now, except for one where I wait for the vbios
07:45 karolherbst: mhh nvenc uses GK104_COMPUTE.UPLOAD.DATA
07:46 karolherbst: mhhh, that doesn't look like too complicated though
07:47 karolherbst: is GK104_COMPUTE.UPLOAD.DATA from the host to the gpu?
07:52 karolherbst: what is the gpu doing here? https://gist.github.com/karolherbst/a232847e32712aee778c
07:55 karolherbst: there are 48 code blocks
07:55 karolherbst: and I rendered 24 frames
07:55 karolherbst: coincidence? :D
07:56 karolherbst: yep
07:56 karolherbst: when I render 48 I get 96 of those clocks
07:56 karolherbst: *blocks
07:56 karolherbst: no idea, but this looks pretty easy? ...
07:56 karolherbst: well probably the setup is hard
07:56 karolherbst: but data feeding seems easy
07:58 imirkin: probably compute shaders that generate Y and UV plane data from RGB surface?
07:58 karolherbst: yeah
07:59 karolherbst: would make sense
07:59 karolherbst: mhh
07:59 karolherbst: but you feed YUV data to the nvenc api
07:59 karolherbst: ...
07:59 imirkin: exactly
08:00 imirkin: but you have RGB surfaces you want to encode (aka your screen)
08:00 karolherbst: well I feed ffmpeg a yuv encoded mp4 video :D
08:00 imirkin: which gets decoded to rgb
08:00 karolherbst: and then to yuv on the hardware?
08:01 imirkin: hw deals in yuv
08:01 imirkin: screen deals in rgb
08:02 karolherbst: https://gist.github.com/karolherbst/167b4bbc81338a80689d
08:02 karolherbst: I think this is the encoding part of one frame
08:03 karolherbst: two Codes
08:03 karolherbst: to data uploads
08:03 karolherbst: one data read
08:03 karolherbst: *two data uploads
08:04 imirkin: so like i said
08:04 imirkin: read rgb surface, poop out Y and UV surfaces
08:04 imirkin: then feed them to OBJ90B7
08:05 imirkin: which you're going to have to figure out wtf all those values mean
08:05 imirkin: not sure how it figures out that the data is ready tbh
08:06 imirkin: [the encoded data]
08:06 joi: ets2 seems to slow down with time (~2h) on nouveau, then it starts to glitch and if I keep playing OOM killer kicks in
08:06 joi: it doesn't happen with intel
08:07 joi: so how to make a good bug report?
08:07 karolherbst: ets2?
08:07 joi: euro truck simulator 2
08:11 joi: mesa is from Dec 19 (master)
08:18 imirkin: joi: tricky... this all points to a leak
08:18 imirkin: joi: i did plug a very minor in the past day or two
08:19 joi: yeah, I saw that
08:19 imirkin: unlikely to be the cause though
08:25 imirkin: karolherbst: actually i don't think it's doing RGB -> YUV... i think it's just a plain blit. weird.
08:26 imirkin: karolherbst: ohhhh... i think it's taking YUV data and splitting it into Y and UV data
08:27 karolherbst: ohhhh
08:27 karolherbst: that would explain why there are two data uploads per frame
08:28 imirkin: or... not. dunno. those shaders are confusing. i don't actually know how the image stuff works =/
08:28 karolherbst: yeah well, I could just put in data into those shaders and check what the result is
08:29 karolherbst: :D
08:29 karolherbst: maybe not
08:29 imirkin: karolherbst: either way, you need to start with mmiotrace
08:29 imirkin: karolherbst: extract the firmware uploaded to the engine, and any other engine init that needs to be done
08:30 imirkin: karolherbst: i'm sure it's the same as all the other falcon engines, so basically you just need the firmware
08:40 imirkin: oh i bet those 304 writes cause a semaphore to be written. that's how it knows.
08:58 karolherbst: imirkin: by the way, I got the vbios from the phoronix 780 ti, but I have no clue what is going wrong there :/ I bet it is something funky on his side, anyway I asked for dmesg :D
08:59 RSpliet: isn't dmesg available on his benchmarking website? I think he documents that stuff
08:59 karolherbst: mupuf: any complaints about those two commits? https://github.com/karolherbst/envytools/commits/nvbios_unk50
08:59 karolherbst: RSpliet: no idea, will check
08:59 karolherbst: but he could tell me that too :D
09:00 RSpliet: sure, but he's a journalist, not a user :-P
09:00 RSpliet: wrt feedback on your envytools patch: what do "t0" and "t1" mean. I find those names not very descriptive
09:00 karolherbst: RSpliet: you are right :O
09:00 karolherbst: temperature
09:01 karolherbst: some thresholds, no idea what though
09:01 karolherbst: t1 seems to be near to the sw downclock though
09:02 karolherbst: and t0 somewhere where clocks seems to drop a little, but who knows what they do
09:02 karolherbst: usually something like 84°C and 95°C
09:02 RSpliet: thanks. in the unk5c patch, I'd opt for "rpmN" instead of "speedN" personally, since it carries more information that way (could've been KM/h or light years :-P)
09:03 karolherbst: :D
09:03 karolherbst: ohh right
09:03 RSpliet: hmm, light years is a distance, but light years per year isn't :-D
09:04 karolherbst: ohhh
09:04 karolherbst: I forgot to push my local changes
09:04 karolherbst: but I only changed a bit in unk5c I think
09:04 karolherbst: the unk is most likely the duty of the pwm
09:05 RSpliet: sounds verifiable
09:05 karolherbst: https://gist.github.com/karolherbst/701f256c1cf6a66b3c6d
09:05 karolherbst: I also added the 54 table here
09:05 karolherbst: there is this strong corelation with the unk0e and unk10 values in the 54 table
09:06 karolherbst: ohh I meant 58 table :D
09:07 RSpliet: anyway, naming is bikesheddable but the most annoying part of PM :-P
09:07 RSpliet: choose them wisely, young grasssmoker
09:08 karolherbst: Tom^: you also get those lines do you? "nouveau 0000:01:00.0: clk: base: 705 MHz, boost: 797 MHz"
09:08 karolherbst: the base boost thingy
09:08 Tom^: good morning.
09:08 karolherbst: :D
09:09 Tom^: yea i do, but on the blob atm.
09:10 karolherbst: ohhh
09:10 karolherbst: he really didn't used my branch :O
09:10 karolherbst: *use
09:10 Tom^: who?
09:10 karolherbst: phoronix
09:10 Tom^: oh
09:11 imirkin: he's not exactly a careful guy... if he can get something wrong, he probably will
09:11 Tom^: karolherbst: get your stuff upstream ! :P
09:11 karolherbst: not yet
09:11 imirkin: the whole out-of-tree thing is pretty difficult for most people to grasp
09:11 karolherbst: I know
09:12 imirkin: since it's not how *any* other project does it
09:12 imirkin: ben loves it, but basically everyone else hates it
09:12 karolherbst: mhhh
09:12 karolherbst: I don't mind too
09:12 karolherbst: just installing is a bit messy
09:12 karolherbst: but I don't need that :D
09:12 karolherbst: I just insmod the module and I am done
09:13 imirkin: right
09:13 imirkin: but like idiots, people run "make install" hoping for good times
09:13 karolherbst: right
09:13 imirkin: which i never do on a kernel tree in the first place
09:13 imirkin: since it never leads to good times :)
09:13 imirkin: anyways, gtg
09:16 damex: is there something i can do to make nouveau card (intel + nouveau. hd3k + nvs4200m) to do output on full screen based on resolution of the nouveau screen instead of intel? i am passing output of nouveau to intel and use nouveau output normally.
09:17 damex: it is like when i make some video fullscreen - it becomes 1600x900 on 1920x1080 screen. in the borderless window.
09:19 RSpliet: damex: I'm sorry, but I have trouble parsing your problem description
09:20 RSpliet: could you first clarify: how many monitors do you have plugged in? which GPUs are running which (at which resolution)? is this an Optimus set-up or something else?
09:20 damex: RSpliet, there is laptop with nvs4200m + intel 3k. optimus with hardware mux. i make optimus work like described on nouveau wiki. i pass nouveau outputs to intel card.
09:21 damex: RSpliet, two. 1920x1080 to nouveau (dp) and 1600x900 to intel (lvds laptop screen).
09:22 RSpliet: okay, so your problem is: if you try to run a full-screen application on the DP monitor, the viewport is the size of the intel monitor?
09:23 damex: RSpliet, yes
09:24 RSpliet: that's... hmm, okay, I've never heard of this problem before. Which versions of the kernel and X.org are you using?
09:25 damex: RSpliet, 4.3.3 kernel and 1.17.4 xorg-x11
09:25 karolherbst: Tom^: how many fps did nouveau got in heaven for you? :D
09:26 karolherbst: in avg
09:26 Tom^: karolherbst: http://i.imgur.com/owbfkEt.png
09:27 karolherbst: and on the blob?
09:27 Tom^: http://i.imgur.com/hDl6eiu.png
09:27 RSpliet: damex: okay, nice and up to date :-) which application are you having trouble with btw?
09:28 damex: RSpliet, vlc/mpv/fullscreen videos (flash/html5-something) in firefox/chrome.
09:28 damex: desktop itself scaling just fine
09:28 Tom^: karolherbst: so like 60% of blob perf :P
09:28 karolherbst: yeah
09:28 karolherbst: it is fine
09:28 karolherbst: yuo don't have those pcie patches yet
09:29 karolherbst: so there is a little more improvement still
09:29 Tom^: some things run better then others tho
09:29 RSpliet: damex: wow all of them!
09:29 RSpliet: hmm
09:29 karolherbst: Tom^: yeah, unigine is a beast
09:29 Tom^: unigine with 4x aa, ultra, tesselation, and cs:go on low everything 1024x768 i get like 100 with dips. :P
09:30 karolherbst: :D
09:30 karolherbst: yeah something is funky with cs:go
09:30 karolherbst: maybe pcie
09:30 Tom^: mmh
09:30 karolherbst: Tom^: try out cs:go on maxed out settings
09:30 karolherbst: and see if you get a noticable fps drop
09:30 karolherbst: and with noticable I mean below 15
09:30 karolherbst: :D
09:30 Tom^: its noticeable but not below 15 no
09:30 Tom^: it goes to like 50 - 60
09:30 karolherbst: yeah
09:30 damex: RSpliet, something i can try/check?
09:30 karolherbst: that sounds like pcie issues
09:31 RSpliet: damex: let's focus on VLC for now, have you tried various output modules? (GLX, SDL, XVideo)?
09:31 damex: auto/gl
09:31 karolherbst: Tom^: I bet with higher pcie link you would get much higher fps on low
09:31 damex: hm.. lemme reboot to new kernel (bfs) and try again in vlc
09:33 karolherbst: Tom^: if you have some time we could try that pcie stuff out
09:34 karolherbst: Tom^: branch named tom, you still need your voltage hack though
09:35 damex: RSpliet, sorry but tried again and it works now
09:35 damex: in vlc
09:35 damex: lemme experiment ;)
09:36 Tom^: karolherbst: sure, just gonna go buy some groceries, so il be back in like 30 min
09:38 damex: RSpliet, changes since last try - new git head for intel driver.
09:39 damex: now i see that flash videos get intel resolution. html5/vlc/mpv work fine and get real nouveau fullscreen resolution with xv/gl.
09:39 damex: im sorry but i don't remember last intel commit hash :<
09:39 damex: s/last/previous/
09:54 orbea: I'm having issues with nouveau/mesa using opengl hardware rendering with the dolphin-emu, im on nouveau commit b18bc03, libdrm b38a4b2, and mesa dfce975. See these scrots to see what the issues look like http://ks392457.kimsufi.com/orbea/stuff/pics/scrots/hardware-rendering/ Software rendering dosn't have these issues, but its also intentionally unplayable with dolphin. is there anything that can be done
09:54 orbea: to remedy this short of using nvidia drivers?
09:54 orbea: also, its a gtx 780 ti card
09:59 imirkin_: orbea: make an apitrace or dff so that i can repro
10:00 orbea:looks up how
10:02 orbea: might have to make an apitrace slackbuild...i'll update you in a bit
10:03 karolherbst: imirkin: my kernel hangs again while doing mmiotraces :/
10:03 imirkin_: orbea: you can also capture stuff in dolphin to make a dff file
10:03 karolherbst: imirkin: do you think the stuff might be in an old trace where I didn't use the nvenc at all?
10:03 imirkin_: orbea: but tbh i've never done that
10:03 imirkin_: karolherbst: unlikely
10:03 imirkin_: karolherbst: iirc that stuff only gets initialized on use
10:04 karolherbst: k
10:13 pmoreau: l1k: Hello! :-) How is the switching work going on? I lost track of what got merged and what still needs work? (I do remember support for Retina MBPs wasn't ready.)
10:15 pmoreau: l1k: I'd like to work back on the Optimus situation for my laptop,
10:15 pmoreau: since 4.4 is getting near, and it was waiting on your fix for Intel+Nvidia pre-Retina MBPs
10:16 Tom^: karolherbst: ok ready to rock n roll.
10:17 karolherbst: Tom^: nice
10:22 Tom^: karolherbst: does it have all the volt pstate patches etc?
10:22 karolherbst: yes
10:22 karolherbst: except your volt hack
10:22 Tom^: oki
10:28 Tom^: karolherbst: ok everything seems to run, benchmarking unigine now.
10:28 karolherbst: nice
10:28 karolherbst: mhh
10:28 karolherbst: could you check with lspci -vv that your lnksta on your 780 ti is either 5.0 or 8.0?
10:30 Tom^: "LnkSta:Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-"
10:30 karolherbst: nice
10:37 Tom^: didnt change my unigine score tho :P
10:38 karolherbst: yeah it will only for a handful of stuff
10:38 karolherbst: try cs:go
10:38 imirkin_: only matters if your pci link was being saturated
10:38 pmoreau: \o/ Got DRI3 working! That will be great to easily run things on the other card. :-)
10:38 karolherbst: :D
10:39 Tom^: 1500 mpixels in glxspheres , dont remember what i got before it tho
10:39 karolherbst: well with nvapoke you can simply slow down the link
10:39 karolherbst: it is trivial
10:43 karolherbst: Tom^: so any change in csgo?
10:43 Tom^: im playing , we shall see :P
10:43 Tom^: its not noticeable until 5mins or so
10:44 karolherbst: ohhh, right if that happens later then it won't change much :/ I totally forgot about this
10:48 Tom^: yea the fps drops are still there bu *shrug* i dont think i ever was saturated on my pci link :P
10:49 orbea: heh, making the trace was a lot faster than uploading it will be
10:49 imirkin_: orbea: how big is it?
10:49 imirkin_: orbea: you can use xz -9 to cut it down
10:50 mupuf: karolherbst: sorry, can't check them out right now, I'm flying back home.
10:51 karolherbst: ohh k
10:51 orbea:is not sure why he didn't try to compress it...silly me, still will be slow given the connection to the server I am using
10:52 imirkin_: orbea: before you waste your time, try replaying the trace to make sure that it (a) works and (b) still shows the problem
10:52 Tom^: karolherbst: https://www.pugetsystems.com/labs/articles/Impact-of-PCI-E-Speed-on-Gaming-Performance-518/
10:52 Tom^: karolherbst: so yea, :P
10:52 orbea: right, first time using apitrace, thanks for the tip
10:53 imirkin_: orbea: yeah np. usually it's fine but every so often you hit something dumb
10:53 imirkin_: orbea: actually you almost certainly will with dolphin since it uses ARB_buffer_storage
10:53 imirkin_: orbea: recapture the trace with MESA_EXTENSION_OVERRIDE=-GL_ARB_buffer_storage in the environment
10:53 karolherbst: something is wrong wiht my dns stuff
10:55 orbea: the apitrace still seems to show the issue
10:55 orbea: 90 mb .tar.xz, heh, uploading
10:55 imirkin_: orbea: oh ok. cool. i guess buffer storage works then.
10:55 imirkin_: orbea: you can up it to google drive or something
10:56 orbea: I dont have an google ccount, will be uploadded to a kimsufi server in 5 mins or so
10:56 imirkin_: ok, sgtm
11:05 Tom^: karolherbst: yea various tech sites confirms my doubts, you can barely ever saturate it at 8x :P
11:05 karolherbst: yeah well
11:05 karolherbst: with talos principle or wasteland 2 you will notice
11:05 Tom^: i dont own those games
11:06 glennk: i think it was pretty much only hybrid graphics laptops that cared that much about pcie link speeds?
11:06 karolherbst: for prime offloading pcie speed is important though
11:06 karolherbst: yeah
11:06 karolherbst: +200% speed in glxgears :D
11:07 Tom^: hmm so boot on my igp and offload to the 780ti? xD
11:07 Tom^: i wonder if that even works hm
11:07 glennk: well if its a laptop and the display connector is physically wired to the igp you don't get much choice
11:07 karolherbst: glennk: usually it was only a difference of 5%
11:08 karolherbst: glennk: but wasteland droped 5 fps in the world overview (statically, so if you had 10fps without pcie speed, you get 15 fps with) and talos principle showed a 25% difference
11:08 orbea: imirkin_: here is the dolphin apitrace http://ks392457.kimsufi.com/orbea/stuff/dolphin-emu.trace.tar.xz
11:09 karolherbst: would be nice to know if these games also show a difference on desktop systems though
11:09 imirkin_: orbea: awesome thanks, downloading
11:09 imirkin_: orbea: i have a GK208 here which should be rather similar to your GPU, hopefully it repros here too
11:09 orbea: yea, I hope it its useful :)
11:10 orbea: also, worth mentioning the dolphin is a recent git version too, commit 7ca9a43 from 2015.12.24
11:12 imirkin_: oh wow. so it's *totally* broken on nouveau =/
11:12 orbea: yea :\
11:13 imirkin_: on the bright side, it's some opt going wrong
11:13 imirkin_: e.g. run a debug build with NV50_PROG_OPTIMIZE=1 and it looks like it works
11:13 imirkin_: will try to see wtf we're screwing up
11:14 imirkin_: karolherbst: it'd be nice if you could grab that trace too and replay it on your GK106 -- let me know if it's also all broken or not
11:15 karolherbst: yes
11:15 imirkin_: that way i know to look for generic issues or SM35-specific ones
11:15 imirkin_: (i.e. is the encoding of some instructino wrong, or some opt is wrong)
11:17 karolherbst: orbea: please don't tar single files :D
11:17 imirkin_: looks like something that AlgebraicOpt is triggering one way or the other...
11:17 orbea: better to put it in a dir and tar taht?
11:18 karolherbst: no
11:18 karolherbst: just xz the file directly
11:18 karolherbst: it is no minor issue though, but I just forgot to untar it :D
11:18 orbea: would of taken half an hour to upload if it wasn't tarred
11:18 orbea: heh
11:18 karolherbst: orbea: I mean you can xz -9 file
11:19 orbea: oh.
11:19 orbea: didn't realize
11:19 karolherbst: you use tar to convert a directory into one file
11:19 karolherbst: and the tar file is then compressed through xz
11:19 karolherbst: mhh
11:19 karolherbst: I don't know
11:19 karolherbst: it looks okayish
11:19 karolherbst: but still weird
11:19 karolherbst: the letters are bad though
11:19 imirkin_: looks like something in my CVT_EXTBF thing
11:19 imirkin_: not *too* surprised i missed something there
11:19 karolherbst: I would say I got the same issues
11:20 karolherbst: what are the issues by the way?
11:20 imirkin_: karolherbst: does it say reading disc in clear letters right on start?
11:20 imirkin_: if not, you got the issues.
11:20 karolherbst: no
11:20 orbea: karolherbst: these are the issues I see on my end http://ks392457.kimsufi.com/orbea/stuff/pics/scrots/hardware-rendering/
11:20 imirkin_: ok cool
11:20 imirkin_: thanks
11:20 karolherbst: the ingame letters are totally not visibale
11:20 karolherbst: also the clouds are weird
11:20 karolherbst: ohh yeah, like the screenshot
11:21 orbea: other games have bad colors/broken text too
11:21 orbea: tried both gamecube and wii
11:21 imirkin_: yeah give me a few
11:21 imirkin_: i think i've isolated it
11:21 imirkin_: i should have it figured out in not too long
11:21 orbea: :)
11:23 imirkin_: ohhhhhhhhhhhh
11:23 imirkin_: gr
11:23 imirkin_: urgh.
11:23 imirkin_: float(int & 0xff) != float(s8)
11:23 imirkin_: it in fact needs to be float(u8)
11:24 karolherbst: :O
11:27 imirkin_: yep, that fixes it.
11:29 imirkin_: orbea: try this patch: http://hastebin.com/ticiwakawu.md
11:29 imirkin_: orbea: that fixes your trace for me
11:30 orbea: trying, hold on
11:39 orbea: imirkin_: how are you applying the patch? I keep getting wrojng p level errors
11:39 imirkin_: patch -p1 probably?
11:40 orbea: http://dpaste.com/04C5B8P
11:41 orbea: -p0 is no better
11:41 imirkin_: what dir are you in?
11:41 imirkin_: are you sure you saved it correctly?
11:41 imirkin_: maybe something helpfully rewrapped it for you?
11:41 orbea: Im in the nouveau src directory
11:43 imirkin_: you should be in the mesa dir
11:43 orbea: oh
11:44 orbea:goes off to patch the right program.... :P
12:04 orbea: imirkin_: the text and graphics look right again, awesome
12:05 imirkin_: orbea: ok great. i'll push it out
12:07 orbea: do you think it would fix stable versions of mesa too? If so I could share it with Pat so he could patch the mesa in slackware
12:07 imirkin_: yeah
12:08 imirkin_: it should apply to the 11.0 series as well
12:08 orbea: cool
12:08 imirkin_: the opt didn't exist before that
12:09 imirkin_: orbea: fix pushed to git master
12:09 imirkin_: orbea: should be included in the next 11.0.x and 11.1.x releases.
12:10 orbea: nice
13:59 l1k: pmoreau: ping? (sorry I was AFK)
14:03 pmoreau: l1k: pong
14:03 pmoreau: No problem
14:03 l1k: hah I was just starting to compose an e-mail to you.
14:04 l1k: I'm wrapping up work on v5 and would have written you over the next 7 days or so to test it.
14:04 l1k: the latest version I have on github is https://github.com/l1k/linux/commits/gmux-latest-drm-intel-nightly-ddcproxy-runpm
14:05 l1k: but it's already outdated
14:05 l1k: (I'm sitting at a different machine now, so can't push... it's complicated ;)
14:05 pmoreau: Awesome! So what got merged?
14:06 l1k: actually so far only some basic groundwork like that commit by Matthew Garrett to fix a race condition.
14:06 l1k: this version on github is 2 weeks old and still uses reprobing.
14:07 l1k: I've ditched reprobing and use deferred probing instead.
14:07 l1k: like danvet had suggested.
14:07 pmoreau: Ok
14:08 l1k: basically the reason why I didn't like deferred probing initially because I thought I'd have to hardcode DMI IDs of all MacBook Pros and that list would have to be continually amended as new models come out.
14:08 pmoreau: On which kernel version is it based now?
14:09 pmoreau: Does it include the big Nouveau rewrite?
14:09 l1k: but I found a way around that, we can just check if gmux is present. all dual GPU macs have a gmux. which is all MacBook Pros with dual GPUs and the 2013 Mac Pro. the trashcan.
14:10 l1k: I'm currently based on drm-intel-nightly.
14:10 l1k: which is a bit ahead of drm-next
14:10 l1k: which nouveau rewrite, the one for 4.3?
14:10 pmoreau: Nice :-)
14:10 l1k: that seems like ages ago.
14:10 pmoreau: :-D
14:11 l1k: so to detect the presence of gmux I check for it's ACPI hardware ID, which is APP000B.
14:11 pmoreau: I remember the discussion about checking for gmux :-)
14:11 pmoreau: That's nicer: only need to hardcode one value for all of them
14:12 l1k: I looked at the entire kernel tree and realized there's an idiom to check for an ACPI HID by calling acpi_get_devices()
14:13 pmoreau: Any progress on retina MBPs? There was still the aux DDC switch that was problematic IIRC.
14:13 l1k: this needs the definition of the callback and leads to lots of duplicate code. there are 7 drivers using this approach, each duplicating the same chunk of code.
14:13 pmoreau: --"
14:14 l1k: so I wrote a helper acpi_dev_present(), basically the ACPI equivalent to pci_dev_present(), this was merged into rafael wysocki's tree and will land in 4.5. it's this commit: https://github.com/l1k/linux/commit/0c81947ce177c1c85e4ef0ab1d93198953c13f11
14:16 l1k: once this has landed I can check for the presence of gmux and check if gmux has registered as a handler, and if it hasn't, the drivers defer probing. works really well and is much simpler than reprobing, so I guess danvet was right all along.
14:17 pmoreau: Do you think the defered probing could land in 4.5 as well, or will it wait for 4.6?
14:17 l1k: with the retina MBP we've had some limited success with AUX proxying. it works if nouveau is the inactive GPU on boot, but doesn't if intel is the inactive GPU.
14:18 pmoreau: :-/ The problem lies in the Intel or the Nouveau driver? Or somewhere else?
14:19 l1k: the problem is link training. I've seen in the logs that nouveau tries to write to the DPCD a couple of times that link training pattern 1 is going to be transmitted. and this dpcd write will fail because I dont allow that to be proxied. so nouveau tries a couple of times, then shrugs it off and sets up the output as if it worked. the i915 driver is less forgiving.
14:21 l1k: I was going to send v5 of the patch set to you and a couple other testers over the next few days, then post it to the list in early january. I believe drm-next is already closed so it could land in 4.7 at the earliest. :( if there are no further objections which postpone it further.
14:21 l1k: and this patch set will be DDC switching for pre-retinas plus deferred probing.
14:22 l1k: so this would make gpu switching work on pre-retinas, and then I'll see about retinas.
14:22 pmoreau: drm-next is already closed for 4.6? I thought the merge window hadn't already opened for 4.5…
14:23 l1k: oops. I guess I'm mixing things up.
14:23 l1k: yeah it could make into 4.6 then.
14:23 l1k: right.
14:23 l1k: god I'm losing track of things.
14:23 pmoreau: 4.7 would put it in a **really** long time :-)
14:24 l1k: so about the retinas...
14:24 l1k: I was browsing the DP spec two weeks ago... turns out there'
14:24 l1k: there's a special provision for "a closed, embedded link".
14:25 pmoreau: And we would need the switching to work on retinas before adding the general switching patches from airlied.
14:25 l1k: in this case if we already know pre-calibrated link parameters, we can use a minimal link training sequence.
14:25 pmoreau: Oh? There's some kind of minimal/fail-safe communication setup?
14:25 l1k: basically we just send link training pattern 1 for clock recovery and can then resume normal operation.
14:27 l1k: and we could use that for switching. basically the active GPU would do the link training. then store the known-good values for pre-emphasis etc in vga_switcheroo. the inactive GPU defers probing until these values are available, obtains them, then sets up the link using the shortened link training.
14:27 l1k: but I've looked at the drivers and none of them supports this.
14:28 l1k: I've looked at i915 specifically and it didn't seem to be hard to add this.
14:28 l1k: nouveau is usually a bit more complicated. radeon I haven't looked at yet.
14:29 l1k: I think this is also the method that apple uses, there's a call to some copyEDPConfig function in their driver which literally just copies some bytes in memory, I bet these are the values for pre-emphasis, voltage swing etc.
14:30 l1k: what do you mean above, "the general switching patches from airlied"?
14:32 pmoreau: Sounds nice! (However, I don't know how DP works so I don't grasp everything, but I get the idea.)
14:33 l1k: often I don't know what I'm doing either when hacking on this stuff and then I'm surprised that something actually works
14:33 pmoreau: I submitted (a year ago?) some patch to add 2009 and 2010 MBP's gmux handler so that Nouveau would detect those as Optimus setup and automatically power down the card not in use.
14:34 l1k: yes, I've also had a look at this.
14:34 pmoreau: However I was told and linked to some patches that would handle it in a more generic fashion
14:34 l1k: there are a couple of patches for this problem as well on github:
14:34 pmoreau: But those were hold by the switching on Intel+Nvidia MBP failing.
14:34 l1k: this is for nouveau: https://github.com/l1k/linux/commit/a488fec41fb6341f05492a82c42a038ca04b7322
14:35 l1k: I'm using the apple_gmux_present() that I introduced for deferred probing.
14:35 l1k: I think you used a different approach, you checked for the acpi handle instead.
14:36 pmoreau: Right
14:36 l1k: and I also did this for gmux: https://github.com/l1k/linux/commit/2876c25133f3d63381eea083fc4f680ada92664d
14:37 l1k: because I analysed the schematics of the different MacBook Pros in detail and all models with thunderbolt are no longer able to switch the external port.
14:38 l1k: the MBP5 2008/09 and MBP6 2010 don't have thunderbolt. those can fully switch the external port.
14:38 l1k: there's a DP mux built into these models, NXP CBTL06141, which is controlled by gmux.
14:39 l1k: on those machines with thunderbolt, only *part* of the external DP port is switchable.
14:39 l1k: AUX and HPD is switchable, but the main link is not.
14:39 l1k: crazy, isn't it?
14:39 pmoreau: I don't remember if they changed anything on later models, but I hope they didn't messed it up more.
14:40 pmoreau: Yeah…
14:40 l1k: so to the integrated GPU the external port looks like a phantom display which fails to link-train. it's totally useless, I don't know why appke did this.
14:40 l1k: s/appke/apple
14:41 l1k: basically the only reason I can imagine is that they found HPD to be unreliable on the external port so maybe they let the integrated GPU periodically poll the port, and then wake up the integrated GPU to drive it.
14:41 pmoreau: Is that the case for all retina models or did they fixed it at some point later?
14:42 l1k: even the retinas have this, they have two extzernal DP ports, and there's no less than 4 mux chips to switch AUX/HPD between integrated and discrete GPU. but not the main link.
14:42 l1k: it must be deliberate.
14:43 pmoreau: :-/
14:43 l1k: anyway so the patch linked above changes gmux to not switch the external port at all on those machines which have thunderbolt.
14:44 l1k: just pin it to the discrete GPU because the integrated GPU can't drive it anyway. and wake up the discrete GPU on hotplug, because gmux receives the HPD pin of all ports.
14:44 l1k: I've tested runtime pm with my patches and found there are still numerous runtime pm issues.
14:45 pmoreau: Talking of external ports reminds me I still need to have a look and fix when plugging an external screen to my laptop.
14:45 l1k: when the machine is booted with nvidia, then switched to intel, the nvidia card doesn't go to sleep for some reason.
14:45 pmoreau: Which MBP model do you have? Is it a 2011 one?
14:46 l1k: when the machine is booted with intel, nouveau will put the card to sleep after 5 seconds. but when I switch to the nvidia card it's not woken.
14:46 l1k: I have the MBP 9,1 2012
14:46 pmoreau: :-(
14:47 pmoreau: I'll check the pm on mine when I'll test your patches.
14:47 l1k: bottom line, no matter if we use your approach to match the acpi handle, or my approach to match the acpi HID, it doesn't work because of these runtime pm issues.
14:47 pmoreau: Did you had any troubles when plugging an external screen on yours and trying to run some OpenGL program on it?
14:49 l1k: I still don't have X11 running, but I tested the external port today and what I noticed is that if I boot with nvidia, then plug in the external display, nouveau just sits there and doesn't do anything, the screen stays black.
14:49 pmoreau: (It was heavy flickering in my case, which persisted even after unplugging the monitor. Haven't tried back for some time now, so it could have been fixed.)
14:50 pmoreau: Oh! It could be that the behaviour depended on the adaptor used… that rings a bell
14:50 l1k: then when I switch to intel and back to nvidia, it will poll the outputs when waking up the card and *then* it will properly detect and drive the external display.
14:50 pmoreau: Hum…
14:50 l1k: it was an Apple DP-to-VGA adapter.
14:50 l1k: with an ancient eizo T67 monitor. :)
14:51 pmoreau: I'll have to try again, haven't tested back for a year at least. ;-)
14:52 l1k: yeah. lots of stuff in nouveau that still needs to be fixed. but it's getting better. e.g. today when testing switching it complained that the card refuses to wake up from D3, but it still worked. half a year ago everythiing would freeze at that point.
14:52 pmoreau: :-)
14:53 l1k: karolherbst also discovered a runtime pm issue, apparently a runtime pm ref is taken when unloading the module but not released. lots of stuff that needs fixing...
14:53 pmoreau: I remember that one on the ML and IRC. Did it got merged?
14:54 karolherbst: l1k: yes, I noticed that too
14:54 karolherbst: l1k: I got a fix, but that sometimes crashes my kernel
15:01 l1k: pmoreau: when I send you the patches for testing, should I rebase drm-next or can you also handle it if they're based on drm-intel-nightly=
15:02 l1k: sorry my typing is sloppy this time of night, "should I rebase ON drm-next"
15:03 pmoreau: I usually follow drm-next, but I should be able to work it out for drm-intel-nightly.
15:05 l1k: ok I'm still doing final review and cleaning up bits here and there, applying polish, so I'll mail you in a couple of days...
15:05 l1k: end of this week maybe.
15:06 pmoreau: Sure, no hurry :-)
15:50 imirkin: karolherbst: btw, do you plan on looking into ARB_framebuffer_no_attachments, or is that not something interesting to you?
16:09 Tom^: he is busy playing divinity with me :p
16:14 imirkin: wait so it works ok now?
16:18 Tom^: on blob. :P
16:19 imirkin: oh :)
16:20 imirkin: hakzsam: btw, i forget where you're at with all the compute stuff, but on the off chance you're interested, it should be possible to start working on ARB_compute_shader support (basing your work on my atomic3 branch)
16:20 imirkin: hakzsam: it'd be a substantial undertaking though
16:21 imirkin: esp for someone not wholly familiar with st/mesa and so on. but i think it'd be a good learning experience ;)
16:39 joi: imirkin: http://paste.ubuntu.com/14272110/ <- valgrind log for glretrace of short (1min) ets2 session
16:49 orbea: I have another emulator problem, this time PCSX2 (commit 3fd0b10_2015.12.21) with their opengl hardware renderer. Here is a apitrace http://ks392457.kimsufi.com/orbea/stuff/apitrace/PCSX2.trace.xz its a bit big, but I skipped as much as I could, wait till the massive slowdowns when kos-mos appears. One catch, PCSX2 is 32 bit only and so I had to use a 32 bit apitrace on it... The software renderer is
16:49 orbea: better, but still not very playable. Less intesnsive games are working well with the software renderer though.
16:51 imirkin: joi: could you let me have that trace? looks like glretrace doesn't close down the context, so a ton of stuff is left unfreed =/
16:52 imirkin: orbea: will have a look... will the problem be obvious?
16:52 orbea: yes, massive slowdowns, stuttering, just like its stuck
16:52 imirkin: hmmmmmmmm
16:52 imirkin: that might be a "emulator doing something silly" issue. will have a look anyways though.
16:52 orbea: xenosaga is supposed to be performance intensive
16:52 imirkin: so it's not misrendering or anything like that right?
16:53 orbea: looks, fine, just at an unplayable speed
16:53 imirkin: gotcha
16:53 imirkin: that might be tougher to debug
16:53 imirkin: but i'll have a look
16:53 orbea: cool, thanks
17:18 joi: imirkin: http://people.freedesktop.org/~mslusarz/eurotrucks2.trace.xz
17:19 imirkin: joi: cool thanks. not sure why glretrace didn't close things down for you... i think it normally does =/
17:21 joi: imirkin: note that I used --leak-check=full --show-leak-kinds=all - without it Valgrind reports only unreachable memory
17:22 imirkin: joi: oh... hm. right
17:22 joi: which is ~1kB
17:22 imirkin: anything interesting?
17:23 joi: nope, they are included here: http://paste.ubuntu.com/14272110/ as "definitely lost"
17:23 imirkin: ah right
23:30 Tom^: karolherbst: sleep well.
23:50 karolherbst: imirkin: no idea :/ I think I still need some more OpenGL or mesa/gallium internals knowledge before I could jump into that