00:01imirkin: quiliro: getting dmesg + xorg log after the "hang" would be great
01:52quiliro: imirkin: thank you...will test when I get back to that machine
01:52quiliro: thank you likewise Lekensteyn
02:21skeggsb: imirkin: yes, i have thought about it, but i a) have no idea how it's implemented, and b) couldn't come up with a configuration where the blob did it either
02:21skeggsb: i didn't try particularly hard yet though
02:21skeggsb: display was the priority for the moment
10:42karolherbst: mhh, is it normal, that when a machine boots without a display, that I don't get any output even after plugging in a display?
10:42karolherbst: no X started
11:15pmoreau: Given the following program (http://hastebin.com/alijohofog.bash) which is made of two kernels/entry points (global_write and global_read) and a regular function (foo), shouldn’t the code be padded with nops such that (at least) each entry point starts with its own scheduling instruction?
11:17pmoreau: Running the first kernel, then the second one, works and the hardware does not seem to complain, but it still seems weird to me. :-D
11:18mwk: what the fuck, this is legal?
11:19pmoreau: Maybe? :-D
11:20pmoreau: (I manually added the labels to distinguish the functions)
11:21pmoreau: Otherwise it is just a continuous flow of instructions.
11:21karolherbst: mhh looks fine to me
11:21karolherbst: ohh wait
11:21karolherbst: I see what you mean
11:22karolherbst: pmoreau: I doubt it, cause you have the same issues with bras already
11:22karolherbst: and we don't fill BBs with nops too
11:22karolherbst: would be an interesting test though
11:23pmoreau: Except that with bras, you will always have read at least one sched insn somewhere in the past.
11:23karolherbst: mhh, am I a git noob now or why doesn't this work: "git fetch origin linux-4.10; git checkout origin/linux-4.10"
11:24pmoreau: Whereas here, if you start directly with the kernel global_read, you won’t read any sched insn before reaching the third insn.
11:24karolherbst: ohh right, I see
11:24karolherbst: checked what nvidia does?
11:24pmoreau: Not yet
11:24imirkin: skeggsb: yeah, i suspect it's tricky. dunno if intel has it working either - airlied had patches at one point, but i think they messed other things up.
11:24karolherbst: pmoreau: I am sure the hw will default to 0x0 sched opcodes
11:25pmoreau: karolherbst: For the git thing, I guess that you get a detatched HEAD rather than a new branch?
11:25karolherbst: I can't checkout that branch
11:25karolherbst: "error: pathspec 'linux-4.10' did not match any file(s) known to git."
11:25imirkin: pmoreau: i think it's perfectly legal
11:25karolherbst: or "error: pathspec 'origin/linux-4.10' did not match any file(s) known to git."
11:25imirkin: sched isn't an upcode
11:26imirkin: opcodes are groups of 4 or 8 64-bit words
11:26pmoreau: karolherbst: Why not just do a `git fetch origin`?
11:26imirkin: which express 3 or 7 instructions
11:26imirkin: (for maxwell or kepler)
11:26karolherbst: pmoreau: cause it doesn't do anything
11:26karolherbst: I cloned with --depth 1
11:26karolherbst: because I really don't want to clone an entire git tree
11:27imirkin: we express them as instructions because it's convenient to do it that way, but it's not actually something that's executed
11:27pmoreau: imirkin: And the scheduling part happens to be the firs part in of an opcode?
11:27imirkin: of an opcode group, yes
11:28imirkin: always aligned to 64 (or 32) bytes
11:28karolherbst: still I would like to know what nvidia does
11:28imirkin: i.e. if addr & 0x3f == 0, it *must* be a sched
11:28karolherbst: but I guess the hardware is somehow aware of that
11:29pmoreau: I should have a closer look at the emit code, at least more than "there is a code and a code, and there is a lot of bit-shifting!"
11:29karolherbst: or maybe we would have to adjust those scheds to deal with all the rets and calls and whatever
11:30imirkin: i'm not 100% clear on why they decided to put the sched info into a separate block rather than make it part of the insn... ran out of bits i suppose
11:30imirkin: you can also actually branch to an address that has sched info in it. in that case, my observation is that it just executes starting the next "real" opcode.
11:31pmoreau: Oh, fun!
11:31imirkin: i've added logic to avoid that in the nouveau emitters because it's plain confusing
11:31imirkin: but i haven't observed any problems resulting from such branches
11:31imirkin: but when you're debugging a weird problem, any such oddness is a plausible explanation for all your woes...
11:31pmoreau: That logic was added like yesterday or the day before, wasn’t it? I remember seeing some patches that could match :-)
11:32imirkin: for maxwell, yeah
11:32imirkin: for kepler, i did it a very long time ago
11:32imirkin: (or perhaps it was already like that, i forget)
11:32imirkin: (or perhaps i did it for gk110 a very long time ago?)
11:44karolherbst: yeah... this should be fixed as well
11:45karolherbst: as it seems leaving nouveau without any display attached is kind of risky without X
11:48pmoreau: While I am at it with entry points: currently for compute, all entry points are attached to the main function's node, even if main is never called/is empty. It seems to be working but… it's a bit weird to think of all entry points being attached to a possibly existing function.
11:49karolherbst: uhh, "disp: ERROR 5 [INVALID_STATE]" and then "link training failed"
11:49karolherbst: I guess I will try with skeggsbs 4.10 tree first before reporting anything on bugzilla
11:50pmoreau: Should there be a vector of entry points instead, with main being one of them? Might not change anything, apart from "cosmetic", so it could be not worth it.
11:50imirkin: pmoreau: yeah mostly cosmetic. but also semi-important, since you have to have a CFG for functions too
11:50imirkin: e.g. one function can call another
11:50imirkin: and it's a lot easier to have a CFG with a single root
11:50imirkin: even if it's a bit of a lie
11:53pmoreau: I’m not against keeping things as-is, especially as it means less work and less opportunities to screw things up.
11:53imirkin: so the thing is that functions can call one another
11:54imirkin: and we don't really have a calling convention
11:54imirkin: we could create one and push/pop things via l
11:54imirkin: but we don't right now - at least i don't think. we try to pass all params via regs, and we allow functions to call one another.
11:54pmoreau: Calling convention, as in how to pass arguments around and get the return value
11:55pmoreau: I would assume that if register pressure is low, it is more efficient to pass arguments via regs than local memory
11:56imirkin: of course.
11:56imirkin: but there's the additional issue of caller- and callee-saved regs
11:56imirkin: right now RA is done as if it were one big happy function
11:57imirkin: and i believe the tendency is for top-level functions to not have any params or return values
11:57pmoreau: So, inlining everything, then?
11:57imirkin: since it all comes in via memory
11:57imirkin: not exactly inlining
11:57imirkin: just doing RA as if it were one big function :)
11:57pmoreau: Ah, k
11:58karolherbst: how can that work if you can call functions from multiple places? jsut use local memory everywhere?
11:58pmoreau: So, even if no inlining is done, the functions that are called aren’t going to use registers from the callee (except if there is no liveness overlap of course)
11:59karolherbst: and isn't inlining everything so that we can apply all SSA opts on the big thing in the end the most smartest thing to do?
11:59karolherbst: ohh wait, that only works if you have one entry point...
12:01imirkin: karolherbst: well, functions can recurse, etc
12:01imirkin: inlining isn't always practical
12:01karolherbst: if you are smart enough you can build loops out of this
12:01karolherbst: but I see the issue
12:10imirkin: anyone have a G84-G98 gpu plugged in? i'd like to get 'glxinfo -l -s' output for it against mesa 13.0
12:11karolherbst: skeggsb: black screen with your tree
12:12karolherbst: imirkin: I guess a mcp79 won't help
12:13karolherbst: my tesla can't handle my active DP to HDMI adapter....
12:15karolherbst: nice, EDID can't be read
12:15karolherbst: yeah, with both DP adapters
12:16pmoreau: imirkin: I’ll get you one from my G96
12:19pmoreau: karolherbst: Could you please share your .config with me? I haven’t been able to run any of the kernels I built: always getting some symbols undefined… --"
12:19karolherbst: pmoreau: what symbols?
12:20pmoreau: memcpy and a ton of others in scsi_mode and firewire, or some unknown symbol in the btrfs module but it didn’t gave anymore explanation
12:21karolherbst: right, I know what you did wrong
12:21karolherbst: there is this config to disable non used symbols
12:21karolherbst: don't use that
12:21pmoreau: imirkin: http://hastebin.com/aqipujizuq.py
12:22pmoreau: Let me have a look…
12:24pmoreau: karolherbst: "Enable unused/obsolete exported symbols"?
12:24pmoreau: It is enabled in my config, so they should get exported
12:25karolherbst: maybe there was another one
12:25karolherbst: let me try to find it
12:25pmoreau: There is "TRIM_UNUSED_KSYMS", but it is set to no
12:26karolherbst: did you enable the expert stuff?
12:26karolherbst: in doubt, make clean
12:26karolherbst: right, you use a plain make to compile the kernel, right?
12:26pmoreau: I used the default config from Arch, just saying no to new options
12:27karolherbst: this should work
12:27pmoreau: I have cleaned, mrpropered at least five times since yesterday, and never managed to get it to work
12:27karolherbst: try defconfig
12:28karolherbst: I don't trust those arch kernel image maintainers anymore after they decided to remove files for no reason
12:28imirkin: pmoreau: thanks!
12:28karolherbst: except arm files make no sense on x86, so remove all arm headers....
12:30pmoreau: Let’s see how it goes with defconfig + reenabling Nouveau and increasing some log levels.
12:30pmoreau: imirkin: Need the output from other cards?
12:32imirkin: pmoreau: out of the ones i care about, i'm missing eg/ni, gen7.5 and gen8
12:32imirkin: karolherbst: you have a hsw right? if you have a mesa 13.0 build, mind getting glxinfo -l -s against that?
12:32pmoreau: I got a hsw (h)as well
12:33imirkin: good one
12:33pmoreau: imirkin: http://hastebin.com/eleqidudit.py
12:33imirkin: excellent thanks.
12:33karolherbst: skeggsb: so bascially: DP on my MCP79 is completly broken
12:34pmoreau: Did you had the same issue as poma?
12:35karolherbst: on the linux-4.10 branch
12:35pmoreau: I wanted to try on my laptop 4.9 and Ben’s patches, but I have been runnning on those kernel issues…
12:36pmoreau: So it is still working with a regular 4.9 branch without the MST and atomic serie, I assume?
12:36karolherbst: it works with 4.8.4
12:37karolherbst: except my active DP->HDMI had a few issues, but it could at least read out the EDID
12:37karolherbst: DVI to VGA works fine though
12:38pmoreau: Oh, I thought 4.9 was the first non-working kernel for him, not one of the 4.8 serie…
12:38karolherbst: no idea
12:38pmoreau: So I could try with 4.8.4, no need to compile my own kernel for that one
12:38karolherbst: I had a 4.8.4 kernel installed, so I used that one
12:38karolherbst: with kernel nouveau
12:39karolherbst: ohh wait
12:39karolherbst: pmoreau: I have somehting for ya
12:39karolherbst: pmoreau: https://github.com/karolherbst/nouveau/commits/master_4.8
12:39karolherbst: but without the kms changes
12:39karolherbst: I just droped those patches cause I couldn*t get them rebased
12:39karolherbst: they use some fancy drm-next features
12:40pmoreau: I have a 4.8.6 kenrel installed, but thought I needed some 4.9-rcX one
12:40karolherbst: well, depends on what you want to use
12:41pmoreau: But I still would like to test Ben patches on all my cards :-)
12:41pmoreau: There are two different things I want to test: whether I run into the same issue as poma, and the new serie from Ben.
12:48karolherbst: pmoreau: I guess I will try pomas issue again, now that I have another adapter, which has a few issues
12:50pmoreau: I only had some adapters working, even before the faulty patch, so I have no idea what it is going to be now.
12:50Lekensteyn: is somebody here in power to close bugs on kernel bugzilla? https://bugzilla.kernel.org/show_bug.cgi?id=104791
12:50Lekensteyn: that one got fixed in 4.8
12:51imirkin: Lekensteyn: pretty sure you can do it
12:51imirkin: oh wait. kernel bugzilla.
12:54Lekensteyn: yeah, exactly.
12:56karolherbst: pmoreau: can't say anything about his patch, it doesn't seem to affect anything of my situations
12:57karolherbst: mini DVI -> VGA: still working, passive miniDP -> DVI: still working, active DP -> HDMI: still broken
12:58karolherbst: pmoreau: fun fact: my miniDP -> DVI , display gets listed as HDMI for whatever reason
12:58imirkin: karolherbst: probably means you get a single link
12:58karolherbst: ohh wait, that's on my intel nvm
12:59karolherbst: it's a passive one as well
12:59imirkin: and it's really miniDP -> HDMI with a DVI-looking connector :)
12:59karolherbst: uhh, I am sure it is a real miniDP to DVI adapter
12:59imirkin: define 'real'
12:59karolherbst: HDMI wasn't a thing back then
12:59imirkin: HDMI was a thing before miniDP was a thing
13:00karolherbst: somehow I totally missed that
13:00imirkin: there are nv4x's with HDMI (very poorly functioning)
13:00imirkin: at least ... i think there are.
13:00imirkin: my faint recollection is that they came out at ~same time
13:00imirkin: (DP and HDMI did)
13:01imirkin: Production of consumer HDMI products started in late 2003
13:01imirkin: The first version, 1.0, was approved by VESA on 3 May 2006
13:01imirkin: [of DP]
13:02karolherbst: 3 years
13:03imirkin: either way, both much newer than Karol's laptop :)
13:03karolherbst: maybe HDMI 1.4 was the first really usefull one
13:03imirkin: er, much older
13:03karolherbst: my adapter is pretty old though
13:04karolherbst: well it's form 2009
13:14pmoreau: karolherbst: It did boot! Thanks for the idea of using def.
14:20karolherbst: pmoreau: ... those arch guys :p
15:19Walther: Hello folks! Any information on running GTX970 successfully? I've been trying to debug my install
15:19Walther: https://bugs.freedesktop.org/show_bug.cgi?id=94990 this seems related
15:19mwk:is probably going to uncover a shitload of bugs in NV4 gr context switch code
15:19imirkin: Walther: GTX 970 with 4GB of vram?
15:19mwk: it's hilarious really
15:19Walther: imirkin: yep
15:19imirkin: Walther: yeah, that doesn't work so well for now =/
15:20Walther: yeah, it broke in my linux install some months ago, debugged a bit back then and retried today
15:20Walther: ifi understood correctly, the older nouveau actually fallbacked to sw rendering when it worked or something, and didn't actually even use the GPU :P
15:21Walther: Any ideas on how to get a working setup with this card? Or any guesses on when the fixes will land upstream?
15:22karolherbst: Walther: use the last patch from the bug
15:23karolherbst: it's a hack though and limits your VRAM to 3GB
15:23imirkin: Walther: that's correct.
15:23karolherbst: it isn't really tested, so random oddities might happen
15:23karolherbst: but at least it gives you a working environment with hw accel
15:25karolherbst: Walther: so in the end you have to options, that hack or nouveau.nomodeset=1
15:26imirkin: well, nouveau.noaccel=1 nouveau.nofbaccel=1 is probably a better thing
15:26imirkin: that way you keep modesetting
15:26Walther: hmm. wonder if nouveau nomodeset will give me more than 1024x768
15:26Walther: there's something weird about EDID in logs
15:27karolherbst: uhh right, noaccel and nofbaccel is the better thing
15:27karolherbst: I always assume people have efi_fb...
15:28karolherbst: Walther: with nomodeset?
15:28karolherbst: or in general
15:31Walther: Well frankly with any of the workarounds
15:32karolherbst: well the hack gives you hw accel
15:32karolherbst: but you need to compile nouveau/the kernel
15:32Walther: With nouveau.noaccel and nobfaccel i get Error: no symbol table
15:33Walther: It waits for a bit / asks for any key, and then boots into something and turns display off (no signal)
15:48karolherbst: I see
15:48karolherbst: Walther: would it be possible for you to grab a dmesg from it?
15:49Walther: sure, just a min
15:55Walther: Journalctl from that boot
16:00karolherbst: mhh, unknown connector
16:00karolherbst: Walther: how is the display connected to the gpu?
16:01Walther: DVI, no adapters
16:01karolherbst: Walther: mind cating your vbios from /sys/kernel/debug/dri/0/vbios.rom ?
16:02karolherbst: mhh, 70 is "DisplayPort External Connector"
16:03Walther: sys/kernel/debug/dri empty dir
16:03karolherbst: uhh right
16:03karolherbst: nouveau isn't loaded
16:03Walther: Right, need to nomodeset to get into tty
16:04karolherbst: uhm, odd
16:04karolherbst: Walther: what is your kernel version?
16:04karolherbst: ohhh wait
16:04karolherbst: the value is hex
16:04karolherbst: k, then it is fine
16:05karolherbst: it just complains about the "Virtual connector for Wifi Display (WFD)"
16:05karolherbst: vbios is still interesting though
16:06karolherbst: .. I am silly the version is in the dmesg
16:11karolherbst: skeggsb: "failed to create encoder 1/8/0" does the 8 map to the "reserved" type?
16:15karolherbst: Walther: would it be possible for you to connect the display any other way?
16:25Walther: Sure, there's another dvi port, a hdmi and a displayport, have cables for all but dp
16:25karolherbst: I see
16:26karolherbst: could you verify if it works with the other ones?
16:26karolherbst: I guess there is something in the vbios we don't handle right
16:28Walther: No tty with the other dvi and nouveau.noaccel & nofbaccel
16:31Walther: Hdmi seems to work to tty with correct resolution, startx complains about xf86EnableIOPorts: failed to set IOPL for I/O operation not permitted
16:31karolherbst: ohh well, I guess no video group?
16:31karolherbst: anyway, if it works, then yeah
16:32karolherbst: the vbios will help claryfing this
16:32karolherbst: ohh, you started as root I guess
16:33karolherbst: a user usually needs to be within the video group to start X
16:33karolherbst: which file?
16:41Walther: Kinda weird that with hdmi, nouveau accel off, i cant startx, whereas with nomodeset i can startx as my user
16:42imirkin: not really
16:42imirkin: X uses totally different paths in those cases
16:42imirkin: one way you're ending up with vesa
16:43imirkin: the other with modesetting
16:44Walther: Added myself to video, same permission problem
16:46karolherbst: did you logout and login?
16:47Walther: Wirh sudo startx, ee no screens found
16:47karolherbst: there are two gpus
16:47karolherbst: I mean the vbios.rom in 1
16:48karolherbst: I would suggest to you to maybe use your intel gpu until nvidia gets its ass off and release those pmu firmware files we need to actually set higher clocks
16:49Walther: Sadly that's not really viable, to switch cables each time i boot to linux vs windows
16:49karolherbst: vbios.rom file would still help to figure out why the DVI ports don't work
16:49karolherbst: Walther: ohh right, in this case you can stick with that
16:49karolherbst: you could offload rendering on the intel GPU
16:50karolherbst: but the 970 is pretty fast even on lowest clocks
16:50karolherbst: so I don't know
16:52Walther: Wouldn't mind lower clocks, wouldn't mind offloading
16:52karolherbst: I see
16:52Walther: I just want full resolution X environment with reasonable graphics (youtube working maybe)
16:52karolherbst: right, if you want to play some games, you simply switch to windows I guess
16:53karolherbst: we can look into the DVI issue with the help of your vbios, but if HDMI works, then not that much is broken as it seems. I am sure that the X start issues are related to either wrong configuration
16:53karolherbst: or something important not installed
16:53karolherbst: X log would help
16:55Walther: ha, now i managed to get into fullres X
16:56Walther: nuked the xorgconf i had accidentally leftover from the previous nvidia-xconfig script, and booted with nouveau.noaccel and nouveau.nofbaccel, while using HDMI
16:56Walther: xrandr reports all the happy modes
16:57Walther: wonder what kills the DVI though
16:57imirkin: have you tried both DVI ports?
16:57imirkin: (or is there just one?)
16:58Walther: https://walther.guru/temp/strings-vbios-rom.txt strings of the vbios.rom
16:59imirkin: i think karol wanted the actual vbios.rom file
17:04karolherbst: Walther: the vbios
17:04karolherbst: I always forget to first read all messages
17:04karolherbst: and then to ask stuff
17:05Walther: don't worry :)
17:05karolherbst: DCB 15: type 8 [???] heads 0 1 2 3 CONN 4 [0x70] conntag 15 EXT 0 unk02_6 3 OR 0
17:06karolherbst: type 8 is RESERVED
17:06Walther: this card to be specific https://www.msi.com/Graphics-card/GTX-970-GAMING-4G.html
17:07karolherbst: huh, something is odd
17:11Walther: please elaborate :) Can i help with something?
17:13hussam: heh. WTF (Way Too Fast).
17:26karolherbst: okay, now your issue
17:29karolherbst: Walther: okay, something goes wrong while reading the edid out...
17:30karolherbst: but I have no clue about those things, imirkin knows more
17:35karolherbst: mupuf: GPIO ATX_POWER_LOW triggers vpstate SLOWDOWN_PWR?
17:35karolherbst: or OVER_CURRENT=
17:35karolherbst: there is also ATX_FORCE_LOW_PWR
17:36Walther: karolherbst: big thanks for all your help!
17:37karolherbst: Walther: are you aware of the power budget of your GPU? are you able to confirm that it is 200W max and 220W crit?
17:38karolherbst: on the page "145W" :D sure
17:39Walther: where can I look for that information? any way to fish that from cli?
17:39karolherbst: uhh, only with the nvidia driver
17:39karolherbst: but I hoped you have some GPU tools under windows
17:39Walther: right, let's see
17:39karolherbst: this card is so over spec :D
17:46karolherbst: Walther: does it come with 2x6 pin power ports=
17:46Walther: any ideas on which tools would help me find the wattages? nvidia control panel has nothing, GPU-z only shows percentage TDP
17:46karolherbst: or 1x8?
17:46karolherbst: percentage tdp... how useful
17:47karolherbst: sorry, no idea, I know only nvidia-smi, no idea if there is a windows version of this
17:47karolherbst: Walther: does something like that exists? C:\Program Files\NVIDIA Corporation\NVSMI
17:48Walther: 8+6 if i'm not mistaken
17:49karolherbst: that is 75W+75W+150W = 300W
17:49karolherbst: the pages says 145W ;)
17:50karolherbst: 200W nice
17:50karolherbst: I just found the way to read out that value a few days ago, just wanted a bit more confirmation on this one
17:51karolherbst: mind pasting nvidia-smi -q
17:51imirkin: what do i know?
17:52karolherbst: edid stuff
17:52karolherbst: https://walther.guru/temp/journal3.txt nouveau reads out invalid edid through DVI
17:52karolherbst: hdmi works fine
17:54karolherbst: awesome, thanks
17:55Walther: unrelatedly, the colors look weirdly different via hdmi than dvi, under windows
17:55karolherbst: I get the same feeling on my intel gpu too
17:55karolherbst: the difference is very slim, but there is one
17:55karolherbst: it feels odd
17:56imirkin: bt 601 vs bt 701 or whatever?
17:56Walther: hdmi looks grey-er / flatter / less contrasty / probably something about gamma
17:57karolherbst: hdmi has a lot of fancy consumer stuff as well
17:57karolherbst: I think it even contains some color correction infos
17:57Walther: I always thought DVI was "better"
17:57Walther: well and DP of course better than anything else :)
17:58karolherbst: HDMI is able to transport network and audio
17:58benwaffle: Is this a bad GPU? https://a.pomf.cat/ktlfif.mp4
17:59karolherbst: nope, it wants to be fed, that's all
17:59Walther: looks like a loose connection / bad cable to me
18:00benwaffle: if i switch the plugs on the gpu its still the same
18:01imirkin: interesting. http://hastebin.com/onaxakamuf.sql
18:01imirkin: the read cuts off early
18:02karolherbst: so the edid is actually longer?
18:02imirkin: after 128 bytes
18:04imirkin: i _think_ so
18:04Walther: acer b235HL, 23" IPS fullHD display
18:05imirkin: also, curiously, my /sys/class/drm/card0-DVI-D-1/edid is also truncated [with nouveau]
18:05Walther: we're onto something!
18:05imirkin: also to 128 bytes
18:06imirkin: but i never get such errors as you did
18:06karolherbst: maybe drm handles yours right
18:07karolherbst: but nouveau always exposes just 128 bytes
18:07imirkin: or this edid-decode is broken...
18:07karolherbst: mhh, my edid is 256
18:09karolherbst: imirkin: nope, looks right
18:09imirkin: on intel though?
18:10karolherbst: with parse-edid: https://gist.github.com/karolherbst/507446a39f1267d5f73374bcc1a3b453
18:10karolherbst: x11-misc/read-edid package
18:11karolherbst: Walther: mind installing read-edid?
18:12karolherbst: and use read-edid to dump your display edid?
18:12Walther: Sure, a moment
18:12imirkin: that tool is a bit difficult to use
18:12imirkin: there's also https://cgit.freedesktop.org/xorg/app/edid-decode/
18:12karolherbst: get-edid is the command
18:12karolherbst: ohh right
18:12Walther: Dvi or hdmi?
18:12karolherbst: get-edid uses the i2c bus directly
18:12karolherbst: Walther: dvi
18:12imirkin: ok, so we use drm_get_edid to read it off the i2c bus
18:13benwaffle: Walther: i tried the same cable on a different monitor and the screen is fine, so i think the monitor may be bad
18:13karolherbst: or linux is bad
18:13karolherbst: well, drm actually
18:13karolherbst: first lets dump the edid
18:13karolherbst: then we know more
18:14imirkin: /* if there's no extensions, we're done */
18:14imirkin: if (block[0x7e] == 0)
18:14imirkin: return (struct edid *)block;
18:14imirkin: well, it's also what i assume intel uses...
18:15imirkin: and get-edid retrieved a 128-byte blob for me too
18:15karolherbst: intel could be wrong as well
18:16karolherbst: are you sure your edid is bigger?
18:16karolherbst: my internal screens edid is also only 128 byte big
18:18Walther: sudo get-edid -> looks like no busses have an edid. Sorry!
18:18Walther: It also attempts via VBE and results in illegal instruction
18:19Walther: This is when booted with nomodeset to have the dvi plugged and a working tty
18:19imirkin: no, could be some sort of screwup
18:19Walther: With dvi and just nouveau.noaccel etc it won't give a tty (visibly. No ssh set up)
18:21karolherbst: ohh right
18:21karolherbst: Walther: well, I guess you would need ssh for that
18:22karolherbst: the i2c device nodes need to be exported for that
18:24Walther: Let me see if it handles plugging both dvi and hdmi
18:24karolherbst: I don't think so
18:28Walther: With nouveau.blah kern params it did, actually
18:29Walther: Can see the raw edid in journalctl, get-edid doesn't cooperate
18:30karolherbst: mhh, odd
18:30karolherbst: what did get-edid print?
18:31karolherbst: Walther: well, the thing is, we believe the edid in your log is too short
18:31karolherbst: there is no i2c bus for you?
18:32karolherbst: is the i2c_algo_bit module loaded?
18:32karolherbst: and are there i2c- files within /dev/ ?
18:32Walther: lsmod |grep i2 results in i2c_algo_bit amont others
18:33Walther: ls /dev/ | grep i2; empty
18:33karolherbst: the heck
18:34imirkin: i2c-dev is a separate thing
18:34karolherbst: ohh, really?
18:34karolherbst: i2c_i801 ?
18:34karolherbst: wait, let me check
18:35imirkin: no. i2c_dev
18:35karolherbst: I don't have it
18:36karolherbst: or do you mean I2C_CHARDEV?
18:36imirkin: probably built-in
18:36Walther: find . -iname i2c at / results in sys/kernel/debug/tracing/events/i2c and /sys/bus/i2c
18:36karolherbst: I see
18:36imirkin: obj-$(CONFIG_I2C_CHARDEV) += i2c-dev.o
18:36karolherbst: Walther: you need I2C_CHARDEV enabled in the kernel
18:38Walther: (and a habdful of /usr/lib/modules stuff)
18:38Walther: Running arch
18:38karolherbst: sure, you still would need that enabled ;)
18:38Walther: Nod. How do i check if it's enabled?
18:39Walther: Been a while since i compiled a kernel
18:39Walther: Heh, haven't even submitted the first eudyptula challenge solution i did like in january
18:39karolherbst: cat /proc/config.gz | gzip --decompress | grep I2C_CHARDEV
18:40karolherbst: then you should be able to modprobe it
18:41karolherbst: i2c-dev or i2c-chardev
18:41karolherbst: no idea how it is called
18:41karolherbst: i2c-dev as it seems
18:41karolherbst: and then the i2c files should appear in /dev
18:41Walther: Nod, probed, get-edid finds stuff
18:41Walther: A sec
18:42karolherbst: how many?
18:42karolherbst: well, you should dump all valid ones
18:43karolherbst: get-edid -q 2>&1 | parse-edid
18:43karolherbst: or add -b with the bus number to get-edid
18:44Walther: https://walther.guru/temp/edid-raw.txt edid-stderr.txt
18:44karolherbst: huh, that doesn't help :/
18:44karolherbst: what is the edid via hdmi?
18:47Walther: For bus18, 256byte edid successfully retrieved
18:47karolherbst: this one seems good
18:48Walther: (sorry, those aren't really .txt, just muscle memory in shell command > file.txt and scp that)
18:48karolherbst: imirkin: see that crap?
18:48karolherbst: the first 128 bytes are the same, except that silly extension byte
18:49karolherbst: well at least we found the error
18:49imirkin: well, it's expected that edid over HDMI and DVI are different
18:49karolherbst: but even the checksum is the same
18:49imirkin: yeah, that's the monitor being dumb :(
18:49karolherbst: ohhhh wait
18:49karolherbst: it isn't
18:49karolherbst: hdmi: c1 checksum
18:49karolherbst: dvi c2 checksum
18:50karolherbst: which makes totally sense
18:50karolherbst: cause c1 + 1 = c2
18:50imirkin: it's flipping the 0x7e byte between 0 and 1? yeah
18:50karolherbst: okay... something is very odd then
18:50karolherbst: why does the checksum function fails
18:50karolherbst: do we have an option to ignore the checksum?
18:50imirkin: time to do some math? :)
18:50karolherbst: the checksum is valid for HDMI
18:57karolherbst: imirkin: another byte is different!
18:58karolherbst: imirkin: byte 0x20: 0x12 for hdmi and 0x0 for dvi
18:58karolherbst: 238 + 0x12 = 0x100 :)
18:58karolherbst: those crappy engineers....
18:58Walther: Erm that's bus 17 actually
18:59Walther: (if it matters)
18:59imirkin: airlied: i think that my two commits over at https://github.com/imirkin/mesa/commits/jenkins should improve CTS on nouveau - at least those geometry tests. i think with that, i've fixed all the things you gave me apitraces for.
18:59karolherbst: Walther: okay, here is the deal: the edid over DVI is basically broken
19:00karolherbst: there are two solutions: 1. give drm a fixed edid 2. add/find a way to tell drm to ignore the edid checksum
19:01Walther: There were some instructions on arch wiki about ignoring or forcing modelines, but there were warnings about possible HW damage
19:01karolherbst: adding modelines is painful
19:01karolherbst: anyway, you are better of with the hdmi stuff, cause the edid is double in size and contains more information
19:01karolherbst: that's no edid override
19:02karolherbst: mhhh IgnoreEDIDChecksum
19:02karolherbst: will only work for nvidia though
19:02karolherbst: andway, I would stick with the HDMI thing
19:03karolherbst: in the end we should have a way to just ignore the checksum
19:03karolherbst: but well
19:03karolherbst: good to have actual confirmation, that sometimes those displays edid are stupid
19:04karolherbst: I already see the situation in my head: "uhh yeah, no extension, fixing up that edid with +1, everything else is the same anyway 11!!1!11"
19:04karolherbst: the field being different is for chromaticity
19:04karolherbst: blue in this case
19:05karolherbst: 0x5012 with hdmi and 0x5000 with dvi
19:05karolherbst: those are x/y values
19:05karolherbst: so 0x50 and 0x12 vs 0x50 and 0x00
19:06Walther: hum, so that's causing the color shift?
19:06karolherbst: might explain the difference in color
19:06karolherbst: I guess so
19:08Walther: kinda annoying, now I don't know which color is more correct :P doing some photography on the side...
19:08Walther: but that's a different issue
19:08Walther: now I'll just need to ensure grub / kernel to always do the nouveau.noaccel and nofbaccel
19:08karolherbst: what a pickle
19:10Walther: your favourite place to store that sort of settings? grub confs or something like /etc/modprobe.d or
19:10karolherbst: I don't use grub
19:11Walther: alright, i'll find it on my own, no biggie
19:11Walther: big thanks for all the help!
19:11Walther: anything we learned that could help others? any workarounds that someone could benefit from?
19:11karolherbst: mhh except trying out different connectors?
19:12karolherbst: mhh we learned that sometimes edids checksums are stupid and wrong
19:15Walther: One definitely weird thing is if i reboot and quickly log in i can see the last time's windows for a split second when starting X. Someone could argue this is a vulnerability
19:20Walther: HA, solved the color shift issue, apparently with hdmi NVIDIA control panel decided to default to a limited dynamic range
19:20Walther: selected full -> now it doesn't seem oddly grey
19:21karolherbst: I see :D
19:43karolherbst: funny how much time you can actually put into vbios stuff
19:45karolherbst: "Silicon Image Microcontroller SI1930uC device for HDMI Compositor/Converter" :O
19:46karolherbst: ohh, wrong table
19:49pmoreau: Trying to get mad24 working in my kernels: MADSP(0, 2, 0) will use 24 bits of the 2nd component, whereas using 0, 4, 8 will result in 16 bits being used. But in `NVC0LoweringPass::processSurfaceCoordsNVE4()`, 2 is marked as being 16 bits, and 0 as 24 bits.
19:52pmoreau: Computing `mad24(1, 0x12345678, 0)`: with 2nd component of NV50_IR_SUBOP_MADSP to 0, I get 0x00005678, and when set to 2, I get 0x00345678.
20:15karolherbst: pmoreau: looks like fun
20:16pmoreau: It is :-)
20:16pmoreau: The signed version is more annoying…
20:24karolherbst: I am currently thinking i they added them on purpose or those just happend to do something you could use for stuff after actually figuring out what they do....
20:30karolherbst: I think I will finish my dynamic reclocking prototype
20:30karolherbst: I would consider reclocking capped to boost 0 stable and secure for enough gpus so that it makes actually sense
20:59pmoreau: Could someone please MMT-trace instruction-set_OpenCL-std.out from https://phabricator.pmoreau.org/diffusion/SPVTES/ using the blob?
21:00pmoreau: And on GK104+
21:06Yoshimo: karolherbst: it is incredible indeed, and i still get lines in my log about unknown tables ;)
21:23karolherbst: Yoshimo: fell free to RE those all :p
21:26Yoshimo: i have no idea how to do that or find the firmware needed to get maxwellv2 fans running, therefore i bow before you and the rest of the team
21:39karolherbst: mhh, pmu memory load counter: data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA7QAAALqCAYAAAAFNBidAAAgAElEQVR4XuzdB7hsZ1kv8KMCUbrSS4BQQhGlBRARiBKlXECKCEiVS1cgXBBBKRETQhOlShOCFLmAgNLuJSBdQcBCkS69X1pCJ8D9v2GNDptz9pndZta75jfP8z6zz9kzs77v9377PPt/
21:40mgottschlag: heh, now I want to see that image.
21:41karolherbst: ... the hell who puts all the data into the link :D
21:41BrainDamage: it's the whole pic, encoded in the url
21:41karolherbst: yeah, I noticed
21:41karolherbst: ohh blob: is local storage
21:44karolherbst: I should create those locally
22:10Lekensteyn: Who has a recentish Optimus laptop here running kernel 4.8+?
22:10karolherbst: mupuf: https://i.imgur.com/jvr7OAA.png
22:10karolherbst: mupuf: this is for memory
22:11karolherbst: color is forced load with ./ppwr_counters_fake
22:11karolherbst: left is load reported by nvidia userspace
22:11Lekensteyn: Without nouveau loaded, runtime PM seems still able to power off the dGPU completely (via the pcie port) (power/control = auto for both the pcie port and dGPU)
22:12Lekensteyn: however, modprobing nouveau then fails after resuming the devices (e.g. power/control = on)
22:12Lekensteyn: with something like unsupported chipset 0xffffffff
22:12karolherbst: the kernel doesn't cache the values
22:12skeggsb: karolherbst: i wouldn't have expected any change in behaviour related to that area from my work... did you check the devel-kms branch too?
22:12karolherbst: and it can't ask the hardware
22:12skeggsb: perhaps something in drm-next busted it
22:12skeggsb: but, if not, could you bisect please :)
22:13karolherbst: skeggsb: I only used your linux-4.10 branch
22:13Lekensteyn: removing the nvidia pci device(s) + rescan pcie port fixes the probing though
22:13karolherbst: Lekensteyn: right
22:13Lekensteyn: which is weird, what makes rescanning so special?
22:13Lekensteyn: has it sth to do with the resources?
22:13Lekensteyn: (BARs et al?)
22:13skeggsb: karolherbst: with drm-next as your main kernel?
22:13karolherbst: skeggsb: I used your kernel tree
22:13skeggsb: oh, right, the kernel three linux-4.10
22:14karolherbst: if you tell me what I should check, I could do it tomorrow
22:14skeggsb: just a 4.9 kernel, with the out-of-tree devel-kms branch
22:14karolherbst: it is a bit messy to get internet on that machine, cause the display and the router is too far away and my cables aren't long enough :D
22:15skeggsb: i'm just about to try and plug my laptop into my tv with passive dp, just in case :P
22:15karolherbst: my active dp to HDMI adapter doesn't work _at_all_ with nouveau on my tesla
22:15karolherbst: well, it does detect the display and everything
22:15karolherbst: but the output is totally borked
22:15skeggsb: not a regression though?
22:16karolherbst: didn't work with 4,8 already
22:16skeggsb: ok, good. still, would be nice to fix that too :P
22:16karolherbst: well, didn't test older kernels
22:16karolherbst: at first I thought puttint that on a 4k@60 screen is too much for the gpu, but it doesn't work on my 2k screen as well
22:16mupuf: karolherbst: I do not understand your plot
22:16skeggsb:can't find his dp->hdmi adapter...
22:16Lekensteyn: karolherbst: another observation is that I get an audio function (01:00.1) following this (with no driver loaded): power/control=auto (power off); plug in DP cable; power/control=auto (power on); check with lspci -H1 -s1:
22:17karolherbst: mupuf: x: memory clock
22:17karolherbst: mupuf: y: load reported by nvidia userspace
22:17karolherbst: mupuf: line label: load forced on pmu counter
22:17Lekensteyn: karolherbst: similarly, repeating the steps w/o cable hides the audio function again (??)
22:17mupuf: karolherbst: on which domain?
22:17mupuf: this is so fucked up
22:17karolherbst: but fucked up following an algorithm!
22:18mupuf: it would more likely depend on .... the core clock
22:19karolherbst: wait a second
22:19karolherbst: how did you know :O
22:20karolherbst: I guess running an gl application additionally messes upthe counters too much
22:20karolherbst: ahh piano!
22:20karolherbst: wow, this is odd
22:21mupuf: well, this makes no sense at all otherwise
22:21karolherbst: the memory load reported by nvidia depends on the memory clock _and_ engine clock
22:21karolherbst: the heck
22:21mupuf: I guess they scale the reported load because the value presented by the counter never gets high
22:21mupuf: and so, it would be misleading to the user
22:21karolherbst: I need that algorithm
22:22mupuf: well, I guess it would be nice, but not a priority IMO
22:22karolherbst: it's needed
22:22karolherbst: otherwise we can't know to which memory clock to clock
22:22karolherbst: I tried several things, nothing worked out there
22:23karolherbst: the higher the engine clock, the higher the reported memory load
22:23karolherbst: I fake a load of 10% on the counter
22:23karolherbst: and nvidia reports 80%
22:23karolherbst: this is so messed up
22:24karolherbst: this counter is totally useless if we don't know how to actually interpret this damn value