00:00karolherbst: "nouveau 0000:04:00.0: Direct firmware load for nvidia/gp106/gr/sw_nonctx.bin failed with error -2" that speaks for itself ;)
00:00cosurgi: karolherbst: firmware-linux version: 20161130-4
00:00karolherbst: that's way too old
00:00karolherbst: like 2 years too old
00:01cosurgi: yeah, now I see that. Let's see what can I get from devuan ceres (experimental)
00:01karolherbst: probably a good idea to have that, the kernel and mesa as new as possible
00:02cosurgi: ahh! There is version 20190114
00:02imirkin_: my guess is that this wasn't helping the modesetting ddx situation either :)
00:02cosurgi: let's take it.
00:03imirkin_: that's very odd
00:03imirkin_: that looks like ~the same error as for the modesetting ddx
00:03imirkin_: i wonder if something in the noaccel path maxes out on the width
00:04imirkin_: or ... something?
00:04imirkin_: 6480x3840 is not ... tiny
00:04karolherbst: imirkin_: mhhh, I guess if we try allocating a channel we are doomed for if we don't get one .. maybe there is some early check so we don't even try?
00:04imirkin_: karolherbst: hm?
00:04cosurgi: you mean the resolution 6480x3840 ?
00:04imirkin_: cosurgi: yeah
00:04karolherbst: imirkin_: in the nv ddx
00:05imirkin_: karolherbst: we fall back just fine
00:05imirkin_: and curiously he only gets this on the SECOND X server
00:05imirkin_: which makes ... zero sense
00:05karolherbst: X s weird
00:05karolherbst: don't think about it
00:05imirkin_: it'd be interesting to find out where it's actually crashing
00:05cosurgi: just to make sure, this file: Using config file: "/etc/X11/mon_hm.conf" is https://pastebin.com/gteBRmXU
00:05imirkin_: however with acceleration it'd be even better
00:05cosurgi: I'm backporting firmware-nonfree package now.
00:05imirkin_: cosurgi: yeah, that file is fine
00:05karolherbst: cosurgi: I would let xrandr handle that, but... your choice ;)
00:05cosurgi: version 20190114
00:06imirkin_: karolherbst: my suggestion :p
00:06karolherbst: cosurgi: make sure initramfs was updated
00:06imirkin_: cosurgi: why worry about xrandr when you can just have it come up the way you want?
00:06cosurgi: karolherbst: imirkin_ told me to use xorg.conf instead. I preferred xrandr earlier.
00:06karolherbst: imirkin_: ahh, I use fancy login managers doing that for me
00:06cosurgi: karolherbst: after installing firmware I will have to reboot? OK.
00:06karolherbst: but I also switch between nouveau and nvidia on a regular basis
00:06imirkin_: cosurgi: yeah, unfortunately
00:07karolherbst: so any config is rather annoying
00:07imirkin_: cosurgi: make sure the firmware is in your initrd
00:07cosurgi: karolherbst: I don't like login managers ;) I disabled the last one which was on Ctrl-Alt-F7.
00:07karolherbst: depends on the one you used
00:08karolherbst: sddm is quite solid, lightdm is also fine
00:08karolherbst: gdm... ise meh
00:08karolherbst: can't even display the login prompt on all displays
00:08karolherbst: very stupid
00:08imirkin_: karolherbst: i'm having issues with old versions of gdm actually :)
00:08imirkin_: with screen rotation, something weird happens
00:08karolherbst: I never used gdm
00:08karolherbst: well, on a regluar basis
00:09imirkin_: gdm was like the only other login manager than xdm
00:09karolherbst: slim -> lightdm -> sddm is my path
00:09karolherbst: imirkin_: huh? "was"?
00:09imirkin_: had a nice background :)
00:09imirkin_: well, now there are others
00:09karolherbst: I see
00:10imirkin_: and most people didn't need xdmcp
00:10karolherbst: I can always suggest using lightdm. Used xfce at work for some system with 4 displays attached
00:10karolherbst: the login prompt was always shown on the screen where the cursor was as well :)
00:10karolherbst: two benefits in one
00:12karolherbst: mhhh, this is like super weird. I put the channel id into the req object for the drmIoctl thingy. It's there, but not on the kernel side if I printk it... this is super annoying right now
00:12cosurgi: firmware-* has compiled. There are lots of them!
00:12karolherbst: yeah, there are
00:12cosurgi: $ ls -1 *deb | wc -l
00:12karolherbst: you only need the nvidia/gp106 ones
00:12karolherbst: but yeah
00:12karolherbst: there are quite a lot of files
00:12karolherbst: it's just the way it is
00:13karolherbst: ohhh wait
00:13karolherbst: I know what you mean
00:13karolherbst: cosurgi: there should be either a linux-firmware package or a nvidia one
00:13cosurgi: ok. I'll see whan I need to install.
00:13karolherbst: I assume the former
00:13karolherbst: some firmwares are split out for whatever reasons
00:14cosurgi: Or maybe just install all of them?
00:14karolherbst: or that, I don't care, it's your system :p
00:15karolherbst: worst case you waste a few MBs...
00:15cosurgi: I currently I have those installed with version from year 2016: firmware-linux-free firmware-amd-graphics firmware-misc-nonfree firmware-linux firmware-linux-nonfree
00:15karolherbst: then I would update all of those as well
00:15cosurgi: are we good installing the exact same ones, or do we need something more?
00:16cosurgi: there is nothing with *nvidia* or *nouveau*
00:16karolherbst: check if you have a /lib/firmwares/nvidia/gp106 after updating
00:16cosurgi: in name
00:16imirkin_: cosurgi: probably linux-nonfree
00:17cosurgi: hm, why do I have firmware-amd-graphics ?
00:17cosurgi: perhaps I would better remove it.
00:18cosurgi: firmware-linux-nonfree depends on firmware-amd-graphics
00:18cosurgi: ok. exaplained.
00:18cosurgi: update-initramfs: Generating /boot/initrd.img-4.20.3-absurd.2
00:19cosurgi: I have directory: /lib/firmware/nvidia/gp106
00:19imirkin_: cosurgi: hope these monitors aren't sitting on a $20 desk...
00:19imirkin_: 3x 43" is ... large.
00:19cosurgi: imirkin_: nope. It's a $800 desk ;)
00:19imirkin_: i recently switched my home setup to be the same as my work -- 2x 24"
00:20cosurgi: :) I write a lot og C++ code. when I open gvim on all screens then finally I can see what I code
00:20imirkin_: yeah, i mean that's why i use rotated 24" monitors too
00:20cosurgi: my apologies for using so much screen space
00:21imirkin_: and try to stick to 80 chars, since 2x 80 char emacs windows fit really nicely within 1200px :)
00:22cosurgi: OK. So I suppose it's time to reboot right?
00:22imirkin_: did you stick the firmware into your initrd?
00:22imirkin_: (or are you not afraid you'll have to boot off some weird raid controller, and thus don't use an initrd to load modules)
00:23cosurgi: I can only tell: I hope so. That's what I got here: https://pastebin.com/7r4BuFsU
00:23imirkin_: seems right
00:23karolherbst: the ultimate goal is to have 0 modules :p
00:23cosurgi: to make sure I can reinstall kernel *.deb package
00:24karolherbst: imirkin_: ohh, there was some guide somewhere to create a linux uefi binary with the initramfs included actually. Wanted to check that out at some point, never got to it
00:25cosurgi: ok. See you in 5 minutes, unless something weird happens.
00:25karolherbst: also, how do you do full disc encryption without an initramfs :O
00:28imirkin_: karolherbst: nothing wrong with modules. just something wrong with loading them from initrd
00:29imirkin_: karolherbst: i have an initrd - it decrypts the hd. it doesn't load modules.
00:29imirkin_: i never change it
00:36cosurgi: ok, soe here is my new dmesg: https://pastebin.com/fdSAqBnX
00:36cosurgi: I have started the first xserver
00:36prOMiNd: imirkin_, 12 hours took to gather all that information, I can now parse memory tweak index used by vendor but the problem is - they shared indexes!
00:36prOMiNd: its impossible to determine the vendor that way
00:37cosurgi: that's the first xserver log: https://pastebin.com/G1xpjtav
00:37prOMiNd: why should always be that hard when it comes to nvidia
00:37cosurgi: now will try starting second one
00:38cosurgi: hey. wow!
00:38cosurgi: it worked!
00:39cosurgi: let's try a third one
00:40cosurgi: hey! it worked too :)) this time without xorg.conf, just xrandr
00:40cosurgi: let's start a fourth one!
00:40prOMiNd: pfuse you pointed doesn´t provide much information
00:41prOMiNd: non of strap values are found in *known* registers
00:41cosurgi: works too :)))
00:41cosurgi: the best thing about nouveau is that now switching between virtual consoles is so fast
00:41cosurgi: also, glxgears prints this:
00:42cosurgi: nvc0_screen_create:937 - Error allocating PGRAPH context for 3D: -22
00:42cosurgi: libGL error: failed to create dri screen
00:42cosurgi: libGL error: failed to load driver: nouveau
00:42karolherbst: it shouldn't
00:42cosurgi: that looks disturbing
00:43karolherbst: dmesg please?
00:43karolherbst: ohhh maybe mesa is too old as well :D
00:43cosurgi: ok, soe here is my new dmesg: https://pastebin.com/fdSAqBnX
00:43cosurgi: ohhh. OK, I will backport mesa
00:43karolherbst: yeah.. dmesg looks fine
00:43karolherbst: what version of mesa do you have?
00:43cosurgi: that's the first xserver log: https://pastebin.com/G1xpjtav
00:43cosurgi: hold on
00:44cosurgi: libegl1-mesa 13.0.6-1+b2
00:44cosurgi: mesa-utils 8.3.0-3
00:45karolherbst: 13.0 mhhh
00:45cosurgi: mesa-common-dev 13.0.6-1+b2 libglu1-mesa-dev 9.0.0-2.1
00:45karolherbst: imirkin_: do you know how well pascal was supported with 13.0?
00:45karolherbst: but.. I guess it would be probably better going with something newer? dunno
00:46imirkin_: sounds like "not at all"
00:47imirkin_: given that nouveau_dri didn't load at all
00:47imirkin_: however it doesn't really matter
00:47imirkin_: you only want that for acceleration
00:47imirkin_: my guess is that cosurgi's usage doesn't involve a lot of that
00:47cosurgi: I want acceleration.. I code stuff in OpenGl
00:47imirkin_: then boy are you in for a surprise
00:47imirkin_: because you might have an expectation of performance
00:47cosurgi: not really
00:47imirkin_: and conformance
00:47cosurgi: I don't write games
00:48imirkin_: we got neither! :)
00:48cosurgi: I only need to draw some quantum mechanics wavefunctions and stuff like that
00:48imirkin_: as long as you don't get too fancy with your GL code, should be fine
00:48cosurgi: but without acceleration it's too slow.
00:48karolherbst: imirkin_: well, we are quite good regarding conformance though
00:48karolherbst: or not that bad
00:49imirkin_: yeah... i did a wavefunction sim in GL back in HS. even created a red/green glasses version of it. good times.
00:49cosurgi: heheh. Cool ! Mine's are for writing scientific papers ;)
00:49imirkin_: watching the (1d) waveform quantum-tunnel through a potential...
00:49imirkin_: yeah, i'm sure your usage is a bit more advanced
00:50imirkin_: this was also ... a long time ago
00:50imirkin_: GL 1 days :)
00:50cosurgi: oh, wow :)
00:50cosurgi: ok. So I am backporting mesa now.
00:51imirkin_: and my genius scheme of just renormalizing the wavefunction on every pass, sicne i could never get the numbers to add up otherwise :)
00:51cosurgi: lol :))
00:51imirkin_: i can't believe how long ago it was
00:51imirkin_:is so old
00:51cosurgi: in fact it's pretty common to check normalization to do a basic verification of the computation algorithm.
00:52karolherbst: heh... is there some magic towards ioctl I don't know?
00:53imirkin_: karolherbst: seems eminently likely
00:53karolherbst: I literally copied a working ioctl and everything we have to do for that...
00:53karolherbst: I am sure my patch is correct
00:53imirkin_: cosurgi: anyways, as long as you're not writing a GL test suite, we're fairly conformant
00:54imirkin_: but once you try to measure our conformance, all bets are off :)
00:54karolherbst: imirkin_: https://gist.github.com/karolherbst/8792e93ffa4dc77e8e75e6892f9a814b any thoughts?
00:54cosurgi: hmm.. why do I have libglu1-mesa-dev and libglu1-mesa at ver.9.0.0-2.1 and there is no new version available, while all other *mesa* packages I have at version 13.0.6-1+b2 and there is ver. 18.3.2-1 available?
00:54cosurgi: maybe I should uninstall libglu1-mesa-dev and libglu1-mesa ?
00:55imirkin_: karolherbst: and what problem are you getting?
00:55karolherbst: imirkin_: drm_nouveau_channel_killed is all 0
00:55cosurgi: imirkin_: that's good for me. The most important thing for me is stability. xserver please do not crash. Or if you must crash, please don't freeze the computer (like nvidia), and let other xservers live their own lives.
00:56karolherbst: imirkin_: even though I am sure I write the channel id into that object before it reaches ioctl()
00:56karolherbst: I checked with gdb
00:56karolherbst: I reaches "chan = nouveau_abi16_chan(abi16, req->channel);", but chan is NULL obviously
00:56imirkin_: cosurgi: yeah ... that's unlikely to end well. when you anger the GL, it will freeze the comp
00:56karolherbst: I try to improve the situation :p
00:57imirkin_: cosurgi: best to try not to anger it
00:57imirkin_: and look into saving workspaces in your editor
00:57imirkin_: it's been great for me in emacs
00:57cosurgi: imirkin_: so it's safer to not use OpenGL? :) Maybe I could live with slow-drawing wavefunctions. Then it will not freeze the computer?
00:57karolherbst: it should be fine for most things
00:57imirkin_: cosurgi: yeah, way safer
00:57cosurgi: yeah, gvim saves workspaces great. I use it all the time.
00:57imirkin_: but mostly it'll be more complex applications
00:58imirkin_: ones that use multi-threading
00:58imirkin_: unfortunately many recent UI toolkits think it's hilarious to draw their buttons using opengl
00:59cosurgi: hmm... in my case I have two threads. One to draw stuff in OpenGL and other thread to calculate stuff (possibly using ots of threads). Those two threads use mutex to make sure that the right stuff is drawn. And that's all about threads here.
00:59imirkin_: if you're looking to maximize stability, get rid of nouveau_dri.so
00:59karolherbst: most of the time there isn't even a benefit in doing so I assume :/
00:59imirkin_: cosurgi: yeah, that's fine
00:59karolherbst: cosurgi: it's about multiple GL contexts, this is quite problematic
00:59imirkin_: i meant multiple threads doing GL calls
00:59imirkin_: (in different contexts, all nice and legal, but still breaks our driver)
01:01cosurgi: So I could not run many instances of my program to calculate & draw stuff?
01:01imirkin_: that's fine
01:01imirkin_: just don't do it from a single process
01:01cosurgi: ah, ok.
01:02cosurgi: maybe that explains why one of my coworkers disabled the ability to open multiple OpenGL windows to look at the calculated stuff from different views.
01:03cosurgi: imirkin_: because that would be multiple contexts frm same process?
01:03imirkin_: which would still be fine for nouveau
01:03imirkin_: just not from different threads, at the same time :)
01:03cosurgi: so that would be wrong if each window was a separate thread?
01:04cosurgi: OK. I am not going to code that. I don't need it. One thread for graphics is enough for me :)
01:05cosurgi: so maybe he had other problems with more windows. I don't remember why he did that :)
01:06cosurgi: whoa. build dependencies for mesa are 400MB
01:07imirkin_: cosurgi: there's only one hw context
01:07imirkin_: they write all over each other
01:07imirkin_: and yeah, mesa depends on llvm
01:07imirkin_: which is not small
01:08cosurgi: does it need wayland?
01:09cosurgi: I see libwayland-egl1 in dependencies
01:09imirkin_: not necessarily
01:11imirkin_: only if you want it to work with wayland
01:12cosurgi: I don't like wayland.
01:12cosurgi: I gave it a try last year. It was terrible.
01:13imirkin_: i don't anticipate moving off X11 anytime soon either
01:15imirkin_: maybe in another 10y or so? who knows.
01:15karolherbst: compositors aren't in a great shape... that's true
01:15imirkin_: it'll basically be when it becomes harder to stay on X11 than to move to $other thing
01:16cosurgi: I use compton sometimes.
01:16imirkin_: all fancy-like
01:17cosurgi: when I need transparency or inverse colored windows. After I'm done I kill it. All done with a few keyboard commands, and meta-mouseScroller to change transparency amount.
01:17imirkin_: xcalib -i -a
01:18imirkin_: (ok, that doesn't get you transparency)
01:19karolherbst: imirkin_: I am stupid :/
01:19karolherbst: that's like totally wrong
01:19imirkin_: i actually saw that
01:19imirkin_: but wasn't sure what that was about
01:19karolherbst: well, the ioctl only reads data
01:19imirkin_: so i figured IOR could be right
01:19karolherbst: or well, userspace is only allowed to read
01:19karolherbst: more like that
01:19imirkin_: ah :)
01:20karolherbst: needs IOWR
01:20imirkin_: yeah, didn't know which direction it was in
01:20karolherbst: yeah... I was too optimistic
01:20karolherbst: but from a kernel security point of view it makes sense how it is now
01:20cosurgi: imirkin_: I have llvm-4.0-dev, mesa dependency is on llvm-7-dev. Do I need to backport llvm-7-dev ? Or would it work with llvm-4.0-dev ?
01:20imirkin_: not sure
01:21imirkin_: for amd, you definitely need the latest
01:21karolherbst: you can disable llvm
01:21cosurgi: ech damn.
01:21imirkin_: but for llvmpipe, which is what you need
01:21imirkin_: i think 4 should be ok
01:21karolherbst: imirkin_: I doubt the code would compile
01:21cosurgi: ok, I'll give it a try
01:21imirkin_: karolherbst: pretty sure llvmpipe supports a much wider range of llvm's
01:21imirkin_: (by design)
01:22imirkin_: cosurgi: llvm is only useful in your case if you (a) want to use software rendering or (b) want to use select buffers
01:22imirkin_: so you can just as well build without it if neither of those apply to you
01:23karolherbst: imirkin_: it works now :)
01:23karolherbst: the ioctl thing
01:23cosurgi: what are select buffers? That's an OpenGL thing where I could draw to multiple screens?
01:23imirkin_: it's an opengl thing before transform feedback was a thing
01:24imirkin_: lets you identify which polygon is on "top" for any particular fragment
01:24imirkin_: and http://docs.gl/gl3/glRenderMode
01:25imirkin_: basically those features are tricky to implement
01:25imirkin_: so the solution is to not to :)
01:25imirkin_: i.e. we just fall back to software rast for those
01:25imirkin_: (and when i say "we", i mean *every* mesa driver, not just nouveau)
01:26cosurgi: ok :)
01:26imirkin_: marek (amd maintainer) and i have been talking about implementing something for it, so we can finally drop that last fallback case
01:26imirkin_: but thus far it's been nothing more than hot air
01:27imirkin_: long story short, it's annoying to implement when you also have a regular geometry shader bound
01:29cosurgi: well damn. the dependencies to compile mesa are a bit more difficult than the other stuff.
01:29imirkin_: what more do you need?
01:30cosurgi: well, the present error relates to package called meson
01:30imirkin_: oh ffs
01:30imirkin_: that's not needed either
01:30cosurgi: tell me one thing: if I finally compile mesa - would graphics acceleration start woroking only by restart of xserver, and not restart of the computer?
01:31imirkin_: technically you didn't need to restart before either
01:31karolherbst: imirkin_: nice... abort() was called after a channel was killed :) seems like it works now
01:31imirkin_: it was just a lot easier to do tha
01:31karolherbst: now to more fancier handling
01:32cosurgi: So I will get to that later when slow OpenGL will annoy me too much. :) Oh, really? No need to restart xserver at all? That's awesome!
01:32imirkin_: you could have properly unbound nouveau from the console, and then reloaded the module
01:32imirkin_: i meant restart the computer to load the firmware
01:32cosurgi: ah, right. ok :)
01:32imirkin_: you probably need to restart the X server, coz the X server has to be aware of the proper acceleration stuff
01:32imirkin_: due to silly iglx stuff which is no longer used in practice
01:32karolherbst: I doubt restarting X is actually required
01:32karolherbst: should just work
01:33cosurgi: so we will learn about that.
01:33imirkin_: karolherbst: X server looks into nouveau_dri.so for the configs
01:33imirkin_: so if the nouveau_dri.so that it has loaded doesn't recognize the gpu, then it won't work
01:33cosurgi: Right now I really need to go to sleep, cos it's 2:33 here. And I'm falling aleep while typing.
01:33imirkin_: nite :)
01:33karolherbst: cosurgi: that's when I become actually productive
01:34karolherbst: uhm.. a few hours earlier
01:34imirkin_: karolherbst: of course you wake up at midnight, so ...
01:34karolherbst: imirkin_: :D I don't
01:34cosurgi: heheh. Me too, sometimes. But today I woke up at 9.
01:34karolherbst: I actually had to wake up at 11am today
01:34karolherbst: just to notice that meeting was moved by 30 minutes
01:34imirkin_: oh man. SO EARLY
01:34karolherbst: could have slept longer
01:34karolherbst: imirkin_: you don't want to know when I was heading out to go to the office today
01:34imirkin_: all those unreasonable PHB's
01:35imirkin_: dude, wtvr. i've headed into the office at 7pm before
01:35imirkin_: nyc is very peaceful and quiet right around 4am
01:35karolherbst: mhhh, yeah, never done that actually
01:35karolherbst: what a lame city :p
01:35cosurgi: and anyway, without mesa my xservers should be more stable right?
01:35imirkin_: cosurgi: same amount of stable
01:36imirkin_: although other applications won't cause your gpu to die
01:36karolherbst: but I kind of got the feeling that within the US, the day ends like 2-3 hours earlier than in europe
01:36karolherbst: super weird
01:36imirkin_: the nouveau ddx actually provides 2d accel for various X drawing APIs
01:36imirkin_: so it's not completely unaccelerated
01:36imirkin_: karolherbst: yeah, people are crazy here. going in to work at like 7am
01:37imirkin_: everything shifts earlier and earlier
01:37karolherbst: XDC was odd, we were going out after the talks, at midnight everything was essentially closed
01:37imirkin_: oh, in mountain view?
01:37karolherbst: yeah :D
01:37imirkin_: everything's closed by like 8pm there
01:37karolherbst: not in the centre :D
01:37imirkin_: it's a real problem. i almost wasn't able to get dinner once on a flight that got in a little late.
01:37karolherbst: or were the bars were
01:38cosurgi: imirkin_: ok, so without mesa my OpenGL stuff under heavy load and three months is more likely to not crash my computer? :)
01:38imirkin_: found a 24hr subway
01:38imirkin_: cosurgi: that's definitely my setup -- i don't do anything heavy, and have months of uptime
01:38cosurgi: ok. So perhaps I'll try that :)
01:39imirkin_: i have GL for accel, but i avoid GL-heavy usage
01:39imirkin_: the only times i crash is when i make a mistake in my mesa development
01:39imirkin_: (the mistake being, "let's debug this crash someone reported")
01:39karolherbst: I was comenting on how the suburbs actually look like inside the ones in GTA :D
01:39cosurgi: imirkin_: and last question to be sure: using xorg.conf is better to use than without config file+xrandr ?
01:40imirkin_: cosurgi: i prefer it. your call.
01:41imirkin_: either works. i just like for things to come up correctly up-front
01:41imirkin_: karolherbst: silicon valley closes down absurdly early. i think it's true of many suburbs. even in nyc, when i lived in chinatown, everything shut down between 8 and 10. very weird.
01:41cosurgi: they come up correctly up front because inside ~/.xsession I have this line:
01:41cosurgi: xrandr --output HDMI-1 --off --output DP-1 --mode 3840x2160 --pos 2160x0 --rotate left --output DVI-D-1 --off --output DP-2 --mode 3840x2160 --pos 0x0 --rotate left --output DP-3 --mode 3840x2160 --pos 4320x0 --rotate left
01:42imirkin_: cosurgi: but the whole RightOf/etc stuff should also get it to be right
01:42imirkin_: (in the xorg config)
01:42imirkin_: if it's in your xsession, that means it comes up wrong, and then is immediately switched
01:43cosurgi: imirkin_: true. And that is the only difference? Or are there other differences like 2d acceleration ot X drawing APIs ?
01:44imirkin_: none at all
01:44imirkin_: and you can still do all the same xrandr stuff
01:44karolherbst: imirkin_: it's like they only live to work over there :p
01:44imirkin_: it just seeds the initial config
01:44cosurgi: Previously I couldn't do xrandr changes when I had xorg.conf
01:44imirkin_: that's not wholly true. when you had some stuff in xorg.conf, that disables xrandr
01:44imirkin_: like if you eanbled xinerama
01:44imirkin_: or other things which kill randr
01:45cosurgi: I see, that could have been the case.
01:45imirkin_: e.g. you can run "xrandr" now
01:45imirkin_: and see how it shows you the current setup
01:45cosurgi: I even changed it with arandr
01:46cosurgi: and it worked. And this session started with that xorg.conf, which you have have seen already https://pastebin.com/gteBRmXU
01:46imirkin_: welcome to the future
01:47cosurgi: verrrry nice :)
01:47imirkin_: i do wonder how a 3x 4k desktop will perform on a lowest-clocked gpu
01:47cosurgi: OK. I must quickly restart the 32-core calculations which I had to abort to do this restart. Then I must sleep :)
01:47karolherbst: huh, why would a game be interested in anything nvc0_hw_get_query_result would return?
01:47karolherbst: weird game
01:48imirkin_: karolherbst: huh?
01:48cosurgi: imirkin_: I don't know how to check the GPU clock
01:48karolherbst: ohh "FOpenGLDynamicRHI::GetOcclusionQueryResult"
01:48cosurgi: or change it
01:48karolherbst: that makes sense
01:48imirkin_: cosurgi: you can't
01:48imirkin_: but that's how these gpu's come up
01:48imirkin_: and we can't change it
01:49cosurgi: ah. ok. So I'm glad that I picked a sufficient one :) At least it looks ilke so.
01:51karolherbst: mhh, we should probably return from some ioctls earlier if we know the channel is dead as well
01:51karolherbst: no point in waiting if there is nothing to be waiting for
01:53cosurgi: a kernel oops or sth like that
01:53imirkin_: get dmesg?
01:54cosurgi: I just stopped one of my xservers. I wanted to see if it was started with or without xorg.conf
01:55cosurgi: Then all screens went black. Ctrl-Alt-Fn would not work.
01:55cosurgi: But I could ssh remotely and chvt 1 worked.
01:55cosurgi: so nothing else was broken, fortunately, other xservers did not die.
01:55imirkin_: oh, we've seen that before
01:55imirkin_: it's not fatal
01:56cosurgi: what is it?
01:56imirkin_: a bug in the code :)
01:56imirkin_: however skeggsb has been unable to figure out why it happens
01:56cosurgi: more details? :)
01:56imirkin_: [ 4596.880170] WARNING: CPU: 22 PID: 18282 at drivers/gpu/drm/nouveau/nvif/vmm.c:71 nvif_vmm_put+0x65/0x70 [nouveau]
01:56imirkin_: you know where to look in the code... have at it
01:57cosurgi: ah, right. I didn't see this line number :)
01:57cosurgi: if it happens frequently for me, then maybe I could help you with it. It depends on how it will annoy me :)
01:57imirkin_: more importantly, skeggsb has been unable to reproduce it
01:57imirkin_: there's a certain class of bugs out there, which happen to everyone except skeggsb
01:57imirkin_: this is one of them
01:58imirkin_: (skeggsb is the primary author of the kernel module)
01:58cosurgi: ok! good to know :)
01:58imirkin_: (maybe "everyone" is an overstatement. but to some people)
01:58imirkin_: and it's not for a lack of trying
01:59cosurgi: I just did it again.
02:00cosurgi: hmm. no.
02:00cosurgi: It's something else.
02:00cosurgi: Only this now in dmesg:
02:00cosurgi: [ 5146.027434] nouveau 0000:04:00.0: gr: intr 00000040
02:00cosurgi: [ 5146.640644] nouveau 0000:04:00.0: disp: 0x000062c3: INIT_GENERIC_CONDITON: unknown 0x07
02:00imirkin_: that's fine, don't worry about that
02:00cosurgi: But the screens went black, and I had to ssh remotely.
02:00cosurgi: to type `chvt 4`
02:00imirkin_: that means something DP-related went bad
02:00imirkin_: you could have switched on keyboard too
02:01imirkin_: DP involves link-training
02:01cosurgi: not sure if I tried this time.
02:01imirkin_: which can unfortunately fail
02:01imirkin_: i suspect nouveau's handling of such failures is ... imperfect
02:03cosurgi: yeah. Here's what I do: `startx -- -nolisten tcp -dpi 100` (without config), when it startx I launch glxgears (not sure if necessary), then I launch compton, then I inverse colors in glxgears, then I click logoff from the xserver. Then screens go black. This appears:
02:03cosurgi: [ 5294.503925] nouveau 0000:04:00.0: gr: intr 00000040
02:03cosurgi: [ 5295.117722] nouveau 0000:04:00.0: disp: 0x000062c3: INIT_GENERIC_CONDITON: unknown 0x07
02:03cosurgi: and Ctrl-Alt-Fn does not work. Must ssh remotely.
02:03imirkin_: my guess is that all the in-between stuff doesn't matter
02:04imirkin_: i.e. startx + logoff
02:04cosurgi: ok, will try that.
02:04imirkin_: it's possible that compton matters
02:04cosurgi: yes, you are right.
02:04cosurgi: startx+logoff is enough.
02:04imirkin_: i think the X server crashes
02:04imirkin_: check your Xorg.0.log.old
02:05imirkin_: i see the same crash with rotation
02:05imirkin_: and in crashing, the X server doesn't properly restore the vt stuff
02:08cosurgi: damn, I am not sure which /var/log/Xorg.*log it is.
02:09imirkin_: latest one by timestamp. and .old
02:09cosurgi: but I am writing to you from one xserver, and restarting another.
02:09cosurgi: There is no .old
02:10cosurgi:checks timestamps more
02:11cosurgi: ok. I found it.
02:12cosurgi: only these UnloadModule lines appear when I log out.
02:13cosurgi: No idea what is this "[ 5829.243] -2" line but it was there befor I did log out.
02:13imirkin_: nouveau prints some errors sometimes ... with less-than-perfect messages
02:18imirkin_: anyways, your thing doesn't crash
02:21cosurgi: but evidently when I exit xserver something is wrong. And fortunately ssh can fix it.
02:24imirkin_: i'm not sure what chvt does that ctrl+alt+fX doesn't
02:24cosurgi: or maybe I have to wait a bit more? I notice that `chvt 11` immediately after exiting xserver does not work. It worked after few seconds. Although previously it wasn't like that.
02:24cosurgi: you are right. It might be keyboard related.
02:24imirkin_: all that stuff is black magic to me
02:25imirkin_: KD_TEXT vs KD_GRAPHICS, fbdev, console, vt's
02:25imirkin_: i never understood any of it
02:25imirkin_: and have no plans to start now
02:29cosurgi: hey, wow. gvim is faster upon full-3-screen refresh!
02:29cosurgi: faster than nvidia.
02:29imirkin_: nouveau's pretty good at 2d
02:29cosurgi: Like 2 seconds faster.
02:29imirkin_: also for reasons i never quite understood, text looks *different* on nvidia vs nouveau
02:29imirkin_: i could never quite put my finger on it
02:29cosurgi: When I resize gvim for 3-screens, the first refresh was pretty slow.
02:29imirkin_: maybe some anti-aliasing or something
02:31cosurgi: fortunately not for me :) I use bitmap fonts.
02:31cosurgi: -fn "-Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO10646-1"
02:31cosurgi: that's my xterm launch script:
02:31cosurgi: xterm -en UTF-8 -b 0 -bg black -fg darkgray -si -sk -geometry 80x30 -fn "-Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO10646-1" -sl 80000 -j -sb -rightbar -xrm "xterm*metaSendsEscape:true" -xrm "xterm*Color3:yellow4" -xrm "xterm*Color10:green3" -xrm "xterm*vt100.translations: #override Shift <Key> Insert: insert-selection(CLIPBOARD, CUT_BUFFER1)" -xrm "XTerm*pointerColor:yellow" -xrm "xterm*pointerShape:xterm" -x
02:31imirkin_: yeah, but like ... with emacs standard fonts
02:34cosurgi: ok. calculations restarted.
02:34cosurgi: Now I can really go to sleep. at 3:34 am. uuh
02:34cosurgi: good night :)
02:34cosurgi: thanks a lot for your help!
02:35cosurgi: I'll hang around I keep you informed about bugs ;>
02:35imirkin_: sounds good
02:50cosurgi: hmm, but xterm scrolling is slower :(
02:51imirkin_: scrolling or outputting additional lines?
02:51imirkin_: i've definitely noticed the latter is surprisingly slow
02:51cosurgi: and yeah. outputting additional lines too
02:52cosurgi: can we fix that by messing in nouveau's code? :)
02:52imirkin_: with enough messing, sure
09:15cosurgi: nice! my xservers didn't crash while I was sleeping ;)
09:16cosurgi: now, let this goodness continue for 12 months :)
09:22cosurgi: chromium is slower than with nvidia :( also: libGL error: failed to load driver: nouveau
09:22cosurgi: imirkin_: would mesa drivers speeed up chromium? I change font size on some webpage and I have to wait 3 seconds :/
09:23cosurgi: *recompiled mesa devuan package
09:31cosurgi: imirkin_: whoa xterm scrolling is so slow, that you would no believe that. On nvidia that was instant refresh. I made a video, I am uploading it now.
09:32cosurgi: imirkin_: if it is so slow, them maybe profiling can be a lot easier to spot which function takes so much time.
10:29cosurgi: imirkin_: https://youtu.be/W2_DnniDaGI
10:30cosurgi: hmm it's very low quality I dunno why. But the refresh rate is real. You can see it.
10:36cosurgi: imirkin_: damn, writing text in chromium is also crazy slow :( The letters appear after I wrote almost entire sentence...
11:45imirkin: cosurgi: i'm not sure what will help, tbh. you've got a giant framebuffer, and it could be that rasterizing the whole thing gets too slow. you can try adding Option "NoAccel" "true" and see if that speeds things up. (counterintuitive, i know, but worth a shot.)
11:45imirkin: (nvidia driver can reclock the memory/etc to proper rates, so it's not really a valid comparison of perf)
11:47cosurgi: ok. in which section of xorg.conf do I put this option?
11:47cosurgi: imirkin: ?
11:48cosurgi: ok. trying...
11:48imirkin: you have an added complication in that you have rotated fb's
11:48imirkin: which means an EXTRA copy
11:50cosurgi: uh. There's no scetion "Driver" should I make an empty one?
11:50cosurgi: Only: Section "Monitor" x 3 and Section "Device"
11:51imirkin: oh my bad. Device.
11:51imirkin: was going by memory =/
11:51cosurgi: It's the xorg.conf which we have made together. OK
11:51imirkin: and my memory -- apparently -- looks like swiss cheese
11:51cosurgi: ok, rearting one of the xservers...
11:55cosurgi: whoa. I have never seen so slow refresh of the wallpaper.
11:56cosurgi: Like 30 seconds to draw wallpaper on one screen.
11:56cosurgi: dragging a window is like on vesa driver 10 years ago.
11:57cosurgi: imirkin: definitely this did not help ;)
12:02cosurgi: so you say that profiling code would not help, and better to focus on clocking?
12:02imirkin: profiling could help too
12:03imirkin: but the profile could say "you spend all your time waiting for the gpu to complete"
12:03cosurgi: so if you want me to run code with any patches, I am happy to try this.
12:03imirkin: i don't have anything, really
12:03imirkin: also, note that new versions of chrome blacklist nouveau entirely, across all versions and GPUs
12:03imirkin: [for acceleration]
12:03cosurgi: I don't mean code improvement patches. I mean: more dmesg info about how much time each function call takes.
12:04cosurgi: I don't know which functions need profiling. But you probably know?
12:04imirkin: just run 'perf' on the X server, i think
12:04imirkin: see where the time goes
12:04cosurgi: oh, I didn't know that!
12:05cosurgi: how do I do that? Just type `perf startx -- -nolisten tcp -dpi 100` ?
12:05cosurgi: where will the output go?
12:05imirkin: should probably read up on it
12:05cosurgi: man perf is pretty short :/
12:06cosurgi: ok. I'll try using perf.
12:06cosurgi: So next thing: how about clocking?
12:09RSpliet: cosurgi: TL;DR: for your GPU (Pascal gen) you maybe, just maybe, are able to change the frequency of all clocks except DRAM. To the best of my knowledge though there is no fan control or changing of DRAM clocks/paramters in nouveau for these cards as a result of "the firmware problem" (NVIDIA not releasing firmware, us not being able to cryptographically sign firmware such that the GPU gives the firmware full access to all registe
12:12cosurgi: damn. How do I check if maybe I can change the frequency of all clocks except DRAM?
12:12cosurgi: and without fan control I can fry my card, right?
12:13RSpliet: cosurgi: you can monitor temperatures using the sensors tool.
12:13imirkin: cosurgi: nah, these boards have auto-shutoff
12:13RSpliet: as for clocks... you can try and play around with /sys/kernel/debug/dri/<number>/pstate
12:13imirkin: nothing's impossibl
12:13RSpliet: Not sure if it's writeable for your card. If it isn't... there might be tricks.
12:13imirkin: but let's say, unlikely
12:14imirkin: it's not writable on pascal
12:14cosurgi: uhh, /sys/kernel/debug/ is empty :/ I need to recompile kernel with some extra flag.
12:14RSpliet: or mount your debugfs there
12:14RSpliet: mount -t debugfs debugfs /sys/kernel/debug
12:15cosurgi: I see numbers 0 and 128
12:16RSpliet: I'm guessing 0, but have a browse around in both...
12:16cosurgi: 0 has dirs like DP-1, DP-2 and all others. 128 has just few files.
12:16cosurgi: wow, /sys/kernel/debug/dri/128/clients has a nice summary of running xservers :)
12:16imirkin: 0 and 128 are identical
12:17imirkin: one's for the modeset node, one's for the render node
12:18cosurgi: so what do I do with /sys/kernel/debug/dri/0/pstate ?
12:19imirkin: you can cat it
12:19imirkin: but that's it
12:19cosurgi: cat: /sys/kernel/debug/dri/0/pstate: No such device
12:19imirkin: not even that then :)
12:19RSpliet: as root
12:19RSpliet: not even sudo'ing afaik
12:19cosurgi: tried as root
12:19imirkin: either way, the things RSpliet is talking about are hypothetical
12:19imirkin: in practice, pascal = no reclocking at all
12:20RSpliet: ah ok, in that case there's a whole load of code waiting to be written by someone to make nouveau change your clocks at all ;-)
12:20cosurgi: ok. I see.
12:21cosurgi: well then I will later see what `perf startx -- -nolisten tcp -dpi 100` can do.
12:21imirkin: probably "perf record -g"
12:21cosurgi: question: mesa will not help xterm scrolling at all, right?
12:21RSpliet: Soz, haven't played with nouveau for a long time, so not completely up to date with the state of the driver wrt modern cards.
12:21imirkin: cosurgi: correct
12:21imirkin: you could try the modesetting driver again (with mesa though), in which case it'll use glamor to accelerate
12:22imirkin: glamor is an implementatino of the X rendering protocol on top of GL
12:22cosurgi: OK. I will ask how to do that once I have mesa compiled :)
12:22imirkin: of course then you go deeper into the claws of GL
12:22imirkin: and gpu hangs
12:22imirkin: and random misrendering
12:22cosurgi: uh-oh. OK. Won't hurt to see if it affects xterm performance :)
12:23prOMiNd: imirkin, the saga continues, is there any chance to debug nvapi on windows and find the damn register?
12:23imirkin: prOMiNd: i know nothing about this stuff, sorry.
12:23prOMiNd: can you at least be a bit more specific about the pfuse thingie?
12:25imirkin: first of all there are straps (register 0x101000): https://github.com/envytools/envytools/blob/master/rnndb/io/pstraps.xml
12:25imirkin: secondly there are fuses that can be read: https://github.com/envytools/envytools/blob/master/rnndb/bus/pfuse.xml
12:26imirkin: good luck
12:37mwk: prOMiNd: what are you looking for?
12:38imirkin: RAM manufacturer
12:38prOMiNd: mission impossible :)
12:39prOMiNd: to determine memory manufacturer runtime
12:39mwk: wouldn't that be in vbios?
12:39cosurgi: huh? Why RAM manufacturer is of importance?
12:40prOMiNd: long story short
12:40prOMiNd: vbios has support for at least 1 vendor, most have 2 some even 3
12:40prOMiNd: I can determine vendors supported and the straps they use but the problem is, some vendors share same straps(tweak indexes) so its impossible to determine 100%
12:41prOMiNd: and because samsung, micron and hynix are optimizied differently I can´t just put same optimization on all 3
12:43HdkR: woo peculiarities in memory training :p
12:43prOMiNd: type3=GDDR5, 1 = samsung, 6 = hynix, f = micron
12:44prOMiNd: problem here is 6 and 1 in lowest frequency level share same tweak id = 0 and its either hynix or samsung :)
13:52RSpliet: prOMiNd_: it doesn't have to be different vendors. Some VBIOSes can simply contain timing for multiple types of chips. Example: a graphics card could be released with either 1GiB or 2GiB of DRAM. To keep the DRAM bus width equal between the two, they'd use chips of different density for the two configurations. Both might well come from the same vendor, but density has an effect on DRAM timings. Hence the VBIOS could contain two co
13:54prOMiNd: RSpliet, that is not exactly what I am implying
13:54RSpliet: For as long as I've been looking at this stuff (... a long time), there has not been a reason to differentiate between DRAM vendors despite me at some point believing otherwise. The parameters in the VBIOS can be applied regardless of who the vendor of the chip was.
13:56RSpliet: That is: for specified operation of the card. I understand that overclocking could benefit from such information, but that is no guarantee that NVIDIA has provisioned for this in the VBIOS specification.
14:00RSpliet: prOMiNd: Also, the memory strap translation table you're looking at appears to contains many bogus entries. In practice GPU vendors only populate two or three entries, and just leave the rest at either a default or copy-paste value from a different VBIOS or whatnot.
14:00prOMiNd: RSpliet, there is significant difference in timings for vendors, each vendor has specific settings for training GDDR5
14:01HdkR: i mean, there are only a handful of vendors to handle for these cards :p
14:01RSpliet: prOMiNd: yes, which is why there are DRAM training pattern tables too, describing the patterns
14:01prOMiNd: what I do is simply doing what I did on AMD RX series, but was amused that NVIDIA actually had good factory settings, though some can still be optimized but are vendor specific
14:02prOMiNd: RSpliet, I do know all timing bitfields, they are public for a long time, but the problem is still the vendor
14:02prOMiNd: I can´t simply set read-to-read-delay to 6 cycles on micron while samsung handles it pretty well
14:02RSpliet: prOMiNd: yes I helped reverse engineering those timing bitfields. Both in VBIOS and on the graphics card registers.
14:03prOMiNd: awesome! You´ve done excellent job :)
14:03HdkR: latest GDDR6 cards can overclock the memory frequency by like +1100 commonly. silly 8.1GHz clock
14:03HdkR: must be some good training
14:04prOMiNd: I´ll get there eventually, they have different problems which are low prio until i find the damn vendor :)
14:05RSpliet: thing is, from a nouveau point of view, we don't have to know the vendor or chip revision or w/e. There can be one or multiple configurations in the VBIOS, and the strap value (taken from reg 101000 IIRC) tells the driver which of the configurations to select. All the parameters will fall in place.
14:06RSpliet: Bear in mind it's the OEM populating the VBIOS, not NVIDIA. They know which chips they use, they know the parameters to fill in. A VBIOS does not have to be universal for every possible DRAM chip out there.
14:07RSpliet: (likewise, it's the OEM who controls the value of the strap register. They know "vendor X chip Y on this PCB is configuration 0 in our VBIOS, vendor X chip Z is conf 1"
14:08prOMiNd: yes that is known
14:08prOMiNd: but a there must be something out there connecting those
14:08RSpliet: They already are fully connected from a functional point of view.
14:12prOMiNd: they are, but in order to load specific tweak index the grp/vendor should be known and populated to *some* scratch register
14:12prOMiNd: via init
14:12RSpliet: That's what the strap register is for.
14:13RSpliet: And it's hard-wired ("strapped", hence the name)
14:13prOMiNd: there are 15 strap registers :)
14:13prOMiNd: or 16
14:13RSpliet: It used to be 3 or 4, with override functionality or sth like that
14:14RSpliet: Have you tried following nouveau code for gt215 or gk104 that changes a DRAM clock?
14:16RSpliet: For both generations, the stuff in nouveau is like 99.9% correct and functional. We don't bother with load-based automatic reclocking because 99.9% isn't 100%, but it works, and you can change your DRAM clocks just by playing with the pstate debugfs file.
14:17RSpliet: And no, we didn't need to know which vendor is which, it's redundant information. Not saying it doesn't exist anywhere in the information sources (VBIOS tables, registers), just saying it might not exist because so far we haven't seen the information serve a purpose.
14:17manio: imirkin: regarding vdpau - no mplayer -vo vdpau doesn't work properly
14:18manio: but at least id doesn't crash :)
14:19manio: karolherbst: i don't know if it is hard to build mesa but i can try :) - pls give me a path and branch name for cloning :)
14:22imirkin_: manio: what's the issue?
14:22imirkin_: if mplayer -vo vdpau doesn't work, nothing in karol's patches will help you
14:22manio: imirkin_: vdpau on kodi => opengl+vdpau
14:22imirkin_: manio: i mean, what's the issue with mplayer
14:23manio: i've pasted a dmesg for this yesterday evening
14:23imirkin_: that was for kodi, no?
14:23imirkin_: which does GL + vdpau?
14:23prOMiNd: RSpliet, yes you are correct, nouveau doesn´t need to know the vendor, it simply follows vbios settings and since the bios sets vendor information somewhere on init reclocking should be trivial
14:23manio: karol was saying something about compiling mesa (his branch)
14:24imirkin_: yes, for the GL + vdpau interaction problems.
14:24manio: 19:37 < karolherbst> manio: willing to build mesa yourself?
14:24manio: 19:37 < karolherbst> I have a branch you might be able to try out
14:24imirkin_: but if vdpau doesn't work on its own, you're sunk
14:24manio: 19:37 < karolherbst> manio: willing to build mesa yourself?
14:24manio: 19:37 < karolherbst> I have a branch you might be able to try out
14:24manio: 19:37 < karolherbst> manio: willing to build mesa yourself?
14:24manio: 19:37 < karolherbst> I have a branch you might be able to try out
14:24manio: 19:37 < karolherbst> manio: willing to build mesa yourself?
14:24manio: 19:37 < karolherbst> I have a branch you might be able to try out
14:24manio: 19:37 < karolherbst> manio: willing to build mesa yourself?
14:24manio: 19:37 < karolherbst> I have a branch you might be able to try out
14:24manio: sorry for spamming :(
14:25RSpliet: prOMiNd: even the outcome of the init scripts differ based on the index value in that strap register.
14:26manio: imirkin_: it's strange: in mplayer it look like it is playing to the nearest key frame
14:26manio: imirkin_: then it is stopping - i have to ff/rev to play again (for a short period of time)
14:26imirkin_: can you pastebin the output of mplayer?
14:26manio: imirkin_: ok, will play with this...
14:27prOMiNd: RSpliet, obviously finding such register is mission impossible, I´ll try to apply workaround logic
14:27imirkin_: also, pastebin the output of LIBGL_DEBUG=verbose glxgears
14:27RSpliet: prOMiNd: what? As I said, 0x101000.
14:27imirkin_: (aka the strap register, as i mentioned earlier)
14:28prOMiNd: its 1:1 on both hynix and micron
14:30manio: imirkin_: https://pastebin.com/nbB3SDjN, https://pastebin.com/bE04tbUv
14:31imirkin_: manio: ok, you're not using vdpau for decoding
14:31imirkin_: you're only using it for display
14:31karolherbst: manio: https://github.com/karolherbst/mesa.git branch: mt_fixes_take2
14:31imirkin_: perhaps something's off in there
14:31manio: imirkin_: VO: [vdpau] 1280x720
14:31imirkin_: manio: see https://nouveau.freedesktop.org/wiki/VideoAcceleration/#usingvdpau
14:31manio: isn't it vdpau?
14:31RSpliet: prOMiNd: I'm sorry, I feel like I'm running round in circles now... that's because NVIDIA abstracted away from indexing based on vendor+chip revision. If ASUS only buys hynix chips, RAMCFG index 1 points to a configuration in ASUS' VBIOS tables corresponding with some hynix chip. There's no point for them to have any parameters for micron chips in their VBIOS, so that's not there. MSI has a contract with micron, their index 1 sele
14:31imirkin_: for output, yeah
14:31imirkin_: for decoding, you want ffh264vdpau
14:31imirkin_: as opposed to ffh264
14:32RSpliet: pardon. "selects a configuration in MSI's VBIOS for a micron chip"
14:32manio: imirkin_: but here: https://pastebin.com/eW83Sprx i think it is using vdpau, right?
14:32prOMiNd: RSpliet, bios supports 2/3 vendors except 1080/1080ti which are all micron
14:32prOMiNd: did you see the pastebin I posted?
14:32imirkin_: manio: maybe, maye not. that's a different issue, due to the separate threading.
14:33RSpliet: prOMiNd: you seem to be missing the point.
14:33manio: will force decoder...
14:33imirkin_: manio: can you look on that wiki page for how to invoke mplayer
14:33prOMiNd: RSpliet, I am missing a lot of points, to be honest :D
14:33prOMiNd: driving in circles for days
14:34RSpliet: bios supports 16 configurations, we don't care about the vendor because the configuration contains all the parameters we need. We determine the configuration without caring about the vendor, because the OEM tells us which configuration to pick based on the CFGRAM bits in that strap register.
14:34imirkin_: manio: at the bottom, see "using vdpau"
14:35prOMiNd: ram_cfg IIRC is used to determine only ¨type¨ not vendor
14:35manio: imirkin_: how about this? https://pastebin.com/gbj2iejv
14:35prOMiNd: I see the code
14:35imirkin_: manio: that looks right
14:35prOMiNd: you check only for being DDR2/3/4/5...
14:35imirkin_: manio: does the video play back ok?
14:35RSpliet: prOMiNd: no
14:35imirkin_: or same issue as before, with rare updates?
14:36prOMiNd: well, now I am totally confused
14:36prOMiNd: nfo->strap != ramcfg
14:37RSpliet: I *think* what you labelled "VND" will select a sub entry in the tables we call RAMCFG and RAMMAP.
14:38RSpliet: Either of those then contain an index telling us which entry in "TIMING" we pick.
14:39RSpliet: we pick from that table you're looking at ("MEMMAP" from the top of my head, it's been a year, the strap translation table...) the entry corresponding with the RAMCFG bits in 0x101000
14:39prOMiNd: 0x101000) & 0x0000003c
14:40prOMiNd: RAMMAP = MemoryInformationTable which containts strap info, vendor info and type by group
14:41manio: imirkin_: nope :( this time i have a black screen from the very beginning of playing video and xorg is freezing
14:41prOMiNd: then the TIMING table is called TweakTable which is linked with ClockTable by frequency, Clocktable on the other hand contains subindex which uses strap index from MemoryInformationTable
14:42imirkin_: manio: hrm. that's not good
14:42manio: imirkin_: i even started only the mplayer without any window manager - same problem
14:42imirkin_: do you have a rotated screen by any chance?
14:42prOMiNd: that is how basically timings are set
14:42manio: nope - but i am using setoutput provider from xrandr
14:43manio: imirkin_: i'll try without it
14:43RSpliet: prOMiNdit might be more convenient here to stick with the names picked by nouveau/envytools. Not ideal, but they predate NVIDIA documentation
14:44RSpliet: So to make sure we got this right... the hierarchy is as follows (bear with me one sec)
14:45manio: imirkin_: also doesn't work
14:45imirkin_: that is ... less-than-ideal
14:45imirkin_: anything interesting in dmesg?
14:46manio: imirkin_: are the blobs for this changing? I've extracted it maybe 2yrs ago?
14:46manio: imirkin_: dmesg is clear
14:48RSpliet: Can't believe how outdated envytools is on my machine... building
14:48RSpliet: Should've set up ninja to at least have a decent parallelisation
14:49imirkin_: manio: bleh. sorry, this will require proper debugging =/
14:49manio: imirkin_: ok, thanks anyway :)
14:50manio: imirkin_: so you think it doesn't change much if i try karol's branch in this case?
14:50imirkin_: unlikely, but feel free
14:51manio: ok :)
14:55RSpliet: prOMiNd: oh, looks like we omitted naming the field in the "MEM TYPE" table that translates the RAMCFG field to the required index. For a long time that was an identity mapping afaik
14:56prOMiNd: yeah you use names that weren´t known before nvidia released tweaktable values
14:56prOMiNd: but with big help from several sources I managed to assemble decent parser to gather vbios ramcfg connection
14:58RSpliet: So Strap -(RAMCFG)->MEM TYPE -(idx)-> "Timing Mapping table" (RAMCFG) -(timing) -> "Timing table"
14:58RSpliet: and Strap -(RAMCFG)->MEM TYPE -(idx)-> MEM TRAIN -(idx)-> MEM TRAIN PATTERN
14:59RSpliet: that "idx" thing is literally a field that to the best of our knowledge has no semantic meaning. Just an index into the next level of tables.
15:01RSpliet: It's not quite an identity mapping, because it maps 16 strap values into max. 8 "idx" values. In the majority of the cases though "idx" is just the low 3 bits of RAMCFG
15:03karolherbst: nice, my dead channel detection works now with glamor as well :)
15:04karolherbst: imirkin_: btw, I get compile errors with your new patch
15:04imirkin_: shows you how much testing i've done.
15:05imirkin_: i'll fix it tonight
15:05karolherbst: okay, cool
15:05imirkin_: super-heavy workload has fallen on me from the sky
15:05imirkin_: and is crushing me
15:05prOMiNd: RSpliet, http://prntscr.com/mb3w7x
15:06prOMiNd: in this case
15:06prOMiNd: memclk strap0 has vendor id 1 = samsung which uses tweak index 5, while memclk strap1 has vendor id f = micron which uses tweak index 10
15:07prOMiNd: so vendor is being used and init population of manufacturer is being set either by group or vendor
15:08prOMiNd: OEM fills that specific register
15:09RSpliet: prOMiNd: Interesting, I've never seen that tool before. Bear in mind that in the VBIOS certain fields can simply be omitted, in which case the correct course of action for the driver is to omit the write sequence associated. It's likely we don't need to know the vendor, just look at bits in the table and do as they indicate us to do. The fact that we'd only be instructed to do something if the vendor happens to be X doesn't matter
15:10prOMiNd: yes, I already told you that OEM fills it and you don´t need to know it, but..I do :)
15:10prOMiNd: the tool was written long time ago to modify timings for P104/P106 gpus
15:11prOMiNd: it is of no use in development
15:12RSpliet: The question is whether the tool "stores" the vendor identifier in the VBIOS somewhere, or whether it just leaves hints like only populating this MicronCoreVoltage field on Micron chips
15:14RSpliet: Mind you, I *think* there's a bit in the VBIOS that selects "high"/"low" core voltage. We're oblivious as to what high and low means in terms of actual volts, so the field might always be populated for each vendor, yet it's semantics differ between vendors in ways we can't distinguish.
16:11prOMiNd: tried numerous fields that used to expose different information about micron/samsung but it tends to just be a coincidence
16:11prOMiNd: in order for nvidia to sign modified p104/p106 you have to touch only tweaktable, everything else they rejected it
16:13RSpliet: That tool could actually be very useful for documentation purposes. (AKA. replace all the "unk_04_0b" in code with something vaguely meaningful)
16:18prOMiNd: you don´t actually need the tool :)
16:18prOMiNd: it uses those
16:22karolherbst: imirkin_: uff, so supporting recovery from dead channels will be quite the work indeed :/
16:24karolherbst: the biggest problem I think is, that the pushbuf holds a reference to the channel as well and we can't really do it in a thread safe way like that :(
16:28karolherbst: also we can't check insode nouveau_bo_wait for the dead channel as we don't have a reference to it there :( will need to find a way to support that without breaking the API
16:34karolherbst: ohh wait, I am stupid, I think that's should be trivial afterall
17:11RSpliet: prOMiNd: the "Memory clock table strap entry" description in the official documentation is faaaaaaaaaaar from complete
17:12RSpliet: When they say "reserved", they mean "not publicly documented"
17:12prOMiNd: yeah, its not I used your findings to manage the vendor identification
17:13prOMiNd: I dont have any ¨internal¨ information so the tool is based only on public documentations
17:54karolherbst: wuhu, I have a reproducible "MULTIPLE_WARP_ERRORS" :)
17:55imirkin_: coz one isn't enough...
18:08LuMint: hi guys. using gts 450 with nouveau, trying to get temperature readings. how do I go about it?
18:10imirkin_: run 'sensors'
18:24karolherbst: wuhu, and that error is also triggered by an apitrace :) soo let's see what that error is all about
18:25karolherbst: imirkin_: I love when simple games trigger an error. 4 seconds with vsync disabled to replay that trace :)
18:25imirkin_: karolherbst: better than waiting an hour?
18:26karolherbst: well, bisecting the gl call doesn't take days, right :D
18:28karolherbst: mhh, a glDrawArrays
18:29karolherbst: maybe a stupid out of bound acces?
18:53karolherbst: imirkin_: out of bounds tex instructions won't trigger a trap, would they?
18:53imirkin_: no, that's legal
18:53imirkin_: out-of-bounds ubo will though
18:54imirkin_: we can disable shader traps if we want
18:54karolherbst: that would have been my second question :)
18:54imirkin_: what's the problem though?
18:54karolherbst: nothing, just spam inside dmesg
18:54karolherbst: or at least the rendering looks fine
18:54karolherbst: didn't really check though
18:55karolherbst: would like to know what it is though
18:57karolherbst: what was that name of that tool to modify the shader and rerun?
18:57karolherbst: or was there something new?
18:58imirkin_: frameretrace is for a single frame
18:58karolherbst: well, I simply want to modify a shader used for a draw call and see what changes
18:58karolherbst: using an apitrace of course
18:59imirkin_: yeah, i used frameretrace for that
18:59imirkin_: worked fine
19:03karolherbst: mhhh, it just... exited
19:05karolherbst: imirkin_: do I need to add something special to mesa for that to work?
19:06karolherbst: like support for INTEL_performance_query?
19:07imirkin_: just to get extra-fancy features
19:07karolherbst: mhh, weird
19:07karolherbst: it just "crashes" for me
19:07imirkin_: yeah dunno. maybe i did do something? i don't think so tho
19:10karolherbst: mhh, SIGPIPE
19:10karolherbst: seems like it tries to do something with glretrace
19:13imirkin_: oh i remember that...
19:13imirkin_: glretrace dies
19:13imirkin_: wrong version or something
19:13imirkin_: you need to set a path maybe? or bad libs? i forget.
19:13karolherbst: mhhhhhhhhh ohhhh, it uses the system one even though I set PATH?
19:14imirkin_: hopefully not
19:14karolherbst: uninstalling the system one, just to make sure :)
19:14karolherbst: still no luck
19:15karolherbst: but I use the same glretrace from the frameretrace build
19:15imirkin_: can you just run it with that glretrace?
19:15imirkin_: and see if it will play back your trace ok?
19:17karolherbst: it does
19:40karolherbst: imirkin_: when I nop the fp shader, the error doesn't appear in dmesg :)
19:40imirkin_: there's a setting for that btw
19:40imirkin_: you can make fp shaders return a solid color, etc
19:40imirkin_: bunch of neat options
19:41karolherbst: right, but I want to figure out what exactly is causing it, so I have to edit the fp anyway
19:41imirkin_: pastebin the shader?
19:42karolherbst: something inside effect() triggers it
19:44imirkin_: yea dunno
19:44imirkin_: nothing obvious
19:46imirkin_: perhaps the double-nesting makes the return get confused
19:46karolherbst: there is some missrendering though
19:46karolherbst: black boxes in the rendertarget :)
19:47imirkin_: and we all know what causes those...
19:47imirkin_: too much chrome.
19:48karolherbst: if I disable the return the error doens't appear
19:49karolherbst: returning a vec4(0.0); doesn't cause the error either
19:49karolherbst: sooo, something in that calculation inside it is odd
19:51karolherbst: * color;
19:51karolherbst: if I remove that part, it works fine
19:51imirkin_: could be that we screw up RA
19:52karolherbst: I don't think so
19:52karolherbst: color is VaryingColor
19:52karolherbst: which is a varying obviously
19:53karolherbst: or... maybe we do mess up RA
19:54karolherbst: moved tha multiplication outside of the function. Works as well
19:54imirkin_: which means that it has to not get screwed up
19:54imirkin_: for the duration of the loop
19:55karolherbst: I like frameretracer :)
19:55imirkin_: cool, right?
19:56karolherbst: yeah, I should use it more often :D
19:56imirkin_: what application is this in?
19:56karolherbst: kingdrom rush: origins
19:59karolherbst: imirkin_: ohh, NV50_PROG_DEBUG=3 while frameretrace :O that should even help us dumping one specific shader
20:01karolherbst: it doesn't appear in the output
20:01karolherbst: imirkin_: any ideas why that is?
20:03karolherbst: imirkin_: okay, do you have some patches to support printing the disassembly?
20:14imirkin_: shader cache?
20:15imirkin_: pipe to envydis?
20:15imirkin_: you mean in the thing
20:15imirkin_: i did
20:15imirkin_: but i didn't do them properly
20:15imirkin_: also i think frameretrace has since gained KHR_debug-ability
20:15imirkin_: which is the vastly better way of doing it
20:18karolherbst: imirkin_: not yet as it seems
20:20imirkin_: i saw patches for it
20:20imirkin_: check with mark
20:20karolherbst: I did, and it wasn't merged yet
20:20imirkin_: anyways, i'll check tonight what i have in my frameretrace branch
21:24karolherbst: imirkin_: well, the mul just reads from a to get that varyings value
21:24karolherbst: no, but does it right before using it
21:24karolherbst: was confused by that "pinterp mul"
21:25karolherbst: probably not a good idea to do it over and over again inside a loop, but...
21:25imirkin_: yeah, we struggle between extending live range and doing it over
21:25imirkin_: i've been reasonably happy to just do it however it ends up in tgsi
21:26imirkin_: since any specific strategy sucks
21:26karolherbst: comparing the generated shaders right now and there is no obvious mistake :/
21:27karolherbst: the thing is, why would that cause a trap?
21:28karolherbst: ohh, maybe the CFG stuff just sucks?
21:30kur1j: I'm trying to figure out if the Nouveau driver included with 18.04 Ubuntu supports the P2000 video card
21:31imirkin_: one way to find out...
21:31kur1j: The card is listed as a "GP104GL"
21:31karolherbst: kur1j: even if, probably not to a degree you'd expect it I guess
21:32kur1j: karolherbst: well I'm just trying to run through the Ubuntu 18.04 installer. It hangs on timeout from a nouveau timeout
21:32kur1j: once I get the OS installed I can install the Nvidia driver
21:32karolherbst: kur1j: huh, is that a mobile or desktop card?
21:32imirkin_: kur1j: boot with nouveau.modeset=0
21:32imirkin_: that should fix it right up
21:33kur1j: karolherbst: its a desktop card
21:35kur1j: karolherbst: I'm just surprised it isn't support. Its listed in the Nouveau firmware list (GP104) when I do modinfo nouveau and its a GP104, so I just figured it would work
21:36karolherbst: it should... but there can be various things going wrong
21:36karolherbst: like outdated firmwares on the live image
21:36karolherbst: which is most likely the cause here
21:36kur1j: you mean what is included in the 18.04 image?
21:36kur1j: thats what I was trying to find out
21:37karolherbst: do you know the version of the linux-firmware package included?
21:37kur1j: just for my own sanity, for example if they included version X of Nouveau, but I need version Y, then obviously thats the reason
21:37kur1j: I do not
21:38karolherbst: nouveau isn't the issue here (most likely). Nvidia shipped buggy firmwares
21:38karolherbst: which got updated later on
21:38karolherbst: there are still issues, but those are usually limited to laptops
21:38kur1j: filename: /lib/modules/4.15.0-43-generic/kernel/drivers/gpu/drm/nouveau/nouveau.ko
21:38kur1j: shipped buggy firmwares in the card itself?
21:39karolherbst: inside /lib/firmware/nvidia/gp104
21:39kur1j: in Nouveau?
21:39karolherbst: "linux-firmware" package ;)
21:40karolherbst: nothing much you can do about that except what imirkin_ suggested
21:40karolherbst: open a bug on ubuntu will most likely be super unsuccessful
21:40karolherbst: or maybe it would, dunno
21:41kur1j: linux-firmware/bionic-updates,bionic-updates,now 1.173.3
21:42kur1j: what does nomodeset=0 actually do?
21:43karolherbst: don't use nomodeset
21:43karolherbst: but nomodeset=0 should do nothing
21:43karolherbst: whatever "1.173.3" means... *sigh*
21:44karolherbst: why do distributioin try to be smart where it doens't make sense to be it
21:44kur1j: its pretty obvious that this is the cause, because I can put a GTX 980 in the machine and it works without issue
21:45kur1j: put in the P2000 and it breaks without the nomodeset like imirkin_ mentioned
21:46karolherbst: bug inside linux-firmware
21:46kur1j: welp, we have 60 of these machines coming
21:47kur1j: Dell has bugs
21:47kur1j: won't even install without BIOS changes to VMD
21:47karolherbst: needs 85c5d90fc155d78531efa5d2b02e92aaef7e4b88 backported
21:47kur1j: wait how did you find that out?
21:47karolherbst: kur1j: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/nvidia?id=85c5d90fc155d78531efa5d2b02e92aaef7e4b88
21:47karolherbst: it's obvious :p
21:47kur1j: karolherbst: rofl
21:48karolherbst: kur1j: I highly doubt that dell has bugs
21:49karolherbst: it's most of the time that either nvidia or nouveau has some :p
21:49kur1j: karolherbst: completely different problem
21:49karolherbst: in any case, don't buy laptops with nvidia gpus
21:49karolherbst: nvidia doesn't care to get it working
21:49karolherbst: I am quite sure I know which problem you are refering to, it's an nvidia bug actually
21:50kur1j: well its hard to say that when the *only* option for machine learning is NVidia...you have to have nvidia cards in machines haha
21:51karolherbst: well, then live with a shitty battery life time on those laptops :p
21:51karolherbst: imirkin_: ... that issue doesn't make any sense at all
21:51karolherbst: because... the CFG is exactly the same in both versions
21:52karolherbst: ohhh wait, there is a "not $p0 bra BB:20" I didn't see
21:53karolherbst: mhh, it's harmless though
21:53karolherbst: one way or the other it ends up with a break BB:16 with no other flow instructions in between
22:16kur1j: karolherbst: the other issue where the system wouldn't boot at all on the Dell system (even with nomodeset) had to do with the Intel VMD technology. Dell sent me instructions on turning those options off and it now seems to boot the Ubuntu installer
22:16kur1j: (if the nouveau version supports the card installed in the machine)
22:17imirkin_: there were some bugs in pascal modesetting which were fixed semi-recently
22:18imirkin_: relating to not putting something into vram when we were supposed to
23:23karolherbst: imirkin_: there are still some secboot issues on mobile chips
23:23karolherbst: like on my gpu
23:38karolherbst: imirkin_: mind taking a look at that ir? I really don't see what could trigger an error on the GPU: https://gist.githubusercontent.com/karolherbst/8ceb0420e6ad0bfd03a375530abdb012/raw/209c58f804084ce0abd8a9040ea1fe2e11b5808a/gistfile1.txt
23:39karolherbst: mhhh, except.. "$p0 pinterp mul" is that actually fine?
23:39karolherbst: not that there is some weirdo stuff going on making that illegal
23:46imirkin_: and what's the error?
23:46karolherbst: nothing usefull :/
23:46karolherbst: TRAP and MULTIPLE_WARP_ERRORS
23:46karolherbst: that's basically it
23:46imirkin_: anythign before that?
23:47imirkin_: i don't have any great ideas
23:47imirkin_: in such situations, esp with TEXS, i'd dump the shader binary
23:47imirkin_: and ensure that it decodes the expected way with nvdisasm
23:48karolherbst: right. I diff it with the adjusted output where I moved the * color outside the function
23:48karolherbst: but yeah... I could probably check the binaries again
23:48karolherbst: but most of the shader actually looks fine
23:54karolherbst: :/ I have a bad feeling again..
23:55karolherbst: okay.. doesn't seem to be the sched opcodes