09:05librin-thinkpad: karolherbst: hello, I saw You were interested in Civ6 crashes
09:07librin-thinkpad: it segfaults in the register allocator
09:07librin-thinkpad: do You want a backtrace or such?
09:08karolherbst: imirkin wants that :D
09:08karolherbst: best would be to create a bug on bugzilla
09:08librin-thinkpad: I see
09:08karolherbst: + apitrace
09:09karolherbst: we would need that glsl shader do debugs this properly
09:09librin-thinkpad: roger that
09:09karolherbst: I thought it may be multithreaded related, but it seems like it isn't
09:09librin-thinkpad: I'll make an apitrace as soon as I get back home from work and file a full report
09:09karolherbst: awesome, thanks
09:10librin-thinkpad: it's WAY too consistent to be MT-related
09:10karolherbst: I guess civ6 will have the same perf issue like civ5: being terrible CPU bound
09:10librin-thinkpad: although, it also does rarely crash before even reaching the main memu, but that is very rare
09:10librin-thinkpad: and it looks like THAT is MT-related
09:11librin-thinkpad: since the crash point of that wasn't consistent
09:11librin-thinkpad: I only saw it three times so far, so I can't be certain
09:11karolherbst: you know the application output of that?
09:12librin-thinkpad: yes, in the matter of fact I do
09:12librin-thinkpad: it's none
09:12karolherbst: I see
09:12karolherbst: anything in dmesg?
09:12karolherbst: if the gpu is still up and running
09:12karolherbst: usually with those mt issues, the entire nouveau stack goes down
09:13karolherbst: except, it is some bug in mesa itself not affecting the drm subsystem or the GPU
09:13librin-thinkpad: although, those three times it happened, nouveau itself puked a list of hex numbers that doesn't fit in my 10k line terminal scrollback
09:14librin-thinkpad: and I hadn't had logging to a file enabled, so I have no idea what came before that
09:14librin-thinkpad: i.e. the game might have outputted something, I just didn't see it
09:14karolherbst: I see
09:14librin-thinkpad: I'll try to catch it
09:14librin-thinkpad: again, when I'm home
09:15karolherbst: skeggsb: are you there?
09:16librin-thinkpad: I do have another question, about another weird issue
09:17skeggsb: karolherbst: yeah, i'm half-around atm
09:17karolherbst: skeggsb: did you see my pci fix for g92?
09:17librin-thinkpad: there's this one game that in itself runs fine, but after running it at least once, several other games randomly bring the GPU down
09:17librin-thinkpad: while those same games run perfectly fine if that other game hasn't been run since last boot
09:17skeggsb: karolherbst: yeah, i did, i'll probably merge it tomorrow
09:18karolherbst: skeggsb: awesome, thanks
09:18librin-thinkpad: does this sound like it falls under "MT issues"?
09:18karolherbst: librin-thinkpad: might be
09:19karolherbst: would be fun to debug this
09:19skeggsb:is working on it as fast as he can
09:19skeggsb: (MT issues, that is)
09:20librin-thinkpad: although, wait, MT issues... is this the same as those "thread sync" issues from ages before or is this something new?
09:22karolherbst: librin-thinkpad: it is more like MC issues
09:44mlankhorst: skeggsb: ping on atomic disable_all?
11:28RSpliet: mwk: 0x04.net is unusually slow (10KB/s)... known issues?
11:37mwk: RSpliet: no idea
11:37mwk: something is clarly wrong
11:56kattana_: any goodies in 4.9?
12:10pmoreau: kattana_: IIRC, there was no merge requests for 4.9 from Nouveau.
12:15kattana_: so not a single improvement on that release?
12:19pmoreau: No. But there will be reclocking improvements, among others, in 4.10.
12:20kattana_: ah ok
12:20kattana_: well I am jumping from 4.3 > 4.9
12:21librin-thinkpad: there were quite a few improvements since 4.3, that I can assure You of ;]
12:42karolherbst: kattana_: gooies are in 4.10 :p
12:42librin-thinkpad: "automatic reclocking when"
14:02dboyan: imirkin, I found something interesting when looking into my bug today
14:03dboyan: 3d and compute on kepler seems to follow different rules when setting texture and samplers
14:06dboyan: 3d encodes tsc and tic id in a single tex_handles[s][i]
14:08dboyan: while in compute, the values in the CB means both tsc id and tic id, they must be the same
14:10dboyan: The blob driver uses slots 0x310-0x312 for the textures, and it sets both tsc and tic of the range, according to mmt trace
14:10AeroNotix: Why do I get "primus: fatal: Bumblebee daemon reported: error: [XORG] (EE) Unknown chipset: NV117
14:10AeroNotix: When trying to run things with nouveau and primusrun? The chipset should be supported
14:13dboyan: AeroNotix, what is your kernel version?
14:13AeroNotix: dboyan: 4.9
14:14karolherbst: AeroNotix: don't use bumblebee when using nouveau
14:14AeroNotix: I've got 4.10 available to test. I can get things running with DRI_PRIME=1, but just interested in why primusrun/bumblebee tells me the chipset is unspported
14:14AeroNotix: karolherbst: why is that?
14:14karolherbst: because you get worse perf
14:14karolherbst: and because they also don't care
14:15karolherbst: and we don't care as well
14:15AeroNotix: ugh, graphics drivers are such a black box to me.
14:15AeroNotix: karolherbst: why don't you care?
14:15karolherbst: because there is prime offloading
14:15AeroNotix: These kinds of problems are very frustrating
14:15karolherbst: DRI_PRIME is the official way of doing offloading
14:15AeroNotix: How is 99% of users supposed to figure this out?
14:16karolherbst: AeroNotix: there is a wiki page for this
14:16karolherbst: the DDX is pretty much irrelevant for prime offloading with a recent enough X server
14:17karolherbst: just use DRI_PRIME and you are fine
14:17dboyan: imirkin, (following what's said above) nouveau uses tic slots 0-2 and tic 0, however, if I blindly copies tic 0 to 1 and 2 as a hack, my test program renders correctly on my machine
14:17karolherbst: if DRI_PRIME=1 glxinfo prints the nouveau stuff, you are set to go
14:18AeroNotix: karolherbst: oh I've got my gpu/game working. Only issue is that even with DRI_PRIME=1 I still get abysmal performance. I know that will improve. I was just wondering WHY primusrun tells me the chipset is unsupported when the nouveau driver page tells me it is. I don't understand that part.
14:18karolherbst: AeroNotix: because the nouveau ddx you ahve installed has no support for maxwell and modesetting is used instead
14:19karolherbst: AeroNotix: for good perf, you need the 4.10 kernel, mesa-17 and you have to change the pstates to something higher
14:19karolherbst: then you should get pretty good performance on your maxwell gpu
14:19AeroNotix: karolherbst: yes I am working on getting a clean set up for that. Arch has mesa-13 it seems. How old is that package?
14:19karolherbst: 4.10 is needed for stable reclocking, I added a bunch of fixes
14:19karolherbst: AeroNotix: pretty new
14:19karolherbst: AeroNotix: but the maxwell perf patches aren't in any mesa release yet
14:19karolherbst: and they can give you up to 2x more perf
14:21RSpliet: Wonder why the Tegra K1 has the ability of locally diverging on its two 32b memory "channels" if the cache line size is 64B anyway...
14:21karolherbst: RSpliet: magic
14:22RSpliet: it's an interesting middle ground between full-blown channels and wide ganged-rank set-ups, but I'm curious about the use-case if it's not shared-L2 (and GPUs often need even longer transfers if the stride isn't pathologically worst-case)
14:48dboyan: imirkin: well, I know, that seems what called linked-tsc mode, right?
14:51dboyan: and kepler's compute engine can't turn that mode off, I guess
15:00pmoreau: AeroNotix: Mesa 17 was released today, so it will take it a few days before reaching Extra. But it is already available on Testing if you want to try.
15:04AeroNotix: pmoreau: you mean for arch?
15:04AeroNotix: ok, great. Thanks
15:05pmoreau: I assume that Linux 4.10 should be in Testing by the middle of next week, if 4.10 is released on Sunday/next Monday.
15:06AeroNotix: pmoreau: perfect
15:06AeroNotix: Thanks for the information
15:06pmoreau: No problem :-)
15:21imirkin: dboyan: if we're indeed forgetting to disable linked tsc mode, that's huge...
15:24imirkin: dboyan: https://hastebin.com/joqobiqehi.pl
15:24imirkin: oh wait, that's in the wrong place
15:26pmoreau: imirkin: The images generation should be fixed, and an image should be coming within an hour or two. I blame skeggsb for updating the version of drm-next he is building against, without creating a commit whose message contains "drm-next $commit_hash"
15:28pq: imirkin, re: some bug related discussion you had with someone I already forgot; if you ever need help setting Weston up, I'll be happy to help when I'm around.
15:28imirkin: pq: thanks. i've given up for now - wayland doesn't cross-compile due to lack of wayland-scanner or something. i decided it was a sign that i shouldn't worry about it.
15:29pq: ah yes, cross-compiling has always been a weak point
15:29imirkin: not even weston... wayland. which ... as i understand shouldn't have a ton of things to compile. wtvr.
15:30pq: wayland-scanner is a code generator written in C, and it is needed to build libwayland. But the autotools scripts are not smart enough to build it twice automatically, for build and host archs.
15:31pq: essentially it needs a pre-built wayland-scanner for build arch in PATH, and tell the wayland ./configure to use that instead of what it is building.
15:31imirkin: ah, too bad.
15:32pq: I think practically every project using libwayland will also need wayland-scanner for the build.
15:33pq: it's a silly little program, but cross-compile...
15:34imirkin: yeah, that sounds pretty unfortunate. oh well.
15:36pq: imirkin, yeah, it's annoying, but at least it's tiny and not with a huge dependency tree.
15:37imirkin: well, let me know when it's fixed up and i can try again.
15:39dboyan: imirkin, I also copied your hunk into nve4_screen_compute_setup, and it doesn't work
15:40dboyan: I just wonder if it can be turned of on kepler+
15:40dboyan: turned off
15:41pq: imirkin, would you consider migrating from autotools to Meson as a "fix" or "even more broken"? :-o
15:41imirkin: dboyan: yeah, that method's just not there on kepler. i suspect it's part of the compute shader header.
15:42RSpliet: pq: Risking to sound like a troll rather than genuine interest, could you give me a 10000ft overview of why Meson trumps cmake?
15:42imirkin: pq: i haven't played with meson. does it require effort from the package maker's side to support cross-compilation (beyond "don't do stupid shit"), and does it have a ./configure --help equivalent?
15:43pq: RSpliet, I have no idea. All my attempts to do something with cmake have been utter failures. Meson I have not tried first-hand yet.
15:43imirkin: RSpliet: well, the above concerns are why cmake is trumped by autotools
15:44RSpliet: imirkin: I was about to ask you whether you answered my questions :-D Fair... cmake imho is mostly useful because it's simple, but probably not designed with cross-compile in mind
15:44imirkin: autotools is simple too... and way more understandable. and doesn't leave hard-to-delete cache files around.
15:44pq: imirkin, Meson has been advertised as "cross-compiling fully supported". The --help I would be very surprised if it was missing, but I haven't looked that carefully yet.
15:45RSpliet: imirkin: I take your word on that, I have yet to mess with autotools myself
15:47pq: imirkin, as someone who is thoroughly skeptical about every build-system-of-the-day, I've been somewhat cought up with my peers saying it good, up to the point that I read its manuals, but don't have project to port to it.
15:49pq: imirkin, Daniel Stone has already done most of the porting of both Wayland and Weston to Meson. Autotools will be kept on the side for a good while though so distributors have time to adjust.
15:50pq: not merged yet, maybe on the next cycle perhaps
15:51pq: I would guess that once Meson support is merged, the autotools build will be even more neglected than it was...
15:52karolherbst: is meson like cmake, just in good? :D
15:52pq: I'm not comfortable making any comparisons, the only thing I have experience atm. is autotools.
15:53karolherbst: allthough cmake isn't a build system anyway
15:54pq: Meson generates ninja build files, so isn't that also a meta-build-system?
15:54karolherbst: it seems so
15:54karolherbst: cmake can also generate ninja build files though
15:55pq: yup, I'm sure you can find lots of comparisons and flames on that topic ;-)
16:01pq: imirkin, what distribution tooling do you use for cross-compiling? obs, yocto, ...? Just curious.
16:02imirkin: pq: gentoo
16:03imirkin: portage supports cross-compiling
16:07dboyan: imirkin, is there any possibility to work out how to set linked tsc mode in compute shader header?
16:07imirkin: mwk: happen to know perchance how to disable linked tsc mode on kepler+ compute? i assume it's via the launch descriptor, but haven't the faintest clue how to figure out which bit
16:08imirkin: heh. well that was well-timed.
16:08dboyan: yeah, really
16:08imirkin: anyways, mwk is the RE magician
16:12imirkin: aha! found it, i think
16:13imirkin: dboyan: https://hastebin.com/dezacedima.hs
16:15dboyan: rebuilding mesa, should be able to test in two minutes
16:18dboyan: Wow it works, that's real magic
16:25imirkin: dboyan: great find on the fact that it *was* the linked tsc setting
16:29imirkin: dboyan: i sent a slightly more complete patch to the list. end result should be the same though.
16:30pq: imirkin, I have suspicion that the wayland ebuild is not exactly up to par in the case where wayland is not installed already in the build-arch system. I wonder what other project would use a binary code generator tool I could compare with.
16:30pq: I do not see a dependency to wayland in the wayland ebuild to force building a native version
16:30imirkin: that shouldn't be something for the ebuild to do
16:31imirkin: the wayland-scanner binary should be built with a "build" target, and then used from that
16:31imirkin: (i can't for the life of me remember wtf target, host, etc mean... but basically you should be able to specify in autoconf/automake that you need this tool to be run by the builder-arch, not the end-result-arch)
16:32pq: yes, sure, but it's not only wayland.
16:32pq: imirkin, how should it work then? One needs to put the build-arch scanner somewhere so that wayland, weston and all the other projects can find it.
16:32pq: where is that place in Gentoo?
16:32imirkin: then it needs to be a separate package
16:32imirkin: and depended upon as a build tool rather than as a runtime dep
16:32imirkin: (just like gcc, make, etc are also build tools that have to be on the builder's system)
16:33pq: right, so a wayland-scanner.ebuild that just pulls the same sources as wayland.ebuild, but installs only the scanner
16:33imirkin: tbh i don't remember how that's expressed in ebuilds, but it's definitely a thing.
16:33imirkin: xDEPEND. where x != R. i forget what though.
16:34dboyan: imirkin, thanks for the effort. I gonna sleep now, will build send my t-b tomorrow
16:35imirkin: dboyan: no rush. i'm off too.
16:35imirkin: hakzsam: might be time to test out F1 2015 again ;) [with that fix applied, that is]
16:36pq: imirkin, Right. Also yocto does something alike, building all build tools into a different sysroot than all the actually cross-built stuff I hear. Looks like this should actually be solved with Gentoo ebuilds then. Apparently the current "cross-build support" relies on wayland being already installed.
16:37imirkin: pq: yeah, i guess. relying on external build tools that aren't gcc and make is fairly uncommon.
16:37pq: imirkin, dare I suggest, might your issue be solved as simply as installing dev-libs/wayland natively (would not be used by anything), or did you perhaps already try that?
16:38imirkin: pq: yes, you can dare... but then prepare to suffer the consequences! :p
16:38pq: that said, I'm not quite sure how Weston would pick it up...
16:38pq: imirkin, well, I did offer :-)
16:38imirkin: pq: so ultimately what i want to do is test the bug claiming that all colors are fubar'd on wayland big-endian
16:39pq: ah, that was the discussion, yeah
16:39imirkin: pq: i have a ppc g5 with a NV34, which does not support GLES2
16:39imirkin: given what you know about the wayland ecosystem, any recommendations?
16:40pq: do you need to run GLES2 on hw to test the bug?
16:40imirkin: [it does support GL 1.5]
16:40imirkin: well, i think the bug is in mesa
16:40imirkin: so i'll need to run glxgears-or-equivalent
16:40imirkin: and i can force-enable GLES2 for such an application
16:40imirkin: and hope it doesn't use NPOT textures or weird blending
16:41pq: weston has two options: GLESv2 and Pixman renderers. Either way, you'll hit sw rendering on NV34 then. This also means you cannot use hw-GL Wayland apps, since weston cannot use hw-GL.
16:42imirkin: ok, so... solution is ...
16:42pq: I think Weston also has known big-endian problems with GL.
16:42imirkin: don't use weston? :)
16:42imirkin: (which is why i was going around shopping for simple compositors)
16:43imirkin: basically i'd like the *simplest* compositor imaginable which still lets me do hw-GL stuff
16:44pq: hmm... I'm can't name any for certain. I know other compositors very poorly.
16:45pq: imirkin, would hw-GL on KMS be of any value? I.e. without Wayland, for starters.
16:47pq: I should get back tomorrow, it's quite late here. .o/
16:49imirkin: pq: well, the bug is in wl_*
16:49imirkin: or so the theory goes
16:49imirkin: pq: https://bugs.freedesktop.org/show_bug.cgi?id=99638#c13
16:50pq: I'll need to ask around if there is any compositor that could do hw-GL with less than GL 2.0.
16:51pq: ahh, Peppa pig, that it was
16:51imirkin: just so we're clear - i couldn't care less if the *compositor* used hw GL for its compositing duties
16:52imirkin: [in fact, i'd prefer it didn't]
16:52imirkin: coz that's just one more place for things to go wrong
16:52pq: imirkin, I do not think anyone has written a compositor supporting hw-GL app while it itself did not use hw-GL.
16:52karolherbst: pq: xfwm does compositing without GL at all, no idea if it uses hw accel though
16:52karolherbst: but I think it does
16:53pq: it means the compositor would need to be doing glReadPixels all the time, which is both slow and slow in the wrong place.
16:53pq: karolherbst, *Wayland* compositors.
16:53karolherbst: I see
16:53imirkin: pq: why would the compositor know anything about GL? it'd just be handed a buffer that it could memcpy, no/
16:54pq: imirkin, no, it would get handed an EGLImage it has no way to access aside from making a GL texture from it.
16:54imirkin: pq: ok, clearly i don't understand the wayland interface. that's fine though.
16:55imirkin: i have to go anyways
16:55pq: me too, I'll get scolded soon :-P
18:58imirkin_: hakzsam: can i convince you to crash your box by running the F1 2015 trace with my patch?
20:05imirkin_: Tom^: did you say you grabbed a copy of Civ6? i hear that the reason for the crash is actually something in nouveau codegen compiler, not multithreading
20:12librin: imirkin_: I'm filing a bug w/ a trace for that crash at this very moment, btw
20:12imirkin_: librin: oh awesome, thanks!
20:12imirkin_: i am aware of a handful of ways of crashing codegen, i'm hoping it's something simple.
20:13imirkin_: note that i did recently introduce a bug, and even more recently, fixed it, relating to compute shaders. probably a 1-week period where the bug was "live"
20:14imirkin_: this was the fix: https://cgit.freedesktop.org/mesa/mesa/commit/?id=399e267f0e633df41eb1922f7c5f0958a40d6a52
20:14imirkin_: breakage was a few days earlier - https://cgit.freedesktop.org/mesa/mesa/commit/?id=e4a698cb97224ef22469b0d8fd703cf164d380f1
20:21Echelon9: throwing this question out to the envytools admins on GitHub: Are you happy for me (a member of envytools) to turn on Travis-CI for the main repository and have it run automatically on each proposed PR?
20:21Echelon9: not a hard *requirement* but a clear visual green/red on any future proposed PRs
20:22Echelon9: the necessary tooling and config should all be there, it just need the switch to be flicked
20:23imirkin_: Echelon9: that's just a github ui thing right? if i just push commits, that won't affect me?
20:26librin: imirkin_ if that bug was fixed a few days ago, it's not related, as Civ6 still crashes in the same way on literally minutes old mesa build from HEAD
20:27librin: and here Ya go: https://bugs.freedesktop.org/show_bug.cgi?id=99799
20:27imirkin_: ok cool. just checking.
20:27imirkin_: ooh. that's a new one.
20:28librin: as for the other bug mentioned in this report, I probs can report it tomorrow; need to gather a lot more information for that report than I needed here
20:28imirkin_: although it does feel oddly familiar... if it's not too much trouble, could you run the game with NV50_PROG_DEBUG=1 and capture the shader that causes the crash?
20:28librin: >capture the shader that causes the crash
20:28librin: elaborate, please
20:28imirkin_: well, when you set that env var, you'll get a ton of junk printed
20:29imirkin_: the last group of junk should correspond to the shader that kills everything
20:29librin: coming right up
20:29imirkin_: it should start with TGSI, and follow with some more things.
20:29imirkin_: in this case, the pre-RA nv50 ir, and then the crash (which happens during RA)
20:30imirkin_: and we can take the TGSI and feed it into the compiler directly
20:30imirkin_: thus avoiding the trace annoyingness.
20:31airlied: imirkin_: btw did you run cts subroutine tests? some of those used to lose their minds on nouveau
20:31airlied: just wondering if you see it
20:32imirkin_: airlied: no... cts tends to crash in the middle, and i haven't been able to get it to spit out the full list of tests so that i can run it more piecemeal
20:32nyef```: ... Is there a random-tester for the nouveau compiler?
20:32imirkin_: nyef```: it's called "opengl games" :)
20:32airlied: imirkin_: use piglit to run it?
20:32imirkin_: airlied: that requires a test list
20:32imirkin_: like gles31-master.txt
20:32imirkin_: there is no such file for the glcts
20:32nyef```: imirkin_: That's not a random-tester, that's an incompletely-deployed, poorly-automated, hand-crafted test suite.
20:32imirkin_: [in the public repo]
20:33airlied: ah maybe the list didnt make it out
20:33imirkin_: i tried generating it (there are options for that), but to no avail
20:33airlied: i havent tried thenopen source one yet
20:34imirkin_: airlied: anyways, there are definitely lots of ways to get the nouveau compiler to go bonkers
20:34imirkin_: most of those require test-style shaders rather than real opengl game style shaders
20:35imirkin_: nv50_ir::Interval::overlaps (this=this@entry=0x70
20:35imirkin_: that's probably not *great* ;)
20:37imirkin_: airlied: i have fixed a handful of bugs that the GL CTS suite pointed out though. mostly trivial stuff.
20:37imirkin_: with dboyan's help, looks like we might have finally nailed down a huge source of compute shader fail, which will be nice.
20:38librin: imirkin_: running the game with NV50_PROG_DEBUG=1 gave exactly zero bytes of output
20:38imirkin_: librin: oh, needs to be a debug build (--enable-debug)
20:38librin: "how do I enable that on gen... err, funtoo, again?"
20:39imirkin_: at least on gentoo
20:39librin: oh right, I see the use flag for that
20:39librin: hope it does what I think it does
20:40librin: although this all would be so much easier if the game didn't take DNF to load, dang it
20:40librin: Duke Nuken FOREVER
20:41imirkin_: it'll be even slower with debug
20:42librin: FWIW, I didn't see any difference between running it undr a -O2 and -O0 mesa build
20:42librin: so I doubt it'll be that much slower
20:42imirkin_: well, debug enables code paths
20:42imirkin_: that are not there
20:43imirkin_: airlied: anyways, i suspect that full CTS conformance will remain elusive for quite a while for nouveau. there are a handful of things we kinda skimp on, like 3d images.
20:51imirkin_: [and fp64 precision, but hopefully dboyan will improve some of that]
20:53Echelon9: imirkin_: Travis-CI is just a github ui thing as proposed. No impact to your workflow if you push commits directly.
20:53imirkin_: Echelon9: and presumably something we can turn off easily should it go awry
20:54Echelon9: imirkin_: Always
20:54Echelon9: easy to turn off
20:54imirkin_: i don't see any downside then
20:54imirkin_: just flip it on and see how it goes.
20:55Echelon9: i've only ever seen it be helpful on other FOSS projects I work on
20:56imirkin_: cool. envytools doesn't really "do" the github flow of things, tbh i'm not a fan, and i doubt others love it too much either. but this doesn't enforce the github flow, and presumably provides useful info as it'll build master...
21:02librin: imirkin_: I am not really sure which part to cut away from the dump, so I'm posting the whooooole dump.
21:02librin: sound good?
21:02imirkin_: probably too much
21:02librin: 635.5KiB of a dump
21:02imirkin_: is there a TGSI dump in the last 1k lines or so?
21:03librin: I don't see "TGSI" in the last few k of lines, actually
21:03imirkin_: that's normal
21:03librin: hence confused on where to cut from
21:03imirkin_: do you see FRAG
21:03imirkin_: (for example)
21:03librin: and MAIN:
21:03imirkin_: and you see the lines that follow FRAG
21:03imirkin_: which look like a assembly dump
21:03librin: the very last 1.7K block of lines is a MAIN:
21:04imirkin_: that's a big program.
21:04librin: before that there goes a FRAG: block
21:04imirkin_: fine, paste the whole thing, i'll sort it out.
21:04librin: the FRAG block is 997 lines, wew
21:04librin: and okay
21:05imirkin_: that's about 900 more lines than i wanted it to be =/
21:05librin: also, s/1.7/1.3/
21:05imirkin_: i guess that makes sense given how big the shader is
21:05imirkin_: esp since nv50 ir is scalar while tgsi is vector
21:05librin: the reg count goes up to something like 4.9k as I can see
21:05imirkin_: those are virtual reg ids
21:06librin: yeah, I know
21:06imirkin_: they're allocated willy nilly, that's not very unusual
21:06librin: in case it can only handle 2^32 regs, that's probably a problem
21:06imirkin_: it's a very sparse quantity
21:06librin: but heck if I know
21:06imirkin_: i don't think i've seen the count ever go over 100k
21:07librin: wait, I missed by several orders of magnitude there
21:07librin: either way, sec
21:10librin: imirkin_: okay, it's up on the bug report
21:11librin: and that's my cue to hit the hay; it's late and I've got work tomorrow
21:11imirkin_: hm. odd. i wouldn't have expected this shader to hit any issues.
21:11imirkin_: librin: can you mention what GPU you have?
21:12imirkin_: ok thanks
21:12imirkin_: only 64 regs, could be it's failing in the RA -> undo RA -> spill -> RA dance
21:12librin: (GeForce GTX 770)
21:13nyef```: RA? ... "Register Array", or something else?
21:13imirkin_: nyef```: register allocation
21:13nyef```: Ah, okay.
21:14nyef```:shudders in memory of some backend bugs triggered by changes to regalloc logic.
21:14librin: nyef```: how come You can have all those ` in Your name, yet I can't register a nick with dots/periods/full_stops >:[
21:14imirkin_: yeah, it's all a little sensitive.
21:16nyef```: librin: Because I've been disconnected three times since I actually logged in, and ERC likes to add ` to the end of my name every time it fails to log in, but doesn't cycle back to my originally-specified nick.
21:39skeggsb: imirkin_, dboyan: you are aware of ftp://download.nvidia.com/open-gpu-doc/Compute-Class-Methods/1/ and ftp://download.nvidia.com/open-gpu-doc/qmd/1/ ?
21:40skeggsb: compute classes, and "launch descriptor" headers
21:42imirkin_: skeggsb: i forgot about those... where do you see the launch descriptor headers?
21:42skeggsb: qmd is the headers
21:42skeggsb: Queue Meta-Data, I believe
21:42imirkin_: oh. and there it is.
21:42imirkin_: #define NVA1C0_QMDV00_06_SAMPLER_INDEX MW(382:382)
21:42imirkin_: #define NVA1C0_QMDV00_06_SAMPLER_INDEX_INDEPENDENTLY 0x00000000
21:42imirkin_: #define NVA1C0_QMDV00_06_SAMPLER_INDEX_VIA_HEADER_INDEX 0x00000001
21:42skeggsb: yep :)
21:49imirkin_: either way, i'm pretty sure i whacked the right bit. but yeah, this is all pretty interesting... should update gk104_compute.xml with that info.
21:50skeggsb: i need to add pascal compute support to mesa too at some point, but, given the fw situation, not too high a priority
21:50skeggsb: i have a test app that's good enough for my immediate needs :P
22:17karolherbst: skeggsb: well there are other things more important than pascal right now until we get the firmware :D
22:18Yoshimo: automated reclocking for older cards for example
22:18karolherbst: I have a RFC state of thing for this already
22:19karolherbst: but yeah, this is for gt215+ only
22:19imirkin_: still not bad.
22:19imirkin_: probably worth cleaning up.
22:19karolherbst: I have the last reclocking patches on my top priority right now
22:20karolherbst: it helps a little with overheating
22:20karolherbst: aka adjusting the clock when the temperature changes
22:20karolherbst: found a bug with this yesterday
22:22Yoshimo: how far have you come with claiming openGL 4.5 on the different models?
22:22karolherbst: imirkin_: I was pretty sure that this patch would prevent the driver from hanging when you reclock while suspended: https://github.com/karolherbst/nouveau/commit/b9aec38ec28fdb8d76337aadbd50ef6149f028fd
22:22karolherbst: but something else is odd
22:23imirkin_: Yoshimo: well, i still have no clue what the conformance process entails. however i doubt we'll be able to claim conformance in the next year.
22:23karolherbst: but I don't even want to prevent changing pstates while being suspended anyhow
22:23imirkin_: off chance we might be able to on maxwell actually...
22:23karolherbst: I need a deeper thought with this
22:24karolherbst: imirkin_: do we fail with too many things?
22:24Yoshimo: conformance test is one thing, are the features for it done?
22:24imirkin_: karolherbst: well, there's tons of little fails... but the big ones for fermi/kepler are going to be 3d image support
22:25imirkin_: either we do the address calculations in the shader, or we detile before passing it in.
22:25imirkin_: (detile along the z axis)
22:25karolherbst: skeggsb: I also have a tested-by now for my pci fix: https://github.com/karolherbst/nouveau/commit/0a7e150242b9e86e6fcb28a48cae25e744c6ed5f
22:26karolherbst: silly issue, why are 6 and 0 so close :D
22:26imirkin_: Yoshimo: feature-wise, it's basically there. just a few of these odd, rarely-used corners left.
22:27imirkin_: we've been requested not to return GL 4.5 version upstream as it may create difficulties for distros
22:29Yoshimo: if it is not entirely done, don't claim support officially, i understand that
22:29imirkin_: well, it's not the 4.5 bits that aren't done
22:29imirkin_: like i said... just some corners of corners of features
22:29imirkin_: like shader images
22:30imirkin_: a few others, i think
22:31imirkin_: query buffers still aren't perfect
22:31imirkin_: lots of little things :)
22:31imirkin_: generally just stuff that conformance tests care about, not real applications
22:31Yoshimo: is this amount of small things what caused the request from distros?
22:32imirkin_: the request was because KHR may get mad at distros for shipping a GL 4.5 impl that didn't go through the conformance process.
22:33Yoshimo: ah so not a technical issue, i see
22:34imirkin_: in general nouveau is pretty conformant
22:35Echelon9: imirkin_: Appears I need to be moved from "member" permissions on GitHub organization to "owner" to edit the necessary webhook permissions to link Travis
22:35imirkin_: Echelon9: give me the thing to enter in?
22:37imirkin_: [i'm on the add webhook page... what url/etc settings do i need?
22:39Echelon9: let me see. I've only ever done the request from the Travis-CI side, which automates the setup
22:40karolherbst: you don't need to setup anything
22:40karolherbst: just login into travis via the envytools user
22:40karolherbst: and enable it there
22:40imirkin_: there's a thing to add the travis ci service in github
22:40imirkin_: which requests a user/token/domain
22:41karolherbst: I will do it
22:41japele: somebody can tell me the difference between 0x022438 and 0x02243c nouveau registry?
22:41karolherbst: imirkin_, Echelon9: it should work now
22:42karolherbst: imirkin_: https://travis-ci.org/profile/envytools
22:42imirkin_: japele: 22438 = PUNITS.DESIGN_PART_COUNT
22:42imirkin_: 2243c = undocumented
22:42imirkin_: oh. 2243c is on GP100+
22:43imirkin_: something subtle and memory-related.
22:44karolherbst: why is there my user now configured for travis.... whtvr
22:45karolherbst: Echelon9: next idea for the travis integration: check if the xml files are wellformed and that lookup doesn't throw out errors
22:46karolherbst: no ide what check_rnndb does though
22:46karolherbst: maybe it does something like this already
22:47Echelon9: karolherbst: I'd been thinking along those lines... will take a closer look at check_rnndb
22:49japele: imirkin: OK, using 2243c registry insted of 22438 for detect the ram parts in ramgf100.c fixes the bug on my fermi GF108 that detect half amount of ram
22:49karolherbst: "fixes" or fixes
22:50karolherbst: I am sure there are tons of regs which may fix the bug as well
22:50imirkin_: japele: hmmm... let's see if the gk20a headers have anything clever to say. probably not since they don't have these fuses...
22:50karolherbst: japele: what values have both regs for you?
22:51karolherbst: imirkin_: on my gpu both regs are 0x3
22:52imirkin_: 22438 = top_num_fbps
22:52imirkin_: no mention of 2243c
22:52japele: on my 22438 is 0x1 and 2243c is 0x2
22:52karolherbst: japele: did you do an mmiotrace?
22:53karolherbst: nvidia doesn't read 0x02243c here
22:53imirkin_: has info on 2243c for pascal...
22:53japele: Yes but I don't send yet
22:53karolherbst: japele: I would like to take a look inside the trace
22:53imirkin_: In most prior NVIDIA architectures
22:53imirkin_: (except GF108) each logical FBP mapped to one FBPA.
22:54imirkin_: heh. and of course you happen to have a GF108 afaik, right?
22:56japele: Imirkin: yes karloherbst: I must send it at mmio dot dumps?
22:56imirkin_: japele: that'd definitely be good.
22:56karolherbst: or give me a link
22:56karolherbst: whatever works for you
23:02snkcld: just curious.. but why cant a optimus device be used for gpu passthrough for a guest vm?
23:02snkcld: i dont exactly understand why its not possible... the GPU has a framebuffer that it outputs to right? so why not just have that framebuffer sent to the guests memory ?
23:02imirkin_: snkcld: should work
23:02snkcld: oh, haha
23:02snkcld: well, in that case, i take back waht i said
23:02imirkin_: it'll require jumping through some hoops
23:02snkcld: oh, dang
23:02snkcld: like what?
23:03imirkin_: since most usually the vbios is stored in ACPI for those
23:03snkcld: as opposed to what?
23:03imirkin_: which means you'll either need to construct a sufficient ACPI for the guest system to include that
23:03imirkin_: or to pass it in via an oprom you invent
23:03imirkin_: or some other mechanism
23:03snkcld: so in a discrete card, the vbios exists elsewhere?
23:03imirkin_: snkcld: as opposed to being in the option rom
23:04snkcld: ok, ok
23:04imirkin_: depending on the driver in the guest, it might be easy to convince it to load your vbios
23:04imirkin_: like if the guest is also running nouveau, you can boot with config=NvBios=foo
23:04imirkin_: which will load whatever you tell it
23:05imirkin_: if the guest is running windows, it might take a bit more convincing :)
23:05snkcld: imirkin_: what is an example of what foo would be?
23:06imirkin_: path to where request_firmware() can load the file
23:06snkcld: oh, so,o n the host i could like, dump the acpi data
23:06snkcld: and extract the vbios portion
23:07imirkin_: well, easiest by just loading nouveau
23:07snkcld: then tell the host to use that file for the bios
23:07imirkin_: and then dumping /sys/kernel/debug/dri/N/vbios.rom
23:07snkcld: oh, ha
23:07snkcld: that does seem easier
23:07snkcld: so option rom is like, the basic driver for a device that comes with the device....
23:08snkcld: like a primitive driver before the OS's modules?
23:08imirkin_: option rom is executed by the bios
23:08snkcld: how does bios know where the option rom is located?
23:08imirkin_: it's a PCI thing
23:08imirkin_: when it scans the PCI bus, it knows which devices have option roms
23:08snkcld: that makes sense, ok
23:09imirkin_: this is how various devices can do things before boot
23:09imirkin_: like PXE, display things on the screen, etc
23:09snkcld: so is it similar to the kernel having "init_module"
23:09snkcld: for each built in module
23:09imirkin_: yeah. just happens before boot ever starts :)
23:09snkcld: it just iterates over each function and executes it
23:09snkcld: yea yea
23:09imirkin_: it works a little different with EFI, but same basic idea i think
23:10snkcld: so the device's option ROM can register itsefl to handle certain interrupts?
23:10imirkin_: can do whatever it wants.
23:10snkcld: like if its a keyboard, it can register itself in the option rom?
23:10imirkin_: it's just code running on the CPU in real mode.
23:10snkcld: yea, yea
23:10imirkin_: what could possibly go wrong
23:11snkcld: so the machine starts, the bios goes over each function 0 of each bus + device number, if a function 0 responds, it then asks the device for option rom?
23:11imirkin_: something like that.
23:11imirkin_: its presence is indicated in the pci config space i think
23:12snkcld: ok ok
23:12snkcld: the "going over each function 0" happens in that space right?
23:12snkcld: the whole mailbox thing
23:12imirkin_: tbh i've never written a bios
23:13snkcld: imirkin_: so if the vbios stuff gets passed in... the guest should in theory work with the device?
23:14imirkin_: in theory.
23:14imirkin_: in practice, nothing is so simple.
23:14imirkin_: the option rom is an x86-executable binary
23:14imirkin_: however the vbios also needs to be parseable by the driver
23:14imirkin_: so there are actually often 2 different "APIs" for accessing it
23:15snkcld: oh yea, those are different things...
23:15imirkin_: one, via the option rom, which returns x86 asm
23:15imirkin_: this asm is basically an interpreter for vbios opcodes
23:15imirkin_: the other is a way to get at the vbios directly
23:15snkcld: ok so the option rom on the graphics card is meant to be executed by the _graphics bios_
23:15snkcld: not the cpu's bios?
23:16imirkin_: it _is_ the graphics bios
23:16imirkin_: which is executed by the cpu's bios
23:16snkcld: ok ok
23:16snkcld: so option rom = vbios (atleast in some cases e.g. optimus)
23:16imirkin_: well, in the optimus case, there is no option rom at all
23:16imirkin_: and the vbios lives in ACPI
23:16snkcld: just the vbios...
23:17imirkin_: i.e. you call acpi methods, and they return vbios data to you
23:17snkcld: uh huh
23:17snkcld: and between these 2 methods, when is one used or the other?
23:18imirkin_: the driver tries to read the vbios from vram, then from the prom [not the same thing as the option rom], then acpi (i think... or some other order)
23:19snkcld: you mention the driver... but what about before boot? is the vbios etc not accessed before boot?
23:19snkcld: oh sorry
23:19snkcld: i misread the sentence
23:19imirkin_: that's all done by the option rom, or by an integrated bios that knows about all the peripherals.
23:20snkcld: so between vram, prom, and acpi, theres some order which the option rom will look to get the 'vbios'
23:20imirkin_: option rom will always use its own way
23:20imirkin_: the driver does the fallback thing.
23:20snkcld: the driver...
23:20snkcld: is this the OS's driver?
23:20snkcld: honestly, im not even trying to think about that at this point in time haha
23:20snkcld: the boot process is complex enough
23:21snkcld: but thank you
23:21imirkin_: well, you don't HAVE to have working graphics during boot
23:21snkcld: but is what i said correct?
23:21snkcld: the option rom goes looking for a vbios too?
23:21imirkin_: e.g. that's what happens when you plug a random board into a mac which doesn't have an OF blob in its option rom.
23:21imirkin_: yes, it does.
23:22imirkin_: but it knows where to find it
23:22snkcld: and if it finds a vbios, it then jmp's into it
23:22snkcld: then the vbios might set up interrupt handlers for video?
23:22imirkin_: option rom can't not find a vbios
23:22mjg59: snkcld: It's unusual for firmware-level drivers to use interrupts
23:22imirkin_: i mean, you could have catastrophic failure where it's not there, but that's where you get catastrophic boot failure :)
23:23snkcld: mjg59 the wiki page is saying that vbios uses int 10h
23:23imirkin_: mjg59: well, presumably it installs the usual real mode int 10h stuff
23:23mjg59: snkcld: They'll usually do a bare minimum of modesetting and link training
23:23snkcld: i see
23:23mjg59: snkcld: Unf ok that's not untrue but
23:23snkcld: but its rare right?
23:24mjg59: How familiar are you with software interrupts?
23:24snkcld: mjg59: i think i have a fair grasp
23:24snkcld: like system calls via int0x80
23:24mjg59: On x86 systems in BIOS mode, yeah, video cards will usually install themselves as int10 handlers
23:24snkcld: passing the syscall numebnr in eax etc
23:24mjg59: But they'll rarely make use of hardware interrupts
23:24imirkin_: [int10h is a software interrupt of course]
23:24snkcld: of course, yea
23:24snkcld: my bad
23:24imirkin_: it's just a way to invoke the software without an explicit "RPC" interface
23:25mjg59: UEFI doesn't use the int10 interface
23:25snkcld: so int10 is triggered in software, then the bios looks at what code to run for int 10... it finds a memory address which is mapped into vbios?
23:25mjg59: (Ok there's a set of corner cases where that's not actually true but they're horrifying and have mostly died out)
23:25imirkin_: int10 is triggered *by* software. the video card will have installed its handler for it to implement the functionality.
23:25mjg59: snkcld: Yeah the video card installs an address in the IDT and the CPU jumps to that address when int10 is called
23:25imirkin_: so the BIOS will call int10h to configure video settings, etc
23:25snkcld: imirkin_: yea by, thats what i meant
23:26mjg59: That address will typically point at the option ROM, but not necessarily on systems with integrated graphics
23:26snkcld: imirkin_: so that happens _after_ the bus enumeration etc then
23:26mjg59: snkcld: Yes
23:26imirkin_: snkcld: yeah. BIOS is a pretty fragile system :)
23:26imirkin_: a wonder it ever worked at all
23:26snkcld: its confusing
23:26japele: karolherbst: mmiotrace sended...
23:26mjg59: Once an option ROM is found, the system bios will jump to offset 0x3 in the ROM and execute it
23:26mjg59: That code then installs the int 10 handler
23:26snkcld: japele: please tell me this is in regards to the gtx 1050
23:26mjg59: And does initial modeset
23:27imirkin_: (why 3 you ask? because all ROMs have a 0x55aa header which takes up 2 bytes)
23:27karolherbst: japele: thanks
23:27mjg59: BIOSes *may* then use further int10 calls
23:27mjg59: Say, to set a graphics mode for bootsplash
23:27snkcld: does it know it found the ROM due to the header?
23:27snkcld: awesome ok
23:28snkcld: so machine starts, pci stuff is done, at which point memory addresses are mapped to devices and option roms have been executed, then, after all that, int 10h points to a vector inside the graphics card option rom
23:29mjg59: In the old days X would make int10 calls
23:29snkcld: lol wow
23:29snkcld: so the vbios is the option rom? i forgot
23:29mjg59: And included an x86 emulator so that would work
23:29mjg59: On plug-in cards, the vbios is in the option rom
23:29snkcld: yea yea
23:29snkcld: so youre saying its _in_ the option rom
23:30japele: snkcld: I have a fermi geforce GT60M (gf108) on optimus laptop, do you have less ram detected?
23:30snkcld: im confused though because from my understanding, a bios "provides run time services"
23:30snkcld: does the graphics card "provide runtime services"? or am i just being too pedantic?
23:30imirkin_: it does provide runtime services
23:30snkcld: in my mind, a bios is a thing that responds to interrupts i guess
23:30mjg59: When nouveau does modesetting it executes scripts that are in the vbios
23:30imirkin_: the system bios also provides runtime services
23:30snkcld: uh huh
23:31mjg59: int10 executes an interpreter that runs the same scripts
23:31snkcld: i see
23:31snkcld: so nouveau runs int 10?
23:31snkcld: or it just reads the data from the vbios and does it itself?
23:31mjg59: nouveau and int10 run the same scripts
23:31mjg59: There's a similar setup on Radeon
23:32imirkin_: apparently the radeon scripts are a lot more involved than the nvidia ones...
23:32skeggsb: and there's a *lot* more to the process than the scripts, they're just a very small part of the modeset
23:32mjg59: Yeah, conceptually similar but quite different in the details
23:32snkcld: are these "scripts" the bytecode that is interpreted by firmware?
23:32imirkin_: mmm... not by firmware
23:32imirkin_: by the cpu
23:32snkcld: oh, ok ok
23:33snkcld: so theyre binary code then right? like
23:33snkcld: theyre just straight up ran by the cpu
23:33imirkin_: no, they're bytecode
23:33imirkin_: that requires an interpreter
23:33imirkin_: but that interpreter runs on the cpu
23:33snkcld: oh yes yes of course
23:33imirkin_: while firmware implies that it runs on the device
23:33snkcld: is this... AML?
23:33snkcld: ive heard of like, AML or something
23:33imirkin_: no, AML is different
23:33snkcld: ah, ok
23:33imirkin_: it's the language of ACPI
23:33karolherbst: imirkin_: nvidia reads out 0x02243c on the gf108
23:34imirkin_: ah right yeah. i did the lookup for c0, not c1
23:34imirkin_: either way... this is going to be one for skeggsb
23:34snkcld: so, right before boot. the video card option rom has been executed during the "pci-phase" (or whatever it is called). and into 10 has been registered by said option rom
23:34karolherbst: japele: nice catch
23:34imirkin_: who clearly doesn't have enough weird issues on his plate, he needs another one
23:35karolherbst: japele: can fix it :p
23:35imirkin_: well, i dunno how working the full 2GB of vram will be in that case
23:35imirkin_: it sounds like there's going to be interactions with LTC and who-knows-what-else
23:36snkcld: and the vbios is data _inside_ the option rom right
23:36imirkin_: sometimes, not always.
23:36snkcld: oh, yea
23:36snkcld: could be in acpi
23:36snkcld: either way, its somewhere that only the option rom knows about, and must be able to register the bios to point to for int 10?
23:38snkcld: ok ok, so, once thats all said and done, and the boot process begins... we modify the display by accessing 0x80000 right?
23:38snkcld: or do we do it via int 10 arguments?
23:40snkcld: i guess both work, right
23:41japele: karloherbst: imirkin_: well, thanks guys for this and also for your work on open source projects! :p
23:42karolherbst: japele: you can help out as well :p
23:51kattana_: which language is needed for the opencl and vulkan part in nouveau?
23:52skeggsb: imirkin_: airlied suggested i write the vulkan driver in rust...
23:52skeggsb: i'm fairly certain he wasn't serious ;)
23:52imirkin_: skeggsb: that'd be amusing though
23:52imirkin_: (rust appears to be the subject of an enormous amount of hype... oh well)
23:53skeggsb: my biggest fear about such a thing would be that it just disappears one day
23:53imirkin_: my biggest hope about it is that it disappears one day
23:53imirkin_: preferably today.
23:53skeggsb: there are many other reasons not to of course, but, yes
23:54imirkin_: although i'm about to embark on a project in go. should be interesting.
23:55imirkin_: somehow i've been left with a very negative opinion of rust though
23:55imirkin_: i think it's coz something in their installation process tries to download stuff from the internet
23:55imirkin_: which is pretty much a no-go for me
23:55orbea: that is when I stopped with rust too, but only cargo does it
23:55skeggsb: i've only very briefly played with it, found it interesting, but not interesting enough to go through the pain of learning it properly
23:56skeggsb: whether or not it'd be actually useful, i don't know :P