00:58mogorva: yesterday i was seeking help with apitrace and Steam games running under Wine. The method that imirkin suggested actually worked, it was the game that crashed so early that apitrace didn't produce a trace.
00:59mogorva: fortunately i found another game that crashes in a similar way under Wine with nouveau (backtrace: http://pastebin.com/b2EvnxKh) and i managed to create a trace with apitrace: https://drive.google.com/open?id=0B-tTbLKBl-tOZHFhRHJoSWJJN0U&authuser=0
01:00mogorva: this game, named 'How to Survive' crashes during the initial loading screen, the generated trace file is about 26 MB.
07:28mogorva: why am I getting a cropped image when replaying a trace created with apitrace? The game was running @1024x768 but the window created by glretrace is only the half of that or smaller than that. Only the left half of the screen is rendered in that window.
07:28mogorva: this looks suspicious: error: drawable failed to resize: expected 6750824x768, got 616x768
07:44imirkin: mogorva: dunno, i don't think i've ever seen that
07:44imirkin: mogorva: i'll have a look at your other trace. odd that apitrace didn't handle an early crash by the game... maybe it doesn't flush its buffers?
08:09mogorva: this is the first warning/error message from glretrace: unsupported glXSwapIntervalMESA call error: drawable failed to resize: expected 6750824x768, got 616x768
08:09mogorva: it is a windows game running in Wine btw.
08:22imirkin: that reads like some uninitialized value or something... odd.
08:22imirkin: you could file a bug with the apitrace people... i just use it, don't know much about how it works
08:23mogorva: do they accept bug report with games under Wine? I mean it could be a bug in Wine as well
08:27imirkin: i'm not really sure what the issue you're having is tbh :)
08:27imirkin: but i'll look at the trace you uploaded tonight-ish
08:27imirkin: you said replaying that trace crashes mesa right?
08:28mogorva: no, it's only the game that crashes
08:29mogorva: and the backtrace points to nouveau
08:29imirkin: wait, so replaying the trace doesn't crash?
08:29imirkin: then i won't be able to repro =/
08:34mogorva: there's no crash when replaying the trace, it simply ends where the game crashes in WIne
08:35imirkin: are any of the games with this issue free games?
08:35imirkin: or ones with demos that also trigger the issue?
08:35mogorva: this is the output from glretrace: http://pastebin.com/7Mk7A74V
08:36mogorva: yeah, Skydrift has a demo on Steam, but it is the one that crashes so early that I couldn't get a trace: http://store.steampowered.com/app/91100/
08:37imirkin: that's fine
08:37imirkin: so all i do is get that and run it in wine and i should see the crash? what wine ver btw?
08:38mogorva: i tried with Wine 1.7.45+git
08:38imirkin: ok. i have 1.7.45, should be able to repro then.
08:38mogorva: so was the trace that I created completely useless?
08:40imirkin: if it doesn't crash when replayed, yes
08:40imirkin: the idea was for me to be able to debug the issue locally
08:41mogorva: it crashes wine, not the whole host/X11/kernel whatever...
08:42imirkin: but if the trace doesn't crash in the driver
08:42imirkin: then i can't debug the driver with the trace :)
08:42mogorva: i don't see anything wrong in dmesg after replaying the trace though
08:44imirkin: you said you git a segfault
08:45mogorva: no, that was another game, a few days ago, i think the King's Bounty game
08:45imirkin: so what happens here?
08:46imirkin: actually i gtg... but basically i have to have a way to see the same thing you do in order to debug any issues in nouveau
08:46mogorva: Skydrift (under wine) crashes on start
08:46imirkin: crash == segfault, no?
08:53mogorva: at least now I know how to get an apitrace in a Steam game under Wine :)
09:31imirkin_: mogorva: ok, that's jus tthe crash i wanted, in nouveau. so if i grab the skydrift demo, i should see that? anything special i need to do?
09:33mogorva: imirkin_: the demo will install some MSVC++ runtimes and Directx, the game should start afterwards
09:34imirkin_: ok, i'll ping you if i run into trouble
09:34imirkin_: [in like 10h or so]
09:34mogorva: but it shows only a black screen for 4-5 seconds without displaying anything then it crashes
09:34imirkin_: the condition it crashes on shouldn't be able to happen at all, which is why i'm def a bit confused :)
09:34mogorva: instead, you should see some splash screens and the game should load to the main menu properly
09:35imirkin_: i know that you want it to render properly
09:35imirkin_: but i just want it to not crash
09:35imirkin_: if one fixes the other, so much the merrier
09:36mogorva: well, the game has rendering issues as well (trees are blocky) but that is present with the binary drivers as well
09:36mogorva: i'm just curious if you can reproduce the crash, because it affects more than one game on my system
09:38imirkin_: you *are* aware that you need a 32-bit mesa for wine, right?
09:38mogorva: i'm on 32-bit
09:38imirkin_: that solves the issue nicely :)
10:02mogorva: imirkin_: actually the crash might be due to a regression in Mesa. Now I compiled Mesa 10.5-branchpoint and the game starts properly with that.
10:03imirkin_: you could do a git bisect :)
10:06mogorva: imirkin_: would you mind sharing your configure options and CFLAGS when compiling mesa? Here's mine, are they sane? http://pastebin.com/UKqyYuab
10:06imirkin_: yours are insane
10:07mogorva: thx :)
10:07imirkin_: mostly that you have a lot of them, and a bunch are defaults
10:07imirkin_: --enable-osmesa -- you really don't need that
10:07imirkin_: do you actually need svga? that's the vmware thing
10:08imirkin_: you can nuke all the opencl stuff, no opencl on nouveau
10:08mogorva: i know of at least 1 game that makes use of osmesa it under wine
10:08mogorva: it's Civilization III
10:08imirkin_: the rest aren't horrid, but partly unnecessary
10:08imirkin_: how is it able to make use of that osmesa?
10:08imirkin_: osmesa is basically swrast with glFoo wrapped as notglFoo or something like that
10:09mogorva: if i don't compile opencl in, there won't be gallium-llvm drivers
10:09imirkin_: well, i never build opencl
10:09imirkin_: also, i challenge you to find a GLES1 application that is not piglit
10:10imirkin_: also all the stuff like --enable-shared-glapi should be unnecessary
10:10imirkin_: no reason to override the defaults on that
10:10mogorva: i noticed if i skip --enable-opencl then ../lib/gallium-pipe is not populated
10:11imirkin_: wtf is that dir?
10:11imirkin_: i've never heard of it
10:11mogorva: isn't that used for software rendering?
10:11imirkin_: maybe with the million enable flags you have
10:12imirkin_: but certainly not in my builds
10:12imirkin_: swrast_dri.so is used for swrast
10:12imirkin_: (and swrastg_dri.so)
10:12imirkin_: which are all actually hardlinks to one another nowadays
10:16mogorva: i'd be glad if you could reproduce the crash in Skydrift demo in current Mesa git, that would mean I didn't screwed up my mesa compilation
10:17imirkin_: yea, tonight
12:06hakzsam: imirkin_, Hi, do you plan to review more patches of my series about nv50 perf counters? :) I'm going to remove the double query thing locally but I won't submit a v2 only with those (minor) changes
12:19mupuf: oh oh oh
12:19mupuf: sounds promising!
12:21hakzsam: you are looking at the GK20A headers?
12:22mupuf: not much has changed since the last time, as far as I can tell
12:22mogorva: imirkin_: the offending commit should be 6b284f08ab399154ad10e2166440b44cbbdcb2c5 , reverting it on current head fixes the crash for me.
12:22hakzsam: mupuf, yeah, NVIDIA is not going to publish all registers :)
12:23hakzsam: what about perfkit btw?
12:23mupuf: they already released a lot more than this :)
12:36mogorva: gtg now, cya tomorrow
18:03skeggsb_: buhman: ping
18:05skeggsb_: buhman: http://cgit.freedesktop.org/~darktama/nouveau/log/?h=hack-gk106m <<< worth trying to see if nouveau works on your laptop with that
18:05skeggsb_: ^^^ that goes for anyone seeing the GK106M graphics engine hangs on load
18:06skeggsb_: aaronp: that's what i've got so far, if that helps your digging
18:07buhman: skeggsb_: :DDD
18:09skeggsb_: buhman: it fixes the w541 i have here with the issue, so hopefully for you too
18:26buhman: skeggsb_: I'm not sure if I'm doing something stupid, but it seems to reliably and un-reconverably crash
18:27buhman: I've not been able to capture dmesg after the crash, but before it starts to whine: https://ptpb.pw/z2cV
18:32skeggsb_: buhman: add nouveau.runpm=0, maybe there's something else busted for you there too..
18:34skeggsb_: we've got some *very* bad behaviour in runpm failure paths, i'm working on fixing those too, but it's tricky.. just attempting to avoid things going wrong atm :)
18:34skeggsb_: but, i'd suspect that's the crash you're seeing
18:39imirkin: skeggsb_: that BIND_ERROR seems a bit worrying no?
18:39skeggsb_: i'm pretty sure that's a post-runpm fuckup
18:39skeggsb_: based on the acpi crap right before it
18:40imirkin: definitely runpm going on
18:40imirkin: but it doesn't appear to be *failing*
18:40skeggsb_: i'd almost bet on it failing right before the crash that buhman can't capture
18:40airlied: oh I should at least pull that patch to add a delay in power off
18:42skeggsb_: airlied: why the acpi warnings there btw? those could get annoying in the logs with runpm
18:43skeggsb_: i seen those on the w541 too
18:43imirkin: they've been there for ~everyone since forever =/
18:51skeggsb_: interesting, i just seen the same issue as buhman on two boots in a row
18:51skeggsb_: runpm=0 "fixes" it
18:52imirkin: more magic writes left?
18:52skeggsb_: no, i think this is just broken runpm tbh
18:53buhman: 01:52:45 skeggsb_ runpm=0 "fixes" it
18:53buhman: I agree
18:53buhman: https://ptpb.pw/n5DC :D
18:54airlied: skeggsb_: DSM is incorrectly specified some places
18:55imirkin: airlied: is it specified correctly anywhere?
18:55airlied: spec says one thing
18:55airlied: implementations do something else
18:55imirkin: i've never seen it not give a warning
18:55airlied: there are other DSM implementations that I assume get it right
18:55airlied: since DSM is just device specific method
19:08buhman: imirkin: what if I told you nouveau is significantly slower than intel on my hardware (since you asked earlier)
19:08buhman: …and also incorrect output
19:08imirkin: buhman: then i would not be surprised.
19:08imirkin: buhman: but boot with nouveau.pstate=1
19:08imirkin: you should be able to change clocks, and at higher clocks it should probably be faster
19:09imirkin: incorrect output i'm very interested in, but make sure you have the latest mesa
19:09imirkin: and by 'latest' i mean 'git'
19:09buhman: how do I make a sane report for incorrect output?
19:10imirkin: apitrace + screenshot of incorrectness
19:10imirkin: what is it in, perhaps i've already looked at it
19:11imirkin: the white blocks?
19:11buhman: no, the whole scene is weird
19:11buhman: shadows and backdrop aren't where they're supposed to be
19:11imirkin: o... hm. i haven't noticed any oddity on any of my runs...
19:11buhman: layers in the wrong order
19:12imirkin: just a screenshot is probably good enough for that, no need for a trace
19:12imirkin: (since i can easily run it myself)
19:14buhman: hmm, maybe it's because it's doing <1fps
19:20buhman: imirkin: https://ptpb.pw/q6so.png https://ptpb.pw/6Mki.png
19:20buhman: I think that's actually a engine-related thing; it seems to just give up on drawing the frame after 1 second
19:23imirkin: well, i've been running it on a GF108 and getting decent framerates out of it
19:23imirkin: (like 4fps. heh.)
19:23imirkin: but i decrease the resolution a lot
19:23imirkin: like 640x480 or so
19:23imirkin: going at 2880x1620 without clocking your card up, i'm sure it's like 0fps :)
19:23buhman: well, nvidia can do like 20fps at those settings
19:24imirkin: did you clock your card up?
19:24imirkin: did the clock change take effect? check dmesg for errors
19:25imirkin: it'll often complain about a voltage change fail
19:25imirkin: but in any case, earlier indications that clock-for-clock we're still *way* below nvidia
19:25imirkin: like 60-80% of the perf
19:26buhman: is anything said when it's successful?
19:26imirkin: no, but if you cat the file, the -- line should have the thing you expect
19:26imirkin: (or maybe it's the AC/DC line now)
19:27buhman: which file?
19:27imirkin: ... the pstate file
19:27imirkin: the one you echo thigns into to change clock speeds...
19:27imirkin: did you boot with nouveau.pstate=1 ?
19:28imirkin: if you don't reclock, you're stuck a boot clocks for your chip, which are likely to be like 50mhz on kepler. and very low memory speeds.
19:30buhman: https://ptpb.pw/KEle err that?
19:30imirkin: yeah. that.
19:30imirkin: if you're ready for a hang, try echo'ing 0f into that file
19:31imirkin: the 0a level is likely to work reliably though
19:31imirkin: 0f is likely to hang your box, but worth a try first to find out
19:33buhman: well, that was fun
19:33buhman: it didn't just hang, but had some fun gitter before completely hanging
19:33imirkin: yeah, the memory doesn't came back properly :(
19:34imirkin: when it's your primary gpu it's a hang, on secondary i guess it can limp along for a little while
19:34imirkin: anyways, try 0a -- that's likely to work
19:34imirkin: it tends to be fine for most people
19:34buhman: it does
19:34imirkin: should make heaven go a lot faster
19:34buhman: if 1620 is fine, why is 3008 such a big deal?
19:34imirkin: things tend to be pickier at higher speeds
19:35buhman: it's ~1-2x faster
19:36buhman: ~1.5fps -> 2fps
19:36imirkin: hmmm... that's it? :(
19:36imirkin: it does have to copy those frames over to the intel device
19:36imirkin: but that can't be THAT Slow
19:36buhman: nvidia can do it 10x faster ;p
19:36buhman: (and with more stuff, like tessellation)
19:37buhman:nudges imirkin to merge the juice
19:37imirkin: yeah, but that's with nvidia being the primary right?
19:37imirkin: yeah, so then it doesn't have to copy
19:37imirkin: at least not like that
19:37imirkin: er, not like it does with prime
19:37buhman: does nouveau support not-prime?
19:38imirkin: dunno... no laptop device with nvidia here, no clue how that stuff is all hooked up
19:38imirkin: in theory it should support the thing nvidia supports too, but dunno if that pans out in reality
19:42skeggsb_: the nvidia can't display at all on those laptops, binary driver or not
19:42skeggsb_: it's not wired up to the laptop panel
19:43skeggsb_: with a funky bios option, the DP output *can* be switched to the nvidia, but it's not the default, and it still leaves the panel on intel
19:43buhman: ooh really?
19:43skeggsb_: if it's the same as the w541, yes
19:43skeggsb_: and yeah, your kernel log shows no panel in the dcb table either, so i'd say it's the same
19:44skeggsb_: oh, and it's only the dock DP too, not the one on the laptop..
19:44skeggsb_: very messed up
19:44skeggsb_:doesn't like it
19:47buhman: imirkin: I think I should have said 'no' earlier
19:47buhman: I use 'xrandr --setprovideroutputsource modesetting NVIDIA-0' along with other magic
19:47imirkin: ah ok
19:54buhman: is vdpau decoding supposed to look like a vincent van gogh painting?
19:55imirkin: esp if you're watching a video of van gogh's paintings...
19:56imirkin: anyways, some h264 videos have issues
19:56imirkin: mostly works though
19:57buhman: -vo gl: https://ptpb.pw/vRLW.png -vo vdpau: https://ptpb.pw/drn0.png
19:57buhman: (same frame)
19:58imirkin: h264, right?
19:58imirkin: yeah... that happens in some videos
19:58imirkin: but not all!
19:59buhman: well, it also tears more than usual
19:59imirkin: that's the prime aspect of it... afaik there's no sync with prime
20:00buhman: I'm not convinced
20:01buhman: imirkin: if I step individual frames at like 1fps, it still tears
20:03buhman: does the prime sync involve possibly copying partial frames?
20:03buhman: surely there's some sort of double buffering
20:03imirkin: i wouldn't be quite so sure
20:03imirkin: airlied would be the expert
20:04skeggsb_: the intel driver doesn't support dma-buf fences, it has no way of knowing when the nvidia gpu has finished rendering
20:30imirkin: mogorva: well, looks like skydrift demo doesn't crash
20:30imirkin: anyways, i have a pretty good idea of how to fix that crash issue
20:31mogorva: imirkin: i verified that the game doesn't crash if I revert that commit
20:31imirkin: mogorva: yea
20:32mogorva: imirkin: so what's your idea?
20:33imirkin: mogorva: does this help? http://hastebin.com/minivekuga.diff
20:48mogorva: imirkin: steam doesn't even start with your patch, it crashes: http://pastebin.com/cPYzZQUt
20:52imirkin: note to self: test patches first
20:54imirkin: well... it's _something_ like that which is necessary
20:55imirkin: i'll have to figure out what exactly thoug
20:55mogorva: any idea why did you get different result (no crash in the app)?
20:57imirkin: coz it's a different app
20:57imirkin: oh wait. my system mesa is 10.5.x!
20:59imirkin: another equally untested patch: http://hastebin.com/pewodaquri.diff
20:59mogorva: would you mind retesting the demo in current git?
21:00imirkin: yea... i need to get a 32-bit build going
21:09mogorva: imirkin: the second patch works, steam is able to start and the game doesn't crash :)
21:11imirkin: does it render properly too?
21:15mogorva: AFAICS, yes it does
21:16mogorva: imirkin: do you still have skydrift demo installed?
21:17mogorva: could you try this demo on steam: http://store.steampowered.com/app/200050/ it's one of the unity-based games that crashes on start, unrelated to the crash in skydrift, but the crash is not present with the binary drivers
21:18imirkin: with wine i presume?
21:18mogorva: sure, in the same prefix where skydrift is installed
21:18mogorva: it doesn't produce a useful backtrace, probably it's own exception handler catches the segfault on start
21:19imirkin: my steam is having some issues... not sure what i did to break it, but can you give me a steam://install/asdf url?
21:19imirkin: i tried steam://install/200050 but that didn't work
21:23mogorva: try 'wine Steam.exe -applaunch 200050'
21:23imirkin: nah... it's the browser that's broken
21:23mogorva: if the steam store page doesn't load in wine use 'winetricks corefonts'
21:24mogorva: wget https://raw.githubusercontent.com/Winetricks/winetricks/master/src/winetricks
21:25mogorva: chmod +x winetricks
21:25imirkin: i know about winetricks
21:25imirkin: i get an app crash
21:25imirkin: steamwebhelper.exe dies
21:26mogorva: sometimes it occurs here as well, but only on exit
21:26mogorva: does the browser work with the fonts installed?
21:26imirkin: mogorva: no
21:26mogorva: also make sure you have the default Windows XP profile selected in winecfg
21:27imirkin: joi: wtf... for some reason demmt is only decoding VP_START_ID for g84 traces instead of also FP_START_ID... looking at the code it *should* work though
21:27imirkin: oh yeah. i have windows 8 i think
21:27imirkin: let me switch that back
21:27imirkin: yeah, much better now
21:28imirkin: what game is 200050?
21:29imirkin: ok, got it. downloading
21:31mogorva: the crash in 200050 is a different problem, probably not a regression, and your advised patch doesn't fix it
21:39imirkin: joi: fp, data: 0x7f4bccd51010, anb: 0x40080000, m: 0x40000000
21:40imirkin: joi: because we allocate one codepage for all of fp/vp/gp
21:40imirkin: joi: what was wrong with the code that you had?
21:59imirkin: mogorva: hmmm... that game seems to work for me
21:59imirkin: mogorva: that was Naval War Arctic Circle right?
22:00imirkin: i didn't really play too much, but no crash... did i have to do something in particular?
22:00imirkin: it did kinda die on exit
22:00mogorva: no, it should crash on start
22:00imirkin: not with mesa 10.5.6
22:19mogorva: imirkin: i think i found it: it's when anti-aliasing is enabled that NWAC crashes with nouveau
22:21mogorva: in ~/Documents/My Games/Naval War/NWACPrefis.ini set antialiasing to 2 or 4, save the file and restart the demo
22:31imirkin: yeah, no idea what's happening there.
22:31imirkin: it's dying in something unrelated to graphics entirely
22:32imirkin: er wait, all my debuginfo is wrong. gr.
22:32imirkin: gdb doesn't detect the 32-bit-ness of things
22:44imirkin: joi: thoughts on http://hastebin.com/kazikewaci.coffee ?
22:44imirkin: that makes it start working for me
22:58imirkin: mogorva: ok, so for the thing where opt makes a difference
22:58imirkin: i'm going to have a series of patches for you
22:58imirkin: coz i don't have the hw plugged in, so this is a guess-and-check situation
23:00imirkin: er hmmmm.... crap. it's generating ops that it knows are bad :(
23:00imirkin: need to read the code harder
23:01imirkin: hm, or not
23:01mogorva: what do you mean by opt makes a difference? compiler optimizations? i'm ready to test your patches :)
23:01imirkin: mogorva: try this: http://hastebin.com/qupixujide.md
23:02imirkin: you had some set of games where running with optimizations produced crap results but disabling them made things work
23:02imirkin: (or perhaps just one game)
23:03mogorva: you mean https://bugs.freedesktop.org/show_bug.cgi?id=91056 ?
23:05imirkin: if that patch doesn't work, try setting all of that column to 0
23:05imirkin: i.e. all of the a column
23:06mogorva: is your latest patch for that game in the bug report, right?
23:08mogorva: did you find anything useful in the valgrind log that I created ?
23:08imirkin: yeah, that's what i'm basing my patch on :)
23:08imirkin: it starts with cvt rzi s32 $r0 f32 a[0x24]
23:09imirkin: which seems a little fishy. i dunno. probably ok. but good to check.
23:10mogorva: thanks for taking your precious time to help
23:10imirkin: np :)
23:20mogorva: imirkin: your patch doesn't resolve the rendering issue, nor when I'm changing all values to '0' in column a
23:20imirkin: ok, so that's not in
23:21imirkin: could be that i'm even looking at the wrong shader =/
23:26imirkin: i wonder if it has something to do with the propagation of the indirect loads
23:26imirkin: let's try to forbid that
23:27imirkin: mogorva: ok, so instead of the other change, try: http://hastebin.com/biyogugizu.pl
23:38mogorva: imirkin: your latest patch works marvellously :) the flashing brownish polygons disappeared. I mean when shader opt is enabled
23:40imirkin: mogorva: grrrrr
23:40imirkin: so now to figure out which instruction actually hates these
23:54imirkin: i guess someone will need to run some tests to figure out exactly what is allowed and disallowed
23:54imirkin: and i suspect i'm going to be the one picking the short straw =/