00:08 HdkR: was
00:08 karolherbst: AndrewR: because a context is big
00:08 HdkR: Newer hardware context switches are fast :P
00:08 karolherbst: HdkR: it's still slow compared to a CPU contextswitch
00:08 HdkR: Oh yea, definitely
00:09 AndrewR: karolherbst, but still, I was under impression nvidia hw was more mlti-context than early ati hardware ?
00:09 karolherbst: it is
00:10 karolherbst: it's still slow
00:10 AndrewR: so, basically not very seful, just there for some CAD apps or something?
00:10 karolherbst: you use it to seperate different processes
00:10 karolherbst: because there you have nothing better
00:10 karolherbst: but inside the same process you can also context switch in software
00:12 AndrewR: karolherbst, yeah , but still applications not very well isolated, in current or even fixed by this mt series nouveau? (not like latest OGL extensions about robustness require?)
00:13 AndrewR: (becase I rn seamonkey with OpenGL compositing, it worked ok for years ....but it was with single-process mozilla core ...)
00:20 AndrewR: (sorry, a bit sleepy now, might fail into bed now ... thanks everyone and have relatively good time!)
00:37 karolherbst: imirkin: mhhhh, the join gets missplaced :/
00:38 karolherbst: so with your code we get a "not $p0 bra BB:11; join BB:12" and everything is fine
00:38 karolherbst: but without that explicit bra, we end up with "not $p0 join BB:11"
00:47 imirkin: yes.
00:48 karolherbst: unfortunate :/
00:49 imirkin: AndrewR: G92 should run DX9 games just fine. unfortunately we don't actually have reclocking working on there
00:53 karolherbst: imirkin: yeah... due to the lack of a better solution right now (and because the kepler code does the same): "Reviewed-by: Karol Herbst <kherbst@redhat.com>" for the series
00:55 imirkin: karolherbst: thanks
00:55 karolherbst: I still wished we had a better solution for some of the "hacks" we do :/
00:55 imirkin: yes.
00:56 imirkin: i have a similar wish.
00:56 imirkin: i might play some more with it
00:56 karolherbst: we might have to rethink/rework quite a lot for volta/turing anyway :/ but maybe that's best for later when skeggsb starts to publishing stuff
00:58 karolherbst: I think the better way to enforce that we only have one explicit branch per BB is to create join BBs
00:58 karolherbst: or mhhh
00:59 karolherbst: I didn't look at how that joinat stuff actually works, but it seems to convert that conditional bra into a join, which isn't the right thing actually
01:00 karolherbst: but we want to have a new BB with a join if the BB ends with an implicit bra to the joined BB
01:11 imirkin: karolherbst: yeah, it's conceivable i mess something up with the joinAt
01:11 imirkin: or ... something
01:11 karolherbst: imirkin: no, I think the joinat handling code might be not entirely correct
01:11 karolherbst: or well
01:11 karolherbst: it worked for pre kepler
01:12 imirkin: fermi is same as kepler
01:12 karolherbst: but since maxwell only has this joined jump
01:12 karolherbst: uhm
01:12 karolherbst: I meant pre maxwell
01:12 imirkin: maxwell is still the same
01:12 imirkin: just have to have a separate NOP op
01:12 karolherbst: isn't a join on maxwell essentially a bra?
01:12 imirkin: SYNC == NOP.S essentially
01:12 imirkin: a join on fermi+ is a bra
01:13 imirkin: a join on nv50 is a "resume simd mode"
01:13 karolherbst: ohh, I see
01:13 imirkin: so we have "anterior" joins
01:13 imirkin: by default we place them nv50-style
01:13 imirkin: but then there's logic which moves them into the joining sections
01:14 karolherbst: ahh
01:15 karolherbst: can we have a "join add f32 $r0 $r1 $r1" on maxwell? I am never entirely sure about where this join stuff looks how
01:18 imirkin: no
01:18 imirkin: can only have a nop join on maxwell
01:19 imirkin: fermi/kepler can have a join flag on (almost) any op
01:19 karolherbst: okay, so that was the difference
01:19 karolherbst: mhh
01:19 karolherbst: so we might want to add a join nop if we encounter an conditional bra at the top of the joinat target?
01:19 karolherbst: uhm..
01:19 karolherbst: reprhasing
01:20 karolherbst: so we might want to add a join nop at the top of the joinat target if not all bras into the joinat target are explicit?
01:22 imirkin: well
01:22 imirkin: the way it works is it always adds an explicit join nop
01:22 imirkin: and then that gets combined with the previous instruction if possible
01:22 imirkin: even on fermi/kepler you can't stick a join onto *any* instruction
01:23 karolherbst: ahh, right, I see it now
01:23 karolherbst: "not %p92 bra BB:11" + "join" -> "not $p0 join BB:11"
01:24 karolherbst: that just doesn't make much sense if join is inside BB:12, does it?
01:25 imirkin: well, it makes sense if the joinat is at BB:12
01:26 imirkin: er, BB:11
01:26 karolherbst: it's at BB:12
01:26 imirkin: i was just happy the damn thing started working :)
01:26 imirkin: i had yet another bug unrelated to all this
01:27 karolherbst: :/
01:27 karolherbst: hopefully fixing all that fixes random bugs as well
01:36 karolherbst: imirkin: sounds like NVC0LegalizePostRA::propagateJoin does that stuff?
01:36 karolherbst: that "if (exit->op == OP_BRA)" is a bit optimistic, isn't it?
01:37 karolherbst: should at least check if target.bb == bb, no?
01:37 imirkin: why?
01:37 imirkin: oh
01:37 imirkin: probably ;)
01:40 karolherbst: it passes now without that explicit branch, but the join just disappeared
01:40 karolherbst: ohh right
01:40 karolherbst: because it removves that join
01:41 karolherbst: mhhh
01:45 karolherbst: imirkin: maybe you should make it an explicit loop with precont and cont?
01:45 karolherbst: then I think "NVC0LegalizePostRA::tryReplaceContWithBra" would prevent that stuff from happening
01:46 karolherbst: or maybe not, dunno
03:57 imirkin: karolherbst: that propagateJoin() thing makes the assumption that the final bra of a bb goes to the joinAt location
03:57 karolherbst: yeah, seems that way
03:58 karolherbst: why are joins actually needed here?
03:58 karolherbst: I mean, for the atom lowering
03:58 imirkin: pops the "join stack"
03:58 imirkin: oh
03:59 imirkin: well, you want them anytime you have divergence
03:59 karolherbst: mhh, I don't really know if that's _that_ important on newer archs though
03:59 imirkin: yeah, no clue
03:59 karolherbst: or what would be the downsides of it if they just join at the next time it's required
03:59 imirkin: from the sounds of it, not important on volta
03:59 imirkin: but afaik it's important everywhere else
04:00 karolherbst: right, for volta+ it's different again
04:00 karolherbst: mhh, k
04:00 imirkin: well -- when would they join?
04:00 imirkin: it'll just keep going in diverged mode
04:00 imirkin: until it hits a join
04:00 imirkin: or ... something
04:00 karolherbst: worst case, they never do
04:00 imirkin: right.
04:01 karolherbst: it sounds like it's way more important on tesla, I just simply don't really know how that all works out on later gens
04:01 karolherbst: but probaby it's not that great
04:03 karolherbst: anyway, we need to be more careful and only allow that join merging if the CFG is structured all paths go through a bra
04:03 joepublic: So I can clock this GeForce 7200 GS to go (much) faster than the default state by requesting state 20 - but I can't slow it back down. Is that normal?
04:03 imirkin: not really
04:03 karolherbst: *and all paths
04:03 imirkin: joepublic: the lower modes don't work?
04:04 imirkin: note that the boot state isn't recoverable via reclocking (necessarily)
04:04 joepublic: by default, it shows a 20 and an AC, write 20 and it shows a 20* and a DC, writing AC or DC is a no-go.
04:05 karolherbst: imirkin: well, if all paths go through a bra to that joinat target, we can replace all bras with joins, right?
04:05 karolherbst: joepublic: AC/DC is your power supply
04:06 imirkin: joepublic: all you see is 20? no other perf levels?
04:06 joepublic: then it starts in a default mode which is slow, and lists only mode 20 which is about 3x as fast.
04:06 imirkin: yeah
04:06 joepublic: just the 20 and the AC or DC
04:06 imirkin: the boot clocks can be whatever, not even a defined level
04:06 imirkin: but we reclock to specific parameters specified in the vbios perf tables
04:06 imirkin: in your case, you only have 1
04:07 imirkin: AC/DC shows the current state.
04:07 joepublic: Ok. I understand. Thanks for your knowledge and info.
04:20 AndrewR: imirkin, I think in my case it is more about wine (3.21) + nouveau + game making something ..not optimal .. I can partially reclock this GPU with simple one-line hack: https://pastebin.com/ZWNT789p
04:22 imirkin: yeah, could be we're doing something dumb
04:22 imirkin: or the game triggers some ... unfortunate behavior
04:22 imirkin: did it become worse recently? i pushed some patches to, uh, improve ... register allocation on nv50
04:23 AndrewR: imirkin, https://torrent-igruha.org/344-01-mafiya-2.html - if you have some 8gb of free space yo can try for yourself ... (it all in russian, not sure if language swicther works with wine ... so, use your intuition).
04:23 imirkin: i.e. make it return correct results
04:23 imirkin: AndrewR: i'm russian
04:23 imirkin: so i'll probably work it out
04:23 AndrewR: imirkin, I only tried it ..yesterday
04:24 AndrewR: imirkin, well, not all russians (by birthname/location) like/know/want to use their first lang ....
04:24 imirkin: :)
04:24 imirkin: fair enough. in my case, i'm quite comfortable with it.
04:25 imirkin: but ... not sure i have time to deal with this. however if it's a DX9 game, you should try the nine state tracker, it might fare better
04:26 AndrewR: imirkin, ye, time to rebild new mesa and new (another) brand of wine ...but also, not right now, probably (wanna eat first)
04:54 AndrewR: also, for seeing kind of artefacts I saw on virtio-gpu on nouveau, you can download (warning, 2.5 gb) and play with it a bit ....
05:01 AndrewR: like: qemu-system-x86_64 -enable-kvm -m 1G -display sdl,gl=on -soundhw es1370 -usb -usbdevice mouse -vga virtio -smp 2 -cdrom /mnt/sdb1/slax-29-11-2018-test0.iso
05:01 AndrewR: nfortunately, it uses 4.12.0 (!) kernel, I was too lazy for updating this part regulary ...
07:42 AndrewR: so, building with this patch on top of staging-3.21 https://github.com/sarnex/wine-d3d9-patches
07:42 AndrewR: because build from https://slackbuilds.org/repository/14.2/system/wine-staging/ apparently was missing nine patches!
08:06 AndrewR: it says : Native Direct3D 9 is active but ..mafia2 just qits after showing splash screen ....
12:29 pabs3: is this kernel warning interesting at all? http://paste.debian.net/hidden/92ac08f5/
13:48 karolherbst: pabs3: yeah, it shouldn't happen
13:48 karolherbst: but it doesn't seem to cause any issues afaik
13:49 karolherbst: AndrewR: does csmt_force=0 helps?
13:50 karolherbst: AndrewR: apperantly nine uses threading and nouveau ain't that great there
13:53 AndrewR: karolherbst, no, same brief black screen and then back to terminal.
13:53 karolherbst: mhh any crash report?
13:53 karolherbst: csmt_force as an evironmental variable I meant... or maybe disable csmt inside winecfg? not quite sure how all that works out nicely
13:56 AndrewR: https://pastebin.com/dtuwT0tc
14:00 karolherbst: AndrewR: and it just quits?
14:01 AndrewR: karolherbst, yes :/
14:02 karolherbst: yeah no clue. If there would be a crash, I could look into it but this way? Maybe ask inside #d3d9 ?
14:22 orbea: AndrewR: turn off vsync in the game maybe
14:23 AndrewR: orbea, already tried, apparently no (positive) effect
14:23 orbea: hmm, there was a xorg bug that got fixed a while ago, maybe its different
14:24 orbea: you could try making sure you have recent versions of libdrm, mesa and xorg-server, it might help
14:32 karolherbst: there are sometimes those weirdo games exiting with no apperant reason at all :/
14:32 karolherbst: mostly requires a full wine trace to figure out what's wrong
14:32 orbea: yea...
14:32 karolherbst: maybe there is a game log?
14:43 AndrewR: karolherbst, no, at least not in game directory ...
14:44 orbea: AndrewR: could also try ~/.local/share or something
14:55 AndrewR: orbea, nothing interesting there, some wine associations.. i think I'll delay any debug for now. It sort of work with normal wine, and I want to sleep too much.
14:55 AndrewR: thanks all.
15:07 Tom^: AndrewR: https://forum.winehq.org/viewtopic.php?p=50202#p50202
15:07 Tom^: AndrewR: seems like an broken game in wine. :p
15:10 orbea: but AndrewR said it worked in normal wine for him so... :P
15:11 Tom^: oh i see
15:49 imirkin: skeggsb: bah! we don't have the "disable" mask at the time we call nvkm_lockvgac() in nvkm_devinit_preinit.
15:50 imirkin: skeggsb: why not do it earlier? PMC enable bits?
16:36 karolherbst: imirkin: no changes in shader-db with that constantfolding fix
16:36 karolherbst: actually, I would be super surprised if it would have :/ but I think even more surprised nothing actually broke
16:37 karolherbst: I mean, why did nobody notice before
16:39 AndrewR: GALLIUM_HUD=fps,samples-passed,instructions wine /mnt/sdb1/Mafia\ 2/pc/mafia2.exe resulted in like 23 MSamples and nearly 300 Minstructions for 1440x900 screen .... is it normal?
16:40 karolherbst: well, more or less
16:41 karolherbst: depending on what you mean by normal
16:41 karolherbst: for demanding games? sure
16:41 karolherbst: but 300M is quite low for those
16:41 karolherbst: things are getting interesting for 1G+
16:43 karolherbst: AndrewR: I am sure we could do better actually
16:43 karolherbst: but that means looking at the shaders and see where we could do potential optimizations
16:44 karolherbst: but the benefit is quite low
16:58 imirkin: karolherbst: i didn't expect there to be changes. a change would have been a sign of a bug in your change which neither of us noticed.
16:58 imirkin: however since you're messing with the flow of the opt, who knows :)
16:59 imirkin: skeggsb: btw, see my analysis of the issue here - https://bugs.freedesktop.org/show_bug.cgi?id=108980#c7
17:02 karolherbst: imirkin: well use after free can lead to all kind of weirdo bugs though, allthough here it was pretty unlikely to happen :/ if there would be a change, that would have meant that something odd was going on because that should have never happened
17:03 imirkin: yeah
17:03 imirkin: i would have wanted any changes to be analyzed under a microscope
17:03 imirkin: but as there were none... :)
17:03 karolherbst: yeah
17:05 imirkin: AndrewR: is that 23 MSamples/s or /frame?
17:05 imirkin: 1440*900 is 1.3M pixels
17:06 imirkin: and a DX9 game probably has a multi-pass renderer
17:13 karolherbst: imirkin: btw, do I get a r-by from you for that patch as well? Or is there something left I should look into?
17:13 imirkin: didn't i have some feedback? or is that all addressed now?
17:14 imirkin: "Please make this whole (outer) if/else sequence have { }. Even though"
17:15 karolherbst: ohh, I have that fixed locally, right...
17:17 karolherbst: imirkin: https://github.com/karolherbst/mesa/commit/cf4b03f35466ef61021afe149772710ff72a3ef6
17:18 imirkin: karolherbst: r-b
17:18 karolherbst: thanks
20:37 imirkin: Lyude: this happened on driver unbind with 4.19.8: http://paste.debian.net/hidden/fb718bb0/
20:38 imirkin: and the initial issue is triggered by a DP 1.1 -> 1.2 switch on the dell monitor
20:39 Lyude: imirkin: huh, can you poke me tommorrow when I'm at work and have access to my nvidia hw?
20:39 imirkin: (which also makes the panel hang off of DP-3-8 instead of the DP-3 root)
20:39 Lyude: i'm looking at a bunch of mst stuff already
20:40 imirkin: yeah, that's why i'm telling you about it :)
20:40 imirkin: i'm not looking for you to do anything immediately, but more of an fyi
20:46 Lyude: imirkin: I appreciate it, thanks!
23:11 imirkin: Lyude: i just pushed some changes for the nouveau ddx which should make DP-MST work better
23:11 imirkin: the only crash i managed to get was also when nouveau kernel module wigged out when switching between DP 1.1 and DP 1.2
23:12 imirkin: so i figure that's acceptable
23:12 imirkin: (esp since i can't repro it)
23:13 imirkin: i'd appreciate it if you could just use it as your main ddx when working on nouveau mst stuff, and let me know if there are any issues
23:26 Lyude: imirkin: honestly
23:26 Lyude: most of the time I'm not even using x
23:27 pabs3: karolherbst: it happened here only once, but it indeed causes no visible issues (yet).
23:29 karolherbst: pabs3: a use after free inside codegen?
23:31 pabs3: ok, I wasn't sure how to interpret the warning, but your explanation makes sense
23:33 karolherbst: pabs3: ohh, you were refering to that dmesg warning?
23:33 pabs3: yes
23:33 karolherbst: right... that's something we might want to fix, but yeah..
23:34 pabs3: any other info I should supply? (its Linux+mesa from Debian testing)
23:38 imirkin: Lyude: ok. well, if you do get a chance, esp to retest the scenarios where you had experienced crashes
23:38 imirkin: i did make a number of fixes
23:38 imirkin: so it's not the same code you had tested earlier
23:40 imirkin: it's pushed on master now, on the off chance someone actually follows that
23:40 imirkin: i'd like to make a proper release some time next week
23:46 pabs3: sorry, I got disconnected