04:33 imirkin: skeggsb: any insight into https://bugs.freedesktop.org/show_bug.cgi?id=98149 ?
04:34 imirkin: skeggsb: for some reason we're no longer picking up the BIOS FP mode
04:34 imirkin: i've looked through the code, but the conditions are ... complex. and nvbios doesn't have the relevant fields decoded, as far as i can tell
06:58 mupuf: ok, now all the machines do boot with wtrpm and they all have access to the NAS
06:58 mupuf: I wonder if I can put ssh keys in ldap though
06:58 mupuf: otherwise, I have to update multiple copies of the same .authorized_keys file
11:41 lapion: Does anyone in here have a system with 8GB ram and an nvidia card that has non-relocatable ram-window at the 7-8GB ?
11:41 lapion: and nvidia fx1800 /
11:43 lapion: my system keeps freezing up. if I have more than 6GB installed
12:58 RSpliet: hah... today I learned. Tegra K1 on L4T doesn't support OpenCL
12:59 RSpliet: pmoreau: we can do better than that, can't we? :-D
13:01 mupuf: RSpliet: hehe, nice
13:03 pmoreau: RSpliet: Yes we can! Well, at some point in the future that is… :-D
13:04 mupuf: I'm sure they said the same "D
13:05 pmoreau: Eh eh
13:11 RSpliet: oh... same for the TX1
13:11 RSpliet: >_>
16:43 pmoreau: Loads (from global memory) with an offset aren’t possible on pre-Pascal hardware, are they?
16:46 imirkin: erm
16:46 imirkin: they should be possible
16:46 imirkin: /*0cd0*/ LD.E R20, [R2+0x124]; /* 0xc4800000921c0850 */
16:47 imirkin: [that's with SM35, but there's an encoding for it on SM20 as well]
16:47 pmoreau: Oh ok. Then, maybe the compiler doing weird stuff.
18:04 imirkin: skeggsb: what's the implication of that bar thing?
18:06 lapion: Does anyone in here have a system with 8GB ram and an nvidia card that has non-relocatable ram-window at the 7-8GB ?
18:07 lapion: my laptop with nvidia quadro fx1800m keeps freezing up. if I have more than 6GB installed
18:08 imirkin: lapion: shouldn't the ram get relocated somewhere up high by the bios?
18:12 imirkin: anyways, take my comment with a grain of salt - i'm pretty weak on all the PCI memory window details, and who's supposed to configure what.
18:13 imirkin: you may consider asking linux-pci@vger.kernel.org - they know pci :)
18:13 imirkin: [or put it another way, if they don't know, no one does]
18:48 pmoreau: skeggsb: Ping for the backlight stuff (both Maxwell+Pascal support and the unique sysfs/debugfs interface fix)
18:51 imirkin: mwk: when's the HDL version coming? :)
18:51 mwk: imirkin: :p
18:54 imirkin: pretty soon you'll be making ASICs
18:55 mwk:wonders how many NV3s could fit in the area of a modern GPU
18:55 ajax: a mere 3.5M transistors in an nv3
18:56 imirkin: yeah, but they were bigger...
18:56 imirkin: they don't make 'em like they used to
19:05 imirkin: Riastradh: did you have a chance to put something together re netbsd for the wiki?
19:35 Riastradh: imirkin: 'Fraid not. But I have it on my todo list and might have time tonight. (And feel free to poke me again if I haven't gotten to it.)
19:35 imirkin: will do
19:35 imirkin: doesn't have to be a work of art btw
19:35 karolherbst: imirkin: anything I can help reagarding the one issue I've found?
19:35 imirkin: just has to be better than a blank page :)
19:35 imirkin: karolherbst: keep pinging me...
19:35 karolherbst: will do
19:35 imirkin: karolherbst: remind me which piglit it was?
19:36 karolherbst: uhhh, hakzsam actually listed them in his mail
19:36 imirkin: karolherbst: yes, but if you want me to look at stuff, the way to do that isn't to tell me to look through my email for something
19:37 karolherbst: right
19:37 imirkin: also patchwork link to your patch
19:37 karolherbst: spec/glsl-4.30/execution/built-in-functions/cs-all-bvec{2,3,4}-using-if
19:37 karolherbst: spec/glsl-4.30/execution/built-in-functions/cs-any-bvec{2,3,4}-using-if
19:37 imirkin: one's enough
19:37 karolherbst: I've told you about the s-all-bvec4-using-if first
19:38 karolherbst: no clue if the others are much different
19:38 imirkin: i'll look at one, and move up from there.
19:38 karolherbst: thanks
19:38 imirkin: where's the patch?
19:38 imirkin: (patchwork link)
19:39 karolherbst: uhm, never use patchwork, might take a while
19:39 imirkin: nvm, i'll find it
19:39 karolherbst: https://patchwork.freedesktop.org/patch/114182/
19:39 imirkin: thanks
19:48 imirkin: karolherbst: looks like a legit RA issue. still investigating.
19:52 orbea: Maybe this is a GLES3 nouveau bug? I cant reproduce it on intel and the glupen64 dev cant reproduce it on nvidia. trace - http://ks392457.kimsufi.com/orbea/stuff/trace/libretro/Zelda%20OoT%2064_GLES3.trace.xz issue report: https://github.com/loganmc10/GLupeN64/issues/74
19:53 imirkin: orbea: unlikely to be anything GLES3-specific
19:53 imirkin: all that stuff is handled by a shared frontend
19:53 orbea: ah
19:53 imirkin: entirely possible that the backend's GLES3 mode triggers some sort of fail in nouveau though
19:54 orbea: the other idea is to try an older mesa which my laptop has
19:54 imirkin: feels like blending gone wrong
19:55 imirkin: orbea: basically bugs can live in one of 3 places -
19:55 imirkin: (a) mesa core, (b) st/mesa, (c) the backend driver
19:55 imirkin: intel doesn't use st/mesa
19:56 imirkin: so if it works on intel, chances are mesa core is fine
19:56 imirkin: the next step is to test a different gallium driver, such as llvmpipe
19:56 imirkin: if that works, the issue is in the backend driver (e.g. nouveau). if it fails similarly, either llvmpipe and nouveau have the same bug, or it's a bug in st/mesa
19:56 orbea: I'll have to look into compiling llvm into my mesa I guess. Last time I did a while ago it did not go well
19:56 imirkin: well, i'm checking it out now
19:57 orbea: will still be useful for the future :)
19:57 imirkin: llvmpipe is a lot slower than nouveau =/
19:57 imirkin: looks like everything works fine with llvmpipe
19:57 orbea: so probably backend?
19:57 imirkin: that is the most likely situation
19:58 imirkin: in very rare scenarios it could be diff features of llvmpipe/nouveau causing st/mesa to take different logic paths
19:58 imirkin: now to use retracediff to nail down which calls are causing the fail
20:00 imirkin: aha, also the fail doesn't happen on nv50
20:01 orbea: interesting
20:01 imirkin: and the two rasterizers are a lot more similar, so it's a better comparison than to llvmpipe (for retracediff purposes)
20:02 imirkin: hm. let's hope.
20:03 imirkin: aha. i was getting a ton of diffs, but turned out the nv50 board was just in a funky state that happens sometimes.
20:05 imirkin: looks like call 64635 in that trace is the first bad one...
20:07 imirkin: which of course happens to enable blend right before it
20:21 imirkin: weird.
20:30 karolherbst: also ping on this series: https://patchwork.freedesktop.org/series/13485/
20:31 karolherbst: hakzsam: if you feel I left something important out from the comments you felt strong about, just tell me
20:32 hakzsam: I will have a second look tomorrow :)
20:32 karolherbst: awesome
20:32 hakzsam: still debugging that fucking F1 2015 game
20:32 karolherbst: but it seems like you are getting somewhere
20:33 imirkin: karolherbst: right, i see what's going on
20:33 karolherbst: awesome :)
20:33 imirkin: long story short - longstanding bug that i tried to fix a long time ago, but did slightly incorrectly
20:34 imirkin: specifically this change: dfb0ca16065c1d251101bb094f2cfd08cf3cda15
20:35 karolherbst: mhhh
20:35 imirkin: the *real* problem is that you can't merge phi values
20:35 imirkin: it's just not possible.
20:35 karolherbst: :(
20:36 imirkin: so something needs to stick a mov in
20:36 karolherbst: that makes me quite unhappy
20:36 karolherbst: ohh
20:36 karolherbst: okay
20:36 karolherbst: so a fake mov instead
20:36 karolherbst: and just get the phi defs be used by nothing?
20:36 karolherbst: or can that phi be removed?
20:37 imirkin: not fake
20:37 imirkin: real mov.
20:37 imirkin: your change is fine though
20:37 imirkin: but we need to fix RA first
20:37 karolherbst: well
20:37 imirkin: i'm pretty sure it was possible to hit it without your change as well
20:37 karolherbst: I am sure it will work if we replace it with a mov though
20:38 imirkin: to be clear, i'm talking about OP_MERGE
20:38 karolherbst: ohh I see
20:41 karolherbst: so in the end, RA has the issue and my patch should be fine? Or should CSE put a mov whenever it tries to merge phis?
20:43 imirkin: CSE is fine.
20:43 imirkin: LoadPropagation and condenseSrc's needs to be careful.
20:48 imirkin: karolherbst: http://hastebin.com/irogiyibar.php
20:48 imirkin: i need to do a quick audit of the remaining code
20:49 imirkin: but this fixes up the case you gave
20:51 imirkin: otoh i dunno
20:52 karolherbst: I hate it when patch produces those .orig files even when the patch applied just fine :(
20:52 imirkin: i have another theory
20:52 imirkin: yeah ok. this fixes it too
20:52 imirkin: but is kinda lame :(
20:52 karolherbst: :D
20:53 imirkin: http://hastebin.com/hofuheqodi.hs
20:53 karolherbst: :D
20:53 karolherbst: it is
20:54 imirkin: yeah, so i think that's the "correct" fix actually
20:54 karolherbst: this changes the reg id though, needs a more careful though maybe?
20:54 karolherbst: mhh
20:54 karolherbst: maybe it is fine, no idea what that functions does anyway
20:54 imirkin: but i like my first fix better
20:55 karolherbst: mhh
20:55 karolherbst: which is more future proof?
20:55 karolherbst: well, I will run a full shader-db with both patches
20:55 karolherbst: and see what is the difference
20:55 imirkin: ideally it would have RA'd better in the second situation
20:55 imirkin: but for some reason it didn't :(
20:56 imirkin: i think there's a bunch of low-hanging fruit in that area
20:56 karolherbst: yeah
20:57 karolherbst: there are
20:57 imirkin: where we can do dumb things to improve certain common situations
20:57 karolherbst: we should implement some layouting code for those wide reg things as well
20:57 imirkin: so yeah. i think i'm gonna stick with the second patch for now.
20:57 imirkin: ?
20:58 karolherbst: like exports
20:58 karolherbst: where RA adds silly movs
20:58 imirkin: mmm... that should be handled. when it goes wrong, it's going to be quite difficult to fix.
20:59 karolherbst: I encounterd that so many times
20:59 karolherbst: sometimes I write passes which should only improve stuff
20:59 karolherbst: but then RA messes up and adds silly movs for tex/export/other things just to make those d/t/q sources happy
21:01 karolherbst: imirkin: your short patch has no difference at all
21:01 karolherbst: the big one: "total gprs used in shared programs : 379277 -> 379251 (-0.01%)"
21:01 karolherbst: "total instructions in shared programs : 2818666 -> 2818677 (0.00%)"
21:03 imirkin: yeah, that seems likely.
21:07 imirkin: karolherbst: http://hastebin.com/operagepix.php -- that should help a bunch i suspect
21:07 imirkin: need to think about whether it's correct or not
21:09 karolherbst: mhh
21:09 karolherbst: the benefit on my patch didn't change at all
21:09 karolherbst: ohh
21:10 karolherbst: instruction count isn't that good, meh
21:11 karolherbst: imirkin: choose: https://gist.github.com/karolherbst/a61a86e4cb9dfb2e4097b2900fa36c70
21:13 imirkin: karolherbst: i'm trying to work out a potential improvement...
21:16 karolherbst: k
21:16 karolherbst: and your last patch crashes shaders
21:16 karolherbst: run: ../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_util.cpp:197: void nv50_ir::Interval::unify(nv50_ir::Interval&): Assertion `this != &that' failed
21:17 imirkin: heh
21:17 imirkin: i have some updates
21:17 imirkin: sec
21:19 imirkin: http://hastebin.com/rosulaporu.php
21:19 imirkin: how about this?
21:20 karolherbst: well, it doesn't crash yet :)
21:20 imirkin: maybe soon :)
21:23 karolherbst: uhh
21:23 karolherbst: looks like a jackpot
21:24 karolherbst: https://gist.github.com/karolherbst/78c0439d81cfdaffd70fc806a4e68c1d
21:24 karolherbst: hihi: helped inst ../nvidia_shaderdb/gputest_pixmark_piano/7.shader_test - 1 3744 -> 3723
21:24 imirkin: neat-o
21:24 karolherbst: more perf
21:25 RSpliet: perfect, I love playing pixmark piano :-P
21:25 imirkin: funnest game ever
21:25 karolherbst: imirkin: let me check the hurts though
21:25 karolherbst: it may be something silly again
21:26 karolherbst: imirkin: hurt inst ../nvidia_shaderdb/bioshock_infinite/22.shader_test - 0 48 -> 54 :D
21:26 karolherbst: that shader doesn't like you
21:26 karolherbst: yo :D
21:27 RSpliet: karolherbst: think you can obtain old vs new assembly too?
21:27 karolherbst: mhh
21:27 karolherbst: silly SSO thing
21:28 imirkin: perhaps i don't like that shader :p
21:28 karolherbst: https://gist.github.com/karolherbst/65ff2144b36701573138999bf2bfd09c
21:29 karolherbst: 13 vs 13+14 in post ra
21:29 karolherbst: this looks odd
21:30 karolherbst: uhh, more locals
21:30 karolherbst: I bet it is the orbital again
21:30 karolherbst: uhh
21:30 karolherbst: actually not
21:31 karolherbst: ../nvidia_shaderdb/talos_principle/3945.shader_test - type: 1, local: 48, gpr: 61, inst: 718, bytes: 6568
21:31 imirkin: yeah, so ... right. that can happen.
21:32 karolherbst: funny though: helped gpr ../nvidia_shaderdb/talos_principle/3945.shader_test - 1 62 -> 61
21:32 karolherbst: hurt local ../nvidia_shaderdb/talos_principle/3945.shader_test - 1 36 -> 48
21:32 karolherbst: I guess that's our layout issue again or something silly like that
21:32 imirkin: it can be a lot of things
21:32 imirkin: those random moves help sometimes
21:34 karolherbst: I mean it could use a few more regs instead of doing spilling
21:35 imirkin: if only it were so simple.
21:41 imirkin: anyways, all this stuff is pretty questionable...
21:42 karolherbst: right
21:43 karolherbst: well, I will head to bed now anyway
22:32 imirkin: anyone have a non-GK110/GK208 nvc0+ handy?
22:33 imirkin: and could check orbea's trace?
22:35 hakzsam: imirkin, send me an email, I can check tomorrow