04:33imirkin: skeggsb: any insight into https://bugs.freedesktop.org/show_bug.cgi?id=98149 ?
04:34imirkin: skeggsb: for some reason we're no longer picking up the BIOS FP mode
04:34imirkin: i've looked through the code, but the conditions are ... complex. and nvbios doesn't have the relevant fields decoded, as far as i can tell
06:58mupuf: ok, now all the machines do boot with wtrpm and they all have access to the NAS
06:58mupuf: I wonder if I can put ssh keys in ldap though
06:58mupuf: otherwise, I have to update multiple copies of the same .authorized_keys file
11:41lapion: Does anyone in here have a system with 8GB ram and an nvidia card that has non-relocatable ram-window at the 7-8GB ?
11:41lapion: and nvidia fx1800 /
11:43lapion: my system keeps freezing up. if I have more than 6GB installed
12:58RSpliet: hah... today I learned. Tegra K1 on L4T doesn't support OpenCL
12:59RSpliet: pmoreau: we can do better than that, can't we? :-D
13:01mupuf: RSpliet: hehe, nice
13:03pmoreau: RSpliet: Yes we can! Well, at some point in the future that is… :-D
13:04mupuf: I'm sure they said the same "D
13:05pmoreau: Eh eh
13:11RSpliet: oh... same for the TX1
16:43pmoreau: Loads (from global memory) with an offset aren’t possible on pre-Pascal hardware, are they?
16:46imirkin: they should be possible
16:46imirkin: /*0cd0*/ LD.E R20, [R2+0x124]; /* 0xc4800000921c0850 */
16:47imirkin: [that's with SM35, but there's an encoding for it on SM20 as well]
16:47pmoreau: Oh ok. Then, maybe the compiler doing weird stuff.
18:04imirkin: skeggsb: what's the implication of that bar thing?
18:06lapion: Does anyone in here have a system with 8GB ram and an nvidia card that has non-relocatable ram-window at the 7-8GB ?
18:07lapion: my laptop with nvidia quadro fx1800m keeps freezing up. if I have more than 6GB installed
18:08imirkin: lapion: shouldn't the ram get relocated somewhere up high by the bios?
18:12imirkin: anyways, take my comment with a grain of salt - i'm pretty weak on all the PCI memory window details, and who's supposed to configure what.
18:13imirkin: you may consider asking firstname.lastname@example.org - they know pci :)
18:13imirkin: [or put it another way, if they don't know, no one does]
18:48pmoreau: skeggsb: Ping for the backlight stuff (both Maxwell+Pascal support and the unique sysfs/debugfs interface fix)
18:51imirkin: mwk: when's the HDL version coming? :)
18:51mwk: imirkin: :p
18:54imirkin: pretty soon you'll be making ASICs
18:55mwk:wonders how many NV3s could fit in the area of a modern GPU
18:55ajax: a mere 3.5M transistors in an nv3
18:56imirkin: yeah, but they were bigger...
18:56imirkin: they don't make 'em like they used to
19:05imirkin: Riastradh: did you have a chance to put something together re netbsd for the wiki?
19:35Riastradh: imirkin: 'Fraid not. But I have it on my todo list and might have time tonight. (And feel free to poke me again if I haven't gotten to it.)
19:35imirkin: will do
19:35imirkin: doesn't have to be a work of art btw
19:35karolherbst: imirkin: anything I can help reagarding the one issue I've found?
19:35imirkin: just has to be better than a blank page :)
19:35imirkin: karolherbst: keep pinging me...
19:35karolherbst: will do
19:35imirkin: karolherbst: remind me which piglit it was?
19:36karolherbst: uhhh, hakzsam actually listed them in his mail
19:36imirkin: karolherbst: yes, but if you want me to look at stuff, the way to do that isn't to tell me to look through my email for something
19:37imirkin: also patchwork link to your patch
19:37imirkin: one's enough
19:37karolherbst: I've told you about the s-all-bvec4-using-if first
19:38karolherbst: no clue if the others are much different
19:38imirkin: i'll look at one, and move up from there.
19:38imirkin: where's the patch?
19:38imirkin: (patchwork link)
19:39karolherbst: uhm, never use patchwork, might take a while
19:39imirkin: nvm, i'll find it
19:48imirkin: karolherbst: looks like a legit RA issue. still investigating.
19:52orbea: Maybe this is a GLES3 nouveau bug? I cant reproduce it on intel and the glupen64 dev cant reproduce it on nvidia. trace - http://ks392457.kimsufi.com/orbea/stuff/trace/libretro/Zelda%20OoT%2064_GLES3.trace.xz issue report: https://github.com/loganmc10/GLupeN64/issues/74
19:53imirkin: orbea: unlikely to be anything GLES3-specific
19:53imirkin: all that stuff is handled by a shared frontend
19:53imirkin: entirely possible that the backend's GLES3 mode triggers some sort of fail in nouveau though
19:54orbea: the other idea is to try an older mesa which my laptop has
19:54imirkin: feels like blending gone wrong
19:55imirkin: orbea: basically bugs can live in one of 3 places -
19:55imirkin: (a) mesa core, (b) st/mesa, (c) the backend driver
19:55imirkin: intel doesn't use st/mesa
19:56imirkin: so if it works on intel, chances are mesa core is fine
19:56imirkin: the next step is to test a different gallium driver, such as llvmpipe
19:56imirkin: if that works, the issue is in the backend driver (e.g. nouveau). if it fails similarly, either llvmpipe and nouveau have the same bug, or it's a bug in st/mesa
19:56orbea: I'll have to look into compiling llvm into my mesa I guess. Last time I did a while ago it did not go well
19:56imirkin: well, i'm checking it out now
19:57orbea: will still be useful for the future :)
19:57imirkin: llvmpipe is a lot slower than nouveau =/
19:57imirkin: looks like everything works fine with llvmpipe
19:57orbea: so probably backend?
19:57imirkin: that is the most likely situation
19:58imirkin: in very rare scenarios it could be diff features of llvmpipe/nouveau causing st/mesa to take different logic paths
19:58imirkin: now to use retracediff to nail down which calls are causing the fail
20:00imirkin: aha, also the fail doesn't happen on nv50
20:01imirkin: and the two rasterizers are a lot more similar, so it's a better comparison than to llvmpipe (for retracediff purposes)
20:02imirkin: hm. let's hope.
20:03imirkin: aha. i was getting a ton of diffs, but turned out the nv50 board was just in a funky state that happens sometimes.
20:05imirkin: looks like call 64635 in that trace is the first bad one...
20:07imirkin: which of course happens to enable blend right before it
20:30karolherbst: also ping on this series: https://patchwork.freedesktop.org/series/13485/
20:31karolherbst: hakzsam: if you feel I left something important out from the comments you felt strong about, just tell me
20:32hakzsam: I will have a second look tomorrow :)
20:32hakzsam: still debugging that fucking F1 2015 game
20:32karolherbst: but it seems like you are getting somewhere
20:33imirkin: karolherbst: right, i see what's going on
20:33karolherbst: awesome :)
20:33imirkin: long story short - longstanding bug that i tried to fix a long time ago, but did slightly incorrectly
20:34imirkin: specifically this change: dfb0ca16065c1d251101bb094f2cfd08cf3cda15
20:35imirkin: the *real* problem is that you can't merge phi values
20:35imirkin: it's just not possible.
20:36imirkin: so something needs to stick a mov in
20:36karolherbst: that makes me quite unhappy
20:36karolherbst: so a fake mov instead
20:36karolherbst: and just get the phi defs be used by nothing?
20:36karolherbst: or can that phi be removed?
20:37imirkin: not fake
20:37imirkin: real mov.
20:37imirkin: your change is fine though
20:37imirkin: but we need to fix RA first
20:37imirkin: i'm pretty sure it was possible to hit it without your change as well
20:37karolherbst: I am sure it will work if we replace it with a mov though
20:38imirkin: to be clear, i'm talking about OP_MERGE
20:38karolherbst: ohh I see
20:41karolherbst: so in the end, RA has the issue and my patch should be fine? Or should CSE put a mov whenever it tries to merge phis?
20:43imirkin: CSE is fine.
20:43imirkin: LoadPropagation and condenseSrc's needs to be careful.
20:48imirkin: karolherbst: http://hastebin.com/irogiyibar.php
20:48imirkin: i need to do a quick audit of the remaining code
20:49imirkin: but this fixes up the case you gave
20:51imirkin: otoh i dunno
20:52karolherbst: I hate it when patch produces those .orig files even when the patch applied just fine :(
20:52imirkin: i have another theory
20:52imirkin: yeah ok. this fixes it too
20:52imirkin: but is kinda lame :(
20:53karolherbst: it is
20:54imirkin: yeah, so i think that's the "correct" fix actually
20:54karolherbst: this changes the reg id though, needs a more careful though maybe?
20:54karolherbst: maybe it is fine, no idea what that functions does anyway
20:54imirkin: but i like my first fix better
20:55karolherbst: which is more future proof?
20:55karolherbst: well, I will run a full shader-db with both patches
20:55karolherbst: and see what is the difference
20:55imirkin: ideally it would have RA'd better in the second situation
20:55imirkin: but for some reason it didn't :(
20:56imirkin: i think there's a bunch of low-hanging fruit in that area
20:57karolherbst: there are
20:57imirkin: where we can do dumb things to improve certain common situations
20:57karolherbst: we should implement some layouting code for those wide reg things as well
20:57imirkin: so yeah. i think i'm gonna stick with the second patch for now.
20:58karolherbst: like exports
20:58karolherbst: where RA adds silly movs
20:58imirkin: mmm... that should be handled. when it goes wrong, it's going to be quite difficult to fix.
20:59karolherbst: I encounterd that so many times
20:59karolherbst: sometimes I write passes which should only improve stuff
20:59karolherbst: but then RA messes up and adds silly movs for tex/export/other things just to make those d/t/q sources happy
21:01karolherbst: imirkin: your short patch has no difference at all
21:01karolherbst: the big one: "total gprs used in shared programs : 379277 -> 379251 (-0.01%)"
21:01karolherbst: "total instructions in shared programs : 2818666 -> 2818677 (0.00%)"
21:03imirkin: yeah, that seems likely.
21:07imirkin: karolherbst: http://hastebin.com/operagepix.php -- that should help a bunch i suspect
21:07imirkin: need to think about whether it's correct or not
21:09karolherbst: the benefit on my patch didn't change at all
21:10karolherbst: instruction count isn't that good, meh
21:11karolherbst: imirkin: choose: https://gist.github.com/karolherbst/a61a86e4cb9dfb2e4097b2900fa36c70
21:13imirkin: karolherbst: i'm trying to work out a potential improvement...
21:16karolherbst: and your last patch crashes shaders
21:16karolherbst: run: ../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_util.cpp:197: void nv50_ir::Interval::unify(nv50_ir::Interval&): Assertion `this != &that' failed
21:17imirkin: i have some updates
21:19imirkin: how about this?
21:20karolherbst: well, it doesn't crash yet :)
21:20imirkin: maybe soon :)
21:23karolherbst: looks like a jackpot
21:24karolherbst: hihi: helped inst ../nvidia_shaderdb/gputest_pixmark_piano/7.shader_test - 1 3744 -> 3723
21:24karolherbst: more perf
21:25RSpliet: perfect, I love playing pixmark piano :-P
21:25imirkin: funnest game ever
21:25karolherbst: imirkin: let me check the hurts though
21:25karolherbst: it may be something silly again
21:26karolherbst: imirkin: hurt inst ../nvidia_shaderdb/bioshock_infinite/22.shader_test - 0 48 -> 54 :D
21:26karolherbst: that shader doesn't like you
21:26karolherbst: yo :D
21:27RSpliet: karolherbst: think you can obtain old vs new assembly too?
21:27karolherbst: silly SSO thing
21:28imirkin: perhaps i don't like that shader :p
21:29karolherbst: 13 vs 13+14 in post ra
21:29karolherbst: this looks odd
21:30karolherbst: uhh, more locals
21:30karolherbst: I bet it is the orbital again
21:30karolherbst: actually not
21:31karolherbst: ../nvidia_shaderdb/talos_principle/3945.shader_test - type: 1, local: 48, gpr: 61, inst: 718, bytes: 6568
21:31imirkin: yeah, so ... right. that can happen.
21:32karolherbst: funny though: helped gpr ../nvidia_shaderdb/talos_principle/3945.shader_test - 1 62 -> 61
21:32karolherbst: hurt local ../nvidia_shaderdb/talos_principle/3945.shader_test - 1 36 -> 48
21:32karolherbst: I guess that's our layout issue again or something silly like that
21:32imirkin: it can be a lot of things
21:32imirkin: those random moves help sometimes
21:34karolherbst: I mean it could use a few more regs instead of doing spilling
21:35imirkin: if only it were so simple.
21:41imirkin: anyways, all this stuff is pretty questionable...
21:43karolherbst: well, I will head to bed now anyway
22:32imirkin: anyone have a non-GK110/GK208 nvc0+ handy?
22:33imirkin: and could check orbea's trace?
22:35hakzsam: imirkin, send me an email, I can check tomorrow