00:08 whompy: duddasankha: If you ask your question directly, you're more likely to get responses on IRC.
09:52 dboyan: imirkin_: I fixed the "unknown instruction" discovered yesterday in envydis
09:53 dboyan: https://github.com/envytools/envytools/pull/83
09:57 dboyan: I also made more comments about rcp assembly program on my branch. https://github.com/dboyan/mesa/blob/fp64-rcprsq1/src/gallium/drivers/nouveau/codegen/lib/gk110.asm
15:45 dboyan: well, it seems 2 or 3 rounds of newton raphson step is nearly enough for drcp, i can only see error in the last bit. I really wonder why the blob is emitting such long code, even longer than dsqrt
15:46 dboyan: it called a really complicated function for 2 rounds after doing 2 newton raphson steps. Haven't read what that function really did yet.
16:17 imirkin: dboyan: cool. definitely like the new comments, but imho, it's still not enough. this is not some obvious algorithm... it's very subtle stuff. even the newton-raphson steps would be good to document as saying that each step is RCP_{x+1} = ... RCP_{x} ... etc
16:29 dboyan: imirkin, okay, will document the formula of newton-raphson step. Actually i can't really follow the calculation in double, just copied the blob's logic there...
16:32 imirkin: dboyan: that's not great... you should have your own specific logic
16:33 imirkin: dboyan: you can look for my newton-raphson steps in the commit i referenced the other day
16:33 imirkin: dboyan: if it doesn't add up to what the blob is doing, then do your own thing. (a) their stuff could be wrong, or more likely (b) their stuff could be incredibly clever
16:33 pmoreau: imirkin: I have `choose:-1 (out %r15, in %r7) { mov u32 %r8 0x00000001; set u8 %p9 eq u32 %r7 %r8 }` which makes `nv50_ir::ConstantFolding::findOriginForTestWithZero()` SEGFAULT as there is no insn associated to %r7.
16:34 imirkin: dboyan: you don't have to match their cleverness :)
16:34 imirkin: pmoreau: ok. i likely didn't anticipate it being able to come from a function input
16:34 pmoreau: imirkin: Am I supposed to move %r7 into some tmp variable and/or fix findOriginForTestWithZero to check for values without insn and ignore those?
16:34 imirkin: pmoreau: findOriginForTestWithZero should return NULL in that case.
16:35 pmoreau: Sounds good, will send a patch then
16:35 imirkin: (or whatever the "i couldn't find an origin" indicator is, i don't remember)
16:35 pmoreau: Seems like NULL, looking at the function. :-)
16:35 dboyan: imirkin: I guess there were some sort of optimization in place, i'll validate that later. Worst case scenerio: I'll replace the last 2 rounds with basic formula
16:35 imirkin: pmoreau: you probably know this, but the glsl path doesn't generate functions. so all the function stuff is pretty untested.
16:36 imirkin: dboyan: yeah sounds good. i'd rather have less optimal code we understand than fancy code we don't
16:36 pmoreau: imirkin: I didn’t know. So the TGSI code that you have as input has all functions already inlined into main? O.O
16:36 imirkin: dboyan: it may also be helpful for you to write a simulator... i've had good success with using numpy - it has all the necessary precision but is still easy to manipulate.
16:37 imirkin: pmoreau: yes. it's all just a single MAIN function.
16:37 pmoreau: Oh, ok. Good to know
16:37 imirkin: pmoreau: that said, the function stuff *was* written, and i believe tested a bit with TGSI (which does support subroutines, even though we don't use them)
16:37 imirkin: pmoreau: but i've never personally even seen it in action.
16:38 pmoreau: I saw that the TGSI backend supported subroutines, as I took inspiration from it to get functions working.
16:39 imirkin: pmoreau: yeah, i think calim and/or curro wrote it for eventual opencl support
16:39 pmoreau: Ok
16:39 dboyan: imirkin: I simulated the logic in C first before writing those assembly, Translating assembly into equivalent C code by hand.
16:40 dboyan: So I know at least that the logic works
16:40 imirkin: pmoreau: but there's no doubt been a lot of code written by, e.g., me that doesn't really take functions into account
16:40 imirkin: dboyan: ok. well that's a start :)
16:40 pmoreau: imirkin: No problem! That way, I get to interact with some parts of the code I wouldn’t have looked at otherwise, if it worked. :-)
16:46 imirkin: dboyan: but we can't really just be going around copying blob's code as-is :) using their generated code to figure out what algorithm you need to use is fine, but then you need to use that algorithm and not copy their code..
16:50 dboyan: I think I only "copied" about 5-10 lines about numeric. Other parts are not the same
16:50 imirkin: dboyan: btw, does the rsq64 thing make use of the rsq64h op? if not, then all the lowering could be written in glsl directly
16:51 dboyan: the blob does use rsq64h
16:51 imirkin: ok. then we can't farm it out to glsl
16:51 imirkin: [not easily]
16:53 dboyan: I think 3 newton-raphson steps can be precise enough, but the blob uses a lot of code after 2 steps. I haven't read what that huge code did.
16:54 dboyan: The result of rsq can't be denorm except for 0, that's what I'm really happy about it
16:54 dboyan: denorm handling in drcp was a nightmare until I found my trick
16:58 pmoreau: imirkin: Are BRA allowed in the middle of a BB in NVIR?
16:58 pmoreau: I guess not
16:58 karolherbst: pmoreau: if you put a bra inside a BB you have two BBs ;)
16:58 pmoreau: :-D
17:01 karolherbst: if you put those there you have to adjust the edges and the cfg, but it's perfectly fine if you do it right
17:03 pmoreau: I’ll create some proper BBs, as it otherwise confuses my own code as well.
17:04 imirkin: pmoreau: no.
17:04 imirkin: the idea of a BB is that you have a bunch of phi ops, followed by a bunch of regular ops, followed by a bunch of flow ops
17:05 imirkin: bb->getPhi() gets you the first phi node, bb->getExit() gets you the first flow op (iirc)
17:05 imirkin: and iirc bb->getFirst() gets the first regular op. double-check that though.
17:05 karolherbst: getFirst is the first op
17:06 karolherbst: I think the frist non phi one is getEntry?
17:06 imirkin: yeah could be.
17:06 karolherbst: yeah, it's getEntry
17:06 karolherbst: and getExit is the last one
17:07 karolherbst: BasicBlock isn't aware of flow instructions
17:08 imirkin:wonders how logic ops handle the flags register as an input...
17:13 Scootyloo: Hi um, just asking whether pascal cards (eg gtx 1050) should be treated as unknown chipset or not?
17:14 Scootyloo: I'm trying to use a GTX1050 but it's currently stuck with VGA resolution
17:14 Scootyloo: [ 1.156023] nouveau 0000:01:00.0: unknown chipset (137000a1)
17:15 pmoreau: Having proper BBs works better! Switch is now working, so just need to support its support in the linker.
17:15 nyef: Scootyloo: My next questions would be what kernel version you're using and what lspci -nn has to say about the card.
17:16 Scootyloo: Sure one sec
17:16 imirkin: Scootyloo: GP107 needs a patch for modesetting support
17:16 nyef: ... mostly because I can't identify a board type straight from the chipset ID like that.
17:16 imirkin: Scootyloo: such as this - https://lists.freedesktop.org/archives/nouveau/2017-February/027318.html
17:17 imirkin: nyef: the high 12 bits (ignoring the 2 highest iirc) give the chipset id.
17:17 imirkin: in this case, 0x137
17:17 Scootyloo: 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c81] (rev a1)
17:17 Scootyloo: 4.9.10
17:18 nyef: Hunh. Not even in pci.ids yet?
17:18 imirkin: Scootyloo: you'll want 4.10-rcN anyways - iirc 4.9 had a sad bug regarding cursor handling on pascal
17:18 imirkin: Scootyloo: but you also need a patch like the one i linked to
17:18 imirkin: there's no upstream GP107 support yet
17:18 Scootyloo: imirkin: I see, thanks!
17:19 imirkin: Scootyloo: note that ther ewill be no acceleration support at this time
17:20 Scootyloo: imirkin: got it, cheers
17:24 imirkin:&
17:27 librin: imirkin: You mentioned that the Civ6 issue might be mem corruption
17:29 librin: which reminds me, there are some weird things going on with Civ6 and mem management on the OpenGL side that I noticed
17:30 librin: I'm not sure if it's actually related, but maybe I should detail those observations in that bug report, just in case it is?
18:35 pmoreau: Damned! I forgot that I need to handle extensions as well in my linker… :'(
23:02 imirkin: librin: i suppose it's possible the game could be corrupting stuff, but it seems much more likely that nouveau's doing its own corrupting
23:03 imirkin: librin: hm yeah. actually with your shader, i still get some illegal shenanigans in valgrind. just not enough to kill the whole thing.
23:07 imirkin: librin: i assume fixing that will fix your issue
23:07 imirkin: unfortunately it's the underlying issue again, which will need careful analysis
23:07 imirkin: there's a chance i may be able to do that tomorrow
23:15 xaero: how well does nouveau work with a 970m at present?, does the 900 desktop series firmware cover that?
23:18 imirkin: xaero: looks like that's likely to be a GM204 - should work, but likely not in a way that's particularly useful to an end-user
23:19 imirkin: there's some investigation into letting mobile chips reclock since they're usually cooled by a fan controlled by the system, however it has yet to come to fruition
23:20 imirkin: [the main thing that the lack of "secure" firmware prevents us from doing is controlling the fan]
23:21 librin: imirkin: the game hits at least a few "Mesa: User error: GL_OUT_OF_MEMORY in glTexImage" && "Mesa: User error: GL_OUT_OF_MEMORY in glTexImage2D" while loading
23:21 imirkin: librin: ouch, that's not great =/
23:21 imirkin: is it a 32-bit game?
23:22 librin: also, sometimes it suddenly allocates insane amount of memory and makes nouveau crash in the memory allocator, running out of memory arenas
23:22 librin: but that is rare
23:22 librin: x86-64 onlu
23:22 librin: that allocation happens on loading
23:23 imirkin: weird.
23:24 librin: imirkin: I can't remember for sure but it was something about slub free/ unused arena(?) list being NULL, or something like that
23:24 librin: can only vaguely remember
23:25 librin: apparently, I have very bad luck triggering this when I do have a debugger attached
23:27 librin: imirkin: can the game actually corrupt mesa's / nouveau's memory?
23:30 imirkin: sure, i mean it's all in the same memory space
23:30 imirkin: and vice-versa too :)
23:46 karolherbst: it's not like we have no memory issues
23:48 imirkin: i can barely remember anything ;)