00:00 karolherbst: *access
00:18 orbea: Maybe there is a nouveau specific bug lurking here? https://github.com/iXit/Mesa-3D/issues/308#issuecomment-366777592
00:19 imirkin_: probably.
00:20 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_state.c#n445
00:20 imirkin_: this does seem roughly right though...
00:25 imirkin_: aw crap
00:26 imirkin_: hm. no.
00:30 imirkin_: hmmm
00:30 imirkin_: i think marek and i have different interpretations of how this should work...
00:31 imirkin_: which is of course not documented.
00:36 orbea: heh...
03:43 imirkin: orbea: can i assume you'd be willing to test a patch?
03:43 imirkin: (i don't have one yet)
03:43 orbea: imirkin: yea, ofc
03:43 imirkin: ok cool. i'll ping you when i have something
03:44 orbea: cool :)
09:57 pmoreau: skeggsb: :-( Looks like I did not test your MMU series enough; I assume (maybe wrongly) that this new bug (regression in 4.15 on MCP79) https://bugs.freedesktop.org/attachment.cgi?id=137460 was caused by the series.
09:57 pmoreau: I’ll try to reproduce it tonight and bisect it.
10:52 karolherbst: imirkin: actually we could also just nuke the if clause
10:52 karolherbst: but meh
12:59 rhyskidd: mwk: thanks for review off the nvapy patches
15:23 karolherbst: imirkin: I am wondering if I should CC stable with the load patch, but I have the feeling that this issue is like never happening in real world scenarios...
15:27 karolherbst: imirkin: meh :( I am hitting the load_output thing in a tomb_raider shader :(
15:29 imirkin: i haven't been cc'ing anything to stable
15:30 imirkin: it's too much of a hassle to worry about
15:30 imirkin: the process is broken, and i have too many other things to worry about to try to fix it (and it's not like i'm the one paying the release folk)
15:51 karolherbst: imirkin: yeah
16:56 karolherbst: imirkin: fixed the tess crash :)
16:57 karolherbst: https://github.com/karolherbst/mesa/commit/28e899e114d0a34f8ae14651eee5ff0eccde0cb9
16:57 imirkin: cool
17:07 karolherbst: imirkin: all shader-db + nouveau_shaderdb shaders compiled without crashes :)
17:07 karolherbst: mhh
17:07 karolherbst: the stats are interesting: https://gist.githubusercontent.com/karolherbst/aeae4a96a203e8ae0794d2000d732540/raw/2add85bc5cee819b34f3d62182160f7e8bc04dd9/gistfile1.txt
17:08 karolherbst: especially this: total local used in shared programs : 20476 -> 8208 (-59.91%)
17:08 karolherbst: and this is without running any notable nir opts in the driver (st_glsl_to_nir runs a few though)
17:09 imirkin: curious indeed.
17:09 imirkin: i've never really spent a lot of time looking at that
17:10 imirkin: (i mean, reducing the local storage)
17:10 karolherbst: well, now we can dig into what each path do better and improve the other one :)
17:10 imirkin: i did do one thing to make it way better, but that was a very long time ago
17:10 imirkin: (more like "not totally horrible")
17:11 karolherbst: I guess nir does a better job packing stuff
17:11 karolherbst: and I still have disabled the other packing "feature"
17:11 imirkin: ah yeah, that could definitely be the case
17:11 karolherbst: where array entries aren't vec4s
17:11 imirkin: i don't think i ever made use of the masks
17:12 karolherbst: checking the compilation time, but I think the TGSI path should be a lot faster
17:12 karolherbst: or maybe that was just because of the debug builds and me using C++ containers
17:12 imirkin: maybe, maybe not. there's a ton of BS in glsl_to_tgsi that can be disabled.
17:12 imirkin: debug vs not-debug is a HUGE difference
17:12 imirkin: in nv50_ir alone.
17:13 karolherbst: yeah
17:14 karolherbst: wondering what happens when the cause of the +gprs count is some silly thing, like constants at the top of the shader and so on
17:14 karolherbst: the instruction count reduction is promising
17:14 imirkin: welllll
17:14 imirkin: heh
17:14 imirkin: solving the gpr count issues will increase ops :)
17:15 imirkin: i've seen it before
17:15 karolherbst: well
17:15 imirkin: in some of my attempts at improving things
17:15 karolherbst: not if you have silly movs at the top
17:15 karolherbst: and you could just move them down
17:15 imirkin: yeah, in some cases true
17:15 karolherbst: wow
17:15 karolherbst: that difference in compile time
17:15 karolherbst: 2:57.50 with nir
17:15 karolherbst: 0:33.44 with tgsi
17:15 imirkin: with same "debug" settings?
17:16 karolherbst: both meson release builds
17:16 imirkin: huh, ok.
17:16 karolherbst: well
17:16 karolherbst: I changed away from debug
17:16 imirkin: i think nir does some crazy validation stuff in debug builds too
17:16 karolherbst: no idea if the cflags are fine
17:16 imirkin: (revalidates all assumptions after every opt pass)
17:16 karolherbst: -O3
17:17 imirkin: what are you compiling that only takes 33s?
17:17 imirkin: well, i mean -DDEBUG vs not
17:17 karolherbst: yeah, doesn't seem to be set
17:17 karolherbst: NV50_PROG_DEBUG=3 doesn't print anything as well
17:17 imirkin: k
17:18 karolherbst: I guess this is because nir runs a lot of opts a lot of times
17:18 karolherbst: I am sure if I disable the opts in glsl_to_nir it will be quite faster
17:18 imirkin: opt_algebraic, while flexible, i think is relatively slow compared to the nouveau approach
17:18 imirkin: also i think they run to a fixed point
17:18 karolherbst: well not that we care much with a nir shader cache...
17:18 imirkin: which is mildly better, but ... obviously slower. esp if you don't give much thought to optimization order.
17:19 imirkin: that's a surprising amount of difference though
17:19 karolherbst: yeah
17:19 imirkin: from what i had seen, the nir pathway was fairly fast
17:19 imirkin: i wonder if your converter's doing something dumb
17:19 karolherbst: maybe it is my fault or something
17:20 karolherbst: but usually I just use std::vector for stuff...
17:20 karolherbst: ohh wait
17:20 imirkin: either directly, or indirectly by doing something bad with the ir
17:20 karolherbst: unordered_map for the defs
17:20 karolherbst: this might hurt
17:20 imirkin: that's a hash map
17:20 karolherbst: I think you can do better than a hash_map for int -> Value* mappings
17:21 karolherbst: especially if the ints are below 10k
17:21 karolherbst: usually
17:21 imirkin: not for sparse aribtrary ints
17:21 imirkin: you can for small packed ints (it's called an array)
17:21 karolherbst: maybe nir tells us the highest ssa value id
17:21 karolherbst: then I could just preallocate an array
17:21 imirkin: but map lookups aren't costing you 2 minutes.
17:21 karolherbst: 10k pointers should be fine, no?
17:22 imirkin: run perf and see where it's going.
17:22 imirkin: instead of guessing.
17:22 karolherbst: yeah
17:29 karolherbst: imirkin: https://gist.githubusercontent.com/karolherbst/34206414981e8ef4b7de4e50472f6864/raw/5ce98aab63b0868d53cf40c7b8d3c4a76112c606/gistfile1.txt
17:30 karolherbst: nir_validate_shader ...
17:35 karolherbst: imirkin: with NIR_VALIDATE=0 40 secs
17:37 karolherbst: imirkin: wow, codegen is good
17:38 karolherbst: imirkin: I've disabled running the opts in glsl_to_nir: https://gist.githubusercontent.com/karolherbst/c703d8625213f995d99095d6d3120551/raw/0a37061ba3d08db03a949b6abfac77ae1e86d3ee/gistfile1.txt
17:38 karolherbst: :D
17:39 karolherbst: okay, didn't disable all calls to st_nir_opts
17:39 karolherbst: now compiles are failing
17:41 imirkin: validate should only be called in debug builds though
17:41 imirkin: but yeah, it's super-duper slow
17:42 karolherbst: right
17:42 karolherbst: but 30 secs vs 40 secs is good enough for me
17:53 karolherbst: imirkin: https://gist.githubusercontent.com/karolherbst/b0c5e1db326aa31890cb37143f79e598/raw/3d9322b614c867e9d62eac3f3958ecc36b2f1399/gistfile1.txt
17:53 karolherbst: so yeah, BB:0 is filled with values we might want to move down somewhere
17:53 karolherbst: especially those constant movs
18:06 imirkin: skeggsb: you want to carry https://patchwork.freedesktop.org/patch/202322/ or should i get it included in drm-misc?
21:08 imirkin: karolherbst: just spend the 2 minutes to add this ... + /* TODO: nir doesn't support tg4 with multiple offsets */
21:09 karolherbst: imirkin: ohh, so we have support for this now?
21:10 imirkin: i just mean... add the nir support
21:10 imirkin: it'll take like 2 mins
21:10 karolherbst: it didn't look like it though
21:10 imirkin: (ok, legitimately, more like 10 mins)
21:10 imirkin: just need to extend the offsets stored in the op
21:11 karolherbst: I think there was some other problems as well, because I am sure I thought about just doing that
21:11 karolherbst: but yeah, we probably want to have this
21:12 imirkin: it's just a silly thing to not support
21:12 imirkin: since supporting it is trivial
21:13 karolherbst: yeah, let me take a look at it again
21:15 imirkin: if you're unsure how to proceed, ask some of the intel guys
21:15 imirkin: they should be able to tell you fairly exactly what needs to be done
21:16 imirkin: i only have approximate knowledge
21:17 karolherbst: well the thing is that the tex intrinsics have multiple sources, each typed
21:17 karolherbst: but you can't have multiple of the same type
21:17 pmoreau: karolherbst: Grrr, I was planning to go to sleep, but I guess I should at least review some patches, or I’ll end up never doing it. :/
21:17 karolherbst: I could add nir_tex_src_offset1, nir_tex_src_offset2, nir_tex_src_offset3.... though
21:19 imirkin: consult with them
21:19 imirkin: oh, and remind me to review connor's patch on like ... friday. so that i still remember over the weekend.
21:20 karolherbst: okay
21:21 karolherbst: pmoreau: :p
21:21 imirkin: coz i can't now (it's the sort of thing that'll require me to read a lot of code and think, so ... need time)
21:22 karolherbst: yeah, most likely
21:22 karolherbst: allthough, I think this part isn't as complicated in the end
21:23 imirkin: well, it's not directly related to the series
21:23 karolherbst: right
21:23 karolherbst: but without it a lot of things break :)
21:23 imirkin: yea
21:23 karolherbst: but I think with that patch in place we could even clean up some code in the TGSI path
21:24 karolherbst: or maybe not, because it is usually all 32 bit there anyway
21:28 pmoreau: karolherbst: What does “sdt” mean in “DataType sdt”? (I don’t think you answered me about it in my previous comments.)
21:29 karolherbst: source data type, but I think I should convert that to sty like in other code
21:30 pmoreau: Okay, I was wondering whether it was that or not. But I would personally prefer the slightly more verbose srcTy or dstTy that are already used in the code.
21:31 karolherbst: yeah
21:35 pmoreau: karolherbst: Quick question: for example in handleSAT, you introduce a 64-bit MAX. I guess that one isn’t going to go through your handleMINMAX, or is it?
21:36 karolherbst: pmoreau: doesn't have to ;)
21:38 pmoreau: Okay, handleMINMAX is on integers and handleSAT is creating a MAX between floats.
21:38 karolherbst: :)
21:42 pmoreau: God, it took me 5 sec to realise handleLogOp had nothing to do with the log operation.
21:42 pmoreau: But Log -> logical instead
21:42 imirkin: logic ;)
21:43 karolherbst: :)
21:44 pmoreau: Would it be possible to rename it to avoid the confusion? O:-)
21:45 karolherbst: I am sure I copied that name
21:45 pmoreau: I guess if it really was about the log operation, it would be named handleLOG and not handleLogOp.
21:45 karolherbst: or maybe not
21:45 karolherbst: dunno
21:45 karolherbst: yeah
21:45 karolherbst: I could call it handleLogOps
21:46 karolherbst: allthough that wouldn't help
21:46 karolherbst: or we go with the classic scheme
21:46 karolherbst: handleANDXOROR
21:46 karolherbst: or handleANDORXOR
21:46 karolherbst: :p
21:46 pmoreau: GO FOR IT!!!
21:46 pmoreau: :-D
21:46 imirkin: ANXOR
21:46 karolherbst: handleORANDXOR for total confusion
21:46 karolherbst: imirkin: not bad
21:50 karolherbst: imirkin: you know how I can get the array values out of a ir_rvalue with type vec2[4]?
21:50 RSpliet: handleBoolOps ... ?
21:51 karolherbst: RSpliet: those are bool ops?
21:51 RSpliet: and, or, xor I'd say are...
21:51 karolherbst: well
21:51 karolherbst: kind of
21:51 karolherbst: but still
21:51 karolherbst: mhh
21:51 karolherbst: I would call them rather bit ops
21:51 karolherbst: or so
21:51 RSpliet: Just throwing it out there ;-)
21:52 pmoreau: SPIR-V went with bit operations, fwiw https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html#_a_id_bit_a_bit_instructions
21:54 RSpliet: Sounds valid, but that includes shift/rotate ops too...
21:54 RSpliet: I'm sure there's a million options though, plenty of bikeshedding food. I'm just throwing a proposal out there ;-)
21:54 imirkin: karolherbst: not offhand
21:54 imirkin: karolherbst: check what glsl_to_tgsi does
21:54 karolherbst: imirkin: seems like I have to down case
21:54 karolherbst: imirkin: good idea, right
21:55 imirkin: probably make lvalue's out of each of them
21:55 imirkin: and then pass those in as offsets
21:57 karolherbst: it does weird things
21:57 karolherbst: canonicalize_gather_offset
21:57 imirkin: =/
21:58 karolherbst: yeah...
21:58 karolherbst: I hoped for ir->offset->as_array()->component[i] or something :)
21:58 karolherbst: well
21:58 karolherbst: element rather than component
21:58 karolherbst: but you get the idea
21:59 karolherbst: ir->offset->as_constant()->get_array_element(i)
21:59 karolherbst: but I guess that only works with constant offset values
22:04 karolherbst: imirkin: do you agree with me, that the glsl to tgsi code looks a bit weirdo?
22:04 karolherbst: ohhh
22:04 karolherbst: I think I see what is going on now
22:05 imirkin: tg4 with multiple offsets is requried to be constant iirc
22:05 karolherbst: the magic is all in emit_asm
22:05 karolherbst: imirkin: really? I see
22:05 imirkin: with a single offset, it can be nonconst
22:05 karolherbst: imirkin: you are right
22:06 karolherbst: "The specified values in offsets must be set with constant integral expressions."
22:06 karolherbst: okay
22:06 imirkin: it's almost like i implemented support for these things :p
22:16 karolherbst: :D
22:25 karolherbst: imirkin: PIGLIT: {"result": "pass" } :)
22:25 imirkin: ship it
22:26 hakzsam: freecoder: hi, I know nothing about frameratracer, but yeah I know the performance counters infrastructure
22:26 karolherbst: not sure I like the patch though :D https://github.com/karolherbst/mesa/commit/caa3229309eb60663fdc0ad50199678391f9b3ac
22:28 hakzsam: freecoder: feel free to reach me by email, I should be able to answer your questions
22:29 imirkin: hmmmm i dunno, seems fine
22:30 karolherbst: imirkin: well, I think I would like to rename nir_tex_src_offset to nir_tex_src_offset0 while at it, but mhh
22:31 karolherbst: imirkin: for what use case was textureGatherOffsets added by the way?
22:31 karolherbst: seems to be too specific to just happened to be added
22:32 rooted123: hi
22:35 rooted123: i have a question
22:36 rooted123: someone known develop drivers?
22:36 rooted123: what I should know to develop on Noveau?
22:37 imirkin: karolherbst: no clue. only nvidia gpu's support it, too.
22:37 imirkin: everyone else just emulates it with 4x tg4
22:38 imirkin: i don't use it, i just implement it :)
22:38 karolherbst: rooted123: hard to say, usually you should just start and learn on the way?
22:38 karolherbst: rooted123: there are many areas and each area requires different skills
22:38 karolherbst: or knowledge or motivation or whatever
22:38 karolherbst: imirkin: :D right
22:39 imirkin: (i guess llvmpipe also "implements" it, but ... that's a bit different)
22:39 rooted123: I learn on institute java, python, html, css, jquery and javascript, I sometimes I like "play" with ollyDbg some crackmes, i know a little C but NO C++
22:40 karolherbst: rooted123: being able to write in a specific language isn't really important, what is important is to understand what is happening/required on a hw level for stuff
22:40 karolherbst: sometimes
22:41 rooted123: I would like develop areas that NOT implique riskes or be dangerous to my hardware
22:42 karolherbst: ohh, that won't be a problem
22:42 rooted123: it's secure?
22:42 karolherbst: I think a broken fan is more likely to damage your GPU than any driver work you would do
22:43 imirkin: never say never, but i don't think anyone here has broken their gpu as a result of writing code
22:43 rooted123: ammm, I'm wordked on shop/ fix of computers 5 years
22:43 karolherbst: did any nouveau dev actually broke a GPU?
22:43 imirkin: i think mupuf ended up killing a few in the oven, but ... don't do that.
22:43 karolherbst: I think there were some "failed" experiements with fans, where the GPU just didn't break
22:44 mupuf: Just one out of 12!
22:44 karolherbst: fan is in a hair drzer
22:44 karolherbst: mupuf: what? who?
22:44 karolherbst: ohhh
22:44 karolherbst: I didn't saw imirkins message
22:44 karolherbst: well
22:44 mupuf: Sry, i'm on my phone
22:44 karolherbst: I've heard you can repair GPUs in ovens if the soldering is a bit broken
22:44 airlied:did it with a phone once as well
22:44 mupuf: Yeah, typical for the teslas
22:45 airlied: worked long enough to get my stuff off it
22:45 karolherbst: teslas are those 140 C GPUs, right? :D
22:45 rooted123: I don't have installed linux on pc but on my 2 laptops yes, i only used my computer for forensics functions and administravibe task
22:45 imirkin: gotta get it *just* right though :)
22:45 rooted123: i have GTX 750
22:45 mupuf: karolherbst: yep :D
22:46 karolherbst: mupuf: I think the first overclockers were super careful, and then they saw the GPU reaching 130 with the stock stuff anyway :D
22:46 mupuf: imirkin: yeah, i was doing pwm to control the temperature slopes by opening/closing the oven :D
22:46 rooted123: I have bad experiece 7 years installed noveau on my old computer
22:46 karolherbst: mupuf: :D
22:47 rooted123: but i want to do one opportunity to kali linux, my distribution favorite
22:48 mupuf: Ahh, finally some winter in Helsinki! Just came back and got greeted with -11°C. That's a bit of a shock when clothed for the 10°C of Belgium
22:49 rooted123: but I would like learn with practice nouveau and I would like participe on develop
22:49 karolherbst: mupuf: I will have this at the weekend maybe :)
22:49 mupuf: Tomorrow, we'll get to -20
22:49 karolherbst: rooted123: right, best way: run linux and nouveau, run into bugs, fix them
22:49 karolherbst: mupuf: have fun
22:50 rooted123: can you recommended some book or some tutorials?
22:50 karolherbst: no and no, I basically just jumped into it
22:50 karolherbst: I was even too lazy to read those links from the wiki
22:51 karolherbst: https://nouveau.freedesktop.org/wiki/IntroductoryCourse/
22:51 karolherbst: https://nouveau.freedesktop.org/wiki/Development/
22:51 karolherbst: maybe we should update the wiki at some point
23:00 rooted123: Can i use 2 graphics nvidia witch nouveu?
23:00 rooted123: it's notable rendiment of GPU?
23:01 imirkin: nouveau can drive any number of nvidia GPUs
23:01 imirkin: there's no cross-GPU acceleration though
23:01 rooted123: okey
23:01 imirkin: i.e. no SLI or whatever