07:49 ilios: hi, i have a quick question. Is there any specific MMIO commands to stop the running CUDA kernel? For example, CUDA kernel stops when cuda-gdb is attached to the running CUDA process. Briefly, i want to simulate it by just requesting the MMIO commands.
11:39 pmoreau: karolherbst, skeggsb: Re https://github.com/skeggsb/nouveau/commit/c510bfa30c0f41212d8ff85a9dc79f849fe3ddb3 there are a couple of other places in Nouveau using nv_encoder->or (and not nv_encoder->dcb->or), for example https://github.com/skeggsb/nouveau/blob/master/drm/nouveau/nv50_display.c#L2489 and other places in nv50_display.c.
11:40 pmoreau: Is this expected? I am wondering if those (or something similar) could be responsible for the EVO timeout regression on a G98 board https://bugs.freedesktop.org/show_bug.cgi?id=105319
11:42 karolherbst: pmoreau: this is expected
11:43 karolherbst: normally you call nv50_outp_acquire/nv50_outp_release when touching display stuff
11:43 pmoreau: Okay
11:43 karolherbst: and it sets the sor for you
11:43 karolherbst: but with the backlight we don't
11:43 karolherbst: it might be though that some places are left out
11:43 karolherbst: or need more fixing
11:44 pmoreau: Ah okay, good to know about the nv50_outp_acquire/nv50_outp_release dance.
11:45 pmoreau: Any particular reason not to use that in the backlight code, and instead do the `ffs(nv_encoder->dcb->or) - 1` thing?
12:19 skeggsb: pmoreau: because the backlight controls are fixed, and can't be routed
12:23 pmoreau: skeggsb: Understood. What is the benefit of rerouting/mapping the outputs rather than using the “fixed” access?
12:24 pmoreau: (I think I have seen some talk about rerouting outputs in context of gmux, but even there I’m not sure I get it :-D)
12:24 skeggsb: because there's more possible outputs than there are SORs on newer GPUs
12:25 pmoreau:should probably read more on that subject
12:25 skeggsb: ie. one board i have sitting here has 3xDP, 1xHDMI, 1xDVI-D
12:25 skeggsb: but only 4 SORs
12:25 pmoreau: I see, makes sense
12:26 pmoreau: Thanks for the explanation :-)
12:27 skeggsb: you're right though, we *should* be using acquire()/release() stuff for LVDS/eDP at init/fini time, and ensuring that they remain identity-mapped in the routing code
12:27 skeggsb: but, that can wait, the goal was to fix the bug in the least invasive way possible for backport
12:27 skeggsb: the rest is basically to make it "clean"
12:29 pmoreau: Hum, the EVO timeout bug was for a screen connected via eDP (laptop), so maybe some missing cleanup. I’ll look in that direction.
12:29 skeggsb: is there anything before it? a method dump?
12:30 pmoreau: It’s this bug: https://bugs.freedesktop.org/show_bug.cgi?id=105319
12:30 pmoreau: I don’t recall a method dump before the first timeout message
12:31 skeggsb: yeah, specific commit is going to be most useful there
12:32 skeggsb: um, he mentioned in #8 that that patch fixed the issue already
12:32 skeggsb: it doesn't make sense that what went into 4.16 (and Cc'd stable) didn't help.. it was the same patch
12:33 pmoreau: I’m not sure he meant what he wrote, as the line below that comment, he is asking whether he should test the patch or not.
12:33 skeggsb: oh, right
12:33 skeggsb: yes, i can read that comment in a different way too :P
12:34 pmoreau: :-D
12:36 pmoreau: There are a few people on Tesla for which the ALIGN_DOWN fix was not enough. Hopefully one of them can bisect; I wasn’t able to reproduce that bug either.
12:51 karolherbst: random thought: maybe we want to compile nv50 by default?
12:53 ilios: Hello, sorry for asking the same question. Is there any specific MMIO commands to suspend (not terminate) the running CUDA kernel? For example, CUDA kernel is suspended when cuda-gdb is attached to the running CUDA context. I want to suspend (but not to terminate) the CUDA kernel by just requesting the MMIO commands. Is it possible?
12:58 karolherbst: pmoreau: did you test building your branch with autotools by the way?
12:59 pmoreau: karolherbst: I did for the v4, but not for the v5. I should probably try it again, with and without the dependencies.
13:00 pmoreau: karolherbst: Re “maybe we want to compile nv50 by default?”: where is nv50 not compiled by default?
13:00 karolherbst: mesa
13:00 karolherbst: or is it? it doesn't get displayed
13:00 pmoreau: Ah
13:00 karolherbst: "Gallium drivers: r300 r600 svga swrast"
13:00 pmoreau: Probably not then
13:00 karolherbst: but we build the classig nouveau driver by default
13:00 karolherbst: :)
13:00 karolherbst: *classic
13:00 pmoreau: oO
13:01 pmoreau: Interesting
13:02 pmoreau: ilios: No idea if this has been RE’ed. :-/
13:04 karolherbst: build errors on your add_clover_spirv_backend_v2 branch :)
13:05 karolherbst: pmoreau: ../../../../../src/gallium/state_trackers/clover/llvm/invocation.cpp:36:10: fatal error: llvm-spirv/SPIRV.h: No such file or directory
13:05 karolherbst: :)
13:05 karolherbst: that's the error I expected, that's why I tried building it
13:05 karolherbst: I looked at your "[PATCH v5 14/21] clover/llvm: Allow translating from SPIR-V to LLVM IR" patch
13:05 karolherbst: I think you need to fix the Makefile.am files
13:07 pabs3: pmoreau: any issues with G98/GT21x over the weekend?
13:07 pmoreau: pabs3: Sadly no issues :-/
13:09 pmoreau: karolherbst: Ah yes, it should include ${LLVM_SPIRV_CFLAGS} for libclspirv
13:09 pabs3: hmm ok
13:10 pmoreau: And libclspirv_la_LDFLAGS needs ${LLVM_SPIRV_LIBS} as well
13:10 pmoreau: pabs3: Which one was your issue: the EVO timeout or the NULL pointer dereference?
13:11 pmoreau: karolherbst: I was going to send a v6 addressing curro’s comment, I’ll send a fix for autotools build as well.
13:11 pmoreau: ^ tonight
13:12 karolherbst: thanks!
13:12 pabs3: you linked to the EVO timeout bug, this was my log https://paste.debian.net/hidden/7bfa5226/
13:14 pmoreau: pabs3: Okay. Would you be able to bisect the kernel to find the faulty commit? Also is this a laptop or desktop, and how is the screen connected (if desktop)?
13:15 pmoreau: karolherbst: Thanks for testing with autotools. :-) Maybe I should update the scons build as well. :-/
13:17 karolherbst: pmoreau: :D right
13:17 pabs3: pmoreau: I don't have a good commit, but I guess I could pretend 4.2 was good. desktop, via DVI. card also has VGA and HDMI connectors
13:18 pmoreau: Ah right, for you it never worked, it’s not like the other bug report, which is a regression in 4.15.
13:18 karolherbst: pabs3: might be willing to do a git bisect on the kernel? it might take your whole day though :(
13:19 pmoreau: karolherbst: Since it’s behind a define, it shouldn’t break scons, and I don’t think people using scons are using clover, but still.
14:02 pmoreau: xexaxo1: Is there anything I need to do regarding scons in my series (adding SPIR-V support to clover, where I added new dependencies to clover, and a new target there as well)? There does not seem to be any scons specific file in src/gallium/state_trackers/clover.
16:22 freecoder: hi all, i'm getting this error while building out-of-tree module on 4.14.29 vanilla kernel - https://hastebin.com/raw/guxinomiko
16:22 freecoder: any help on how to fix this?
16:39 karolherbst: imirkin: I got a 50% perf hit in pixmark_piano with NIR.. 80% of this I could fix only by moving the immediates loads to the defs...
18:00 pmoreau: freecoder: Build against a more recent kernel; the out-of-tree module should be compiled against a rather recent version of the kernel. You can find which version it is based on by looking through the commits for commit messages like "drm-next $some_commit_hash" or v4.16-rc5
18:07 freecoder: pmoreau, what is the significance of "drm-next $some_commit_hash" in the commit message?
18:58 pmoreau: freecoder: That you should be using, at least, the commit $some_commit_hash from the branch drm-next (https://cgit.freedesktop.org/~airlied/linux/?h=drm-next)
19:45 freecoder: pmoreau, oh i see. thanks, i will see if that works
19:49 pmoreau: BTW, the last such commit is v4.16-rc5, so try cloning Linus’ tree and check out that tag: the code should build fine against that.
20:20 imirkin_: karolherbst: register usage matters i guess? :)
20:21 karolherbst: well it was 54 vs 63 registers
20:21 karolherbst: but
20:21 karolherbst: there was some spilling :)
20:22 karolherbst: the fixed version still spilled
20:22 karolherbst: but not that much
20:28 karolherbst: imirkin_: I would like to spend this and maybe next week a bit to figure out what we can improve in codegen. I am sure with nir we get a few patterns more often. At least nir seems to be better in eliminating simple if-else branches with type conversions
20:28 karolherbst: so a set+then:mov+else:mov -> slct
20:29 karolherbst: especially for boolean types
20:29 imirkin_: i think we cover that one :)
20:29 imirkin_: at least sometimes
20:29 karolherbst: yeah, there were some funky cases we don't
20:30 karolherbst: anyway, first I want to optimize the most common patterns with the nir stuff and then check back where we can improve our TGSI path
20:31 karolherbst: imirkin_: with TGSI we just end up loading the immediates when parsing the sources of an instruction, right?
20:34 imirkin_: and consts yeah
20:34 imirkin_: CSE will happen too
20:34 karolherbst: right
20:35 karolherbst: I am more interested how a loadImm works after you already generated the instruction ...
20:36 karolherbst: there are a few cases in the TGSI path which does insn->setSrc(s, fetchSrc())
20:36 imirkin_: insn gets added after though
20:37 imirkin_: (fucking better)
20:37 imirkin_: otherwise it's in for a world of pain
20:37 imirkin_: we've had some bugs like that
20:37 karolherbst: ohhh...
20:38 karolherbst: I was always wondering why TexInstruction *tex = new_TexInstruction(func, OP_TXQ); was done actually....
20:38 karolherbst: and then bb->insertTail(tex);
20:38 karolherbst: I think I just found the answer why
20:38 karolherbst: or one of them
20:38 imirkin_: ]=
20:38 karolherbst: mhh, maybe I just add a second bld which always points before the new instructions
20:39 karolherbst: ...
20:39 karolherbst: I really don't want to have too many setPositions there
20:39 karolherbst: or have to be aware to get the src before creating instructions always
20:39 imirkin_: https://youtu.be/t_TdCs9GA4w?t=28
20:39 karolherbst: :D
20:51 karolherbst: imirkin_: getImmediate can only be used in SSA form right?
20:51 imirkin_: yep
21:18 pendingchaos: lachs0r: can you test the second revision of the tests?: https://github.com/pendingchaos/piglit/tree/nv_conservative_raster_v2_rc1
22:34 pendingchaos: lachs0r: *retest the second revision
22:35 pendingchaos: lachs0r: *rerun the second revision
22:38 pendingchaos: lachs0r: *run the second revision
23:14 karolherbst: that moment where you have to do "gdb gdb"...
23:15 karolherbst: imirkin_: mind taking a loot at the nvidia_shaderdb/Civilization\ VI/1501528914/47.shader_test shader?
23:15 karolherbst: that one crashes for me with gk106
23:15 imirkin_: not in front of an nvidia gpu
23:16 karolherbst: ahh, k
23:16 karolherbst: nv50_ir::Interval::overlaps (this=this@entry=0x78, that=...) at ../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_util.cpp:167
23:16 imirkin_: does it crash with upstream, or with your various patches?
23:16 imirkin_: yeah that's bad :)
23:16 imirkin_: don't do that ;)
23:16 imirkin_: this == 0x78 can't end well
23:16 karolherbst: :)
23:17 karolherbst: I was on my nir branch, dunno if cwabbott_s patch could cause this
23:17 karolherbst: retesting with master
23:17 karolherbst: but I think I know that issue
23:17 imirkin_: iirc you have a bunch of cts patches too
23:17 imirkin_: or not on that branch?
23:18 karolherbst: no, I don't
23:23 karolherbst: okay, it also crashes on plain master
23:24 imirkin_: yaaay
23:25 karolherbst: mhh at least I know in which node it crashes
23:25 karolherbst: maybe there is something obvious
23:27 karolherbst: valgrind is annoyed
23:27 imirkin_: good
23:27 karolherbst: ohhh yeah
23:27 karolherbst: that is bad
23:27 karolherbst: invalid read in getUniqueInsn :)
23:27 karolherbst: https://gist.githubusercontent.com/karolherbst/fac3e49c174929480df008ce90a62779/raw/3a534ff06a9c466d5bd7bbaab8f809dbc4c188a2/gistfile1.txt
23:31 karolherbst: uhm
23:33 karolherbst: imirkin_: some bug inside the spillCodeInserter
23:33 karolherbst: it deletes an instruction allthough something still references one of its values
23:34 karolherbst: well has to be the def
23:34 imirkin_: yeah
23:34 imirkin_: that can happen :(
23:34 karolherbst: I will put an assert
23:34 imirkin_: i've been trying to figure out how to fix those issues for a long time
23:34 imirkin_: i know what the issue is...
23:35 imirkin_: i just don't know how to fix it
23:35 imirkin_: first part of this was a sad attempt at defeating it: https://github.com/imirkin/mesa/commit/0eddaaea62d8ef150720e87cb8d2ad2f4e4095bf
23:36 imirkin_: (second part should not have made it in there ... dunno what i was testing)
23:36 karolherbst: ahh mhh
23:36 karolherbst: I think this patch causes some errors elsewhere though
23:37 imirkin_: aka "sad attempt"
23:38 karolherbst: I put an assert(insn->getDef(0)->uses.empty()); but it didn't trigger
23:39 karolherbst: :(
23:39 imirkin_: it's tricksy
23:39 imirkin_: i don't fully remember the situation by now
23:40 imirkin_: i just remember being greatly saddened
23:40 karolherbst: ohh, I have an idea
23:40 imirkin_: the main issue is around how RIG node merging is done
23:40 karolherbst: wow, that would be ugly
23:40 imirkin_: and especially *unmerging*
23:40 imirkin_: the unmerging is terminally broken
23:40 imirkin_: and i'm not sure hwo to fix it
23:40 imirkin_: (and you do unmerging when spilling)
23:41 karolherbst: ha
23:41 karolherbst: insn->getDef(d)->defs.size() == 1 should be true before removing an instruction, no?
23:41 imirkin_: but it's not
23:41 karolherbst: exactly
23:41 imirkin_: because of the unmerge fail.
23:42 karolherbst: mhh
23:42 karolherbst: I am wondering about that actually
23:46 karolherbst: uhm...
23:46 karolherbst: imirkin_: should the spillCodeInserter be doing _anything_ with phi instructions?
23:47 karolherbst: "1511: phi u32 %r3471 %r3470 %r3802 (0)" is the instruction it tries to delete
23:48 karolherbst: mhh wait, actually it makes kind of sense... but
23:49 imirkin_: it's a merged node
23:49 imirkin_: note that it goes through ALL the defs
23:49 imirkin_: and it skips the fake one
23:49 imirkin_: fake ones*
23:54 karolherbst: mhh weird
23:54 karolherbst: there is noting actually using that def though
23:55 imirkin_: unlikely.
23:55 imirkin_: post-RA is confusing.
23:56 karolherbst: I mean, the instructions using that were already fixed by reading from lmem
23:57 imirkin_: yes
23:57 imirkin_: that is correct.
23:57 imirkin_: it's a really annoying issue.
23:57 mooch2: mwk, which hardware are you emulating in your harddoom branch?
23:58 karolherbst: imirkin_: I hope it isn't something silly like phi defs are stored in some list and we don't clean that up when spilling phi defs?
23:58 imirkin_: it's tricky
23:58 imirkin_: i won't be able to explain it over irc
23:59 imirkin_: you need to do a lot of debugging to get a proper understanding of what's going on
23:59 imirkin_: but ... trust me when i say it's the unspill that's broken
23:59 imirkin_: despite anything else that you might see.