00:07karolherbst: imirkin: can I do something like this? slct u64 $r4d ne u32 $r4d $r6d $r8
00:07imirkin: you have to split it up i think
00:07imirkin: i don't think slct likes u64
00:08karolherbst: that might explain it
00:08karolherbst: do we want to allow it as input?
00:10karolherbst: should I fix something up like this in from_nir, or should it be done in the lowering stages?
00:12imirkin: what are you even trying to do?
00:13karolherbst: I compare a 32bit value against 0 and select one of two 64bit values
00:14karolherbst: it seems like a slct like this exists in PTX
00:14imirkin: hm ok
00:14imirkin: well i don't think that op exists
00:14imirkin: i'd rather it get cleaned up in from_*
00:15karolherbst: nvdisasm says "ICMP.NE.U32 R4, R4, R6, R8;"
00:16karolherbst: let me see what nvidia does
00:19karolherbst: nvidia splits it up as well
03:14lessthan500: I have a friend who might get me a 50 dollar off coupon to buy a new laptop for 499.99 dollars. I am thinking of buying an Acer Aspire E 15 E5-575G-57D4 notebook. However, I have not found any Debian 9 info about that actual model. If I buy it, I will install Debian 9 on it but under no circumstances proprietary drivers. My main concern is the GeForce 940MX. How well is it supported out of the box by Debian 9 and nouveau?
03:15karolherbst: might be good, might be bad, you could run into issues we didn't fix yet. No way of telling until having the hardware
03:25lessthan500: by hardware you mean a GeForce 940MX card or the Acer Aspire E 15 E5-575G-57D4 notebook?
03:38karolherbst: there are some power management aspects depending on the laptops firmware as well
03:38karolherbst: and then a 940MX != 940MX
03:38karolherbst: you can't trust the model numbers
03:41lessthan500: I am really concerned by your answer, since hardware compatibility is a deal breaker, the Acer Aspire E15 E5-575G-57D4 laptop no longer seems a great buy for the money.
10:34Aristar: well something improved a ton between kernels 4.12 and 4.14 on the nouveau module. not a single crash in 3 days even with 200'ish browser tabs and in QT/Plasma5 (on this old nv50/G86M ). kudos to the devs here and elsewhere :) ((also yes I have compared kernels using identical userspace/X.org/Mesa/DRM which had reproducible hard lockups))
10:46rhyskidd: Aristar: great to hear
11:20enyc: Aristar: i wonder about older nvidia's actually... have many e.g. mx440 which would be nice if video output accel worked alrught using nouvieau...
11:20enyc: various cards that just aren't supported in nvidia drivers [good riddance to those!] any more either =)
11:46ar|131: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY NIGGER DAY!! PLEASE SAY HI TO ALL THE DUMB NIGGERS IN #OFTC...quicktalkeh676te.onion/6697hepsdfuzqy: rellig Kabouik- Lildirt jkucia Lyude sarnex theMaze lemonzest AndrewR thc202 Omar007 tlwoerner docmax perfinion kelsoo1 nyef`` karolherbst geaaru Asu Q-Master titou damke_ RSpliet Celelibi
11:46ar|131: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ HAPPY NIGGER DAY!! PLEASE SAY HI TO ALL THE DUMB NIGGERS IN #OFTC...quicktalkeh676te.onion/6697qjumfg: Exterminador jkucia prg318 kelsoo1 Cheery Q-Master perfinion cyndis geaaru titou dolio agd5f Lekensteyn RSpliet siro__1 Lildirt CME nyef`` espes__ liamda
11:47perfinion: wow what a day when spammers cant even change the #OFTC to the right channel or even network
11:55Asu: this shitty bot spammed in #rtlsdr too
13:03feep: I find the patchwork of colors that heads that message very inclusive and appropriate
13:03feep: we're all one big social quilt~
13:04feep: though I'm not sure what skin color the bright turquoise represents?
13:04feep: probably mermaids
13:05feep: excuse me, that was insensitive of me.
13:27liamdawson: oh boy that was an interesting lot of messages and channel invites
16:05imirkin: enyc: are you seeing issues with MX440's with nouveau?
16:14karolherbst: imirkin: mhh, for f2b conversion I do this: "slct u32 %r11 eq f64 0x00000000 0x00000001 %r10d" (I put all immediates into the instruction for demonstration), and I kind of assumed this should just work
16:14imirkin: use set.
16:14karolherbst: the tgsi code uses set
16:15karolherbst: because of f64?
16:15karolherbst: or should I prefer set over slct?
16:15imirkin: depends on the situation
16:15imirkin: in this situation, you should prefer set.
16:15imirkin: (a) nir bools are like tgsi bools -- 0xfffffff which is what set produces
16:15imirkin: (b) f2b should result in false for a denorm
16:16imirkin: (b) f2b should result in false for -0.0 (i.e. 0x80000000)
16:16imirkin: (d) i can't count.
16:16karolherbst: okay, I see
17:14karolherbst: imirkin: mhh, I have a problem I can't really track down. In "firstname.lastname@example.org@execution@clipping@vs-clip-distance-const-reject" I am sure I produce shaders with equal outcome compared to TGSI, and even the headers are equal, still the tail fails. Any ideas?
17:15karolherbst: allthough I am already wondering if the entire header is printed out or just until 0x4f
17:15karolherbst: but they should ne only 0x50 bytes big...
17:21karolherbst: ugh.. the flags aren't printed
17:22karolherbst: I need to set info->io.clipDistances as well
17:22karolherbst: well, I kind of do, but not in all cases as it seems
17:34karolherbst: mhh, that also fixed those interpolation fails, nice
17:56imirkin: karolherbst: you have to enable clip distances... iirc in the shader header
17:56karolherbst: yeah, already found it
17:56karolherbst: I just relied a little bit too much on what gets printed out
17:59karolherbst: ew, how did I end up with this: set u8 $p0 neu u32 $r255 $r255
18:01Bl4ckb0ne: hi, is there any updates about Vulkan support ?
18:02karolherbst: Bl4ckb0ne: one could say I kind of work on it... or at least on the foundation what we need to get it
18:03karolherbst: still takes a while though
18:03karolherbst: and I am sure we won't have vulkan in 6 months
18:04karolherbst: I am working on spir-v support for nir, which is kind of required for vulkan, but I won't work on actual vulkan support
18:04Bl4ckb0ne: yeah, the spec looks hard
18:05karolherbst: yeah, nir
18:05karolherbst: it is a new ir in mesa
18:06karolherbst: mesa can convert from SPIR-V to nir
18:06Bl4ckb0ne: oh nice
18:06karolherbst: and if a driver supports nir, it gets spir-v support ;)
18:06karolherbst: more or less
18:06karolherbst: I need to rephrase
18:06karolherbst: I am wokring on nir support for nouveau... this way it makes sense
18:07karolherbst: anyhow, dealing with nir is much simplier than dealing with spir-v directly
18:07Bl4ckb0ne: im playing with vulkan these days, but i havent got the chance to play with spir-v
18:07Bl4ckb0ne: seems to be quite a piece of work
18:08karolherbst: well usually you don't have to deal with spir0v directly
18:08karolherbst: you just compile your shaders to spir-v
18:15Bl4ckb0ne: I guess
18:35karolherbst: imirkin: mhh, I have a CF problem and I don't really see anything strange in what I generated. A phi node isn't generated for a value, which needs one. I am quite sure that I got the edges wrong, but not what exactly
18:35karolherbst: nir: https://gist.githubusercontent.com/karolherbst/df2af4f81902ed40cd81fdf1fa0c7d7a/raw/f1fa270f01c5d94c1dcd8257be0781d947f59d20/gistfile1.txt
18:35karolherbst: tgsi: https://gist.githubusercontent.com/karolherbst/49ef124312ae104789535f019a447bb1/raw/c6cc354ff9c589b9a6dcd19a0e012fa14df2c266/gistfile1.txt
18:36karolherbst: "35: eq %r16 bra BB:11 (0)"
18:36karolherbst: %r16 should get replaced by a phi value, but isn't
18:38imirkin: when are you printing that out?
18:39imirkin: is this pre-opt/pre-ssa or post-opt?
18:39karolherbst: pre ssa
18:39karolherbst: the first print for DEBUG=3
18:39imirkin: so then that's fine...
18:39imirkin: pre-ssa means ... pre-ssa
18:39imirkin: aka no phi nodes
18:39karolherbst: but while converting to ssa, the phi node doesn't get inserted
18:41imirkin: is eq %r16 a thing btw?
18:42imirkin: don't you have to make it into a FILE_PREDICATE?
18:43karolherbst: it is
18:43karolherbst: it gets converted into a predicate later
18:43karolherbst: from_tgsi does the same
18:43karolherbst: eq %r1 bra BB:9 (0) in the TGSI version
18:44karolherbst: it compares against 0 aka false
18:54karolherbst: trying to visualize the flow: https://gist.githubusercontent.com/karolherbst/c4a7c59d85258ef35a492e3e0ab0466f/raw/8aaa025f6bf6ce36673a717fd6e9012019079dfe/gistfile1.txt
18:55karolherbst: nir gives me empty else branches, so that's why there are a few more BBs in the nir version
18:58imirkin: so you have to be a little careful with edge types
18:58imirkin: the way tgsi generates them isn't exactly to spec
18:58imirkin: which is unfortunate
18:58imirkin: but i've been too chicken to change it to be "correct"
18:59karolherbst: I noticed
18:59karolherbst: I tried to be more like spec, but everything broke
18:59karolherbst: so now I try to be like tgsi
19:00karolherbst: I think I should take a deeper look on what TGSI does when there is a else branch
19:00karolherbst: maybe something gets fixed up or produces different edges or something...
19:02karolherbst: anyway, the write in BB:0 isn't "detected" lets say
19:02karolherbst: and I don't see why it wouldn't
19:03karolherbst: maybe BB:2 -> BB:6 already needs to be a forward edge? would be strange though
19:04dupondje: whats a quick & easy tool to check performance of nouveau vs nvidia driver?
19:04dupondje: want to see what difference there is on mt laptop :)
19:04karolherbst: run something with nouveau and then run it with nvidia ;)
19:04karolherbst: there is no tool for it
19:05dupondje: I have no games :)
19:05karolherbst: well do you need performance then?
19:05karolherbst: or are you just interested in general?
19:07dupondje: yea indeed, general performance :)
19:10dupondje: ./GpuTest /test=fur /width=800 /height=600 /benchmark
19:10dupondje: found this :D
19:11dupondje: 22 FPS for my integrated card (Intel)
19:11dupondje: 10 FPS for nouveau
19:11dupondje: ah well :)
19:14imirkin: dupondje: which GPU?
19:14imirkin: did you clock it up?
19:14imirkin: by default it'll be super-slow
19:15imirkin: echo 0f > /sys/kernel/debug/dri/1/pstate
19:15dupondje: I didnt touch anything :)
19:17dupondje: what clock does it run by default?
19:19imirkin: whatever it boots to
19:19imirkin: either the lowest or second-lowest perf level
19:24karolherbst: on maxwell it is usually the lowest
19:24karolherbst: with furmark we are pretty close to nvidia
19:24karolherbst: dupondje: try pixmark_piano
19:24dupondje: that killed my laptop :P
19:24karolherbst: it didn't
19:24karolherbst: it just got slow ;)
19:24karolherbst: very much so
19:25dupondje: [104055.799716] watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [kworker/7:0:10152]
19:25karolherbst: well right
19:25dupondje: had to reboot :( the echo 0f was hanging
19:25karolherbst: yeah, that is inconvenient
19:25karolherbst: you can only do that while your GPU is awake
19:25karolherbst: aka not suspended
19:26dupondje: well realized that after running the command :)
19:26dupondje: then it was to late :D
19:27imirkin: oh crap, sorry, forgot to mention that :(
19:28imirkin: karolherbst: you really need to get skeggsb to take your patches
19:28karolherbst: I know
19:28karolherbst: he knows
19:28karolherbst: he tries to review those for the next window afaik
19:32karolherbst: imirkin: ohh by the way, I guess it is right to say, that nvidia usually JIT compiles everything, even in the contet of CUDA and getting PTX binaries, right? Just wondering because I had a discussion with somebody and that person kind of assumed that Nvidia doesn't do JIT
19:32dupondje: Dec 17 20:27:20 lt-jeanlouis kernel: [ 322.945183] nouveau 0000:01:00.0: DRM: evicting buffers...
19:32dupondje: Dec 17 20:27:25 lt-jeanlouis kernel: [ 328.065181] asynchronous wait on fence nouveau:GpuTest:e1810722 timed out
19:32dupondje: Dec 17 20:27:25 lt-jeanlouis kernel: [ 328.065183] asynchronous wait on fence nouveau:GpuTest:e181071f timed out
19:32dupondje: Dec 17 20:27:25 lt-jeanlouis kernel: [ 328.065234] asynchronous wait on fence i915:gnome-shell/1:ead timed out
19:32dupondje: Dec 17 20:27:35 lt-jeanlouis kernel: [ 338.049266] ------------[ cut here ]------------
19:32dupondje: Dec 17 20:27:35 lt-jeanlouis kernel: [ 338.049292] WARNING: CPU: 3 PID: 581 at /build/linux-tt6jd0/linux-4.13.0/drivers/gpu/drm/nouveau/nouveau_bo.c:1216 nouveau_bo_move_ntfy+0x9b/0xa0 [nouveau]
19:32dupondje: and dead :P
19:33dupondje: known issue (and maby fixed in newer kernels? or ?)
19:33dupondje: did go to 54 fps for some seconds btw :D
19:33karolherbst: most likely something not implemented in memory reclocking
19:34karolherbst: we are kind of aware of what is missing
19:34karolherbst: just need time to implement it
19:34dupondje: it doesn't clock to the highest clock because of that missing firmware issue?
19:35imirkin: no, missing firmware is a thing for GM20x+
19:35karolherbst: imirkin: .... okay, that might be a codegen bug in the end...
19:36imirkin: karolherbst: PTX isn't the nvidia ISA. they do include pre-compiled versions of PTX to some isa's inside those cubin files or whatever. i dunno how much online recompilation they do.
19:36dupondje: imirkin: ah ok :) so why doesnt it clock higher then by default? :)
19:36karolherbst: imirkin: right, but if you don't have the machines isa, you need to compile, right?
19:36dupondje: karolherbst: need full stacktrace? or
19:36imirkin: dupondje: we don't control that
19:36karolherbst: just asking to be 100% sure
19:36imirkin: karolherbst: yes.
19:36imirkin: it is not *at all* a 1:1 conversion
19:36karolherbst: I know
19:37imirkin: it's obviously closer than, say, fortran
19:37karolherbst: I just wanted to check that what I answered wasn't bs
19:37imirkin: but ptx doesn't have registers, etc
19:37karolherbst: uhm... well it has the .reg type
19:37karolherbst: but yeah
19:37karolherbst: in the end PTX is too high level to be considered an assembly language
19:37karolherbst: even though they try to say it is
19:38imirkin: yeah. it's a fairly low-level language which exposes a lot of the ISA weirdness that would be a pain for a compiler to auto-use
19:39imirkin: but it still requires a proper compilation phase, and they *definitely* run opts on it
19:39dupondje: FYI :)
19:39karolherbst: I also called a PTX to SASS thing a compiler for that reason
19:39karolherbst: not an assembler
19:39karolherbst: because it clearly isn't
19:40karolherbst: imirkin: okay thanks for that answer :)
19:41imirkin: but obviously what one calls a compiler vs assembler is in the eye of the beholder...
19:41imirkin: by today's standards, the early C compilers were more like assemblers
19:41karolherbst: but ptxas is more complex that those
19:41karolherbst: so yeah
19:42karolherbst: anyway, it is kind of important in the end to have clarified that
19:42imirkin: dupondje: hm, yeah, some cross-driver sync issue between i915 and nouveau
19:42imirkin: i think i've heard of those, but i dunno if they've been resolved
19:44dupondje: guess ill try some fresher kernel once :)
19:45dupondje: WARNING: Power management is a very experimental feature and is not expected to work. If you decided to upclock your GPU, please aknowledge that your card may overheat. Please check the temperature of your GPU at all time! => thats still valid? :)
21:11Toughy: Hello, I have some problem with my GeForce 710M card in an optimus setup
21:11Toughy: After trying to install / uninstall nvidia drivers
21:11Toughy: I now get these errors in system log when I open the compute device in opencl:
21:11Toughy: 12/17/17 11:07:26 PM nouveau 0000 1:00.0: DRM: resuming kernel object tree... 12/17/17 11:07:26 PM nouveau 0000 1:00.0: bus: MMIO write of ffffff1f FAULT at 6013d4 [ IBUS ] 12/17/17 11:07:26 PM nouveau 0000 1:00.0: DRM: resuming client object trees... 12/17/17 11:07:32 PM ACPI Warning \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160831/nsarguments-95) 12/17/17 11:07:32 PM ACPI \_SB_
21:13imirkin: Toughy: that's not unexpected
21:13Toughy: And then mesa opencl icd says there are no compute devices
21:13Toughy: It used to work before, and I don't know what I did to break it
21:13imirkin: opencl + nouveau never worked
21:15Toughy: But I was able to enumerate devices in the Mesa platform ...
21:15imirkin: it's conceivable that the icd returned a device of sorts
21:15imirkin: but it would not have worked
21:17Toughy: Ok, thank you
21:40karolherbst: imirkin: could it be that codegen doesn't like empty else branches?
21:41karolherbst: mhh, allthough no, it looks fine on a second look
21:44karolherbst: noooo... I found the issue, meh
21:44karolherbst: BB:3 was missing another edge
21:45karolherbst: I modified the TGSI a little and added empty else branches
21:45karolherbst: I used exactgly the same edges, just one edge is missing
21:56karolherbst: uhm wait a second... it gets added, I just removed it in my paste
22:00karolherbst: imirkin: I guess that terminated = thing is also kind of important? I didn't really give it any attention
22:02karolherbst: mhhh, I thought there is some special flag, besides fixed
22:03imirkin: conceivable, but i don't remember
22:03imirkin: look at what tgsi does
22:03imirkin: and do the same thing.
22:04karolherbst: but that gets only set for the very last instruction, aka exit
22:04karolherbst: mhh, let me check something
22:26uzgun366: ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ DID YOU GUYS KNOW TODAY WAS NIGGERS DAY?? SAY HI TO YOUR FAVORITY NIGGER IN #FREENODE!! quicktalkeh676te.onionygkmlvdm: geaaru theMaze agd5f Exterminador lemonzest dleone tlwoerner Celelibi IdleGandalf titou Kabouik- NanoSector anunnaki smithjd imirkin_ espes__ AndrewR RSpliet asgs feep Cheery dupondje