00:00 imirkin_: i wonder if it should also happen with mov's that have >32bit sources
00:00 pmoreau: vB (which is %104) is indeed not a compound, as it is never split nor merged, as 64-bit adds and movs seem to be split post-RA. But vA, (which is %105) is compound.
00:01 pmoreau: Maybe 64-bit add should be split pre-RA? I’m not sure why it’s done post-RA, on the contrary to other ops.
00:01 imirkin_: yea dunno
00:02 pmoreau: Here is the info you asked for: https://hastebin.com/lipaticeru.rb
00:04 imirkin_: right, so 104 comes from that phi
00:04 imirkin_: i think the 64-bit support for things is being somewhat abused
00:04 imirkin_: although that's curious ... hmmm
00:05 imirkin_: the real fix is for phi to normalize the whole compoundedness madness
00:05 imirkin_: which it doesn't seem like it does
00:06 imirkin_: this is when i wish calim were still around :)
00:06 pmoreau: :-)
00:06 pmoreau: I can try to understand what is going on, but I am quite sure you still are way more familiar with that code than I am.
00:07 imirkin_: anyways, in theory this is all legal code that the RA doesn't deal properly with
00:07 imirkin_: in practice, you're bringing this on yourself with the 64-bit phi's
00:07 imirkin_: you can either fight the power, or you can kneel before your 32-bit overlords
00:08 pmoreau: I wouldn’t be against those 64-bit values to be split in 32-bit ones.
00:08 imirkin_: just means you have to throw in some splits + merges
00:08 imirkin_: i have an opt pass that eliminates redundant ones
00:09 pmoreau: I can try that
00:10 imirkin_: (since tgsi generates them around every 64-bit op)
00:10 imirkin_: (and it seemed silly to have a ton of split-immediately-followed-by-merge ops)
00:18 pmoreau: I need to disable that pass though, as it occurs before RA, so I was left with a non-compound still :-D
00:19 imirkin_: crap
00:19 pmoreau: But I no longer hit the assert, and the program compiles properly
00:19 imirkin_: so this can happen with tgsi?
00:19 imirkin_: do you still have 64-bit phi's?
00:19 imirkin_: you should have 32-bit phi's
00:19 imirkin_: which should prevent SOME of those merges from getting eliminated
00:21 pmoreau: I don’t have enough backlog on the console to get back to the program pre-RA --"
00:21 pmoreau: one sec
00:21 imirkin_: https://hastebin.com/ihonogaqus.pl
00:21 imirkin_: oh. &| less
00:24 pmoreau: I still have 64-bit phi
00:24 imirkin_: "don't do that"
00:27 pmoreau: If you had something like `... uint64_t a = 0x0000003400000012; if (whatever) a = 0x43000000000043; ...` in a shader, how does TGSI handle it?
00:28 imirkin_: you can answer that question yourself fairly easily
00:28 imirkin_: but TGSI is all expressed in terms of 32-bit quantities
00:28 imirkin_: and the from_tgsi logic keeps that in mind
00:28 imirkin_: so it loads a pair of 32-bit values, merges them, performs the op, and splits the result back into a pair of 32-bit values
00:28 pmoreau: True, just run it with shader_runner
00:29 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp#n4039
00:31 pmoreau: Indeed, I should try that
00:32 imirkin_: that probably doesn't map too well onto spirv
00:32 imirkin_: since you probably get the stupid 64-bit phi directly
00:32 imirkin_: although ... you have a pass to de-phi stuff right?
00:33 pmoreau: Right, I de-SSA'fy the program before feeding it to Nouveau
00:34 imirkin_: ok
00:34 imirkin_: so then yeah ... try to do stuff as 32-bit things
00:35 imirkin_: but really phi nodes should be able to handle one side being a compound and the other one not
00:38 pmoreau: Thank you for your help!
01:08 Lyude: [98347.512702] nouveau 0000:03:00.0: fifo: read fault at 0006446000 engine 00 [GR] client 04 [GPC0/T1_1] reason 02 [PTE] on channel 13 [007f255000 systemd-logind[1230]] <-- does PTE here mean "Page table exception"?
01:09 imirkin_: mmmm
01:09 imirkin_: no
01:09 imirkin_: i think it's "missing page table entry"
01:09 imirkin_: it can also say [PGD] which is ... i'll let you guess
01:10 Lyude: mm, close enough, two people have poked me as of late with nouveau issues, one is kepler1 and the other is kepler2
01:10 Lyude: think there might be a bug here
01:11 Lyude: output is different though, so i can't say for sure
01:11 imirkin_: yeah, it's a use-after-free or something
01:11 imirkin_: or use-without-saying-we-use-it
01:12 imirkin_: somewhere in the GL driver
01:12 imirkin_: i think T1_1 == texture unit
01:12 imirkin_: not 100% sure
01:14 imirkin_: also those errors go back to since before the dawn of time
01:14 imirkin_: however apparently people think it's funny to use GL to accelerate 2D rendering that's faster to do with the cpu anyways
01:15 imirkin_: so the issues get hit a lot more
01:16 imirkin_: (and they do it in ways that are very different than most games do, so ... yeah. much sad.)
01:16 imirkin_: basically if i see anything relating to systemd or kde or gnome in a bug, i just ignore it
01:17 imirkin_: not enough hours in the day. esp not mine.
01:17 skeggsb: Lyude: is this on f26?
01:20 imirkin_: skeggsb: not sure how github comments work, but iirc i left one or two comments in your fermi series
01:20 skeggsb: Lyude: if so: https://github.com/skeggsb/nouveau/commit/689c51f14b7966595fd07a20c9b1b9d06de18bdd#diff-80583d7615163257812eee78fab2e279
01:20 imirkin_: ideally you get emails about them.
01:20 skeggsb: Lyude: make sure the kernel that's being used is new enough to have that
01:20 skeggsb: imirkin_: yeah, i've seen them, thanks :)
01:20 imirkin_: ok cool.
01:20 skeggsb: + your LUT stuff
01:20 skeggsb: there's unexplained issues there too that i'd like to figure out properly before we do anything
01:21 imirkin_: such as?
01:21 skeggsb: ie. 8bpp doesn't work on >=gf119 still even if you switch to lores
01:21 imirkin_: uhhhh
01:21 imirkin_: really?
01:21 imirkin_: works on my GK208
01:21 skeggsb: what'd you use to test?
01:21 imirkin_: modetest
01:21 imirkin_: with my patches to add support for C8
01:21 imirkin_: (which are on the ML)
01:21 skeggsb: i'm using fbcon, so, not sure if there's something else wrong there too
01:22 imirkin_: well, it seems like the first modeset is extra-bad
01:22 imirkin_: which isn't helping
01:22 skeggsb: i think there's stuff missing from your log for some reason, because EVO wouldn't allow the base channel update if all that stuff was missing from the core channel
01:22 imirkin_: perhaps
01:22 imirkin_: the log was started when i ran modetest
01:22 imirkin_: the previous "position" of the monitor was off
01:23 imirkin_: but i didn't have the option on when the nouveau module loaded
01:23 skeggsb: ah, right
01:24 imirkin_: you probably know, but i have multipel GPUs plugged in
01:24 imirkin_: and i was reloading the nouveau module a bunch
01:24 imirkin_: but the GPU in question is *not* the primary GPU
01:24 imirkin_: [the primary is GK208, secondary is G92 ... and NV17 + NV5, but that's not relevant here]
01:31 skeggsb: imirkin_: how do i make modetest select C8? despite being in the list, it's telling me "unknown format C8"
01:31 imirkin_: you have to apply my patch
01:35 skeggsb: interesting... that appears to be correct..
01:35 skeggsb:wonders why fbcon doesn't show up unless messing with LUT values
01:36 imirkin_: the real question is ... wtf is the diff between low res and hi res
01:36 imirkin_: i thought it was 64 (or 128) values vs 256
01:37 imirkin_: but ... for C8 ... what does it do?
01:37 skeggsb: not sure.. but it's also weird that on >=gf119 the HIRES works for 8bpp, but needs to be LORES on earlier boards
01:37 imirkin_: indeed it is :)
01:38 skeggsb: i'd also like to know wtf that +0x6000 is all about
01:38 imirkin_: oh
01:38 imirkin_: so ... if you look at the name
01:39 imirkin_: that's the range of the thing
01:39 imirkin_: it's just a bias or somethin
01:39 imirkin_: and the reason that it was a 0x20 stride was that it's actually a 1024 lut
01:39 imirkin_: not 256 - you were filling in every 4th value :)
01:45 imirkin_: oh, and i still haven't sent my "wtf to do about non-256-sized LUTs" question
01:46 skeggsb: yeah, i knew why there was a 0x20 stride :)
01:46 skeggsb: just not why we +0x6000 to each entry
01:46 imirkin_: if you look in the display class headers
01:46 imirkin_: they talk about different ranges
01:47 imirkin_: i'm guessing it's a property of the range of that lut type
01:47 skeggsb: ah, yes, sorry
01:47 imirkin_: not that that's a perfect explanation, but ... :)
01:47 skeggsb: it'll do :)
01:54 imirkin_: btw, don't know if you have the hw for it, but if you do - hdmi 2.0 might be a nice quick win
01:55 imirkin_: i'm sure it's just a couple of bits that need to be toggled somewhere
01:55 imirkin_: (i have a tv, but no hw that does 2.0)
01:57 skeggsb: i have the opposite :P
01:57 imirkin_: ok, so we just need a really long hdmi cable
03:17 imirkin: karolherbst: i don't suppose you had a chance to try out the bindless patchset with DOW3?
04:58 karolherbst: imirkin: not yet
09:44 SciresM: Hey. Anyone here familiar with TSEC/falcon ucode?
12:18 karolherbst: pmoreau: were you able to start to work on that LoweringHelper?
12:56 karolherbst: uhh nice, I get the same tesselation fails with and without nir in unigine heaven
12:58 karolherbst: and on master the issue looks different...
13:29 karolherbst: mupuf: tell your CEO to not be a douchebag please :p
13:30 karolherbst: selling shares before making that CPU sec bug public, not cool
14:12 mupuf: karolherbst: lol
14:50 karolherbst: mupuf: next time, don't tell management :D
15:57 karolherbst: pmoreau: https://github.com/karolherbst/mesa/commit/c6000147625af634276ac01ac6a3f807050a9f0b :)
15:59 pmoreau: Awesome! I wanted to ask you whether you had made any progress on it! :-)
16:00 karolherbst: well, currently I am moving stuff from my nir alu commit into it :)
16:00 karolherbst: I doubt I get to fix all the other 64 bit ops now, because I want to get those patches merged first before adding new features
16:00 imirkin_: why isn't this going into buildutil?
16:01 imirkin_: you know what ... nevermind.
16:01 karolherbst: imirkin_: do we really want to put everything in there?
16:01 imirkin_: i don't have time to think about this now
16:01 karolherbst: I mean, I don't care that much where to put it in the end
16:01 karolherbst: k
16:12 karolherbst: by the way: is pramin the only way the GPU can access sysram through the mmio regs? or is there some other fun regs one might be able to use
16:12 imirkin_: huh?
16:12 imirkin_: GPU can access sysram via dma...
16:12 karolherbst: well right
16:13 karolherbst: but imagine you are on an arm platform and have only the iommu to map the GPU regs into your address space
16:13 karolherbst: and pramin doesn't work
16:13 imirkin_: uhm
16:13 karolherbst: and you can't be root :)
16:13 imirkin_: you're confusing a number of things here
16:13 imirkin_: ok, you want to do mmio writes from non-privileged userspace?
16:14 karolherbst: yeah
16:14 imirkin_: and without a driver that helps you along with that?
16:14 imirkin_: if you have a driver loaded, it can expose whatever it wants to userspace
16:14 karolherbst: right
16:14 karolherbst: there is that nvgpu driver ;)
16:15 imirkin_: not sure how arm vs x86 change things
16:15 imirkin_: the answer is plainly that "you can't"
16:15 imirkin_: if you're not CAP_SYS_ADMIN you don't have raw access to devices
16:15 karolherbst: well, right
16:16 karolherbst: well I talked with some switch guy and they managed to map the mmio stuff into userspace, but it seems like that the pramin stuff is disabled
16:16 imirkin_: i think PRAMIN is disabled on tegra, period
16:16 imirkin_: it has no vram
16:16 karolherbst: okay
16:17 karolherbst: but is there other areas where one might be able to access sysram?
16:17 imirkin_: this is a security thing?
16:17 imirkin_: i.e. if you do this, will userspace be able to access all memory?
16:17 imirkin_: then the answer is definitely 'yes'
16:17 karolherbst: well, the idea is to kind of "root" the switch
16:17 imirkin_: although with an iommu ... only the things that the iommu allows
16:18 imirkin_: anyways ... going back to my original point, i don't have time right nwo :)
16:18 karolherbst: k :)
19:07 karolherbst: pmoreau, mupuf: https://fosdem.org/2018/schedule/event/nouveau/
19:08 pmoreau: Cool! :-)
19:29 Lyude: skeggsb: no, they seem to be running f27
19:29 Lyude: skeggsb: full kernel log here
19:29 Lyude: https://gist.githubusercontent.com/nrr/abed8234b53f144f5cd25287064e1422/raw/43a943b0e31ea01b01895e96514e6ae33d1eb791/full-dmesg
19:37 karolherbst: Lekensteyn: I see you made some progress on the HDMI audio stuff :)
20:07 Lekensteyn: imirkin_: should not be different between battery plugged in or not. If it does, then I suspect that an old version of TLP is installed which changes the runtime PM settings for a device. https://github.com/linrunner/TLP/issues/244
20:07 Lekensteyn: karolherbst: no progress AFAIK :(
20:07 Lekensteyn: no new information, so nothing to act on
20:07 karolherbst: Lekensteyn: mhh, weird
20:07 karolherbst: then somebody used your copyright or you wrote some kernel module
20:08 karolherbst: Lekensteyn: https://bugs.freedesktop.org/show_bug.cgi?id=75985#c21 and following
20:08 karolherbst: last two attachments
20:12 Lekensteyn: karolherbst: it looks like someone used bbswitch as base
20:13 karolherbst: yeah, makes sense
20:13 imirkin_: Lekensteyn: "TLP"?
20:13 imirkin_: wow, people do all kinds of crazy
20:13 karolherbst: but that nouveau patch looks interesting
20:14 karolherbst: there is just this silly "if (drm->client.device.info.chipset != 0x134)" check
20:15 mupuf: karolherbst: darn, you are fast :D
20:15 karolherbst: especially this: pci_scan_single_device(pdev->bus, 1);
20:16 karolherbst: mupuf: how so? :D I was still 3 minutes too late
20:16 karolherbst: or did you mean the fosdem thing?
20:16 mupuf: I meant FOSDEM, yes
20:16 karolherbst: luc wrote me
20:16 karolherbst: *to
20:16 Lyude: btw skeggsb nrr is the one who was having the issues on their f27 machine
20:17 mupuf: :)
20:17 nrr: o/
20:17 Lekensteyn: imirkin_: TLP just changes configuration (in sysfs) iirc, trying to save power (I just use udev rules instead)
20:17 imirkin_: yeah
20:17 imirkin_: sounds like powertop, but different
20:17 karolherbst: Lekensteyn: do you want to look into that nouveau patch?
20:18 karolherbst: Lekensteyn: it looks really promising
20:18 Lekensteyn: karolherbst: sure
20:19 Lekensteyn: one thing to watch out for is that it can result in more power usage (the PCI root port has two devices: GPU and audio functions)
20:20 Lekensteyn: when I tried to wire it up a while ago, I had two issues: (1) by default the audio function had no power saving enabled (2) I somehow ended up in an oops/broke something I cannot remember
20:21 karolherbst: well
20:21 karolherbst: functionality/stability > power consumption
20:21 karolherbst: if this ends up fixing HDMI audio for everybody, then so be it
20:22 karolherbst: mupuf: I think it would be nice to have that NIR stuff merged until FOSDEM :D
20:25 mupuf: hehe, better would be a submission to Khronos for 4.4 :D But whatever works :p
20:27 imirkin_: whether the NIR stuff makes it in or not, it's probably time well spent for Karol as he learns the ins and outs of both isa's and nvir.
20:27 karolherbst: :)
20:27 karolherbst: well
20:27 karolherbst: yeah
20:27 mupuf: Any time spent on nouveau is a win... given how much love it needs :)
20:28 imirkin_: (not to mention how shaders work and hook up to one another and what all the options are)
20:28 imirkin_: i learned a *ton* implementing GS on nv50
20:28 imirkin_: even though most of the support was already there
20:29 karolherbst: well, I still would like to see it merged
20:29 imirkin_: but my point is that the fact that you're learning about all this stuff is a huge benefit
20:29 karolherbst: I would rather have both SPIR-V -> NIR -> NVIR and SPIR-V -> NVIR than just one of those
20:30 imirkin_: since i obviously don't have the time to do things i once did
20:30 karolherbst: right
20:30 airlied: moving off tgsi may habe some other benefits
20:30 karolherbst: you already said that your time will getting more and more limited
20:30 airlied: esp avoiding glsl to tgsi
20:31 imirkin_:likes glsl to tgsi :p
20:31 airlied: imirkin_: no you dont
20:31 karolherbst: airlied: well currently the nir to nvir thing will do more harm to the generated shaders
20:31 imirkin_: eh. it's simple and i understand it. or can figure it out.
20:31 karolherbst: airlied: but I think with nir we get linked optimized shaders by default, right?
20:31 airlied: imirkin_: how many times have you tried to remove renumber?
20:31 karolherbst: at least it looked like this
20:32 imirkin_: airlied: yeah, it's not perfect :) but what is
20:32 karolherbst: I noticed with glxgears, that I get a lot of recompiles, like for every frame or so
20:32 karolherbst: or reuploads of the binary
20:32 airlied: imirkin_: just saying your like of it may be newfound :-)
20:33 airlied: born again glsl-tgsu
20:33 imirkin_: airlied: i dunno. there's obviously parts of it that i hate
20:33 imirkin_: which are over-complciated, etc
20:33 imirkin_: iirc when i tried to remove some of the stuff it ended up having undesired effects
20:33 imirkin_: not on correctness
20:33 karolherbst: well, with nir -> nvir we would get something less complicated than tgsi -> nvir
20:33 imirkin_: but on the overall runtime / whatever
20:34 karolherbst: the complete texture and CFG stuff is a lot less painful with nir than with tgsi
20:35 airlied: nir also gives sone hope on 16bit
20:35 karolherbst: regarding 16bit I am more worried about codegen than nir vs tgsi
20:35 imirkin_: should be straightforward in tgsi... just tag temp's as 16-bit or 32-bit
20:44 airlied: imirkin_: nodoy wants to do it
20:44 airlied: nobody
20:45 imirkin_: airlied: i'll happily do it when i have a way of testing it all the way down to hardware
20:45 airlied: imirkin_: i thought you had limited time :-p
20:45 karolherbst: imirkin_: when do you expect this to happen?
20:45 imirkin_: airlied: i do. it won't take very long to do though... day maybe.
20:46 imirkin_: and it strikes me as a fun project, which is what tends to get prioritized in my limited time :p
20:46 karolherbst: imirkin_: I don't trust your estimates on time :) you also said adding nir support will take one afternoon ;)
20:46 imirkin_: but afaik 16-bit is only a thing on pascal [and newer]
20:46 imirkin_: karolherbst: i said it should take one afternoon. and i stand by that.
20:46 imirkin_: karolherbst: note that in the process you had to learn a *ton* of stuff about ... all sorts of things.
20:46 imirkin_: so it's taking you longer
20:47 karolherbst: well, I mainly learn about nir
20:47 imirkin_: and how shaders pass data around
20:47 karolherbst: but I still doubt you would be able to do it in around 6 hours
20:47 imirkin_: and how nvir expects things on input
20:47 imirkin_: and 30 other things
20:47 karolherbst: all those nir intrinsics are sometimes a lot of work to add, even if you understand all of the above
20:48 karolherbst: well, not a lot
20:48 karolherbst: but some may take 30 minutes if you are fast
20:48 karolherbst: and then you still have to debug stuff and do testing and everything
20:48 airlied: imirkin_: including the glsl to tgsi code?
20:49 imirkin_: airlied: well, i guess it depends. the impls i saw talked about would map very cleanly onto what i proposed.
20:49 imirkin_: airlied: depends on how it ends up though. if it totally doesn't map, then yeah, it'll be a huge thing.
20:52 airlied: i wish fp64 had only taken 6 hours
20:52 imirkin_: yeah, that'd have been nice.
20:53 imirkin_: but that was always going to be a giant pain :)
20:53 imirkin_: i think some of your glsl_to_tgsi opinions may be colored by that experiene
20:53 imirkin_: atomics and images were pretty straightforward
20:53 imirkin_: (except figuring out the stupid binding stuff was ... confusing.)
20:54 karolherbst: meh... the 64bit sub lowering can't handle "sub s64 %r40d 0x0000000000000000 c0[0x0]"
20:54 imirkin_: (and i probably got it wrong in the end anyways)
20:54 imirkin_: karolherbst: yeah, don't do that. add with neg.
20:54 imirkin_: karolherbst: or ... just use neg ;)
20:54 karolherbst: sub isn't the issue here
20:55 karolherbst: in TGSI I end up with "sub s64 %r40d 0x0000000000000000 %r39d"
20:55 imirkin_: o.
20:55 imirkin_: right.
20:55 karolherbst: weird thing is, before the SSA passes I have "sub s64 %r40d 0x0000000000000000 %r39d" with nir as well
20:56 imirkin_: const prop?
20:56 imirkin_: er, load prop
20:56 karolherbst: most likley
20:56 imirkin_: do you mov or load the c0?
20:56 karolherbst: I load it
20:56 karolherbst: should I move it?
20:56 imirkin_: no
20:56 imirkin_: you should load
20:56 karolherbst: mhh
20:56 karolherbst: tgsi movs :)
20:56 karolherbst: ohh wait
20:56 karolherbst: it loads as well
20:56 imirkin_: really?
20:56 karolherbst: but it gets translated into a mov
20:57 karolherbst: mhh
20:57 karolherbst: well
20:57 imirkin_: in nvdisasm? maybe.
20:57 imirkin_: but at the nvir level it's a load
20:57 karolherbst: the main cause is, that I currently do 64bit loads with nir, because it works so far (tm)
20:57 imirkin_: yeah, pmoreau had the same issue
20:57 karolherbst: in tgsi, it ends up having 2 32bit loads + merge
20:57 imirkin_: i told him "DONT DO THAT" :)
20:57 karolherbst: ;)
20:57 karolherbst: well
20:57 imirkin_: the main issue
20:57 imirkin_: is with 64-bit phi's
20:57 karolherbst: I don't see why we shouldn't fix up the code to handle it
20:57 karolherbst: ahhh
20:57 imirkin_: that have one arg which is a "pure" 64-bit val
20:58 imirkin_: and one arg which is a merge of 2 vals
20:58 imirkin_: which makes one side a compound
20:58 imirkin_: and the other side not a compound
20:58 karolherbst: ugh
20:58 imirkin_: and RA keels over
20:58 imirkin_: (with an assert)
20:58 imirkin_: now ... it should be possible to just make it all happy somehow
20:58 karolherbst: I think I may have seen that at some point
20:58 imirkin_: but then you have to gain a true understanding of wtf a compound value means
20:58 imirkin_: which i have not.
20:59 karolherbst: maybe I reduce those 64 bit loads to 32 bit then. Should be trivial
20:59 karolherbst: but nir has a native concept of 64 bit types and it makes the code much easier in the end
20:59 karolherbst: or I just lower it away
21:00 karolherbst: because in the end it isn't the fault of the converter that our backend can't handle it
21:01 karolherbst: imirkin_: but wouldn't we get the issue after doing MemoryOpt anyway?
21:01 karolherbst: or is it smart enough to not trigger those issues
21:02 imirkin_: ultimately RA should be able to handle it
21:02 imirkin_: in practice, you're generating code that the TGSI frontend never did, so it's untested.
21:02 imirkin_: like i told pmoreau, you can either fight the power, or bow down before your 32bit overlords.
21:02 karolherbst: I see
21:03 karolherbst: then I will just lower those 64bit loads away for now. This keeps the nir converter code clean and if somebody wants to fix it, it is easily removed
21:03 imirkin_: esp since it's like an extra couple lines of code ... seems easy to just stick in merges/splits all over the place.
21:04 imirkin_: no - you have to do this for every op that consumes a 64-bit arg
21:04 imirkin_: or produces a 64-bit value
21:04 karolherbst: not even tgsi is doing that
21:04 imirkin_: oh?
21:04 imirkin_: [check again]
21:04 karolherbst: or do you mean it also uses merges for 64 bit ops
21:04 karolherbst: because this would make things super ugly
21:05 imirkin_: can you please just RTFS?
21:05 karolherbst: because I get 64 bit srcs, which I would have to split and merge again
21:05 karolherbst: I do
21:05 karolherbst: tgsi produces 64 bit subs
21:05 imirkin_: no.
21:05 karolherbst: it does
21:05 imirkin_: the source
21:05 imirkin_: not the nvir that's generated.
21:05 imirkin_: the source that generates it.
21:06 karolherbst: you mean the from_tgsi thing, right?,
21:06 imirkin_: yes
21:06 karolherbst: yeah, it produces 64 bit subs
21:06 imirkin_: uh huh
21:06 karolherbst: for I64ABS
21:06 imirkin_: and what does it surround them with?
21:06 karolherbst: well right, there is a merge to merge two 32 bit vals together, but this is totally not needed with nir
21:06 imirkin_: and what did i say above?
21:06 karolherbst: that's why I said I would need to do a split and a merge again with nir
21:07 imirkin_: you can either fight the power
21:07 imirkin_: or you can just do what the tgsi fe does
21:07 karolherbst: well I would rather fix the code than to write ugly code like this
21:07 imirkin_: go for it.
21:07 imirkin_: i told you the situation where it breaks
21:07 imirkin_: repro it (should be trivial to cause it)
21:07 imirkin_: and then futz with RA until you understand wtf Value->compound means.
21:07 karolherbst: well it happens in the 64 bit sub lowering
21:08 karolherbst: this has nothing to do with RA yet
21:08 imirkin_: then it's a different thing
21:08 karolherbst: yes
21:08 imirkin_: but my point is that you have issues lying ahead of you
21:08 karolherbst: I am aware
21:08 imirkin_: this is why i said it would take an afternoon
21:08 imirkin_: because that involves not fighting the power
21:08 imirkin_: but instead doing whatever the tgsi fe is doing
21:08 imirkin_: since that is known to work
21:08 karolherbst: the current issue is: "sub s64 %r40d 0x0000000000000000 c0[0x0]" -> "sub s32 { $r2 $c0 } $r255 c0[0x0]" + "sub s32 $r3 $r255 $r255 $c0"
21:09 imirkin_: ok
21:09 imirkin_: and what's the problem with that?
21:09 karolherbst: the second sub
21:09 imirkin_: i ask again...
21:09 imirkin_: what is the problem with that?
21:10 karolherbst: the lowering doesn't split the c0[0x0] value
21:10 imirkin_: ah.
21:10 imirkin_: right, that's a problem.
21:10 imirkin_: it must think the c0[0x0] is a 32-bit value
21:10 imirkin_: are you creating it with the wrong value type?
21:10 karolherbst: it says s64, doesn't it?
21:10 imirkin_: doesn't mean shit
21:11 imirkin_: what's the type of the getSrc(1)
21:11 karolherbst: no idea, but I doubt the code even checks for that at all
21:11 imirkin_: yeah. except when lowering sub.
21:23 karolherbst: ugh... you are right, I create every symbol as 32bit ones...
21:27 karolherbst: but uhm... the issue should be something else
21:28 karolherbst: now it passes, weird
22:24 Lyude: would a GK107 (specifically one of the quadros) potentially have a different type of vram than a gk104?
22:25 Lyude: I remember on maxwell2 they had some variations with the type of memory they used
22:25 imirkin_: sure
22:25 imirkin_: DDR3 vs GDDR5
22:25 imirkin_: some GK107's have GDDR5 of course
22:27 Lyude: dang, both the cards in question appear to have GDDR5 so I guess differences in memory are out of the question (trying to debug https://gist.githubusercontent.com/nrr/abed8234b53f144f5cd25287064e1422/raw/43a943b0e31ea01b01895e96514e6ae33d1eb791/full-dmesg )
22:27 imirkin_: no, they're totally in the question
22:28 imirkin_: but the DDR3 vs GDDR5 thing would be a pretty obvious difference ;)
22:28 Lyude: hehe
22:28 imirkin_: but that issue has nothing to do with vram...
22:28 imirkin_: it has to do with resource management
22:28 imirkin_: we access a texture that's not mapped into vram
22:28 imirkin_: "oops"
22:36 Lyude: btw (I just tried searching for it in nouveau's source, don't see any clear definition other then the fact it's related to the mmu), what is a PTE?
22:36 imirkin_: page table entry
22:37 imirkin_: (same as in any mmu...)
22:38 Lyude: ah, ok! just checking
22:42 nrr: imirkin_: the texture thing sounds vaguely familiar, but i'll admit that i'm not 100% on how modern unix windowing systems work. there've been a couple of cases where i've tickled out this read fault by simply uncovering, e.g., a chrome window.
22:43 imirkin_: nrr: well, it's either a straight up bug, where we submit a batch whereby we make use of a resource that we don't claim we use
22:43 imirkin_: or something more subtle, whereby we think the command is done but it's not and we unmap it, or there's some kind of mmu coherency issue, or ... $other thing
22:44 Lyude: yeah, supposedly this seems to happen randomly and I haven't managed to reproduce this yet with my gk104
22:44 nrr: sure. i've been down that kind of use-after-free road with stuff that runs on the CPU (and come to love getting the register dump on panic as well as a chance to bang against things with kdb), but when it comes to GPU hardware and how this world works, it's all greek to me. (:
22:45 nrr: Lyude: yeah, which is part of the reason why i'm not pushing too terribly hard against you to help me with it, save for getting you the right debugging information or perhaps writing up a test case that'll reliably trigger it.
22:45 nrr: (that last part is what i truly want.)
22:45 Lyude: nrr: nah don't worry, I am a nouveau dev and having our hw actually work is important :)
22:45 nrr: <3
22:46 imirkin_: nrr: well, my approach is to just avoid using the gpu for anything it's not needed for
22:47 Lyude: btw nrr, whenever I/someone else figures out what the problem is here, would you want to volunteer to try out some powersaving patches?
22:47 imirkin_: (i.e. basic desktop usage)
22:47 nrr: Lyude: beyond a doubt
22:48 Lyude: sweet, the kepler1/2 work is already done and maxwell1 will be done pretty soon after I fix my laptop dock
22:49 Lyude: oh nrr I'm assuming this is a no, but you're not running this overclocked or anything correct?
22:50 nrr: absolutely not. it generates enough heat as is. (:
22:54 Lyude: btw, I'm guessing running piglit with concurrency enabled still crashes on nouveau?
22:55 imirkin_: yeah, for everyone except skeggsb
22:55 imirkin_: (not really. but it still does for me.)
22:55 imirkin_: (i haven't checked in a bit though)
22:58 Lyude: nrr: mind going through your kernel logs and see if the error your kernel shows ever changes?
22:59 nrr: Lyude: when i get back home, sure.
22:59 Lyude: cool
23:00 Lyude: leaving this machine running overnight with piglit + glmark2 on loop so hopefully something will break
23:02 Lyude: nrr: also jfyi, you should also give running a newer kernel (maybe one of the 4.15 kernels) and seeing if that helps with your issue
23:04 karolherbst: it works for me as well :)
23:04 karolherbst: I mean piglit
23:04 karolherbst: even on my pascal here
23:07 Lyude: sweet
23:13 nrr: Lyude: sure. i hadn't been enterprising enough to jump into building a new kernel for this box quite yet, but that was slowly climbing up my list of things to do.
23:14 Lyude: nrr: fedora should have some rpm builds for you if you check the current rawhide kernel
23:21 nrr: yep! looks like kernel-4.15.0-0.rc6.git0.2.fc28 finished building a little over an hour ago.
23:22 Lyude: hope you're ready for the KPTI slowdown though :(
23:23 nrr: i mean, i'm gonna have to get ready at work for it, so what better way to get acquainted with it than at home? :/
23:23 Lyude: yep, sigh.