00:52 rhyskidd: mupuf: thanks. i'll ship it
01:20 rhyskidd: i've also rebased the DisplayPort / BIT d table code so that it could land once reviewed
08:25 pmoreau: mupuf: I said VBIOS? I wanted to say BIOS --'
08:30 mupuf: pmoreau: makes more sense :)
08:31 pmoreau: :-)
11:01 karolherbst: mupuf: you need to perform secboot to load signed PMU images, otherwise it is pointless
11:02 karolherbst: even though the PMU is the host for secboot it bootstraps itself to the LS image later on in the process.
11:04 karolherbst: or the image in the vbios operates in HS mode
11:04 karolherbst: which might allow us to exploit it
11:20 mupuf: karolherbst: interesting
11:20 karolherbst: or neither, and there are regs we don't know of :p
11:21 karolherbst: anyway, those firmwares should be fairly small, so we migh be able to get an idea on what they are doing quite easily
11:54 karolherbst: imirkin: can we do something like "or u32 $r16 $r14 0x40000000" on kepler?
11:54 karolherbst: I mean the immediate
11:54 karolherbst: if I disable loading an immediate there, the div stuff passes
11:55 karolherbst: only difference in the shader
11:55 karolherbst: even the register ids are the same
11:55 karolherbst: the immediate was loaded always at the dest of the or
11:55 karolherbst: and the or was right after the mov
11:56 karolherbst: duh...
11:56 karolherbst: 00000528: 001c303d ca001000 or b32 $r15 $r12 not -0x80000
11:56 karolherbst: this looks oddish
11:58 karolherbst: LOP.OR R15, R12, -0x80000; /* 0xca001000001c303d */ in nvdisasm
11:58 karolherbst: ha
11:58 karolherbst: that's not "not -0x80000"
12:03 karolherbst: yeah, the NOT handling is a bit wrong here
12:07 karolherbst: if I emit LOP32I.OR for ors with short imms it passes
12:07 karolherbst: what an annoying bug
12:12 karolherbst: imirkin: same issue with fermi and kepler2 ISA
12:17 karolherbst: so with that I hope I can take a look at tesla :)
12:39 karolherbst: imirkin_: seems like 0x7ffff is the highest value we can use
12:40 karolherbst: or... wait
12:46 karolherbst: yeah, no idea what is exactly wrong here
13:17 karolherbst: imirkin: well my current fix is to always use the limm form for logicops... no idea if there is a problem with that.
13:18 karolherbst: maybe we should treat 0x80000 as a limm value?
13:18 karolherbst: maybe logic ops are special here
13:45 imirkin: so generally, using limm form loses you the ability to use some of the other modifiers / functions
13:46 imirkin: however if that's not the case, there's no reason not to use it
13:49 karolherbst: ahh, right
13:49 karolherbst: imirkin: the short form can emit two NOTs, the long just one
13:49 karolherbst: which might not even matter, because the second NOT is on the immediate
13:50 karolherbst: and I think it is only there for non imms
13:50 imirkin: right
13:50 karolherbst: but still, wondering why it doesn't work with 0x80000
13:51 karolherbst: there is a 0x2 | 0x8 in the imm field in the opProperties
13:52 imirkin: that last bit is questionable
13:52 imirkin: although ... hm. it's weird.
13:52 karolherbst: so
13:52 karolherbst: a limm is > 0x7ffff in insnCanLoad
13:52 karolherbst: if (reg.data.s32 > 0x7ffff || reg.data.s32 < -0x80000) return false
13:53 imirkin: right, i'm just trying to remember how it works
13:53 imirkin: a lot of the encodings stick the high simm bit somewhere "else"
13:53 imirkin: however here it seems like we're relying on the not AND on the negative, which i don't think is possible.
13:54 karolherbst: / last bit indicates if full immediate is suppoted
13:54 karolherbst: so this is the 0x8
13:54 karolherbst: and sets the imm mask to 0xffffffff. so this is fine
13:55 karolherbst: meaning the bug is inside the emiter indeed
13:57 karolherbst: code[1] |= (u32 & 0x80000) << 8;
13:58 karolherbst: this sets the not flag, right?
13:58 imirkin: the neg, but yeah
13:58 karolherbst: well, logicops have no neg, right?
13:59 karolherbst: ...
13:59 karolherbst: yeah it makes sense
13:59 karolherbst: kind of
14:01 imirkin: neg == not
14:01 imirkin: iirc nvdisasm also slightly lies
14:01 imirkin: i'd recommend writing some very targeted shader_test files and see what comes out
14:02 karolherbst: good idea
14:08 karolherbst: imirkin: .... guess what the result is.... it makes even kind of sense, kind of
14:08 karolherbst: 0 | 0x80000 = 0xfff80000
14:08 karolherbst: because, why not :D
14:08 imirkin: right... sign-extended
14:09 karolherbst: yeah
14:09 karolherbst: I guess we could exclude the limm form for 0x80000? Or just always use limm, because it doesn't matter for logic ops
14:10 karolherbst: imirkin: anyway, this wasn't triggered in TGSI, because nir just unrolled the loop
14:11 imirkin: i'd say always use limm if it doesn't matter
14:11 karolherbst: okay
14:14 karolherbst: now I have some MS texture load/store stuff to fix, but I think the reason is kind of simple, because in TGSI the sampler stuff is put into coords.w but in nir it is a seperated source or something like that. Have to take a more careful look at this
14:16 imirkin: you mean MS image load/store?
14:16 imirkin: if so - word of warning - it could just be broken in nir. intel doesn't support it.
14:16 imirkin: [not saying it *is* broken, but it's an option to be cognizant of]
14:17 imirkin: gtg
14:31 karolherbst: imirkin: ohh right, intel doesn't enable it. I might disable MS images as well when using NIR then.
16:16 karolherbst: imirkin: ... bin/arb_shader_image_load_store-shader-mem-barrier subtests: Fragment shader/control memory barrier test/modulus=8+ are passing allthough the probed value differes. But with setting the cache mode to CACHE_CV the probed values are correct :(
16:17 karolherbst: _CG also does the trick
16:17 imirkin_: yeah, so tehre's an issue
16:17 imirkin_: it seems like we need to do a CCTL in between an atomic op and a load from that memory address
16:18 imirkin_: since the load may otherwise get a cached value
16:18 karolherbst: mhhh
16:18 karolherbst: I see
16:18 imirkin_: one solution is to ALWAYS do a CCTL after an atomic
16:18 karolherbst: I guess this messes up perf
16:18 imirkin_: https://github.com/imirkin/mesa/commit/b250f6e610e6c6d966c1ae8511adce79cfdbc690
16:18 karolherbst: ohh, we could always insert them and then opt it away
16:19 karolherbst: ahh
16:19 imirkin_: yeah. something like that.
16:20 karolherbst: wondering why that isn't an issue on maxwell though
16:20 karolherbst: I mean with the piglit test
16:20 imirkin_: probably has to do with the specifics of how atomic image stuff is implemented
16:22 karolherbst: ahh, yeah, maybe
16:22 karolherbst: anyway, I think I just fixed the last difference between kepler2 and maxwell. fermi/kepler1 is triggering those weird spilling issues I wanted to fix some months ago anyway
16:30 imirkin_: cool
17:03 kwizart: hi there, does the video firmwares still install into /usr/lib/firmware/nouveau using any current kernel (say 4.14+)
17:04 imirkin_: firmware is generally picked up from /lib/firmware
17:04 imirkin_: however differnet distros can set things up differently
17:04 imirkin_: i think mesa might assume /lib/firmware for the userspace video decoding firmware though
17:04 kwizart: well, /lib or /usr/lib is the same on fedora
17:05 kwizart: ha, that's my other question: does the firmware are loaded along nouveau.ko or when requested by userspace application ?
17:05 imirkin_: depends on which firmware you're talking about
17:06 imirkin_: all firmware is loaded on demand, however the demands can be different :)
17:06 kwizart: for the video firmwares, the one extracted from the nvidia binary driver
17:06 imirkin_: it's loaded when you first try to make use of it
17:06 kwizart: so I guess I don't have to bundle any video firmware into the initramfs (unless, well no)
17:07 kwizart: thx
17:07 imirkin_: unless you plan to play hw-accelerated videos from the initramfs software
17:08 imirkin_: ;)
17:08 kwizart: yes :)
17:08 imirkin_: compared to what a lot of those distro initramfs's do... wouldn't even sound that crazy
17:10 karolherbst: *sigh* I think this will get me into a world with a lot of pain: "rdsv u32 $r10d sv[LANEMASK_EQ:0]"
17:10 imirkin_: ?
17:11 imirkin_: oh
17:11 karolherbst: :)
17:11 imirkin_: you need to fix that shit up
17:11 karolherbst: you know, those issues are hard to spot :D
17:11 karolherbst: yeah
17:43 stoatwblr: allo folks. I'm testing out Ubuntu bionic on a quadro NV440 4 head card (nv43, dual controller) and finding that nouveau segfaults if I try to enable xinerama even for just 2 monitors. It's fine if I use the onboard intel controller with the same config, so this looks to be a nouveau bug. libdrm-nouveau_2.4.90-1, xserver-xorg-video-nouveau_1:1.0.15-2, libxinerama_2:1.1.3-1 - what other data do you need?
17:44 stoatwblr: also, it's limited to 4096*4096 and I'd like to get 5120*1080 out of it (4 screens wide)
17:44 imirkin_: stoatwblr: the hw is limited to a 4Kx4K fb for scanout (and rendering)
17:44 stoatwblr: imirkin: which is why I'm trying to get xinerama working
17:44 imirkin_: xinerama doesn't change what the hw is capable of
17:45 stoatwblr: but even with 3 displays, the third one stays black and desktop won't extend to it after enabling with arandr.
17:45 imirkin_: xinerama can be a bit tricky to configure
17:45 imirkin_: xinerama + randr = non-starter
17:45 imirkin_: they preclude one another
17:45 stoatwblr: I had this all wroking under the nvidia driver, but ubuntu has droped the -308 driver and later ones don't support nv43
17:45 stoatwblr: yes, I know
17:46 imirkin_: https://nouveau.freedesktop.org/wiki/MultiMonitorDesktop/
17:46 stoatwblr: xinerama also means no hw accleration, but I'm used to that.
17:46 imirkin_: this should be a relatively accurate guide
17:46 stoatwblr: I've been using that
17:46 imirkin_: ok
17:46 stoatwblr: as soon as I enable xinerama, bewm.
17:46 imirkin_: can i see your xorg config + log
17:46 stoatwblr: pastebin or elsewhere?
17:46 imirkin_: doesn't matter
17:46 stoatwblr: ok
17:46 imirkin_: i just need to be able to view it
17:47 stoatwblr: I'll haver to drop out to regenerate the log
17:47 imirkin_: however you make that happen is fine by me. but i'm not going to download files from somewhere.
17:47 stoatwblr: understood.
17:47 imirkin_: i.e. has to be viewable in a browser
17:47 stoatwblr: no surprise
17:48 gyroninja: Is there anything I can do to help get my issue resolved? Do I need to collect more information or something? https://bugs.freedesktop.org/show_bug.cgi?id=105432
17:49 stoatwblr: sorry, had to change consoles to do this.
17:51 imirkin_: gyroninja: GK106 right?
17:52 imirkin_: if so, there's some precedent that some GTX 660's die with the nouveau ctxsw fw. perhaps your board is similar enough.
17:52 imirkin_: they work with the blob ctxsw fw though
17:53 stoatwblr: config https://pastebin.com/MeUF0Hq8 log: https://pastebin.com/P4BbK4Yb
17:54 gyroninja: imirkin_: I don't know what GK106 means. Is there a way for me to check
17:54 imirkin_: gyroninja: lspci -nn -d 10de:
17:54 gyroninja: Yeah it's GK106
17:54 imirkin_: stoatwblr: what's with the virtual thing? i've never seen that...
17:55 imirkin_: uhhhhh wtf
17:55 stoatwblr: it was an attempt to set the fb window wider, didn't work and I'd commented it out previously.
17:55 imirkin_: NOUVEAU(G0): Chipset: "NVIDIA NV43"
17:55 imirkin_: that cannot be helping
17:55 imirkin_: can you throw in AutoAddGPU Off in serverflags iirc?
17:55 stoatwblr: i did say nv43
17:55 imirkin_: i'm more concerned about G0
17:55 imirkin_: should just be NOUVEAU(0) and NOUVEAU(1)
17:55 imirkin_: G0 = "gpu", which is not what you want
17:55 stoatwblr: ah
17:56 imirkin_: er
17:56 imirkin_: hold up
17:56 imirkin_:is confused
17:56 stoatwblr: I'm glad I'm not the only one.
17:56 imirkin_: do you have 3 or 4 monitors?
17:57 stoatwblr: 4
17:57 stoatwblr: 1280*1024 in a horizontal stripe.
17:57 stoatwblr: worked previously under nvidia xinerama, but never under nouveau :(
17:57 imirkin_: that's not really relevant
17:58 stoatwblr: 4*1280=5120... :) that's where the desire for > 4096 comes from.
17:58 imirkin_: yeah i get it
17:59 imirkin_: ok. can you throw in AutoAddGPU off into serverflags?
17:59 stoatwblr: if I comment out the xinerama line then I get 4 indepentent x sessions as you'd expect.
17:59 stoatwblr: ok
18:00 imirkin_: and nuke the virtual thing
18:00 imirkin_: basically... copy https://nouveau.freedesktop.org/wiki/MultiMonitorDesktop/ and make minimal modifications to it
18:00 imirkin_: like the pci id's
18:00 imirkin_: and that's it
18:01 imirkin_: stoatwblr: fwiw people have gotten this to work with 20 monitors. so the 4096x4096 thing isn't a problem
18:01 imirkin_: at least afaik
18:01 imirkin_: not 1000% sure it was with xinerama
18:02 imirkin_: [might have been as totally separate screens]
18:02 stoatwblr: https://pastebin.com/R8mUVe49 - and "wtf" - you'll see why
18:02 stoatwblr: yeah, I know, xinerama shouldn't matter with virtuals
18:03 imirkin_: i just have no clue what all this virtual stuff is
18:03 imirkin_: hrmph
18:04 imirkin_: failed the modeset
18:04 stoatwblr: with the virtuals out it's saner
18:04 stoatwblr: as in not running out of memory
18:04 imirkin_: that's good
18:04 stoatwblr: now let's try with more screens
18:04 imirkin_: i like not running out of memory
18:07 stoatwblr: yeah....
18:07 stoatwblr: ok, 4 screens wipes out again
18:07 imirkin_: how about 3?
18:08 stoatwblr: https://pastebin.com/kNyFAqUe
18:08 stoatwblr: let me test
18:09 stoatwblr: no, same error
18:09 stoatwblr: it seems it doesn't like trying to use the second controller
18:10 stoatwblr: fun'n'games :/
18:11 imirkin_: ooh, progress!
18:11 imirkin_: it's dying somewhere in nouveau
18:11 imirkin_: can you see if there's any errors in dmesg?
18:11 stoatwblr: I had the same error when I tried xinerama from intel onto the nouveau controllers
18:13 stoatwblr: https://pastebin.com/afiH79yw -
18:15 imirkin_: hmmmmmmmm
18:15 imirkin_: j'accuse... skeggsb
18:15 imirkin_: what kernel is this?
18:16 stoatwblr: Linux Magenta 4.15.0-12-generic #13-Ubuntu SMP Thu Mar 8 06:24:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
18:16 imirkin_: could you try 4.14?
18:16 imirkin_: 4.15 had some pretty substantial VM rework
18:16 imirkin_: perhaps something got messed up for the early gpu's
18:17 stoatwblr: I'll have to work out how to step back, within the ubuntu structure but sure
18:18 imirkin_: ah ok
18:18 imirkin_:just builds his own kernels
18:18 stoatwblr: yeah, I used to, but I got lazy about 15 years ago.
18:18 imirkin_: didn't even make it to 2.6?
18:19 stoatwblr: was still making them occasionally, but not on my desktop box.
18:19 imirkin_: it does have its downsides... for some reason my self-compiled 2.0.x kernels didn't work on my hw while the RH ones did. in the end, turned out to be some weird IDE setting. but in general ... fairly easy.
18:20 stoatwblr: back in the early days of ide drivers working wirh Mark lord and andre hedrick we'd be trying 4-5 in a night and they took a lot longer to compile back then.
18:20 imirkin_: mmm ... i dunno. computers were slower, but kernel was smaller
18:20 imirkin_: on average, i think it's been fairly constant amount of time
18:20 stoatwblr: aye, but 2 hours to make a kernel...
18:21 imirkin_: :)
18:21 imirkin_: anyways, a quick workaround for you
18:21 stoatwblr: I still remember how fast my first 486 running Linux seemed
18:21 imirkin_: is to set NoAccel
18:21 imirkin_: i only ever ran linux on pentium's. congratulations on being older than me :)
18:21 imirkin_: (p60? or p66? i forget.)
18:22 stoatwblr: sx20
18:22 stoatwblr: and then dx33
18:23 imirkin_: oooh, with fpu!
18:23 stoatwblr: my very first linux box was a 4MB 386dx25 and we were mad enough to get it running on a 2Mb 386sx16.
18:23 stoatwblr: back when ram was $100/Mb
18:24 imirkin_: and only had 30 pins :)
18:24 stoatwblr: yup
18:24 imirkin_: but at least you didn't have the joys of ESDI?
18:25 stoatwblr: not on linux, but I played with VME systems for dayjob back then.
18:25 imirkin_: (actually you could probably find ISA controllers...)
18:26 stoatwblr: isa mfm 20MB seacrates, and a $3000 200Mb scsi drive for the Usenet box.
18:26 imirkin_: stoatwblr: does the NoAccel thing help? in the Device section
18:26 imirkin_: iirc just do Option "NoAccel" "1"
18:26 stoatwblr: let me try...
18:26 imirkin_: (in each device section)
18:27 stoatwblr: yes, it's up, kinda
18:27 imirkin_: sideways? :)
18:27 stoatwblr: more like the way xinierama is smeared.
18:28 stoatwblr: screen 1 and 2 are delineated and 3/4 are one area
18:28 imirkin_: look at the way ServerLayout is done in that example
18:28 imirkin_: i guess your thing should work too, but i've never done that
18:29 stoatwblr: I take that back. I had the 3 screen config setup. it pulled screen 4 into screen 3
18:29 imirkin_: that's definitely weird :)
18:30 imirkin_: bbiab. let me know how it goes.
18:30 stoatwblr: and now it's treating it as one big desktop rather than one desktop with 4 sections.
18:30 stoatwblr: odd
18:30 stoatwblr: 'tis ok after login though
18:31 stoatwblr: the funny thing is that I tried noaccel on the module and that didn't work
18:32 stoatwblr: and nofbaccel
19:18 imirkin_: stoatwblr: so ... all good?
19:22 stoatwblr: yup. thanks
19:39 imirkin_: nice
19:39 stoatwblr: yup
19:39 stoatwblr: it'd be good to get to the bottom of the problem though.
19:39 imirkin_: yeah
19:39 imirkin_: i dunno why accel dies
19:40 imirkin_: ttm validate failing means it can't satisfy some buffer placement thing
19:40 imirkin_: or something similarly annoying
19:42 imirkin_: stoatwblr: if you're interested in making an investment, there are a number of modern boards that will support 4 screens much better
19:42 imirkin_: i recommend AMD for open-source support
19:43 imirkin_: you'd get accel and all that, too
19:43 stoatwblr: the original intent was to have 4*2 monitors but desktop systems tend ot only have 1 pcie x16 in them. I have a second quadro nv440 card in a baggie. :)
19:43 stoatwblr: yeah, I keep meaning to do that, but then also think it'd be about the same cost to just buy a 4k monitor.
19:43 imirkin_: most semi-modern mobos have 2x x16
19:44 stoatwblr: this system is a dumpster-dive from £orkplace
19:44 imirkin_: that could limit your selection
19:45 stoatwblr: 32gb i7-3660 asus p8-b75m
19:45 imirkin_: that's quite a bit nicer than my home setup
19:45 stoatwblr: i7-3770S actually
19:45 imirkin_: i'm still on i7-920 with 6gb ram :)
19:45 imirkin_: it'll be due for an upgrade in a year or two
19:45 stoatwblr: we're a space lab, the boxes are maxxed out on ram so people can do particle physics on them.
19:46 imirkin_: that's always fun.
19:46 stoatwblr: careful with the 920s and older, they have a nasty tendency to hand-grendade
19:46 imirkin_: fun fact: you can go to dell.com and order a machine with 4TB of ram.
19:46 stoatwblr: yup
19:47 stoatwblr: I have a couple of servers with that much in them but we prefer to let people max out their desktop first. when you let them loose on servers they assume they're the only ones on the things.
19:47 imirkin_: i remember people on lkml running into issues with pages being counted by ints, coz they had > 16TB...
19:47 stoatwblr: and the software of choice tends to be IDL
19:48 stoatwblr: not so long ago we had a well-regarded researcher some and chat to us because he found that the rh 64-bit systems were giving wrong answers vs the rh 32-bit systems
19:49 stoatwblr: after a bit of digging the answer was floating points and iterations.
19:49 imirkin_: 80-bit vs 64-bit fp?
19:49 stoatwblr: nope, all 80 bit
19:49 imirkin_: can you get x87 in x86-64 mode?
19:50 stoatwblr: but they were looping in simulations and taking the output of each loop as input for the next one, rounding errors were adding up and the rounding was subtly different for 32 vs 64bit systems.
19:50 imirkin_: ah
19:50 stoatwblr: so the answer was they were _both_ wrong.
19:50 imirkin_: "yay"
19:51 stoatwblr: and when they went back to first principles and did it the right way they ended up with solar system models that fell apart after 200 million years instead of 11 million.
19:51 stoatwblr: kinda funny to see a model eject jupiter at high velocity
19:52 imirkin_: i did some numerical stuff a long time ago. it's hard.
19:52 imirkin_: esp when you don't know what you're doing, like me.
19:52 stoatwblr: well stuff like this makes you realise that you're not alone in the "don't know what you're doing" stakes.
19:53 stoatwblr: using IDL as a calculation language is a good example of that, but that's rampant across space science/particle physics.
19:53 stoatwblr: it's a bit like beating in nails using a spoke shave.
19:54 stoatwblr: actually, beating in machine screws.
19:56 imirkin_: everyone should just use arbitrary precision and be done with it
19:57 stoatwblr: we spend a lot of time having to explain th elkimits of what computers can do. this wasn't an issue even a decade ago.
21:59 karolherbst: imirkin_: nvidia uses LOP32I.OR as well for 0x80000
21:59 karolherbst: even for 0x1
22:05 imirkin_: should probably follow suit
22:07 karolherbst: yeah, I just wanted to check with some PTX code before I send out the patches
22:08 karolherbst: imirkin_: no idea if you got my message due to the net split: yeah, I just wanted to check with some PTX code before I send out the patches
22:09 imirkin_: checking...
22:17 karolherbst: imirkin_: ... in nvc0 setImmediate doesn't do mod handling unlike the gk110 variant of it? sounds annoying ;(
22:17 karolherbst: :(
22:17 imirkin_: gk110 does?
22:18 karolherbst: emitForm_L(i, 0x200, 0, i->src(1).mod); -> setImmediate32(i, s, mod); -> mod.applyTo(imm);
22:19 imirkin_: fancy.
22:19 karolherbst: yeah, quite so
22:20 karolherbst: there is a "if (i->src(1).mod & Modifier(NV50_IR_MOD_NOT)) code[0] |= 1 << 8;" though
22:20 karolherbst: in the nvc0 case
22:20 imirkin_: right
22:20 imirkin_: and we don't want that afaik
22:20 karolherbst: which I don't think works...
22:20 imirkin_: i.e. that should not be in the limm case
22:20 karolherbst: or maybe it does
22:20 karolherbst: ahh
22:21 karolherbst: maybe I test the kepler1 case in more depth on monday
22:21 karolherbst: the gk110 code looks fine
23:43 karolherbst: :( I forgot to split up 64 bit not instructions