00:00 karolherbst: why that 0x1000
00:00 karolherbst: Thog: well, it doesn something at least if it gets to line 85
03:59 rhyskidd: i'm really happy that motivated people wanting to run homebrew on the switch, might just lead to reclocking on a Pascal dGPU with nouveau :)
04:00 imirkin: it'd be a long road
04:00 imirkin: switch is maxwell, and tegra clocks work totally differently
04:01 imirkin: and you're still stuck using their kernel, so they'd be using blob kernel driver
04:01 imirkin: but it does increase interest in nouveau, so you get that trickle-down effect
04:01 rhyskidd: yes, but if there's a signed bootloader crypt mis-implementation -- those sorts of fails tend to apply across whole classes of devices
04:01 rhyskidd: :)
04:02 imirkin: pretty sure that esp some of the earlier firmware they released is rootable
04:02 imirkin: no one here has greatly looked into it here though
04:03 rhyskidd: Lyude: any interest in DP Test Pattern Generators for the lulz? https://github.com/envytools/envytools/pull/135
04:04 imirkin: DISPLAY_OVER_PCIE? that sounds dangerous...
04:05 rhyskidd: sounds fun ...
05:27 jn__: 06:01 < imirkin> and you're still stuck using their kernel, so they'd be using blob kernel driver
05:27 jn__: fail0verflow's kernel uses slighly modified nouveau, and nintendo's kernel is not linux
05:28 jn__: (https://github.com/fail0verflow/switch-linux)
05:28 imirkin: jn__: nouveau should work pretty much as-is
05:28 imirkin: however i believed that switchbrew was going to be mostly using the existing OS environment / services
05:28 imirkin: just with home-made games
05:29 imirkin: but since the (userspace bit of the) driver is in the SDK, something has to compose the commands
05:29 imirkin: getting an upstream linux running is cool too, of course
05:30 jn__: right, pluto from switchbrew is using the blob sdk on nintendo's OS
05:31 imirkin: (and when i said "switchbrew" i actually meant "homebrew")
05:31 imirkin: keep in mind that the last hand-held console i used was the gameboy, so i'm a bit out of touch with how the latest and greatest operate :)
05:32 jn__: heh
05:33 jn__: that's actually the same for me, now that i think about it
05:33 imirkin: console hacking back then: https://en.wikipedia.org/wiki/Game_Genie#/media/File:Game_Genie.jpg
05:33 jn__: (last *handheld* console was a gameboy, unless you count the wiiu gamepad)
05:42 HdkR: pfft, wii u gamepad is just a fancy wifi chip + video decoder. It can't be a handheld :P
05:58 jn__: HdkR: i can hold it in my hand and it runs code from flash
05:58 jn__: ;)
13:53 gQuigs: I'm remotely helping with debugging a Quadro P6000 full hang. current workaround is to add mem=64000mb to the kernel command line... which still doesn't let the GUI display, but lets the installation continue (preseed)
13:54 gQuigs: I'm curious if there is a nouveau memory option I can try to limit it's memory instead... it shows up as a GP102, btw with 24GB of ram
14:16 imirkin_: gQuigs: if the system has more than 1T of ram, i suspect you're going to have a bad time.
14:17 imirkin_: gQuigs: try booting with nouveau.modeset=0 -- that will prevent nouveau from doing anything
14:18 gQuigs: imirkin_: it has 1TB of ram exactly...
14:19 imirkin_: gQuigs: well, i guess it doesn't matter how much ram it has
14:19 gQuigs: nomodeset by itself didn't do it, still needed mem=64GB line
14:19 imirkin_: gQuigs: if you have ram above the 40-bit line, you're in trouble
14:19 gQuigs: but I'm guessing that's not on nouveau
14:20 imirkin_: (you could have 1MB of ram, but if it's placed about 1T, then you might still run into trouble. BIOS's don't do that usually though.)
14:20 gQuigs: it works with all the ram once the nvidia driver is installed
14:21 gQuigs: (1 TB RAM, plus 24 GB on video card)
14:22 gQuigs: imirkin_: by trouble do you mean specifically with any graphics on linux, nouveau specifically, or something else?
14:22 imirkin_: gQuigs: nvidia VM is 40-bit
14:23 imirkin_: so i don't think it can address physical memory above 40 bits
14:23 imirkin_: i don't know if nouveau handles that nicely
14:23 imirkin_: (did pascal bump it to 48? it might have, actually...)
14:33 gbisson: Hi all, can't seem to get Nouveau working with my GTX860M on my laptop, I get some MMIO read faults at bootup. Seems that other people reported the same on bugzilla, is there any update on this issue?
14:33 gbisson: https://bugs.freedesktop.org/show_bug.cgi?id=100423
14:33 gbisson: https://bugs.freedesktop.org/show_bug.cgi?id=104835
14:34 imirkin_: gQuigs: if linux doesn't boot with the full 1T of ram with nomodeset, then you have non-nouveau issues
14:35 imirkin_: nomodeset (or nouveau.modeset=0) will literally make nouveau do nothing.
14:35 imirkin_: besides maybe a printk. not even sure about that.
15:08 Thog: karolherbst, Hello so I have tried some of the debug registers (tried to read the pc, set breakpoint,...) but nothing worked...
15:08 karolherbst: Thog: yeah.. I kind of feared that would happen
16:15 E5ten: I'm on a macbook air from 2010, and it has this bizzare integrated nvidia GPU, the geforce 320m. On any kernel newer than 4.14, booting into KDE plasma results in a black screen, which I think is probably an issue caused by a newer version of nouveau in the kernel. Does anyone know of this happening to other people, or a possible fix? (or even if it's probably not a nouveau issue in which case I'll ask for help elsewhere)
16:15 E5ten: Thanks.
16:19 gnarface: E5ten: try it without compositing
17:09 Lyude: rhyskidd: oooh, maybe :)
18:41 pmoreau: E5ten: Have you tried kernels >=4.15.11?
22:49 karolherbst: imirkin_: is there some way to debug fp32 precision? for length({1.70141173e+38, 487.993225}) I get 0.0 as a result :(
22:50 imirkin_: not extremely surprising...
22:50 imirkin_: hold on
22:51 imirkin_: that first value is close to max float
22:51 imirkin_: so if you do sqrt(x^2 + y^2) you're going to end up with idiocy
22:51 karolherbst: yeah, I am not doing that
22:51 imirkin_: ?
22:51 karolherbst: I do more or less this: https://github.com/pocl/pocl/blob/3f5a44a64ab7c7d5907c8cbf385ad7f13eff659a/lib/kernel/vecmathlib-pocl/length.cl#L49-L57
22:52 karolherbst: I don't have the "if (maxp == 0.0f) return 0.0f;" check, but I doubt that matters here
22:52 imirkin_: ok, so you scale it by the thing
22:52 karolherbst: yeah
22:52 imirkin_: so you end up getting the length of, basically, {1, 0}
22:52 karolherbst: yep
22:52 karolherbst: expected result is also 1.70141173e+38
22:52 karolherbst: obviulsy
22:52 imirkin_: and then scaling up, obviously
22:53 imirkin_: can i see the generated code?
22:53 karolherbst: right
22:53 imirkin_: my only thought is that rcp(1.7....) == 0
22:53 imirkin_: and then x * rcp(x) != 1 :)
22:53 karolherbst: https://gist.githubusercontent.com/karolherbst/79c3e2bbd5d201c14ab7b4f7bb459c25/raw/e23100e19c622d63c67bb4f4bc2085421ee026be/gistfile1.txt
22:53 imirkin_: you can test that out fairly easily with a shader_runner thing
22:54 karolherbst: mhh
22:55 imirkin_: >>> hex(np.float32(1.70141173e+38).view(np.uint32))
22:55 imirkin_: '0x7effffffL'
22:55 karolherbst: wow, the shader generated by nvidia is brutal
22:55 imirkin_: so i think that IS maxfloat.
22:56 karolherbst: yeah
22:56 imirkin_: >>> hex((np.float32(1.0) / np.float32(1.70141173e+38)).view(np.uint32))
22:56 imirkin_: '0x400000L'
22:56 imirkin_: that's a denorm
22:56 imirkin_: (i think)
22:57 karolherbst: nvidias code: https://gist.github.com/karolherbst/4e621f3449b60d3d1b3704efd74fbe7f
22:57 imirkin_: now, nvidia hw can handle denorms, just get rid of the FTZ's
22:58 imirkin_: right. clever.
22:58 imirkin_: note how it checks if it's "too big"
22:58 imirkin_: and scales it down by 0.25 in that case
22:58 karolherbst: yeah
22:58 imirkin_: or scales up by 16M if it's too small
22:58 imirkin_: and then hopefully undoes that ;)
22:59 imirkin_: actually it might not need to
22:59 karolherbst: but mhh
22:59 karolherbst: this isn't even in the pt
22:59 karolherbst: x
22:59 imirkin_: anyways, i think you get the gist of it.
22:59 karolherbst: just a div.full.f32
22:59 imirkin_: right
22:59 imirkin_: i think we have to do tha
22:59 imirkin_: in order to have precise f32 division
22:59 karolherbst: :(
22:59 imirkin_: which in glsl we totally don't
22:59 karolherbst: fun...
23:00 imirkin_: normally you have a fdiv op which takes care of such idiocy
23:00 imirkin_: but all we get is RCP
23:00 imirkin_: so we get to peer behind the curtain of sadness
23:00 karolherbst: :(
23:01 imirkin_: btw, note how the nvidia code doesn't use the FTZ flag?
23:01 imirkin_: you should probably also make it not use FTZ unless denorm flushing is requested
23:02 karolherbst: mhh yeah
23:02 karolherbst: CL is a bit weird there
23:02 karolherbst: those are the details I don't really want to care for now, but it seems like I have to
23:03 imirkin_: floats suck.
23:04 karolherbst: oh well :(
23:22 karolherbst: imirkin_: popcnt is 32 bit only?
23:31 imirkin_: all integer ops
23:31 imirkin_: including popcnt
23:41 karolherbst: imirkin_: ohh, I see
23:56 imirkin_: karolherbst: just popcnt the two halves, and add
23:56 karolherbst: yeah, already done that
23:56 imirkin_: popcnt used to take 2 args, those 2 would be and'd together
23:56 imirkin_: i think in recent ISA's that's gone