02:19gnurou: RSpliet: I am currently struggling to get the Pascal ones loading in Nouveau (because much has changed against Maxwell). Once this is done, expect a release, but no acceleration support
02:52mooch2: mwk: can you please make some docs on nv1's display engine? particularly the partial vga compatibility would be nice, so that i can start implementing nv1 in 86box
03:50mooch2: i've got a nice 1498 mhz clock on my geforce 750ti
04:31Satchelboi_: That's a really good clock for that card
04:36mooch2: well, i had to turn it down for furmark
04:36mooch2: but for dolphin, it's great
05:48mwk: mooch2: the vga compatibility part is horrible
05:48mwk: the only part that's even remotely sane is the dac
05:50mwk: huh, the palette part is actually documented
05:50mwk: I forgot I did that
05:55mwk: mooch2: ok, I'll try to throw in some documentation about DAC pixel formats and PFB operation soonish
05:57mwk: VGA compat is a much more complex matter though
05:58mwk: there are 3 parts to it
05:59mwk: one is the palette register emulation... this stuff goes straight to DAC and is documented
06:00mwk: second is the VGA memory window handling, which probably works like a normal VGA, except for the weirdo access windows at b1800+ in physical memory
06:00mwk: ideally, I'd slap a hwtest on it
06:01mwk: and third is the text mode emulation, which sort-of implements text mode, but is entirely unlike any actual VGA card
06:02mwk: and batshit crazy
06:03mwk: and only the DAC part is actually part of the display subsystem, the other two parts just write stuff to memory that PFB will later read
06:23mooch: mwk: didn't you say that some of the crtc regs work?
07:03mwk: mooch: a few of them, yes
07:03mwk: the ones that are used by the text mode emulation
07:04mwk: + some ATC, SEQ, GC ones as well
07:04mwk: all GC ones even
09:23RSpliet: gnurou: ok thanks... no accel support sounds a bit odd, is that because of work on the mesa side or because of limitations in the distributed fw?
09:24gnurou: RSpliet: fw limitations. And I am sad to say I do not expect the situation to improve any time soon
09:27karolherbst: pascal fw?
09:27karolherbst: I see
09:28karolherbst: "any time soon" means half a year or more?
09:28gnurou: err hold on, I made a typo
09:28gnurou:is fighting against flu
09:28karolherbst: well okay
09:28karolherbst: this is no big deal for pascal
09:28karolherbst: we support like nothing in pascal right now
09:29karolherbst: most of the P vbios tables are different
09:29gnurou: so of course fw will enable basic acceleration, but no reclocking
09:29karolherbst: maxwell PMU images are a bit more important right now
09:29karolherbst: important as in, we need them asap
09:29RSpliet: gnurou: ah okay that's no shocker
09:29gnurou: by basic acceleration, I mean GR will be enabled
09:29RSpliet: not good, shame on your managers, but having GR enabled is first priority now
09:30gnurou: and that's what I meant to say, is that there are no Maxwell PMU releases on the horizon
09:30karolherbst: gnurou: so most likely no images in 2017?
09:30gnurou: ... if at all!
09:31karolherbst: that's enough information for me
09:32karolherbst: to be honest, if we don't get images from nvidia for the desktop GPUs, I don't see a point in supporting nvidias images for the tegras then
09:32gnurou: which is exactly what I am arguing internally
09:32gnurou: I am not taking sides. I will submit code and images, please do according do your conscience
09:32karolherbst: you would have to add and support the changes to nouveau code
09:33karolherbst: I am not saying we wouldn't include it, but we won't code it
09:33karolherbst: most likely
09:33gnurou: not that NVIDIA is expecting the community to manage Tegra support. At least we have always submitted code for it
09:34gnurou: but not supporting dGPU at the same level is a missed opportunity IMHO
09:34karolherbst: but getting the PMU done in a good way expects some changes to the nouveau code as well
09:34gnurou: yeah, and not light ones
09:34karolherbst: originally we wanted to have the same interface for nvidias and nouveaus pmu image
09:34karolherbst: and completly rewrite our pmu code
09:34karolherbst: so that the interfaces matches
09:35gnurou: note that these changes will also be required for Pascal FW as well, since the falcon reset code has moved into SEC2
09:35gnurou: so it won't be totally Tegra-centric
09:35gnurou: far from it actually
09:36karolherbst: it isn't about secboot really, I am more talking about the host <-> PMU communication
09:36gnurou: yeah - and that part will be needed in Pascal to enable GR
09:36gnurou: secboot will prepare the SEC2 falcon, and SEC2 will boot FECS and GPCCS upon receiving a given message
09:36karolherbst: the easy way would be: nvidia releases documentation about the pmu interface, and we adjust our pmu images to that, so we have the same
09:37karolherbst: I see
09:37gnurou: that's the funny thing, PMU images and messages format change all the time
09:37gnurou: internally, firmware is bound to RM and they evolve in lockstep, so this is not a problem
09:37karolherbst: for nouveau as well
09:37karolherbst: kind of
09:37gnurou: but yeah - I am already supporting 5 different RM versions in my code
09:38gnurou: the changes are small if the code is architectured to handle this, but still
09:38karolherbst: falcon images are shiped inside the kernel
09:38karolherbst: it's messy
09:38gnurou: and my calls to standardize this a bit are... well you can imagine
09:39karolherbst: is the interface different for chipsets within the same release?
09:39gnurou: no, thank the Gods of Poor Software Engineering
09:39karolherbst: okay, so within one nvidia release, every chipset has the same pmu interface
09:39karolherbst: this is sane enough then
09:40karolherbst: depends on how much the interface changes between releases
09:40gnurou: considering that the GM20B FW comes from r352
09:40gnurou: other Maxwells from r361
09:40gnurou: and Pascal will likely come from r367
09:40karolherbst: yeah, but how much does the actual message interfaces differ
09:40karolherbst: if there is a new "method" okay who cares
09:40gnurou: not that much, and I managed to confine the differences into small source files
09:41gnurou: but on some versions you have a different number of queues, etc
09:41gnurou: and the way to do ACR changes as well
09:41karolherbst: okay, so it would be a piece of cake fo make changelogs
09:41gnurou: yeah, explaining differences is not too difficult
09:41gnurou: still a PITA
09:41karolherbst: well true
09:41karolherbst: but in the end, it would take like 1 hour per mayor release?
09:42karolherbst: just to write it down I mean
09:42gnurou: less than that?
09:42karolherbst: I was pessimistic
09:42karolherbst: okay, so this is no big deal. Only to come up with the first draft of the interface I figure
09:44gnurou: yeah, I will send everything as soon as I get the Pascal FW to run...
09:44gnurou: which is also a PITA
09:44karolherbst: are there plans to do updates of the already released firmwares so that they match the same version?
09:45karolherbst: but I guess this opens another can of worms
09:45karolherbst: external firmwares are always crappy
09:53RSpliet: at least post-pascal we can start disassembling firmware with llvm or gcc
09:53karolherbst: I doubt that volta will get the new ISA already
09:54RSpliet: Why? NVIDIA has already presented their own RISC-V processor design
09:54karolherbst: and even that, compiling our own stuff is somehow more important
09:54RSpliet: karolherbst: http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html
09:54karolherbst: RSpliet: and? Does it mean it will be shiped with the next gen already?
09:54RSpliet: karolherbst: it means they are further with developing the core than you might think
09:55RSpliet: there's no guarantee of course, but it's not an open source project where you need to present early in order to attract more developers
09:55karolherbst: I know
09:55karolherbst: I still doubt volta will have those
09:55RSpliet: it's a closed source project that you don't present until the last moment to keep your competition in the dark ;-)
09:55RSpliet: time will tell
09:56RSpliet: if it does, we only need to figure out the brownfield extensions they did for vdec and the likes
09:56karolherbst: what good is this if we can't use our own stuff anyway
09:56RSpliet: opcode encoding schemes and a large body of scalar ops are documented and implemented in toolchains
09:57RSpliet: it eases reverse engineering quite a bit
09:57RSpliet: handy if you're genuinely curious about how the hardware works
09:57karolherbst: well true, but you know..
09:58karolherbst: Okay sure, maybe in 10 years you can actually brute force those keys and deploy your own images or so
09:59RSpliet: I like your practical mindset, but nouveau for me has also just been highly educational :-)
09:59karolherbst: :D true
10:03karolherbst: gnurou: one think I was thinking a bit about: would it be possible to get a small signed LS image which really only contains the fan control, which also just returns back to the unsigned callee and call this image from unsigned code?
10:04gnurou: karolherbst: technically, yes... practically, probably won't happen :(
10:05karolherbst: well, you could also give us a imagine where we put the reg we want to read/write into a register call the signed function and do our stuff ;)
10:05gnurou: and regarding updates to match newer versions: that would not be very useful, since nouveau.ko would still have to update older firmwares anyway to avoid breaking user-space
10:05karolherbst: we really would like to rather code our own stuff
10:05gnurou: I would like that too
10:06karolherbst: but I guess there is an internal reason for those images and this wouldn't really comply to it
14:39RSpliet: karolherbst: I think there's only one viable route to getting that sorted
14:40RSpliet: and that is by scandalously outpace the blob on Kepler and 1st gen Maxwell
14:42RSpliet: should be quite achievable for DirectX 9 games...
15:34NanoSector: how's GTX 950M supported on Linux nowadays, does Bumblebee support it?
15:36imirkin: my notes suggest that's a GM107...
15:37NanoSector: which is bad?
15:37imirkin: should be generally fine with nouveau. occasional rendering artifacts.
15:37NanoSector: but no reclocking right?
15:37imirkin: with 4.10, should reclock
15:37NanoSector: time to try, then
15:38imirkin: iirc there are weirdo artifacts in unigine valley. never diagnosed. they appear to be random, unfortunately.
15:38imirkin: which probably means we're not initializing something, or not flushing something, or who knows
15:39NanoSector: yeah. my GPU being unsupported was my main concern for not moving to Linux
15:39imirkin: with updated mesa, you should get basically the same level of feature support as on kepler
15:39imirkin: just a little buggier.
15:41NanoSector: time to ditch windows this weekend then
15:42NanoSector: does nouveau give better optimus performance than bumblebee nowadays btw? :x the latter was really abysmal with my Kepler, often being slower than intel graphics
15:42imirkin: sounds like that was most likely due to you not reclocking the kepler gpu
15:42NanoSector: provided you reclock
15:42imirkin: or due to you using glxgears as a measure of performance
15:43imirkin: whereas in such a case it's a measure of pcie bus bandwidth
15:43NanoSector: no no I mean if nouveau is faster than nvidia + bumblebee nowadays
15:43imirkin: highly unlikely.
15:44NanoSector: i see
15:44imirkin: nvidia has a 20-50% lead over nouveau. however worse the bumblebee approach is, i doubt it's that much worse.
15:45NanoSector: what's the bottleneck for nouveau actually, mesa or the kernel driver?
15:45imirkin: by bottleneck you mean "where the improvements have to happen"?
15:45imirkin: if so, it's in mesa
15:46imirkin: from a strict data flow analysis, with nouveau the bottleneck is the gpu
15:46imirkin: nvidia is able to make the gpu do things faster :)
15:46NanoSector: or do you mean incomplete reclocking support?
15:46imirkin: by reading the docs the hw engineers provided?
15:46imirkin: and then acting on that documentation
15:47imirkin: i'm not talking about reclocking...
15:47NanoSector: I always thought the proprietary drivers were just a bunch of per-game hacks to improve performance
15:47imirkin: and there's a bunch of stuff left that nouveau could improve. notable instruction scheduling is a sorely missing feature in our compiler.
15:47imirkin: well, i'm sure they have those too
15:47imirkin: but that's hardly the whole thing.
15:48imirkin: i suspect their data handling strategy is superior
15:48imirkin: and they make use of various little features that we're oblivious to
15:49NanoSector: maybe there's things nouveau does in software that the hardware can do?
15:49NanoSector: as example
15:49imirkin: no, just not driving the hw as effectively as possible
15:49imirkin: one thing that comes to mind is use of ZCULL (which is akin to HiZ in other GPUs)
15:49imirkin: [nouveau doesn't use it]
15:49imirkin: and 75 other things we don't know about.
15:50imirkin: nouveau is basically 2 full-time's-worth-of-engineers (maybe) without hw docs competing against a team of 100s of full-timers with hw docs. not exactly a fair competition.
15:51NanoSector: therefore it's great how far you have gotten
15:51imirkin: so we do what we can.
15:51hakzsam: imirkin_: 2?
15:51imirkin: hakzsam: i figure all us part-time volunteer contributors add up to a full-timer's worth of effort...
15:51imirkin: plus ben, obviously
15:53imirkin: also nouveau supports a much wider array of hw than nvidia.
15:53imirkin: riva tnt -> current are support in one form or another
15:53imirkin: while nvidia is fermi+ for their current drivers
18:08pmoreau: (Short notice: if anyone had troubles accessing the images at nouveau.pmoreau.org using Firefox >= 51.0 due to revoked certificates, this has now been solved; I have regenerated the certificates using Let's Encrypt rather than StartCom SSL.)
19:57mooch: are there any other nvidia emulators besides 86box, mame (lol okay), xqemu, and rpcs3?
19:57mooch: like, i need an emulator that emulates the vesa portion of the nvidia card
19:57mwk: ... vesa portion?
19:58mwk: you mean the extra vga regs?
19:59mooch: literally the only ones i've found have been 86box, and spc/at
19:59mooch: and i'm not sure spc/at is open source
20:00mooch: nope, doesn't seem to be
20:00mooch: the author is from belarus tho
20:04mooch: mwk: do you know of any other emulators that emulate the extra vga regs of an nvidia card?
21:29gregory38: hello a quick question
21:29gregory38: does nouveau support openCL (1.2)
21:38imirkin_: gregory38: nope
21:38gregory38: ok thanks you
21:38imirkin_: nouveau supports compute shaders on fermi+ though.
21:39imirkin_: [GL compute shaders]
21:39gregory38: Ok. But I have a full program written in openCL
21:40gregory38: I will test it on Nvidia closed driver when I reboot
21:40gregory38: thanks for the info
21:40imirkin_: yeah sorry
21:41gregory38: don't be sorry
21:42gregory38: having a gl driver is already a huge achievement
21:42imirkin_: unfortunately there's no path from llvm's opencl c compiler to nouveau's codegen
21:42imirkin_: ultimately it should be a spirv-style api, but that's not piped through yet
21:44imirkin_: someone was working on TGSI output from llvm, but that's gone incomplete from 2 separate attempts
21:44gregory38: oh too bad
21:46gregory38: anyway, not important, I'm pretty sure openCL is rather slow
21:46gregory38: (at least the current implementation)
21:46gregory38: (of my app)
21:47imirkin_: not 100% sure that clover supports CL 1.2 either, but that should be fixable... hopefully
21:47imirkin_: at least i think it has images now
21:47gregory38: what is clover ?
21:48imirkin_: state tracker exposing OpenCL (and converting it to gallium api's)
21:48gregory38: oh ok
21:50gregory38: by the way, you said that persistent buffer are always in GART
21:51imirkin_: with nouveau
21:51gregory38: so potentially you could put them in the vram
21:51imirkin_: for non-coherent ones
21:51gregory38: and the user (the app) can access it
21:51gregory38: or does it requires a temporary kernel buffer
21:53gregory38: I guess if it is only possible for non-coherent, it means there is 2 duplicated buffers. one in host and one in vram
21:53imirkin_: tbh i haven't really thought about it
21:55gregory38: I was curious (if for small buffers) it won't be faster to memory map the PCIe BAR in user space
21:55gregory38: so an application can directly write into it
21:55gregory38: instead to write in host and then read it from the GPU
21:55imirkin_: could be
21:56imirkin_: i haven't really investigated it
21:57imirkin_: it's easy to flip if you want
21:57imirkin_: the logic's in nouveau_buffer.c
21:58gregory38: yes files was already open
21:58gregory38: need to look at your competitor ;)
21:58gregory38: dunno what amd does
21:59imirkin_: well, they have various restrictions
21:59imirkin_: like they can only map 256MB of vram at a time
22:01gregory38: but do we have more ? lspci -vv
22:01gregory38: give smaller bar (or is it unrelated)
22:01gregory38: Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
22:01gregory38: Region 1: Memory at f0000000 (64-bit, prefetchable) [size=128M]
22:01gregory38: Region 3: Memory at f8000000 (64-bit, prefetchable) [size=32M]
22:01gregory38: Region 5: I/O ports at e000 [size=128]
22:01imirkin_: we can map as much as we want with some kind of craziness that i don't fully understand
22:04gregory38: or maybe the above is the gart memory
22:10gregory38: hum it seems AMD use GTT when GL_CLIENT_STORAGE_BIT is set (for write)
22:10gregory38: otherwise VRAM
22:11gregory38: with extra flags for write-combining and cpu access
22:54imirkin_: mlankhorst: could you explain how the vram mapping stuff works for g80+? i.e. how is it that we're not constrained to bar size?
22:55imirkin_: iirc you made it that way
23:54snkcld: if i have optimus and am using KMS, should i see just the one "modsetting" provider in "xrandr --listproviders" ?