00:00mwk: it has 3 fields: format (float / short / byte rgba / byte argb), component count (1-4 or 0 to disable), stride
00:00mwk: on NV15+, it's somewhat sane
00:00mwk: it auto-converts invalid component counts and formats to some valid ones, but meh
00:01mwk: but on NV10, when you read back that register, component count of 0 is converted to some other count (depending on the attribute)
00:01gnarface: imirkin: uh.... when you say "G92" that doesn't count this does it? NOUVEAU(0): Chipset: "NVIDIA NV92"
00:01nyef`: gnarface: Okay, I can see where that would be neat.
00:01gnarface: i'm hoping NV92 != G92
00:02mwk: so if you naively context-switch that thing by reading and writing that register, it'll randomly reenable some vertex arrays
00:02imirkin: gnarface: sorry, i do precisely mean that
00:02gnarface: imirkin: well i haven't tried it yet, but i assume then that i should not bother, and get a different firmware version?
00:03mwk: I found a slightly hacky workaround for that, but there's also another problem
00:03gnarface: or am i just S.O.L. ?
00:03mwk: on readout, one bit of the format for secondary color is taken from the format of the primary color instead
00:03mwk: and I have no idea how to fix that
00:03imirkin: gnarface: well, you can try it and see what happens. reports are that it hangs.
00:04nyef`: Well! On my second reboot since I started trying to hack the infoframes into the kernel, and I'm getting obvious 3D displays out of my panel... Though at least one of the modes causes the panel to complain about not being supported.
00:04gnarface: imirkin: what i meant was, should i try a newer firmware version, or are the reports that they all hang on this GPU?
00:05mwk: if it's enabled, I suppose I could submit inline vertex data for secondary color and see how it's decoded...
00:06mwk: and if it's not enabled, the format shouldn't matter either way
00:07imirkin: gnarface: i don't think that'll help
00:09karolherbst: mwk: :( I guess for "simple" hardware, the didn't care much about keeping things sane
00:09karolherbst: so they just hacked something up which somewhat works
00:12mwk: well, anyho
00:12mwk: the whole reasonably available celsius state is now modeled in hwtest
00:12imirkin: time to start drawing? :)
00:13mwk: not yet
00:13mwk: time to start Transforming and Lighting :)
00:15mwk: well, technically I'm not modeling the NV17+ weirdo new memory
00:15mwk: which is likely zcull
00:15mwk: but... meh, base Celsius comes first
00:16mwk: and I'm going to have so much fun figuring out the T&L float formats and operations on them
00:17mwk: formats, plural.
00:17imirkin: yeah, when i think of fun, that's definitely the first thing that comes to my mind as well -- float formats.
00:17mwk: maybe I should take a look at NV20 first
00:18mwk: Celsius T&L should use pretty much the same underlying operations as Kelvin, except on Kelvin they're programable
00:19mwk: but that would require another week or two of mapping Kelvin state
00:35nyef`: Okay, the mode test works for all 3D modes *except* for the frame-packing modes.
00:36nyef`: For which the panel claims that the mode is not supported.
00:38nyef`: Which is not particularly surprising, given the decoded EDID.
00:39nyef`: I guess I'll have more information on that front in about a day.
00:49imirkin: which one's frame packing?
01:05imirkin: ouch. 40% of blob perf on dota2. i wonder why we do so poorly
01:11nyef`: imirkin: Frame packing is the one where you have to double your required clock.
01:11imirkin: ah ok
01:11nyef`: It transmits an image for one eye, then a vblank-sized gap, then an image for the other eye.
01:11imirkin: seems like it could support that with 1920x1080@30 *2
01:11imirkin: or @24
01:12nyef`: The *panel* doesn't support it.
01:12imirkin: i get it... i just mean it's odd that that's the case.
01:13nyef`: I think it's that they didn't bother with the decoding for it, or maybe the memory.
01:20nyef`: The real question, to my mind, is why the frame-packed modes are showing up as options. The panel EDID doesn't seem to claim to support them, after all.
01:21imirkin: they're not
01:22imirkin: oh, you mean in the OSD or whatever?
01:22nyef`: Yeah, the OSD says "Not Supported". The EDID doesn't claim for them. The kernel still presents them as options.
01:25imirkin: oh. that's weird. i wonder where they come from.
01:25imirkin: all that logic is pretty new and untested, so could just be broken
01:26nyef`: Yeah, the only driver that even supports the whole 3d thing is the i915 driver.
01:26nyef`: And that, the only userspace bits that use it appear to be in intel-gpu-tools.
02:26gnarface: imirkin: yep, it went splat. :( http://paste.debian.net/907514/
02:26gnarface: imirkin: didn't lock the whole machine, but it also didn't work. that pastebin is what appeared in dmesg
02:29gnarface: imirkin: instead of the video stream opening, i just got a black screen. cursor was still responsive. host didn't notice anything (thinks the game worked)
02:32gnarface: imirkin: i don't suppose there is any info i could provide to help resolve this issue?
02:33gnarface: the log file still shows it failing over to software decoding, which makes the failure to render anything... weird
02:48gnarface: oh, actually, correction, the steam log at the host does show it trying to use hardware decoding, but then all the rendering times are 0 (which i assume means it just fails to render a single frame after initializing the hardware decoding libs)
03:52nyef`: I guess the next step is to figure out the audio thing.
05:14imirkin: gnarface: i've debugged it a bit with someone a while back. no clue why it's happening.
05:17imirkin: it's *some* bit of init missing
05:18imirkin: or some bit of init we do incorrectly
05:18imirkin: or ... something else still
05:18imirkin: upshot is - it doesn't work. sorry!
09:30pmoreau: karolherbst: I am trying to reclock my GM206, using the latest master from Ben (which I assume contain all needed patches), but when cat'ing as root the pstate file, I get "cat: pstate: No such device".
09:30pmoreau: Did I miss anything?
11:39Chiitoo: Hies! What are the requirements for a chipset to be added to the list of 'NvFamily NVKnownFamilies'? I noticed 'Xorg.0.log' telling me my chipset was unknown, though it works (to a degree). (NV126 (GM206) GTX 960)
11:41mwk: Chiitoo: mostly nobody cares about that list
11:41mwk: it's not used in any driver logic, it's just for display purposes
11:42Chiitoo: Mostly what I thought, then, heh. It was only a little bit distracting when testing things out, but it wasn't that bad in the end.
11:42mwk: in fact, it hasn't been updated since the Fermi days
11:43Chiitoo: So it would seem. :]
11:43mwk: probably someone gave up since nvidia no longer mapped family names to chipsets since around that age :p
11:45mwk: now a geforce 800m card could have a Fermi, Kepler, or a Maxwell chip
11:45mwk: it's a mess
11:50karolherbst: Chiitoo: you could update your nouveau ddx
11:50karolherbst: or leave it as it is, you still get opengl acceleration
11:50karolherbst: and 2d
11:50karolherbst: but 2d is based on glamor without the nouveau ddx
11:50mwk: karolherbst: by all logic, it seems to be mpopaddret REG2 IMM16S ENDMARK
11:51mwk: does this match the code?
11:51karolherbst: it uses SP
11:51karolherbst: most likely
11:51karolherbst: needs, encodes, whtvr
11:52mwk: well, it's a pop instruction, of course it uses $sp!
11:52karolherbst: ohh, so this is implicit?
11:52mwk: well... pop, by definition, takes something from the stack pointer, then bumps it
11:52karolherbst: ohh I see
11:53pmoreau: karolherbst: I am trying to reclock my GM206, using the latest master from Ben (which I assume contain all needed patches), but when cat'ing as root the pstate file, I get "cat: pstate: No such device". What could I be doing wrong?
11:53mwk: and the "add" part refers to the implicit "add immediate to sp" part
11:53karolherbst: pmoreau: yes, because it isn't enabled on gm2xx hardware, cause it is _messy_ to do
11:53karolherbst: pmoreau: basically you have to disable secboot, use the nouveau pmu image, reload nouveau with secboot enabled
11:54pmoreau: But shouldn’t the cat print the available ones, and echo say "unsupported device"?
11:56pmoreau: I thought that even if reclocking wasn't enabled, you could still cat the file to see the available states, just not change them
11:56karolherbst: I think the subdev stuff isn't hooked up
11:56karolherbst: pmoreau: https://github.com/karolherbst/nouveau/commit/e81c44ecebf21631e6e42ffcec9da8494b0027ae
11:57pmoreau: I see, so I should try with that branch instead
11:59karolherbst: wouldn't work
12:00karolherbst: secboot is performed on the PMU, but the PMU is needed to reclock memory
12:00karolherbst: I still need to write the code to load the nouveau pmu image after performing secboot and to reset the PMU so that it works
12:00karolherbst: but I think with gnurous last patches it should be pretty easy by now
12:01pmoreau: Couldn’t you still reclock without the PMU firmware? Just had to be careful because the fan can’t be controlled once performing secboot
12:02karolherbst: memory reclocking is done on the PMU
12:03karolherbst: well sure, you can reclock the engines, but that doesn't give you much
12:03karolherbst: pmoreau: the fans can't be controlled at all, even without performing secboot
12:04pmoreau: Ah! Ok
13:27mwk: alright, the "base case" for Celsius T&L is established
13:27mwk: and it's already a mess to simulate, even with everything set to bypass mode
14:29mwk: crazy Celsius multiplication implemented
14:30mwk: sort of IEEE-754, but denormals are 0, rounding is always to 0, NaN and Inf are apparently the same thing, and second-highest exponent is apparently mostly treated as an Inf/NaN, but not always
14:35nyef`: Well, I may just have found the ELD bug thing on gt215.
14:35nyef`: And it probably doesn't affect gf119.
14:44nyef`: Hrm. Well, that fixed the ELD, but I'm still not seeing HDMI audio as an output...
14:45nyef`: ... which may be my fault with the other changes I have going on in this kernel.
14:45nyef`: Or I could "just" be missing a config option somewhere.
14:48mwk: scratch that, Inf and NaN are not the same
14:49mwk: however, Inf apparently doesn't have a sign
14:49mwk: and Inf * 0 == 0, but NaN * 0 == NaN
14:50mwk: and the second-highest exponent thing is actually some sort of post-processing that happens after xfrm, so I can't actually see proper intermediates... lovely
14:55nyef`: The perverse part of me wants to know: Is Inf * -0 == -0 ?
14:55mwk: nope, +0
14:55mwk: in general, any 0 * anything is +0
14:56mwk: I don't think you can get a -0...
14:57mwk: nope, not from multiplication at least
14:57nyef`: I remember when I ran into a -0 situation, they tended to disappear easily.
14:57nyef`: I forget which CPU it was on, though.
14:58nyef`: Might've been HPPA.
14:59mwk: Difference in reg CELSIUS_PIPE_OVTX.POS.W: expected 00000000 real 80000000
14:59mwk: correction, -0 is possible
14:59mwk: but how...
15:03mwk: well, this happened when I first attempted a calculation that would feed a mul result into another mul
15:03mwk: so there's a very evil possibility that intermediate results in xfrm are in higher precision than the inputs/outputs
15:05mwk: or not
15:05mwk: that's just a straight mul
15:07mooch: what the bloody hell is going on with this
15:08mwk: ahh, lovely
15:08mwk: so any 0 * any 0 is +0
15:08mwk: but you can get a -0 by underflowing things
15:08mwk: actually, any 0 * anything [except NaN] is +0
15:09mwk: and any Inf * anything [except 0, NaN] is +Inf, but you can get a -Inf by overflowing
15:10mwk: which means that the bypass mode is broken, since you can produce a -0 in OVTX.POS.[XYZW], but if you write -0 there with bypass, it'll get squashed to +0
15:10mwk: but -0 handling is probably the least of your problems if you're context-switching Celsius
15:17nyef`: Hrm. Is this not working because it shows up as extra codecs on the same card and the userland not supporting that, or is something else going on?
15:20nyef`: Yeah, that looks likely. Now, does my main system have one or two HDA PCI devices?
15:20nyef`: ... Two.
15:24nyef`: And a cross-check against my test system will have to wait until this afternoon/evening, but should at least answer the question of if the fix is good.
15:25mwk: ah fuck, I'm not doing just a mul, I'm doing a mul and add wiht 0
15:26mwk: of course the add is going to disturb things
15:27nyef`: Yeah, that'd do it. Mul preserves the sign, but the add trashes it?
15:27mwk: could be
15:32mwk: alright, mul2mul seems to be working now
15:32mwk: turns out overflowing results in the highest finite number, not Inf
15:32mwk: so... adding time
15:33nyef`: Hrm. Is there a mode switch on that, for clamped/clipped arithmetic instead of overflow, or is that intrinsic to the hardware?
15:34mwk: if there was, I'd have found it by now
15:34nyef`: I don't know what the actual term is for arithmetic that does that.
15:34nyef`: Fair enough. (-:
15:34mwk: this is not a general-purpose FPU
15:34mwk: the results are only ever supposed to be used by the rasterizer
15:35nyef`: Long pre-OpenCL, huh?
15:35mwk: so, who cares what happens if the result is close to Inf... Inf or maxfin, rasterizer won't draw anything sane either way
15:35mwk: long pre-shaders, even
15:36mwk: the main problem is that I'm not even sure what operations that thing is performing, since it's all fixed-function hardware
15:36mwk: I can only select from some set of configurations, and can't look at individual ops
15:37Brainzman: Anyone know how to fix the loud fan problem?
15:42Yoshimo: karolherbst: so what have you found out about maxwell this weekend so far?
15:43nyef`: ... There's a "Maxwell's Demon" joke around here somewhere, isn't there?
15:44mwk: nyef`: well, now that you said it...
15:44mwk: it's PDAEMON, but close enough
15:45nyef`: Heh. It's in charge of cooling, too, isn't it? (-:
15:45mwk: of course it is
15:47karolherbst: it's isn't called pdaemon on recent hardware anymore though
15:48pmoreau: Brainzman: IIRC mupuf was looking into so loud fan issues, but I am not sure how far he got.
15:48karolherbst: pmoreau: he got far, it's a mess. I am sure he would write up hacky patches to fix it to a dagree it's fine for the user
15:49pmoreau: I remember it being a mess
15:52Brainzman: pmoreau : thanks
16:03mwk: yay, addition is a special snowflake...
16:03mwk: sort-of round to 0, but doesn't seem to bother with things like guard digits
16:07mwk: ... add supports denormals?
16:07mwk: now that's a surprise
16:12mwk: ... more like fails horribly on denormals
16:20tacchinotacchi: i hate to come an ask this question once again but
16:20tacchinotacchi: where is a source file where I can see a working example of reclocking?
16:24karolherbst: tacchinotacchi: what do you mean?
16:24karolherbst: there is no "one" source file for reclocking, there are many files involved
16:26tacchinotacchi: I know, it's just that I come here once in a while asking for help, as I'd like to help in reclocking for fermi, and often I've been told to "look at this file" where I could see some working reclocking code for kepler
16:26tacchinotacchi: karolherbst: but I don't remember what file it was, so I'm asking
16:27karolherbst: mhh, on fermi there is mainly memory reclocking missing
16:27tacchinotacchi: last time I checked core reclocking was completely absent
16:27karolherbst: in drm/nouveau/nvkm/subdev/fb/ramgf100.c is the fermi stuff
16:27karolherbst: it's disabled
16:27tacchinotacchi: but it was a while ago
16:27karolherbst: but it should work
16:28tacchinotacchi: ok this is one of the files I was talking about, thanks
16:28karolherbst: engine reclocking alone doesn't give you much though, but we should enable it maybe. I wrote patches a while ago
16:29karolherbst: ask RSpliet for most things though
16:31tacchinotacchi: do you think I need to know graphics APIs well to get a grip on this?
16:31tacchinotacchi: I mean on power management code
16:32karolherbst: the main task here would be to reverse engineer how it works
16:32karolherbst: the reclocking sequence
16:32karolherbst: it's mainly building a working script for the PMU
16:35tacchinotacchi: I'm sure this is a stupid question, so I'm prepared to recieve a RTFM, but why can't we just intercept the data sent by the blob and send the same to get a given frequency?
16:37karolherbst: tacchinotacchi: because it is different for every frequency
16:38karolherbst: there are memory timings
16:38karolherbst: there are different memory types
16:38karolherbst: and there are multiple flags inside the vbios which affect what has to be done
16:38karolherbst: so sure, you can intercept it for one gpu, but then it only works on this one and maybe a handful others
16:38tacchinotacchi: i've just found this nice doc, I'll dig into it https://media.readthedocs.org/pdf/envytools/latest/envytools.pdf
16:42karolherbst: which won't help, because it's autogenerated from envytools I think
16:43karolherbst: it's "ours" not nvidias or so
16:44tacchinotacchi: it will help me because I don't know as much as you devs do
16:45karolherbst: in general yes
16:45tacchinotacchi: i take it that I also have to know how ram timings work?
16:56nyef`: I guess I don't need to worry about this stuff for the MCP89, do I? (-:
16:56nyef`: Or do I?
17:02pmoreau: I *think* reclocking should be working on MCP89
17:03nyef`: Okay, I'll probably try it within the next couple of days.
17:17fbe: hi, are nouveau and efi-fb compatible? do i need efi-fb when booting via uefi to see a console or does nouveau do this work for me? i don't really understand the interaction between framebuffer and nouveau (do i still need framebuffer support with nouveau?)
17:19nyef`: fbe: I'm given to understand that the efi framebuffer will kick in before nouveau, so that you can see what's going on, and then nouveau will take over once it loads.
17:22mwk: fbe: nouveau is supposed to take over from efifb once it loads
17:22nyef`: So, you probably don't need efi-fb, but it should do no harm, and plausibly some good.
17:26fbe: ok, so i should see my normal ttys once the kernel loads nouveau (if compiled into the kernel directly) if i have no framebuffer support, right?
17:27mwk: uh, no framebuffer support?
17:27fbe: or do i need framebuffer support for seeing the ttys?
17:27fbe: ah okay =) thank you very much mwk / nyef`
17:31imirkin: fbe: nouveau kicks efifb out and replaces it with "nouveaufb"
17:32imirkin: nyef`: not sure if reclocking works on MCP89. RSpliet made it work on MCP77/MCP79 though, and i suspect that MCP89 wouldn't be extremely different.
17:32imirkin: nyef`: most of the difficulty in reclocking tends to be around changing memory timings, and those chips have none :)
17:33nyef`: Okay, so I have some testing, and possibly some digging to do. But, yeah, that's why I figured I wouldn't need to worry about memory timings. (-:
17:44Edward_Black: Hello nouveau community! I have many questions (trying to better understand stuff about taking firmwares out of nvidia package)
17:45Edward_Black: Basically, I've read (yeah...) that 1070 support is limited due to lack of firmware. On nouveau site I've read that it's possible to extract firmware out of nvidia driver distributions... So... Is it the same kind of firmware being talked about here?
17:46Edward_Black: And if yes, did anyone successfully get a 1070/1080 GPU work both with 2D and 3D (or rather, in my case, 2D and video are more important :) ) off the ground this way?
17:50karolherbst: now is fuzzing time
17:52Edward_Black: karolherbst sorry, I don't understand. English not first language of mine.
17:53karolherbst: Edward_Black: I wasn't refering to you
17:54Edward_Black: Okay :)
18:02pmoreau: Edward_Black: NVIDIA changed the way they upload the firmwares to the GPU in newer version of their driver, and old versions do not have the necessary firmwares for Maxwell v2 or Pascal cards.
18:03Edward_Black: pmoreau so basically, I'm SOL until further notice ?
18:03pmoreau: And no one found out yet where those firmwares are now hiding
18:04Edward_Black: Okay, thanks. I'll stick around and will be keeping an eye for updates. Will use linux via embedded graphics for now
18:36mwk: well, success
18:36mwk: I figured out add
18:37mwk: so they have 3 guard digits *too many*, but it doesn't do a whole lot good, since the sticky bit doesn't actually have the sticky behavior
18:40mwk: oh, and the intermediate result between mul and add appears to have extra dynamic range
18:40mwk: so yeah... a MAD
19:06tacchinotacchi: mwk: figured out add on what architecture?
19:06mwk: not a MAD, I just screwed up sign handling :)
19:07mwk: tacchinotacchi: NV10 fixed-function vertex processing
19:07mwk: they have their own brand of floating point
19:07mwk: or rather, at least 3 different brands
19:08mwk: so... I've got mul and add figured out
19:08mwk: time to do a matrix multiplication :)
19:08mwk:wonders what kind of horrors will happen once the "transform bypass" bit is turned off...
19:09tacchinotacchi: are you a hardware engineer? :P
19:12mwk: nope, I just like to RE it
20:34tacchinotacchi: I'm wondering, to come up with a list of registers, do you look at the physical chip?
20:36imirkin: you just poke at stuff over the PCI bus
20:37tacchinotacchi: so we can write directly to registers?
20:50imirkin: you can perform reads and writes over the PCI bus. what that does internally is ... tricky
20:50imirkin: sometimes it's just register, other times it's a "real" thing
21:03tacchinotacchi: do in rams in general ( not necessarily in the gpu ) pick their timings by themself, or is it done by the cpu/motherboard/whatever?
21:04imirkin: it's done by the BIOS for ram on the motherboard, generally firmware for most devices that have ram. gpu's are a lot more configurable, so it's done by both vbios scripts and drivers can then further reconfigure.
21:09nyef: Where'd my C-x go?
21:12nyef: Okay, tried 3D with another panel... And still with the bogus modes being set. And managed to find semi-usable audio controls, but it's not producing HDMI sound even with a good ELD.
21:12imirkin: are you playing to the right audio device?
21:13nyef: Went through every output labelled "HDMI" in the sound properties window.
21:13imirkin: also, fyi, you have to have a video mode set for the audio to work, i think
21:13nyef: Any video mode, or a specific mode?
21:14imirkin: the output can't be off.
21:14nyef: So, doing the test through the GUI is fine? (-:
21:14imirkin: i think it can in theory, but not the way nouveau operates these things
21:14imirkin: as long as the GUI is appearing on the HDMI screen, yes :)
21:15imirkin: anyways, i'd look at aplay -L output, and then try stuff with aplay -D hw0,1 or whatever.
21:15imirkin: i don't trust GUIs
21:15imirkin: they lie.
21:15nyef: Everything lies.
21:16imirkin: yeah, but understanding what some gui does takes ages. cmdline programs are much more straightforward, thus easier to use :)
21:38mwk: cute, a 4-way fused add operation
21:39mwk: I suppose that's what Kelvin's DP4 instruction is going to use
21:39mwk: well, I can multiply matrices!
21:44mwk: now... the perspective division
21:44imirkin: that should be fun.
21:44mwk: I suppose that corresponds to RCC on Kelvin
22:06mwk: wtf, bit 0 of W coord is masked to 0...
22:08mwk: on input
22:16Jeansf: hehe, can laneID be negative really?
22:23mwk: RCC is, obviously, hilariously broken
22:23mwk: RCC is, supposedly, clamped RCP
22:24mwk: ie. clamp the result to 2**(-64)..2**(64), or the corresponding negative range
22:24mwk: except, it clamps the exponent of the result, but not the mantissa
22:25mwk: so once you go past 2*64, it repeats [2**64, 2**65) for every exponent
22:29Jeansf: https://devblogs.nvidia.com/parallelforall/faster-parallel-reductions-kepler/ the one guy has negative laneid to access thread96
22:30Jeansf: it like seems that mbcnt can take negative vsrc1 too
22:32Jeansf: VDST[LANEID] = (EXEC & (1ULL<<INLANEID)) ? ADDR[INLANEID] : 0 , quite weirdo when