00:14 Hijiri: My friend told me that there is fermi reclocking support in kernel 4.10, is that true?
00:20 nyef: Hijiri: From about six hours ago: <imirkin_> currently there's reclocking support for nv4x, GT21x, GKxxx, and GM10x <imirkin_> oh, and G94-G200 <imirkin_> although that's less-well-tested
00:21 Hijiri: nyef: ok, thanks
00:21 nyef: You might try a 4.10 and see if it works, and if it doesn't feel free to pitch in to get it working for 4.11 or 4.12 or so.
00:23 Horizon_Brave: anyone using Debian Stretch?
00:24 Hijiri: I am
00:24 Hijiri: nyef: I don't think my card is in that list
00:24 nyef: Hijiri: My GF119 also isn't in the list, but my current priority isn't getting reclocking working for it, unfortunately.
00:25 Horizon_Brave: for amd64 by chance? what kernel is it on right now? 4.9? when it goes stable is it going to remain at 4.9?
00:25 Hijiri: Horizon_Brave: I think stretch is meant to be 4.9
00:25 Hijiri: I'm going to hop on the next testing anyway
00:25 Hijiri: whenever that is
00:26 Hijiri: nyef: I might have asked in this channel before, but what are the most important parts of the "IntroductoryCourse" wiki page to read to get started with driver development?
00:27 imirkin: Hijiri: your friend lied to you, maliciously.
00:27 nyef: I... don't know? I also don't know how up-to-date the wiki is.
00:27 imirkin: Hijiri: that said, someone has been playing around with it a bit, but it will most likely not work on your board
00:28 imirkin: nyef: when in doubt, not very.
00:28 nyef:is still fairly new here.
00:28 Hijiri: oh, ok
00:28 Hijiri: are there other resources I should check out then?
00:28 Horizon_Brave: wow..didn't just lie to you...he lied maliciously..that's serious stuff guy
00:28 imirkin: what are you looking for?
00:28 Hijiri: I want to help with reclocking support on Fermi
00:29 imirkin: your best bet is to talk to RSpliet who has been working on it on and off for the past ... 2 years?
00:29 imirkin: that seems like a lot. maybe more like 1?
00:29 Hijiri: sounds like a tough problem
00:29 imirkin: his super-duper experimental tree is available at https://github.com/RSpliet/kernel-nouveau-nv50-pm -- i believe it kinda-sorta works for one of his boards.
00:30 imirkin: unfortunately there's no "easy 5-step guide to getting reclocking working". if there were, the person writing the guide would have just done it :)
00:32 nyef: Looks like a year and a half?
00:32 Hijiri: should I just ping him on IRC?
00:33 dboyan_: imirkin: Bad news. I did see misrenering with disk cache enabled.
00:34 dboyan_: It was in Portal 2, when I looked into one of the portals, it displays thing at the back of that portal
00:34 dboyan_: Any idea how I should debug that?
00:35 imirkin: Hijiri: yeah, that's good. he doesn't have a ton of time to work on it... he's busy with ... work? postdoc? i haven't kept up =/
00:35 imirkin: dboyan_: good news - i have a trace of that, from when we had a little oopsie on nv50
00:35 imirkin: (bad news - i haven't the faintest clue where it is)
00:36 imirkin: aha! i think i found it.
00:36 Horizon_Brave: lol...a little oopsie...
00:36 Horizon_Brave: sounds like an interesting story...?
00:37 Hijiri: RSpliet: How would I help with adding reclocking support to Fermi? I've never done kernel development, which would probably be a problem, but I do have a GF114
00:37 imirkin: not really. just a bug.
00:39 imirkin: Hijiri: mostly it takes an incredibly amount of patience, access to hardware, and lots of tracing/analysis.
00:41 Horizon_Brave: you lost me at patience....
00:41 dboyan_: imirkin: http://imgur.com/Qemfd1g and http://imgur.com/fOn9nQB
00:41 dboyan_: It only happens at some portal.
00:43 imirkin: oops
00:43 imirkin: do you have a trace?
00:44 dboyan_: wait a minute
00:44 imirkin: i don't want it
00:44 imirkin: just asking if you do :)
00:47 imirkin: if you don't - make one, find the relevant draw call where it messes up, and figure out the diff between the shader cache case, and the non-cache case
00:51 dboyan_: I just made one. 640M in size...
00:51 imirkin: ok, mine's still uploading =/
00:56 dboyan_: imirkin: Do you think it has anything to do with fixups? That's where I suspect most
00:56 imirkin: it's unlikely - the fixups VERY rarely do anything
00:56 imirkin: although tbh i don't fully remember what they are
00:57 dboyan_: or maybe flags or headers?
00:59 imirkin: that's more likely.
01:00 nyef: When I'm passing a bag of bytes (or two) through nvif, should I be padding out to a 32-bit boundary?
01:00 dboyan_: I think it's about the very same tgsi ends up in different program binaries, and I forget to take something into account
01:01 imirkin: nyef: that's a question for skeggsb, who's not around. take a look at the nvif_unpack macro, that should yield some hints.
01:01 imirkin: my suspicion is that "no", but i'm not sure.
01:01 imirkin: dboyan_: could be. if you like, i could look at your code if you clean up your commits a bit
01:01 imirkin: perhaps i can spot the issue
01:04 dboyan_: gotta go now, I'll dig into that later
01:05 Horizon_Brave: imirkin: is what you just described a very general way that you guys actually develop drivers for prop. hardware and fireware?
01:05 Horizon_Brave: firmware**
01:06 nyef: Ah well, I'll just add the question to the cover notes once I get that far.
01:21 imirkin: Horizon_Brave: what description are you talking about?
01:24 nyef: ... patience, hardware, and tracing/analysis?
01:24 imirkin: that's how you RE things
01:25 imirkin: you develop drivers against undocumented hw very much like you do against documented hw... except you also have to write the docs.
01:26 imirkin: [or gain the same level of understanding as you would from reading the docs]
01:26 nyef: There's also poking at the hardware registers to see what effect they have, which is a bit more intercessory than mere trace analysis, but yeah, that's RE.
01:26 nyef: Heh. As if docs are ever complete and accurate. (-:
01:42 imirkin: nyef: hm, "define:intercessory" isn't coming up with anything useful. care to edify?
01:49 nyef: How about "intercede"?
01:51 nyef: Intercessory isn't quite the right word, because it's "on behalf of another", but... Basically, poking about to see how the hardware responds to input not necessarily in the trace file.
02:20 Lyude: this is turning out to be a challenging rectangle
02:58 nyef: Okay, the changes for the v2 patch series are now in my working tree, compiled, and partly tested. Progress!
02:59 nyef: Still have to finish testing and get everything squared away as a proper (revised) commit series, and possibly make another attempt to figure out the frame-packing thing, but still, progress. (-:
03:29 Horizon_Brave: So... poking around my /lib/firmware directory... I noticed some manufacturers firmware is in .fw format (like b43 ) and other's are in .bin format like Radeon, RTL etc.. what's the difference? are both used as firmware but just written in different languages?
03:30 nyef: Horizon_Brave: They're just different file extensions.
03:30 nyef: The actual format is dependent on the driver and the hardware.
03:31 Horizon_Brave: hmm I see, so it's pretty much arbitrary, but more so about what it's being used by..
03:31 nyef: ... And I have a lead on the frame-packing stuff! But now is not the time to chase it down. /-:
03:54 Horizon_Brave: Okay...here's one... I notice that on my Debian system, the nouveau driver is a .ko..kernel object...but from what I can tell, there is no firmware files for it. Most of the files are in /lib/modules/kernel/<version>/etc.. So this means that these are rolled up in the kernel... but the other .fw and .bin firmware bits are in /lib/firmware, the difference is that these are put into the debian package but not in the kernel righ
03:54 Horizon_Brave: these individual firmware files?
03:55 Horizon_Brave: and the 2nd part is.... why nouveau gets the special treatment of being directly implemented into the kernel? and not AMD? (not that I"m complaining around these parts.. just curious lol)
03:59 nyef: There are *two* AMD kernel drivers.
03:59 nyef: Possibly more.
04:00 nyef: And most (all?) of the AMD cards require firmware.
04:01 Horizon_Brave: right... well in the case of this specific instance, from what i can see here... there are no nouveau firmware files that I can see...
04:01 Horizon_Brave: just the nouveau .ko files
04:01 Horizon_Brave: unless they're named as something more obscure
04:03 dboyan_: Horizon_Brave: Nearly every gpu driver are devided into 2 parts, the kernel driver part and the userspace parts
04:03 dboyan_: They handle different tasks
04:03 nyef: And there are typically at least two userspace parts, aren't there?
04:03 dboyan_: which two?
04:04 nyef: The mesa driver and the X11 driver.
04:05 dboyan_: yep, but they are different things
04:05 nyef: Sure. One does the 3D bits, the other does some 2D bits and mediates with the X server.
04:06 dboyan_: Horizon_Brave: For gm200+, the firmware lives in the nvidia/ directory of the linux-firmware package.
04:06 dboyan_: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/nvidia
04:08 Horizon_Brave: dboyan_: ehh...what am I looking at here..?
04:08 dboyan_: Horizon_Brave: And for previous generations, the firmware are in kernel tree, with assembly in .fuc suffixes
04:09 dboyan_: Horizon_Brave: The firmware released by nvidia (gm200+) has .bin suffix
04:12 Horizon_Brave: what you're talking about, and what you linked to...thes are the official nvidia drivers? the non-free ones? or the nouveau ones developed here?
04:14 dboyan_: Horizon_Brave: They are for the nouveau driver. gm200+ need firmware from nvidia because of signing
04:15 dboyan_: Horizon_Brave: If you want to see what firmware look like in previous generation (the free and open ones). You can take a look at e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/nouveau/nvkm/engine/gr/fuc directory.
04:19 Horizon_Brave: oh wow, so that means that nvidia actualy did provide free and open firmware in the past?
04:20 dboyan_: No, those free ones are reverse engineered
04:20 nyef: No, it means that some people are really, REALLY good at reverse-engineering.
04:20 Horizon_Brave: lol oh
04:21 Riastradh: ...and then NVIDIA started verifying signatures on firmware...
04:21 Horizon_Brave: silly me, thinking nvidia would do good...
04:22 Riastradh: So even the really good reverse engineers can't make their own firmware for newer hardware models any more.
04:22 Horizon_Brave: ohhh so that's why they sign their firmware... to make sure that the firmware being pushed onto the card/device is authentic and not reverse engineered?
04:22 nyef: Is authentic and not malware.
04:23 dboyan_: at least what nvidia says "not malware"
04:23 Horizon_Brave: >.>
04:23 Riastradh: There's a security rationalization about malware. Doesn't justify preventing users from installing their own verification keys, though...
04:23 nyef: Well, yes. And there's also the possibility that someone manages to get their paws on the signing keys.
04:24 Riastradh: Classic defective-by-design digital restrictions management here.
04:24 Riastradh: Could happen. So far I haven't heard of any indication that they've pulled a Sony and misused the signature scheme in a way that reveals the private signing key, but then I haven't been looking closely either.
04:30 Horizon_Brave: hmm alright, thanks guys, makes more sense. Thanks Riastradh, dboyan_ and nyef
04:31 nyef:suddenly wants a NuBus nVidia card.
04:31 Riastradh: Heh.
04:31 Riastradh: Do such things exist?
04:32 nyef: Almost certainly not.
04:32 Riastradh: Are you going to put one in your Frankenstein's monster of a mac68k-lispm?
04:32 nyef: Hey! I don't have the lispm bits for it yet. d-:
04:33 nyef: (The MacIvory 2 is supposed to arrive tomorrow.)
04:33 Riastradh: Awww. I still have an Alpha sitting around that I haven't tried running OpenGenera on.
04:33 Riastradh: Heh.
04:33 nyef: Of course, the board that I really want for it is a microExplorer.
04:36 nyef: I also need to decide if I'm going to try putting the board in my Q800, or if I'm going to use my IIfx.
10:31 kattana: all ppl wasted saturday morning :/
12:52 dboyan: imirkin: I've pushed my shader cache branch at https://github.com/dboyan/mesa/tree/nouveau-cache
12:53 dboyan: imirkin: And I also uploaded a trimmed apitrace https://people.freedesktop.org/~dboyan/apitrace/portal2.trace.xz
13:05 dboyan: imirkin: Difference can be seen, for example, at call 651538, where it is drawing things behind the opposite portal. They don't appear in correct rendering, but show up (winning depth test) after shader cache is enabled.
14:17 nyef: Okay, rough plan for the stereo patches for the kernel: Get my working tree and branch sorted out into a coherent set of (revised) commits. Test against the gt215 and gk104 DISPs. Defer posting the patch series until skeggsb is known to have returned. In the mean time, possibly try to get frame-packing to work.
14:59 imirkin: nyef: sounds like a plan. weren't you going to try to get gf119 too? or did that not pan out?
14:59 imirkin: (or "one thing at a time")
15:00 nyef: G84 and GF119 are tested and working.
15:01 imirkin: oh.
15:03 nyef: Even made sure that HDMI audio didn't break while in 3D mode on GF119.
15:03 nyef: Can't do much for G84 HDMI audio at this point, though. /-:
15:04 imirkin: assuming you don't have the spdif hookup thing?
15:04 imirkin: should just be a cable from your mobo's audio chipset to the gpu
15:04 imirkin: it's like a 3-pin header somewhere
15:04 imirkin: ideally marked S/PDIF :)
15:04 imirkin: but sometimes marked C75
15:05 nyef: Haven't tracked down the pinout information to know which end is ground on each card, and I'm trying to clear the worst of my queue before my new toy arrives today.
15:05 imirkin: ;)
15:06 imirkin: oh, speaking of toys, the fx3700 is mine. ideally i'll have a G92 showing up, might play with the VP2 stuff on it.
15:06 nyef: Nice.
15:11 imirkin:hopes it'll fit in my case
15:13 nyef: Always a concern, right?
15:14 imirkin: esp for me... this is a large case, but a lot of that goes towards HDD housing
15:14 imirkin: (used to have a 4x 2TB RAID5... now 2x 8TB RAID1 :) )
15:28 dboyan: imirkin: Any idea how I can debug my problem?
15:28 dboyan: Now that I found one of the problematic calls, but I just wonder why the correct rendering is correct
15:28 nyef: Okay, everything synced up except for one commit message that needs to be rewritten and one commit which needs to be dropped. Good enough for a build-and-test cycle.
15:28 imirkin: dboyan: did you look at the shaders being used for that call (by nouveau)
15:28 imirkin: dboyan: and then compare & contrast?
15:30 dboyan: I just wonder what should I compare it with
15:30 imirkin: the code being uploaded to the gpu
15:32 dboyan: I haven't done that, and I'm a little bit concerned if I can single out one shader from hundreds of them
15:35 nyef: ... smoke test passed, gf119 still works.
15:37 imirkin: dboyan: add a thing that prints the shaders CURRENTLY being used
15:37 imirkin: rather than as they're being uploaded
15:41 dboyan: imirkin: yeah, I'll try to do that
16:25 nyef: ... Broke GT215 HDMI audio. Lovely. /-:
16:49 user51: I'm sorry for the 'search it yourself' question, but I can't find ANYTHING on overclocking my GPU. I'm using an nvs140m.
16:51 MichaelLong: user51, why do you have the impression that overclocking is a topic at all for nouveau? be lucky if you own a card that can reach its standard clocks to begin with.
16:53 user51: MichaelLong: Just trying to squeeze some more performance from my old thinkpad.
16:55 user51: In all fairness, I was a little worried when I had a nvidia chip on it, but it works surprisingly well.
16:56 MichaelLong: user51, yeah my nvidia-chip was working quite ok but only using the lowest (initial) clock-configuration. but the basics worked
16:57 dboyan: user51: Maybe you are interested in reclocking
16:58 user51: dboyan: I'm not aware of any program that dynamically reclocks my GPU as needed.
16:58 user51: But if you know, a recommendation would be helpful.
16:58 dboyan: user51: You have to manually reclock with nouveau for now
16:59 nyef: ... and the test gk104 system isn't on speaking terms with the HDMI. /-:
16:59 user51: dboyan: man nouveau doesn't have anything about it, perhaps I missed something?
17:02 dboyan: user51: If you have a 4.5+ kernel, you can reclock using /sys/kernel/debug/dri/<id>/pstate
18:34 Rumple: Does nouvea support NVML?
18:35 imirkin: well that's all highly unfortunate. looks like this NV44A never worked, or at least not with the kernels i had lying around =/
18:40 imirkin: mwk: i think i need your help on this one :) let me know if you have any advice for getting a NV44A/PCI going. The FIFO doesn't seem to be able to access the commands properly.
18:40 imirkin: mwk: perhaps i can use your hwtests to experiment with it? i assume you're submitting fifo commands and whatnot?
18:41 Horizon_Brave: afternoon folks, hey imirkin
18:54 mwk: imirkin: hwtests don't submit fifo at all atm :(
18:54 mwk: except some nv1 tests, but that's not applicable to NV44A
18:55 mwk: all the pgraph tests just stuff the commands through FIFO bypass regs on pgraph
18:55 imirkin: mwk: doh :( any opinion on what could be going wrong, assuming that NV44A + AGP works ok with nouveau?
18:55 mwk: well... anything, really
18:55 imirkin: ;)
18:55 imirkin: it's saying INVALID_CMD
18:55 imirkin: and the get address is 0
18:55 imirkin: which seems ... like something's misconfigured.
18:56 imirkin: (the put address is like 0x100, so it's at least consistent)
18:56 mwk: hmm
18:56 imirkin: [ 8.822107] nouveau 0000:09:00.0: fifo: DMA_PUSHER - ch 0 [DRM] get 00000000 put 00000178 state 80000000 (err: INVALID_CMD) push 00000000
18:56 mwk: and what's the memory type for the pushbuf dma object?
18:56 imirkin: oh right - it's GART. forcing it to vram allegedly makes it work
18:56 mwk: what GART
18:56 mwk: 2 or 3?
18:57 imirkin: erm
18:57 imirkin: "above my paygrade"
18:57 mwk: there's a possibility something's off about GART handling on NV44A
18:57 mwk: since it's sorta AGP, but really a h4xed NV44 PCIE core
18:57 imirkin: /* allocate memory for dma push buffer */
18:57 imirkin: target = TTM_PL_FLAG_TT | TTM_PL_FLAG_UNCACHED;
18:57 imirkin: if (nouveau_vram_pushbuf)
18:57 imirkin: target = TTM_PL_FLAG_VRAM;
18:58 imirkin: (and nouveau_vram_pushbuf is false as you might imagine)
18:58 mwk: anyhow
18:58 mwk: it may be worth a try to disable AGP GART and use PCI sg instead
18:58 mwk: ie. DMA object mem type 2 instead of 3
18:58 imirkin: this is a PCI board
18:59 mwk: that's an even better reason to make sure it's not using AGP accesses
18:59 imirkin: ;)
18:59 imirkin: well, i'm pretty sure that's disabled
19:00 imirkin: pci_is_pcie(pci_dev) ? NVKM_DEVICE_PCIE :
19:00 imirkin: pci_find_capability(pci_dev, PCI_CAP_ID_AGP) ?
19:00 imirkin: NVKM_DEVICE_AGP : NVKM_DEVICE_PCI,
19:00 imirkin: so it should be ending up with NVKM_DEVICE_PCI
19:00 imirkin: i guess i didn't check that explicitly
19:01 imirkin: i see no mention of agp - https://hastebin.com/tuyezamehe.cs
19:09 jamm: hi imirkin: i saw a task posted on trello "Update maxwell shaders with proper delays" I was wondering how one would go about updating them? I see a lot of assembly code at /xf86-video-nouveau/tree/src/shader but being new to graphics drivers in general, especially DDX, I'm kinda lost :/
19:10 imirkin: jamm: it says (st 0x0) now. it should have proper delay flags in there for maor fps
19:10 imirkin: jamm: the fact that this is the ddx is largely irrelevant
19:11 imirkin: just ... need to update the shader assembly with proper scheduling info
19:11 imirkin: (only the gm107 ones - the rest are fine...ish)
19:11 imirkin: mwk: so ... it what is this gart 2 or 3 thing you were talking about? is it this?
19:11 imirkin: case NV_MEM_TARGET_PCI:
19:11 imirkin: dmaobj->flags0 |= 0x00023000;
19:12 imirkin: case NV_MEM_TARGET_PCI_NOSNOOP:
19:12 imirkin: dmaobj->flags0 |= 0x00033000;
19:12 imirkin: (and AGP -> PCI_NOSNOOP)
19:12 mwk: imirkin: yes
19:12 mwk: where is that?
19:13 imirkin: mwk: usernv04.c
19:13 imirkin: nvkm/engine/dma/usernv04.c
19:13 mwk: 2 is PCI, so it's fine... but it also seems to be setting the linear bit, which is worrying
19:13 imirkin: for pushbufs, you wouldn't want the linear bit?
19:13 mwk: not on PCI cards
19:14 imirkin: ok, well it definitely does work on *some* PCI cards
19:14 mwk: VRAM objects are not paged, so linear is good; AGP GART objects are paged via, well, the GART
19:14 mwk: but for PCI objects, you need the card to do the paging
19:15 imirkin: this is an area of the GPUs i know nothing about
19:15 imirkin: is there something in the hwdocs about this already?
19:15 mwk: probably not
19:16 mwk: yeah nothign
19:16 mwk: hm.
19:16 imirkin: ok, but does the get address of "0" make sense? since it's relative to the start of the dmaobj?
19:17 mwk: yeah, that sounds fine
19:18 mwk: ah
19:18 imirkin: ok. i guess i need to double-check whether it's really coming up as PCI and not secretly as AGP
19:18 mwk: so it's not going through this code at all... well, shouldn't be
19:19 imirkin: really? see nouveau_chan.c::nouveau_channel_prep
19:19 mwk: yeah, but if base.target == NV_MEM_TARGET_VM, it sets the clone flag
19:19 imirkin: right, which it is
19:19 mwk: which will result in using the dmaobj set up by mmu, not making a new one
19:19 imirkin: ah
19:20 imirkin: and that's set up by nvkm/submdev/mmu/nv44.c i believe
19:20 mwk: maybe...
19:20 mwk: but that'd be real bad
19:20 mwk: as nv44 GART doesn't involve dma objects
19:20 imirkin: heheh
19:21 imirkin: .mmu = nv44_mmu_new,
19:21 mwk: what if you change it to nv04?
19:21 imirkin: that would take time. i can't reboot just now.
19:21 imirkin: but i'll definitely try that
19:21 imirkin: if (device->type == NVKM_DEVICE_AGP ||
19:21 imirkin: !nvkm_boolopt(device->cfgopt, "NvPCIE", true))
19:21 imirkin: i suspect that should instead be
19:21 imirkin: if device->type != PCIE || ...
19:22 mwk: ah :)
19:22 mwk: seems to be our culprit
19:27 imirkin: ok, i'll def give that a shot
19:27 imirkin: presumably same deal for nv41
19:27 mwk: uh?
19:28 mwk: hmm
19:28 mwk: there shouldn't be any AGP/PCI device using nv41 mmu in the first place
19:28 imirkin: or are those more likely to be PCIE -> PCI chips
19:28 mwk: hmm.
19:28 mwk: yes.
19:28 imirkin: yes to what
19:29 mwk: that could be triggered by PCIE -> PCI chips
19:29 mwk: so yeah, same for nv41
19:29 imirkin: and so then you'd want ... the PCI logic or the PCIE logic?
19:29 mwk: nv04_mmu
19:29 imirkin: ok
19:29 mwk: I think
19:29 mwk: but I'm not really sure
19:29 imirkin: well, i'm sure it's quite rare
19:30 mwk: if the MMU is on the GPU, there's really no reason why it shouldn't cooperate with a bridge chip
19:30 mwk: I mean, MMU or not, the card does a PCIE read/write cycle either way
19:31 imirkin: oh, but with a bridge chip
19:31 imirkin: it'd come up as a PCIE device, not AGP/PCI
19:31 mwk: I guess NV44A could just be special, since it's really a different chip than NV44
19:31 mwk: no, the other way
19:31 imirkin: oh?
19:31 mwk: without bridge, NV41 is PCIE; with a bridge, it comes up as AGP/PCI
19:31 imirkin: transparent bridge... ugh
19:32 mwk: so the bridge will convert the PCIE transaction to a PCI/AGP transaction
19:32 mwk: why would it care whether it came through MMU on the card or not
19:32 imirkin: ok. so ... probably nv4a is just weird?
19:32 mwk: that's likely
19:32 imirkin: ok
19:32 imirkin: i'll buy that
19:36 jamm: imirkin: ah, i see sched 0x0's in the *nv110.fp assemblies. I'm new to GPU assembly, but have worked on x64 assembly on nasm before. Any references you could point me to understand GPU assembly better? I remember AMD having some of their references hosted on their site, not sure about nvidia though..
19:37 jamm: Also, this might be a silly question, but do I need a maxwell GPU to be able to test/run this? I have a pascal GPU with me ATM
19:43 imirkin: a pascal GPU will do just fine, but will require you to bring up pascal in the DDX. should be easy enough.
19:43 imirkin: the "correct" sched codes will be based on an understanding of instruction latencies and whatnot
19:43 imirkin: some of this knowledge is encoded in nv50_ir_emit_gm107.cpp in mesa
19:44 imirkin: the rest of it is inside hakzsam's head, although i doubt he'll let you open it up for deeper analysis
19:47 nyef: "It is easier to obtain forgiveness than permission." d-:
19:57 jamm: imirkin: well, i'm ready to take up this quest of knowledge ^_^
19:58 jamm: i'll look into nv50_ir_emit_gm107.cpp. I'll take a nap now since it's like 5am in japan and i probably have to force myself to sleep, no idea why XD
20:03 pmoreau: jamm: Here is some reading regarding scheduling, for when you wake up: https://github.com/NervanaSystems/maxas/wiki/Control-Codes
20:08 nyef: At least it's not a correctness issue, right?
20:08 nyef: Or is it?
20:10 imirkin: nyef: (st 0x0) makes it delay 15 cycles
20:10 imirkin: that might not be enough for texture fetches, but ... hopefully ...
20:10 imirkin: bbl. testing out the nv44a theory.
20:14 imirkin: mwk: w00t! it worked!
20:15 mwk: :)
20:28 imirkin: so why did i want this in the first place? i forget :(
20:29 nyef: Because you were out of PCIe slots, but had an open PCI slot?
20:30 imirkin: no... that's a means to an end... i wanted a nv4x.
20:30 imirkin: [but not badly enough to sacrifice a pcie slot]
20:44 imirkin: if i have a DB9 male <-> DB9 female adapter, that's *most likely* a null modem rx/tx line switcher thing, right?
20:46 Riastradh: Hmm. Aren't those usually male<->male?
20:47 imirkin: so why does this thing exist then?
20:48 imirkin: under what circumstances would you want a male <-> female adapter
20:48 imirkin: it's not a cable, it's like a hard thing that's about 2" long
20:49 Riastradh: No, never mind, your guess is probably right.
20:49 Riastradh: My DB-25 null modem is a male<->female.
20:55 Riastradh: https://mumble.net/~campbell/tmp/20170318/assembly.jpg https://mumble.net/~campbell/tmp/20170318/disassembly.jpg
20:55 imirkin: hehehe
20:55 imirkin: mine's thinner.
20:56 Riastradh: That's my magic nouveau debugging serial console wand, since I couldn't find a DE-9 null modem when I last went looking, but I did happen to have two DB-25/DE-9 adapters handy.
20:56 imirkin: anyways, this is excellent. i've been missing a null modem thingie.
21:04 john_cephalopoda: Hello!
21:04 Riastradh: `Null modem? Gee, I dunno, I think all we have are Comcast ones.'
21:20 imirkin: lol
21:21 imirkin: i'm going through a box-o-stuff, where i found it, also found a 56k pci modem. good times.
21:21 imirkin: (probably a softmodem, although equally likely that it was one of the few that worked on linux)
21:35 imirkin: mwk: you got any nv41+ PCI boards (that aren't nv4a)?
21:36 mwk: imirkin: GF119, curiously enough
21:36 mwk: other than that, no
21:37 imirkin: mwk: hm ok. a little surprising about the GF119. i meant nv4x though ;)
21:42 imirkin: i guess skeggsb might have one lying around... dunno.
21:50 imirkin: airlied: this look familiar? looks like my gpu went into runpm and isn't coming back. this is not a system where runpm is an option... https://hastebin.com/inozoyasod.go
21:51 imirkin: airlied: i'm running a bit of a frankenkernel - 4.10.4 + all of ben's patches for 4.11.
21:51 imirkin: i think it's stuck on the NV4A gpu, at least based on the xorg log
21:55 imirkin: <rebooting>
22:21 airlied: imirkin: strange might be a bad pm hook
22:21 imirkin: airlied: for now i've booted with nouveau.runpm=0
22:41 imirkin: airlied: i can test patches if you have any concrete ideas though
22:42 imirkin: airlied: a little odd that anything of the sort would trigger on my system, since ... no ACPI, no nothing (related to these devices)
22:42 john_cephalopoda: Is it possible to apitrace 32 bit applications?
22:43 imirkin: sure
22:47 john_cephalopoda: Hmm, can't really get it to work.
22:47 john_cephalopoda: It really bugs me, that I get small glitches in some situations.
22:48 imirkin: ok... i assume you're using a 32-bit apitrace to trace 32-bit applications?
22:50 john_cephalopoda: imirkin: Haven't compiled the 32 bit version yet. It's a bit tricky for me because I have no experience with cross-compiling.
22:50 Calinou: easy way: compile from a chroot/systemd container
22:50 john_cephalopoda: I was talking about general graphics glitches in some situations in 32 and 64 bit applications.
22:52 john_cephalopoda: It's hard to trace them down to something.
22:53 imirkin: john_cephalopoda: well, the thing ain't magic, it preloads a lib that hooks into the gl stuff. so it better match the ABI of the target application.
22:54 john_cephalopoda: imirkin: Yeah, I know.
22:55 pmoreau: imirkin: I have got that issue as well on my MBP (Kepler card), and there shouldn’t be any runpm going on either. It was with "vanilla" 4.10.1.
22:56 pmoreau: It hasn’t happened recently though, not sure why.
22:56 imirkin: are you using the other gpu? :)
22:57 pmoreau: Well yes, but the error occured as I was using the IGD (IIRC)
22:58 imirkin: pmoreau: btw, if it's not difficult, can you give me a 'glxinfo -l -s' for the kepler against mesa 17.0? (which i assume you have installed)
22:58 pmoreau: I do, 17.0.1 to be precised
22:59 pmoreau: https://hastebin.com/lucumiboqi.py
22:59 pmoreau: s/precised/precise
23:20 imirkin: awesome thanks
23:21 Megaf: Hi all, on recent kernels the nouveau driver began do often go nuts
23:21 Megaf: [17391.808048] nouveau 0000:04:00.0: gr: 00000010 [ILLEGAL_MTHD] ch 6 [000f6b8000 systemd-logind[527]] subc 2 class 502d mthd 0320 data 0000001c
23:21 Megaf: this happens when wathing youtube videos, playing media or games
23:22 Megaf: only way to stop the errors is a reboot after killing the X server
23:23 imirkin: Megaf: anything else in dmesg?
23:23 Megaf: just thousands of those lines
23:23 imirkin: and which GPU?
23:24 imirkin:doesn't see a method 320 for the 502d class... so i agree with the gpu...
23:24 Megaf: 04:00.0 VGA compatible controller: NVIDIA Corporation MCP89 [GeForce 320M] (rev a2)
23:25 imirkin: it does, however, exist in the copy class, which your gpu should have... interesting
23:26 nyef: ... MCP89, huh?
23:26 nyef: Mac Mini?
23:26 Megaf: MacBook Pro Mid 2010 13"
23:26 Megaf: aka MacBook Pro 7.1
23:26 nyef: Ah, okay.
23:26 nyef: Still, Mac.
23:26 Megaf: pretty much so
23:26 imirkin: apple was the only one to get that gpu
23:27 Megaf: I believed it worked fine in some older version of 3.2.x
23:27 Megaf: Linux aluminium 4.9.0-2-amd64 #1 SMP Debian 4.9.13-1 (2017-02-27) x86_64 GNU/Linux
23:27 imirkin: well, my guess is you've also had some small little tiny changes in userspace?
23:28 Megaf: OpenGL version string: 3.0 Mesa 13.0.5
23:28 Megaf: imirkin: Fresh Debian install
23:28 Megaf: Gallium 0.4 on NVAF
23:28 Megaf: I think that's all relevant information
23:31 imirkin: my main point was that blaming this on the kernel may be misguided
23:34 imirkin: either way, nothing jumps out at me as an obvious cause. sorry.
23:42 Megaf: imirkin: http://paste.debian.net/plain/922604
23:42 imirkin: Megaf: oooooh
23:42 imirkin: Mar 18 23:10:18 aluminium kernel: [16775.432355] nouveau 0000:04:00.0: fifo: DMA_PUSHER - ch 6 [systemd-logind[527]] get 00200302f0 put 0020031454 ib_get 00000144 ib_put 00000155 state 8000ef05 (err: INVALID_CMD) push 00400040
23:43 imirkin: so ... this is a fairly common error on tesla-era gpu's. unfortunately the cause is unknown.
23:44 Megaf: anything to do with..
23:44 Megaf: Mar 18 17:50:08 aluminium kernel: [ 4.662115] nouveau 0000:04:00.0: bus: MMIO write of 0000807f FAULT at 100c18
23:44 Megaf: Mar 18 17:50:08 aluminium kernel: [ 4.745154] nouveau 0000:04:00.0: bus: MMIO write of 0000807e FAULT at 100c1c
23:44 Megaf: ?
23:44 imirkin: [hmmm... i wonder if this is a lack of synchronization between switching fifo context and switching mmu context]
23:44 imirkin: no, i think those are unrelated.
23:44 imirkin: nyef: do you see those same MMIO write faults on your MCP89?
23:45 imirkin: oh. huh. yeah. those come from mcp77_ram_init which has a workaround for MCP77/79. i guess it doesn't apply to MCP89?
23:46 nyef: imirkin: I... don't do much with my MCP89 these days?
23:46 imirkin: i think at the time we didn't test it on MCP89 since no one had one handy
23:46 imirkin: and just assumed it'd be all good
23:46 nyef: Well, other than building kernels and running intel-gpu-tools testdisplay.
23:47 imirkin: nyef: does it show that on load?
23:47 imirkin: [of the nouveau kernel module]
23:48 Megaf: so indeed it was working and you broke it xP
23:48 pmoreau: imirkin: According to NVIDIA (IIRC), the fix should not apply for MCP89
23:48 imirkin: Megaf: not really broke, more like "added some errors that are printed on boot"
23:48 imirkin: pmoreau: oh, so we should probably fix that, huh
23:48 pmoreau: Let me have a look at the email
23:49 nyef: Let me fire it up.
23:49 karolherbst: imirkin: for hitman: total instructions in shared programs : 517553 -> 481872 (-6.89%), total gprs used in shared programs : 35576 -> 34235 (-3.77%)
23:49 imirkin: of course nvidia tends to lie about things like this, but if in addition someone's seeing mmio errors, that's a pretty good indicator.
23:49 imirkin: karolherbst: =]
23:49 karolherbst: guess waht teh biggest change is
23:49 imirkin: karolherbst: any improvement to perf?
23:50 karolherbst: .... little
23:50 karolherbst: avg fps: 6.09 -> 6.12
23:50 nyef: ... Nothing. Only thing it bitches about at startup is hwmon_device_register().
23:50 imirkin: hrm. ok. so *some* nvaf devices have those regs others don't? annoying.
23:51 karolherbst: imirkin: this change improved instruction count by around 5.5%: https://github.com/karolherbst/mesa/commit/e64f0e546ed71860c7202840a67aad2fa64a3fc0
23:51 karolherbst: ....
23:51 imirkin: hahahha
23:51 pmoreau: imirkin: "The pollers exist on MCP77/78 and MCP79/7A", from https://lists.freedesktop.org/archives/nouveau/2014-November/019266.html
23:51 nyef: This on 4.10.0 + local patches.
23:51 imirkin: karolherbst: ship it :)
23:52 karolherbst: imirkin: I already wondered: neg $r2 $r2; fma $r2 $r2 $r3 $r4... and I was like: mhhh, something is definitly wrong
23:52 imirkin: nyef: yeah, that workaround went into kernel like 4.0 or so
23:52 imirkin: if nto earlier
23:52 karolherbst: imirkin: tomorrow or so. Sadly I have no internet access at my new home right now. Just tethering currently
23:52 imirkin: karolherbst: it used to be that nothing generated OP_FMA. now it's reachable though.
23:52 karolherbst: yeah
23:53 nyef: There's something about "using M2MF for buffer copies". Is that relevant?
23:53 karolherbst: I also had to modify my postraloadpropagation pass
23:53 imirkin: karolherbst: mail the patch, i'll apply it
23:53 imirkin: nyef: i mena ... it's just informational.
23:53 karolherbst: imirkin: https://github.com/karolherbst/mesa/commit/e64f0e546ed71860c7202840a67aad2fa64a3fc0.patch
23:53 karolherbst: I could even add a signoff-by or so :D
23:53 imirkin: karolherbst: can you add a s-o-b line?
23:53 pmoreau: it went in 3.19 IIRC
23:53 Megaf: imirkin: nyef: So, what you suggest me doing? Going to the proprietary driver?
23:53 karolherbst: imirkin: I will do a full shader-db run as well and post some fancy stats
23:54 imirkin: Megaf: that's definitely the way to get the most out of your gpu.
23:54 imirkin: karolherbst: ok. go for it. let me know when that's ready.
23:54 Megaf: My two biggest problems with the proprietary driver are, 1- it's proprietary, 2- doesn't work in EFI mode, I have to remove grub-efi and install grupc and migrate stuff to BIOS mode, running it emulated.
23:54 karolherbst: imirkin: will do
23:55 Megaf: then the console resolution will be wrong, and brightness keys will no longer work
23:55 Megaf: requiring me to edit configuration files to work around
23:55 Megaf: shoud just run macOS...
23:55 imirkin: Megaf: the situation with nouveau is that there's a very finite quantity of people working on it, with a very finite quantity of documentation, with a fairly infinite breadth of hardware to support
23:55 Megaf: I understand
23:55 karolherbst: imirkin: but something else is wrong with nouveau here.. nothing I do really changes the perf, though it should. Maybe it is mostly related to scheduling, but sadly I can't try out my dual_issue pass, cause it is broken and hitman hits the broken parts...
23:55 Megaf: but as a user, that doesn't help, does it?
23:55 imirkin: Megaf: not in the least
23:56 imirkin: Megaf: as a user, your only option is to stick to intel/amd gpu's
23:56 Megaf: remember when nvidia used to be hardware of choice for Linux?
23:56 imirkin: i do.
23:56 imirkin: that time is long gone.
23:57 Megaf: at the time I had a Radeon 9200.
23:57 karolherbst: at that time, a lot of stuff was better
23:57 imirkin: i had a radeon 7000 iirc... the vivo kind. it was awesome.
23:57 Megaf: it was
23:57 imirkin: and then i wanted to watch over-the-air ASTC broadcasts, which were the cool new thing
23:57 imirkin: and realized my cpu was *nowhere* close to being able to decode them
23:57 Megaf: and that Radeon was the reason I was using the distro I was using at the time. That I used until not very long ago.
23:58 imirkin: and then i got a NV34 or so
23:58 imirkin: with xvmc support. it was great.
23:58 Megaf: heh
23:58 imirkin: in the past few years, upstream support for recent-model amd hw has really come together.