00:05 pq: karolherbst, quite right
00:06 karolherbst: pq: yeah well I don't know, it seems fine now, I still get "UNKNOWN" entries, but I think the issue is trivial
00:07 karolherbst: pq: especially because it looks like this: https://gist.github.com/karolherbst/3a75e2540218b3a2d369
00:07 karolherbst: starting with 0xf0001088
00:07 pq: karolherbst, you have to know what the page size really is, you want to arm all pages and not just the first one :-)
00:07 karolherbst: and ending with 0xf0096aff
00:07 karolherbst: ohhhhh
00:08 karolherbst: pq: so I have to fix this: https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/arch/x86/mm/kmmio.c#n417
00:08 karolherbst: the size += PAGE_SIZE part?
00:08 karolherbst: but I thought this was only for unaligned mappings
00:12 pq: karolherbst, no. It has to *handle* unaligned mappings, but it really needs to arm all pages in the whole probe area
00:12 pq: so all pages inside the area, plus possibly extra pages at both ends in case the area is unaligned
00:13 pq: so instead of hardcoding += PAGE_SIZE, you'd have to know what page size is actually being used
00:13 pq: I certainly hope it is not possible to get mixed page sizes in a single mapping...
00:14 pq: though I wouldn't be surprised
00:15 pq: karolherbst, so, register_kmmio_probe() iterates through all pages that overlap with the probe area
00:15 pq: add_kmmio_fault_page() sets up our tracking for each such page
00:16 pq: and arm_kmmio_fault_page() marks the page to cause faults on access
00:16 pq: there is also the opposite version of these functions
00:17 pq: when a fault happens, our tracking gets looked up, disarm_kmmio_fault_page() makes the page available, execution is resumed with single-stepping...
00:18 pq: ..once the instruction is executed again for real, we get a debug trap since we were single-stepping, call arm_kmmio_fault_page() again, and resume normal execution.
00:19 pq: kmmio_handler() is the one called on the page fault before single-stepping, and post_kmmio_handler() is the one called after.
00:24 pq: and those two then have the hooks to call the actual instrumentation to record the data
00:56 karolherbst: pq: k
00:56 karolherbst: pq: no there aren't differen page size per mappings
00:57 karolherbst: bigger sizes are only used when the total size is aligned to a bigger page size
00:59 karolherbst: well it kind of works already, I get a really big trace with all kind of stuff, so I think I only have to get those details right now
00:59 karolherbst: well usefull stuff
01:24 karolherbst: pq: well I think the unknown stuff is something else though
01:25 pq: karolherbst, you get trace entries of type "UNKNWOWN"? That would be an instruction that mmiotrace has no code to decode.
01:26 karolherbst: pq: yes
01:26 pq: you get look at some CPU reference manual to see if the bytes make sense as an instruction, or if something else screwed up.
01:27 karolherbst: I am sure something else screwed up
01:27 karolherbst: or maybe nvidia did something to screw with mmiotrace :D
01:28 pq: I believe the set of understood instructions in mmiotrace is far from exhaustive, since you don't usually use funny stuff to access mmio
01:28 karolherbst: pq: no, mmiotrace doesn't interpret those instructions at all
01:28 karolherbst: it does some kernel calls for this
01:29 pq: really?
01:30 karolherbst: yeah
01:30 pq: what does it call?
01:30 karolherbst: wait a sec
01:30 karolherbst: get_ins_type(instruction_pointer(regs))
01:31 karolherbst: or is get_ins_type stuff implemented by the mmiotrace stuff?
01:31 pq: that's mmiotrace's own code alright
01:31 pq: see arch/x86/mm/pf_in.c
01:35 karolherbst: ahh k
01:36 karolherbst: well it is the 361 module
01:36 karolherbst: maybe they added some fancy instructions
01:36 karolherbst: but
01:36 karolherbst: this happens in big blocks
01:37 pq: I'm almost sure the kernel should have a framework for really emulating instructions and a lot more than mmiotrace can decode, but no-one has the time to see how it might be hooked up.
01:37 pq: emulating instead of single-stepping would also help a lot with making SMP work
01:37 karolherbst: painful is, that mmiotrace doesn't tell which instruction can't be decoded :/
01:37 pq: IIRC
01:38 pq: karolherbst, it does, it prints the bytes
01:38 pq: check the mmiotrace doc in the kernel tree :-)
01:38 karolherbst: ohh the instructions are only decoded in one char?
01:38 karolherbst: that's rather small
01:38 karolherbst: okay
01:39 pq: 4
01:39 karolherbst: well the code does this: unsigned char *ip = (unsigned char *)instptr;
01:39 karolherbst: (*ip) << 16 | *(ip + 1) << 8 | *(ip + 2)
01:39 pq: decoding happens byte by byte
01:39 pq: since instruction length varies
01:39 karolherbst: ohh right
01:40 karolherbst: so with 0xf00f82e3 ip = 0xf, ip+1 = 0xf, ip+2 = 0x82e3
01:41 karolherbst: mhh
01:41 karolherbst: no
01:41 karolherbst: ip = 0x0 and ip+1 = 0xf00f
01:41 karolherbst: wait... what's wrong with me today
01:42 pq: remember little-endian ;-)
01:43 karolherbst: yeah I know and also 0xff mask
01:43 karolherbst: which means the data is screwed anyway
01:43 karolherbst: because there should be now way it gets bigger than 0xffffff
01:43 karolherbst: but it is
01:43 pq: hm?
01:43 karolherbst: well there are three 0xff values shifted into that data
01:43 karolherbst: 0xf00f82e3
01:44 karolherbst: but that makes no sense in that case
01:44 pq: search for UNKNOWN in kernel/trace/trace_mmiotrace.c to see how it prints it
01:44 karolherbst: my_trace->value = (*ip) << 16 | *(ip + 1) << 8 | *(ip + 2);
01:45 karolherbst: 0xe3 and 0x82 seems reasonable though
01:45 karolherbst: so maybe there is somewhere an offset +1 issue
01:46 pq: it prints three bytes, but you get to figure out yourself how to interpret that as an instruction.
01:46 pq: I've long forgot how to do that
01:49 karolherbst: 0xf 0x82 is a jump
01:49 karolherbst: but in that case it is still messed up data
01:51 karolherbst: also that "|" annoys me, because if those values are bigger than chars, it all gets unusable anyway
02:01 karolherbst: seems like the unknown thing is "f3 a4 c3 f6"
02:07 pmoreau: karolherbst: I downloaded and applied your patch. Let's see if I can mmiotrace… :-)
02:07 karolherbst: pmoreau: :)
02:08 karolherbst: pmoreau: it seems like nvidia uses a new instruction now so you might see some unknown entries or parts are just unknown
02:08 pmoreau: We'll see. I'm still on the 340xx version since I need Tesla support, so might have different results.
02:08 karolherbst: right
02:08 karolherbst: I use 361
02:12 pmoreau: Can't find some symbols… Mmm, let's try with a newer version of the kernel then.
02:13 karolherbst: or just rebuild nvidia
02:13 karolherbst: usually undefined symbols in nvidia means you just have to rebuild it
02:14 karolherbst: the prebuilt binaries have no reference to any linux symbols
02:16 karolherbst: have to reboot anyway, because nvidia just got a null dereference in __down+0x3c/0xa0 ...
02:24 karolherbst: pq: okay, seems to be the same instruction all over again: f3 a4 c3 with either f or f6 as ip+3
02:37 pmoreau: Meh, can't manage to pass the custom `O=folder` down to the make…
02:37 pmoreau: Will try again tonight
02:37 karolherbst: :D
02:37 karolherbst: ohh nvidia builds against the wrong kernel?
02:38 karolherbst: ohhh
02:38 karolherbst: pq: the unknown instruction comes from linux itself
02:38 karolherbst: "rep movsb"
02:38 karolherbst: arch/x86/lib/memcpy_64.S:50
02:39 pq: I think skeggsb may have found that a while back
02:39 karolherbst: ohh k
02:39 pq: rep instructions are fun, because single-stepping might step the whole instruction, not just one cycle, too.
02:39 karolherbst: rep movsb %ds:(%rsi),%es:(%rdi) is the generated assembly
02:40 karolherbst: pq: well we get the address though, so is this a problem generally?
02:40 pq: and that depends on the CPU hw
02:40 pq: what do you mean by "get the address"?
02:40 karolherbst: the mmio address
02:40 pq: mmiotrace cannot re-read things on its own, if that is what you are proposing
02:41 karolherbst: no, I mean where is the issue with the rep instructions?
02:41 pq: sometimes even reads to iomem trigger something, and we shouldn't trigger things
02:42 karolherbst: ohhh okay
02:42 pq: the problem with rep is that it repeats, but single-stepping might step the whole loop rather than one cycle
02:42 karolherbst: ohhhhh
02:42 pq: and what single-stepping a rep inst does depends on the particular CPU
02:43 karolherbst: k that could be indeed important
02:43 pq: so if it steps the whole loop, you miss almost all the data
02:43 pq: the only sure solution to that is to use emulation instead of single-stepping
02:44 pq: but of course you could decode the rep and hope that it single-steps each cycle
02:44 karolherbst: mhhh
02:44 karolherbst: can we emulate reads on mmio?
02:44 karolherbst: I am thinking about mmio address which change their values after a read
02:44 pq: you'd be emulating CPU instructions that target iomem
02:44 pq: right, which is why we must not read more than the original instruction
02:45 pq: if we never execute the original instruction but instead emulate it, we get all we want
02:45 pq: the missing detail is finding and hooking up an emulation infrastructure...
02:45 karolherbst: but do we get the data from mmio then if we just emulate?
02:46 pq: yes, because it will do the exact same operations to the iomem
02:46 pq: that the point of emulating it properly
02:46 pq: only software would be in control of how the instruction executes, not the CPU directly
02:46 karolherbst: but I thought if you emulate an instruction call you don't change the hardware at all
02:47 pq: I don't know where you got that from
02:47 karolherbst: ohh
02:47 karolherbst: I think I know what you mean now
02:47 karolherbst: so instead of executing the instruction at all, we just change the registers and mmio stuff what the instruction would do
02:48 pq: yes
02:48 pq: that would be a quite big change
02:48 pq: so, the immediate thing you could do to fix this is to help pf_in.c properly decode the rep instruction and just hope that single-stepping works the way we want.
02:49 pq: I suspect with any moderately recent CPUs, single-stepping a rep should work fine. I hope.
02:49 karolherbst: I could try
02:49 pq: but when you read traces, you have to keep in mind that it might not work and things could be silently missing
02:50 karolherbst: we could add a special case for mmiotrace for something like that
02:50 pq: what would it do?
02:50 karolherbst: don't have any good idea for now
02:50 pq: I suppose you could add markers to say "rep instruction, read with care", but that's all I can think of.
02:51 karolherbst: but if it would execute the whole isntruction there could be a "REPEATED %i times"
02:51 karolherbst: or something like that
02:51 pq: but you don't really know if it did, do you?
02:51 karolherbst: no
02:51 karolherbst: ohh wait
02:51 pq: maybe you could know, if you fully decode the instruction and then check the values to see how much it iterated...
02:51 karolherbst: it looks like it though
02:52 karolherbst: the mmio address increases by one each time I get that instruction
02:52 pq: but yeah, this is the thing, and I need lunch now :-)
02:52 karolherbst: :D
04:42 drathir: mornin/evenin...
04:43 drathir: is gpu acceleration possible at 05:00.0 VGA compatible controller: NVIDIA Corporation G84 [GeForce 8600 GTS] (rev a1)
04:44 RSpliet: drathir: yes
04:44 drathir: RSpliet: thanks that good will go diggin than ^^
05:06 pmoreau: drathir: Having any troubles with that card? It should work fine, as long as you do not use it as a secondary GPU (i.e., not using it to drive a screen).
05:08 karlmag: pmoreau: care to elaborate a bit on that? I just got a bit curious. (And not 100% I quite got what it all entailed)
05:09 drathir: pmoreau: glean install not detecting vainfo and vdpauinfo hw acceleration, i guess need search more and its pc standalone gpu with no other cards...
05:09 pmoreau: Regarding the G84? See bug https://bugs.freedesktop.org/show_bug.cgi?id=82714
05:09 drathir: checkin arch wiki if there somethin about that model...
05:10 karlmag: pmoreau: aaah, right.. I thought it might have been a more general issue. Oki, thanks :-)
05:11 pmoreau: drathir: There is an arch package containing the needed firmwares IIRC
05:11 pmoreau: karlmag: Well, maybe some more chipsets are affected by that but nobody tested a similar configuration?
05:12 karlmag: pmoreau: well.. it was the "secondary gpu" and "no acceleration" thing I wondered about.
05:12 pmoreau: drathir: I guess you followed https://wiki.archlinux.org/index.php/VDPAU ?
05:13 pmoreau: karlmag: Yeah, it does sound intriguing. :-)
05:17 karlmag: pmoreau: well, ideally I would want all (no matter how many) GPUs I put into a machine to get initialized and work as they should. (d'oh, captain obvious). I know it's a bit trickier in reality than one would want.
05:17 drathir: [ 186.098422] nouveau 0000:05:00.0: Direct firmware load for nouveau/nv84_xuc00f failed with error -2
05:18 drathir: [ 186.098481] nouveau 0000:05:00.0: Direct firmware load for nouveau/nv84_xuc103 failed with error -2
05:18 drathir: need worry aboout?
05:20 drathir: still reading btw...
05:23 pmoreau: drathir: Did you installed the nouveau-fw package from AUR as suggested by the wiki page?
05:23 karolherbst: ohhh
05:23 karolherbst: steam supports nvenc encoding
05:25 karolherbst: mhh I should really dog into this at some point
05:26 mlankhorst: yeah it will be fun :p
05:30 drathir: pmoreau: sorry scrollback ;/ yea installation pending...
05:37 drathir: karlmag: the best blacklist/off one of them i guess to check if standalone no problem...
05:38 drathir: pmoreau: yea looks like listed now ^^
05:47 drathir: btw is that correct ? ln -s vdpau_drv_video.so nouveau_drv_video.so
05:48 pmoreau: Ehm… Probably not?
05:49 drathir: pmoreau: for vainfo
05:49 pmoreau: drathir: Did you install mesa-vdpau?
05:50 drathir: pmoreau: vdpau workin correctly now vainfo setting up...
05:50 pmoreau: https://wiki.archlinux.org/index.php/VA-API
05:50 pmoreau: You need libva-vdpau-driver
05:50 mlankhorst: pmoreau: yes but still need the symlink I think
05:51 drathir: yea symlink not detected and guessing need symlink with vdpau instead of nvidia, bc second is for closed ones?
05:53 drathir: k looks like vainfo now detect too...
05:58 pmoreau: Interesting, the libva-vdpau-driver comes with a vdpau_drv_video.so, but you need to overwrite it…
05:59 drathir: any other reccomended tweaks? its my first configured nvidia card ;p exclude that gt740m but that one was not used bc was in notebook dual gpu with intel... mean other one tweaks than dri 3 and glamor...
06:00 drathir: pmoreau: only link it becouse in clean nod modified vainfo search for nouveau_drv_video.so
06:02 drathir: pmoreau: have similar in mine amd gpu too need link to galium...
06:02 pmoreau: I guess "The driver used by VA-API is autodetected, but sometimes it may be necessary to configure it manually by setting the environment variable LIBVA_DRIVER_NAME, for example: export LIBVA_DRIVER_NAME=vdpau" could work as well?
06:03 pmoreau: or "LIBVA_DRIVER_NAME=gallium"?
06:04 pmoreau: But I never tried to setup video acceleration on any of my cards, so…
06:04 drathir: pmoreau: i can check that give me a sec... in amd needed stil link, bc searchin r600 if good remember in name and cant find location of .so lib...
06:05 mlankhorst: pmoreau: it detects the driver name same way as vdpau does, so to me it seems vdpau_drv link should be autogenerated for gallium drivers
06:06 pmoreau: I can't really think of any other tweaks you could need. Reclocking would certainly be one, but G84 isn't supported. (though you could hack the Nouveau code to pretend it is supported and see if it works or crashes :-))
06:09 pmoreau: mlankhorst: So you should not need to relink it, it should pick up the correct backend library automatically?
06:12 mlankhorst: pmoreau: well vdpau va drv needs to be patched to make those links
06:16 xexaxo: speaking of vaapi: has anyone compared the libva-vdpau-driver vs the gallium nouveau vaapi driver ?
06:18 RSpliet: curious... I'd expect that on a kepler GPU with 2 GPCs, I can find the regs for the second GPC at 0x508000, but that doesn't seem to exist
06:21 karolherbst: RSpliet: well it is badf1300 for me, but I don't have the driver loaded
06:21 karolherbst: ohh it is loaded
06:23 karolherbst: RSpliet: and I have 3 GPCs
06:23 karolherbst: or does "PUNITS.DESIGN_GPC_COUNT => 0x3" actually mean 4?
06:24 drathir: at clean throw https://gist.github.com/4faf761b1cda36bb2904
06:24 karolherbst: drathir: check dmesg, maybe the firmware doesn't get loaded
06:25 karolherbst: ohh wait, vdpau works for you
06:25 karolherbst: mhh
06:26 karolherbst: drathir: LD_LIBS=debug vainfo
06:26 karolherbst: ..
06:26 karolherbst: LD_DEBUG=libs vainfo
06:31 drathir: and linked version https://gist.github.com/b2c0fb75b5bbc4de3bb7
06:34 RSpliet: karolherbst: what GPU is that again? GF117?
06:38 karolherbst: RSpliet: gk106
06:39 RSpliet: kepler? that should have 4SMX
06:39 karolherbst: mine has 5 actually
06:39 karolherbst: but gpc count doesn't equal smx count
06:39 karolherbst: there are 4 GPC per SMX I think
06:40 drathir: karolherbst: https://gist.github.com/2023702eda6f59ba23b1
06:40 karolherbst: drathir: I meant with the original setup
06:40 drathir: k one sec....
06:41 karolherbst: debugging a working setup is kind of pointless
06:41 karolherbst: RSpliet: but maybe I didn't got the issue you had
06:44 karolherbst: RSpliet: by the way there are GK106 with 3-5 SMx where 4 is the usual count
06:45 drathir: karolherbst: the clean one not linked https://gist.github.com/02bdd375d4f7e7e3fde9
06:45 karolherbst: drathir: seems like that /usr/lib/dri/nouveau_drv_video.so simply doesn't exist?
06:46 drathir: karolherbst: yes correct...
06:46 RSpliet: karolherbst: sorry, just confused by terminology
06:46 karolherbst: drathir: so this is a packaging bug that the mesa package doesn't symlinks the vaapi stuff for nouveau?
06:46 karolherbst: RSpliet: k
06:46 karolherbst: maybe I only have 3 GPCs per SMX indeed
06:47 karolherbst: tegra X1 seems to have only 1gpc per SMX
06:47 RSpliet: the other way around - quoting "the internet" a GPC can contain up to 4 SMX per GPC
06:47 karolherbst: ohhh
06:47 RSpliet: which makes sense, I'd have 1GPC with two SMX in it
06:47 karolherbst: ahhhh
06:47 karolherbst: okay
06:47 karolherbst: yeah
06:47 karolherbst: that makes more sense indeed
06:48 karolherbst: then I have 3 or 4 gpcs with each 5 SMX?
06:48 karolherbst: seems like 3
06:48 RSpliet: 3, but one SMX is 192 shader cores
06:48 karolherbst: mhh I should have 960 in total
06:48 drathir: karolherbst: the amd also need ln -s /usr/lib/dri/gallium_drv_video.so /usr/lib/dri/r600_drv_video.so
06:49 karolherbst: 192 * 5 = 960
06:49 karolherbst: and how does the GPC count effect this?
06:50 karolherbst: drathir: yeah, packaging bug then
06:50 karolherbst: drathir: I think they still rely on the vaapi to vdpau mapper thing
06:50 karolherbst: or there is another package
06:50 karolherbst: maybe mesa-vaapi or something stupid like that
06:50 karolherbst: or none at all
06:52 drathir: in theory it using fallback not native mode takin on mind vainfo: Driver version: Splitted-Desktop Systems VDPAU backend for VA-API - 0.7.4
06:53 drathir: but always like for prevention setup vainfo and vdpauinfo if any app directly linked to vainfo to workin too...
06:54 RSpliet: karolherbst: what does nvapeek 0x508bb8 give you?
06:55 karolherbst: RSpliet: preference which driver is loaded?
06:55 RSpliet: nope
06:55 drathir: i know the nvidia more should vdpauinfo bc its more like native mode for it like vainfo is more for amd if good remember and im not wrong...
06:55 karolherbst: nouveau: 0x500
06:56 karolherbst: nvidia: 0x501
06:56 xexaxo: drathir: up-to recently mesa-vaapi worked with radeonsi alone so there was no need for compat {sym,hard}links
06:57 RSpliet: karolherbst: ehh... with nouveau you have a ROP count of 0, with NVIDIA it's 1? that's... odd
06:57 karolherbst: yeah I thought the same now
06:57 xexaxo: there's an alternative solution in vaapi which attempts to use gallium_drv_video, whenever amd/nvidia gpu is detected.
06:57 xexaxo: not sure if there's a release libva-api release with that one yet.
06:59 karolherbst: RSpliet: maybe the ROP is off with nouveau :O
06:59 drathir: the most "funny" thing is that open drives workin better than closed ones takin of mind not full available spec for den and backward enginering or blind/guess mode often used...
07:00 karolherbst: I should not poke 501 into that reg... :(
07:00 drathir: the performance of open ones could be even better than closed if full spec available for devs im pretty sure...
07:01 RSpliet: karolherbst: don't be silly, surely the ROP must be functional! :-P
07:01 karolherbst: mhh
07:01 karolherbst: maybe
07:01 karolherbst: it isn't the ROP this reg tells us about
07:01 karolherbst: rop is the rastarizer stuff, right?
07:01 RSpliet: should be
07:01 karolherbst: ohh wait
07:01 RSpliet: pixel copy, anti-aliassing, z-buffer
07:01 karolherbst: ROP is after the fragmant shader
07:02 karolherbst: rastarizer is between fragment and geometry/tess
07:08 karolherbst: RSpliet: rop is render output unit
07:08 RSpliet: yeah I googled that to double check already
07:09 karolherbst: but mhhh
07:09 karolherbst: when I poke 501 into that reg the gpu hangs totally
07:09 karolherbst: mhhh that keeps me thinking somehow
07:11 RSpliet: I'm doubting the docs are all right; for one the "GPC count" should more likely be "SMX count"
07:12 karolherbst: yeah
07:12 RSpliet: and guessing 0x22434 returns 2...?
07:12 karolherbst: yes
07:13 RSpliet: so... sorry, I don't have hw as beefy as you
07:14 RSpliet: that's the TPC count per GPC; mind fetching 0x500b08 ?
07:14 karolherbst: PGRAPH.GPC[0].TPBUS.TPC_GPCID[0] => { 0 = 0x1 | 1 = 0x2 | 2 = 0 | 3 = 0x1 | 4 = 0x2 | 5 = 0x7 }
07:15 karolherbst: weird
07:15 karolherbst: nouveau: PGRAPH.GPC[0].TPBUS.TPC_GPCID[0] => { 0 = 0 | 1 = 0x1 | 2 = 0x2 | 3 = 0x1 | 4 = 0x2 | 5 = 0x7 }
07:16 RSpliet: kind of makes sense
07:16 RSpliet: assuming GPC0 only has one SMX, whereas the others have two :-P
07:16 karolherbst: I meant again that nvidia and nouveau are different or can you configure it as you wish?
07:16 RSpliet: I guess you can
07:16 karolherbst: would be interessting to know if there is any performance impact on this
09:25 mupuf: karolherbst: Hey, did you test the mmiotrace patch on reator?
09:25 karolherbst: no
09:26 karolherbst: I won't touch the kernel there, that's your job :p
09:26 karolherbst: don't want to mess it up
09:26 mupuf: :P Ok, will do then!
09:26 karolherbst: well wait I have a new version kind of, but the current one on the list should still work fine
09:26 karolherbst: doesn't matter if there are some minor issues
09:26 karolherbst: as long as it kind of works
09:29 karolherbst: mupuf: newest version: https://gist.github.com/karolherbst/903bf75486134dd9505d
09:29 karolherbst: but I think the changes only matters for performance when a new mapping is created
09:30 mwk: messing kernel up is mupuf's job? :p
09:31 karolherbst: mupuf: well with a newer kernel there are some repeate instructions used mmiotrace doesn't handle yet, so there are some UNKNOWN entries
09:31 karolherbst: I think it is basically copies of big memory areas
09:32 mupuf: karolherbst: thing is that mmiotrace at least used to work on reator
09:32 karolherbst: I know
09:32 karolherbst: it also used to work for me
09:32 mupuf: so, I need to make one before and after your patch
09:32 mupuf: oh, cool
09:32 mupuf: clear regression then
09:32 karolherbst: yeah
09:32 mupuf: mwk: yop, got a problem with that dude?
09:32 karolherbst: you know I created those traces I uploaded for my gpu :p
09:33 karolherbst: mupuf: well the issue is, that the kernel started?!? or something else changed like PAT/mtrr stuff, or maybe nvidia does something conditionally and now ioremaps are page aligned
09:34 karolherbst: usually they are always aligned to 4k pages, but the kernel uses 2M or 1G pages when the remap size is aligned to that
09:34 karolherbst: and nvidia tried to remap a 16MB area
09:34 karolherbst: so the kernel created 2M pages for that instead of the usual 4k ones and mmiotrace couldn't handle that
09:35 karolherbst: it seems to be a thing now anyway, because /proc/meminfo tells me: DirectMap2M: 13733888 kB and DirectMap1G: 2097152 kB
09:37 karolherbst: no idea if that's related though
09:42 mupuf: karolherbst: I see, sorry, I did not follow your discussion with pq
11:26 karolherbst: wow openarena benefits from the pcie stuff? crazy
11:27 karolherbst: ohh seems like a stupid loading issue :/
11:27 karolherbst: seems like the maximum frame time got lower in general
11:38 karolherbst: mupuf: okay, back to the fakebios stuff. basically I need to allocate a big enough continuous memory region, get the physical address for that and put that into the nvidia gpu reg and load the binary driver?
12:23 karolherbst: yay, I managed to not crash my kernel with nvafakebios, but then nvagetbios crashed it :/
12:25 karolherbst: mupuf: I think I will try to figure this out on reator then, only your maxwells are pwm based right? I would like to have a gm1xx one if that would be possible
12:39 hakzsam: mupuf, in case you plug a new card, please keep the kepler because I'm working on it for shared+atomics :)
15:44 mupuf: hakzsam: ok, I will keep it and plug the GTX 750
16:14 imirkin: mupuf: oh, someone should test if ssbo/atomic works at all on maxwell...
20:45 ashmew2: Good afternoon #nouveau :)
22:06 ashmew2: I summarized my adventures with the nvidia-drivers here: https://wiki.gentoo.org/wiki/Hybrid_graphics and here: www.ashmew2.me
22:06 ashmew2: Thanks for all the help :D
22:06 ashmew2: Nouveau + DRI3 fixed me right up.
23:33 mupuf: imirkin: I guess a piglit run would be enough?
23:34 mupuf: ashmew2: nice story :)
23:35 mupuf: quite ballsy to have a blog in ascii! But it really is not problematic aside from missing a comment section
23:48 hakzsam: imirkin, I'll test ssbo/atomics on maxwell :-)
23:51 ashmew2: mupuf: thanks for the feedback, I'll get around to adding an ASCII comment section later ;)
23:52 ashmew2: Once I can publish using emacs and not ssh + nano XD