04:37 mwk: ugh, the Falcon addressing modes suck so much
04:49 mupuf: mwk: sleepless night?
10:24 Soukyuu: hello, is the pstate param not available (anymore?) on 1.0.12 version of nouveau? I wanted to try it with my 260GTX
10:24 Soukyuu: running kernel 4.5.4 / mesa 11.2.2
10:38 Tom^: Soukyuu: after 4.5 its now in debugfs
10:39 Tom^: /sys/kernel/debug/dri/0/pstate
10:40 Soukyuu: ah, i see - thanks
11:07 Soukyuu: hah, there seems to have been a lot of progress since I last tried nouveau last september - I can now play videos without stuttering using mpv opengl-hq:interpolation
11:08 Soukyuu: not a single frame drop so far <3
11:08 Tom^: last september i couldnt run cs:go without gpu hangs. today i run unigine-heaven at 1920x1080 with 4x msaa, and normal tesselation with an average 40fps
11:08 orbea: Soukyuu: watch out for mpv + hwdec + nouveau
11:08 Tom^: so yes. :P
11:08 orbea: very bad
11:09 orbea: check dmesg for the spam
11:09 Soukyuu: not using hwdec, orbea
11:09 orbea: okay, probably not a problem then :)
11:10 Soukyuu: seems i have forgotten to blacklist nvidia, though - dmesg is full of stack traces of nvidia attempting to load and failing
11:10 orbea: heh
11:11 Soukyuu: i dont know what happened, but i was having major issues with nvidia and pcsx2 in hardware mode - 1fps at most.
11:11 Soukyuu: switched to nouveau and it runs smoothly
11:11 Soukyuu: how the world has changed
11:11 orbea: Soukyuu: few games run poorly with nouveau, (Valkyrie Profile 2 for one), but yea, nouveau works great with pcsx2 for the most part
11:12 Soukyuu: if it continues running this smooth, I will probably stick with nouveau completely, big thanks to all the devs
11:37 karolherbst: Soukyuu: yeah, happens
11:37 karolherbst: Soukyuu: like when cuda or opencl stuff is accessed the nvida library usually loads nvidia
11:38 karolherbst: Soukyuu: what GPU do you have?
11:38 Soukyuu: karolherbst: 260GTX
11:38 karolherbst: ahh gt200
11:38 Soukyuu: yup
11:38 Soukyuu: it was awful back in september
11:39 karolherbst: yeah tesla reclocking got some love recently (like begining of this year)
11:40 Soukyuu: that's good to hear, this card is still enough for my uses
11:40 Soukyuu: unless it dies (which it already attempted to) i don't think I will upgrade
12:00 Soukyuu: huh, what the... now it suddenly started dropping frames on the same video that was fine before
12:02 karolherbst: Soukyuu: does restarting the application helps?
12:02 karolherbst: *help
12:02 karolherbst: Soukyuu: and are you still on the highest clocks?
12:06 Soukyuu: karolherbst: yes i was and restarting didn't help
12:07 Soukyuu: somehow setting my clocks to lowest killed the USB controller, i could see the screen update and you writing but all usb devices were down
12:07 Soukyuu: maybe a coincidence though
12:07 karolherbst: odd
12:07 Soukyuu: interestingly, after restarting and clocking up the framedrops are gone
12:08 karolherbst: Soukyuu: okay, we it might be reproducible
12:08 karolherbst: *so
12:09 karolherbst: Soukyuu: could you try to find out what causes the increased framedrops?
12:09 Soukyuu: I will be experimenting with it
12:10 karolherbst: Soukyuu: a look in dmesg might be good too
12:14 Soukyuu: hmm, i'm still getting messages from nvidia.ko failing to load even though i've added "blacklist nvidia" to /usr/modprobe.d/nouveau.conf
12:14 Soukyuu: and there was a stack trace mentioning nouveau 17 seconds into boot, but no dropped frames so far
12:16 Soukyuu: karolherbst: https://dl.dropboxusercontent.com/u/19330332/dmesg_160514.txt
12:18 karolherbst: Soukyuu: I meant dmesg when it starts dropping frames again
12:18 Tom^: change that blacklist to install nvidia /bin/false
12:18 Tom^: and it shouldnt be loading at all
12:20 Soukyuu: karolherbst: are those nouveau stack traces nothing to worry about?
12:20 Soukyuu: Tom^: will try
12:20 karolherbst: "nouveau 0000:01:00.0: fb: Could not calculate MR" is a bit odd
12:23 karolherbst: Soukyuu: no idea about the error, maybe somebody else knows more about it
12:24 karolherbst: when something like that happend to me, that was usually cause I messed up, but the GPU wasn't happy anymore anyway, so it just stoped doing stuff
12:26 Soukyuu: karolherbst: well i did change the clocks to the lower settings and it seemed to work, maybe it comes from that time
12:27 karolherbst: could be
12:27 karolherbst: mc: memory controller
12:28 karolherbst: maybe RSpliet has any ideas
12:29 Soukyuu: all i know is that this particular card is always running on max clocks on windows and on linux+blob
12:30 Soukyuu: it's an MSI TwinFrozr factory OC, if that matters
12:32 karolherbst: what display?
12:32 karolherbst: Soukyuu: mabe this is just due to normal upclocking because nvidia thinks the lower perf states are too slow for the display
12:33 Soukyuu: LG W2442PA (1080p) + an older Hyundai (10280x1024)
12:33 karolherbst: ahh two displays
12:34 karolherbst: yeah, that might exaplains it 1920x1080 + 1280x1024
12:34 Soukyuu: yes
12:34 night199uk: hey all - anyone have any detail or can point me to some code for MMIO reg 0x00000000 (PMC)?
12:34 karolherbst: I would expect nvidia to clock to lower states with only one of them connected
12:35 karolherbst: night199uk: this is like for the device id
12:35 night199uk: especially bits 24-32… these seem to be chipset identifier in later cards?
12:35 karolherbst: night199uk: check rnndb
12:35 night199uk: yeah, thats the one
12:35 karolherbst: night199uk: https://github.com/karolherbst/envytools/tree/master/rnndb
12:35 karolherbst: bus/pmc I think
12:35 night199uk: is yours more up to date than the main rnndb?
12:36 karolherbst: ohh, no
12:36 karolherbst: doesn't matter
12:36 night199uk: that main rnndb *seems* to be out-of-date for this reg on more modern cards (looking at gtx980 here)
12:36 karolherbst: usually mine might be outdated
12:36 karolherbst: night199uk: 24-27: <bitfield low="24" high="27" name="ALWAYS0_1"/>
12:36 karolherbst: 28-31: <bitfield low="28" high="31" name="FOUNDRY">
12:36 night199uk: i was checking pmc.xml
12:36 karolherbst: well
12:36 karolherbst: it is all REed
12:36 night199uk: yeah, this is on older cards :-(
12:37 karolherbst: so for newer cards stuff might change
12:37 karolherbst: but usually until kepler it should have a lot stuff already
12:37 Soukyuu: karolherbst: so after a 25min episode it dropped 16 frames - that's fine by me, i was jumping around and leaving/entering fullscreen. Back when it started dropping frames it was like 1 frame per 2 seconds
12:37 karolherbst: if you find something missing, feel free to RE that and create pull requests
12:37 night199uk: the efi driver reads bits 24 & 0x1f - i.e. 24 - 28
12:38 night199uk: haha, yeah, okay, will see what i can do :-)
12:38 night199uk: was hoping there was a stock answer
12:38 night199uk: only have 2 kepler cards i can pull that reg from
12:39 night199uk: would anyone here have any samples of that reg from fermi/kepler/maxwell cards?
12:40 karolherbst: on my gk106: PMC.ID => { STEPPING = 0xa1 | DEVICE_ID = 0x20 | CHIPSET = GK106 | FOUNDRY = TSMC } // 0e6200a1
12:40 night199uk: perfect, thanks
12:40 karolherbst: night199uk: I could try to get you a _big_ list of a lot of cards
12:40 night199uk: yeah, that would be awesome
12:41 karolherbst: and with big I mean like 20+ different cards
12:41 karolherbst: since fermi
12:41 night199uk: just for that reg?
12:41 night199uk: well, doesn’t matter, i can do that filtering
12:42 night199uk: hrm, your rnndb gives you CHIPSET = GK106 already
12:42 night199uk: sec
12:42 karolherbst: 89 traces I have here for fermi+
12:42 karolherbst: ahh right
12:43 night199uk: awesome
12:43 night199uk: can i ping you my email?
12:46 night199uk: hrm
12:46 night199uk: okay, i do not understand how rnndb is getting GK106 from that
12:47 night199uk: the latest variant it has is NV10- right - this means NV10 and up?
12:47 night199uk: <bitfield low="20" high="28" name="CHIPSET" type="chipset"/>
12:47 karolherbst: night199uk: below
12:47 karolherbst: right
12:47 karolherbst: yeah
12:47 karolherbst: night199uk: <import file="nvchipsets.xml" />
12:48 night199uk: but if it’s nv10 and below?
12:48 night199uk: so it’s assuming nv10?
12:48 karolherbst: no
12:48 karolherbst: NV10- means nv10 and newer
12:49 night199uk: so if the driver is simply looking at 24-28 anyway this makes sense
12:49 night199uk: it’s looking at the high nibble of the chipset
12:49 night199uk: 0xe = kepler
12:49 night199uk: 0xc = fermi
12:49 night199uk: i guess
12:49 karolherbst: yeah
12:49 karolherbst: check nvchipsets.xml
12:49 night199uk: yup
12:50 night199uk: these chipset IDs in nvchipsets.xml
12:50 night199uk: these are the NV.. codes?
12:50 night199uk: e.g. NVC0 = GF100
12:50 night199uk: ?
12:51 karolherbst: yeah
12:51 night199uk: perfect
12:51 night199uk: thanks
12:51 karolherbst: "<value value="0xc0" name="GF100"/>"
12:51 karolherbst: :D
12:51 night199uk: yup.. i just wanted to that that 0xc0 = NVC0 iyswim
12:51 night199uk: check
12:52 night199uk: thanks, i know what the driver is doing now
12:52 night199uk: perfect :-)
13:02 Soukyuu: my 260gtx card sure doesn't like downclocking
13:04 Soukyuu: going from mid -> high didn't fail once, but going from mid -> low or high -> low usually just freezes without anything in journal after reboot
13:06 pmoreau: Possibly low is not enough to power both screens, but… it shouldn’t freeze the computer. Probably be extremely slow, maybe stop updating the display, but you should be at least able to power down normally.
13:07 Soukyuu: pmoreau: i could a few times, but last two times it froze completely
13:07 Soukyuu: i have kde set up to shutdown without prompts when I press the power button
13:07 pmoreau: :-/
13:08 pmoreau: If you have another computer, you could try SSH'ing in or set up netconsole to get the logs.
13:08 Soukyuu: I actually tried that, it's completely dead
13:09 Soukyuu: it could be unrelated to nouveau though, I had a couple of such freezes randomly
13:09 Soukyuu: maybe downclocking just triggers it by accident
13:11 pmoreau: I guess you tried SSH, but not netconsole?
13:12 pmoreau: With netconsole, it will stream the logs to the other computer as they are emitted, so maybe it will get some additional information right before it freezes
13:12 Soukyuu: I'll give that a try then
13:23 imirkin: Soukyuu: there's a patch you might be interested in... hold on
13:23 imirkin: Soukyuu: https://bugs.freedesktop.org/show_bug.cgi?id=95044
13:25 karolherbst: ohh right, forgot about this issue
13:41 Soukyuu: imirkin: does that solve the issue or just disables high clock?
13:43 imirkin: it solves the issue
13:43 imirkin: well
13:43 imirkin: it solves *an* issue
13:43 imirkin: whether it's your issue... who knows
13:44 Soukyuu: it sure sounds like my issue - the reporter also has nearly the same hardware
13:45 tobijk: imirkin: you know if airlied decided to work on a new iteration of the lowering for clip/cull? the current state is really undersirable :)
13:46 imirkin: tobijk: well, he had to roll his change back
13:46 imirkin: because it broke ... something
13:46 tobijk: thats the state i mentioned :)
13:46 imirkin: he said he was going to look at it again on monday
13:46 tobijk: ah mhk
13:46 imirkin: i'm trying to get it going on nv50 right now
13:46 imirkin: with some minor changes, i got it going on nvc0
13:47 imirkin: have to explicitly enable those cull distances
13:47 imirkin: since they're not in the rasterizer clip distance enable thing
13:47 tobijk: and on nv50 i presume you have to fix the inputs first?
13:47 imirkin: hm?
13:48 imirkin: what inputs?
13:49 tobijk: uhm never mind
13:52 imirkin: aha, figured out why nv50 is messed up. ugh
13:52 imirkin: stupid varying linking =/
13:52 tobijk: ah thats what i meant
13:52 tobijk: you can use https://git.thm.de/tjkl80/mesa/commit/47082173d9d6fb7073161bed56dce5c307953c3f.diff for now, should be sufficent enough for testing
14:05 tobijk: imirkin: you have the nvc0 patch of yours available somewhere?
14:05 imirkin: tobijk: not yet
14:08 imirkin: tobijk: https://github.com/imirkin/mesa/commit/18671b9ee1ab20b1f21071b1a97a101038051a71
14:09 tobijk: imirkin: thx
14:09 imirkin: tobijk: note that i'm in the middle of making nv50 work
14:09 imirkin: so... nv50 doesn't work :)
14:10 tobijk: imirkin: for now im happy with nvc0 only :)
14:10 imirkin: it should be easy. just need to read a bunch of code first
14:13 Soukyuu: sounds like i will be rebuilding the kernel more often now
14:13 Tom^: or just stick to the same version longer. :p
14:13 Tom^: or build nouveau out of tree
14:14 Soukyuu: i think it's easier to just rebuild and add a patch with arch linux's pkgbuild system
14:15 Soukyuu: not faster, but easier
14:15 Soukyuu: need more distcc slaves
14:17 imirkin: yay, that worked
14:21 tobijk: imirkin: r-b on the disablement patch you just send
14:22 tobijk: (sitting here and waiting for an compiled up to date mesa :/)
14:26 tobijk: imirkin: hrmpf, all clip test pass for me with airlieds lowering
14:27 tobijk: with nvc0 and intel
14:28 imirkin: tobijk: well, it was some tess stuff that broke
14:29 tobijk: ah
14:46 tobijk: imirkin: i still have some compiler speedup sitting in my tree: https://git.thm.de/tjkl80/mesa/commit/21fef3cb9940a99e9684e3afa6ced0f98d6638af , why didnt we ever put this into mainline? (works fine on my build :) )
14:46 tobijk: raw: https://git.thm.de/tjkl80/mesa/commit/21fef3cb9940a99e9684e3afa6ced0f98d6638af.diff
14:54 imirkin: tobijk: because i was scared of the gathering darkness
14:54 imirkin: tobijk: btw, i ended up doing the thing where i sorted spills :)
14:54 imirkin: tobijk: i needed it for determinism more than performance
14:57 Tom^: imirkin: line 1037 in nvc0_screen.c what is this allocating? ret = nouveau_bo_new(dev, NV_VRAM_DOMAIN(&screen->base), 1 << 17, 1 << 17, NULL, &screen->txc);
14:57 imirkin: Tom^: the txc... whatever that is ;)
14:58 imirkin: i think it's the TIC or TSC table
14:58 imirkin: or both
14:58 Tom^: ok
14:58 imirkin: yeah. it's the TIC and TSC tables
14:58 imirkin: 64k a piece
15:01 imirkin: (i.e. texture and sampler descriptors)
15:16 tobijk: imirkin: is the spill thing already in master? i'd like to see how to do thar right :)
15:29 imirkin: tobijk: yeah... like 3-6 months ago
15:29 imirkin: tobijk: 99581ca393037e10d17aab1f4c90ff2bdb1ec557
15:30 Lekensteyn: does anyone know where airlied got the 0x1B function (NOUVEAU_DSM_OPTIMUS_FLAGS) from in 5addcf0a5f0fadceba6bd562d0616a1c5d4c1a4d ("nouveau: add runtime PM support (v0.9)")?
15:31 Lekensteyn: some check is missing, leading to the situation in https://bugzilla.kernel.org/show_bug.cgi?id=104791. Probably one of the capabilities has to be checked before calling it
15:31 imirkin: probably looked through acpi tables?
15:32 imirkin: feel free to redirect the poster to file a bug on bugs.freedesktop.org so that graphics people can have a look
15:35 Lekensteyn: oh finally I see the link
15:35 Lekensteyn: the value returned for function 0 is a bitmap of supported functions
16:06 tobijk: airlied: what ever was wrong with my lowering of cull, fwiw it does not break the things yours did ;-)
16:07 Tom^: imirkin: hm just noticed my dmesg is filled with this https://gist.github.com/anonymous/07f84fed48962fab80ece9d1c5fe36e6 :o
16:09 tobijk: imirkin: maybe you know what was wrong with the pass, i want to sort this out this weekend :D
16:10 Tom^: imirkin: seems to be spamming that in my dmesg when im playing victor vran
16:10 Tom^: imirkin: yet the game seems to be fine :P
16:15 karolherbst: Tom^: could you create an apitrace?
16:15 Tom^: sure was just about to compile mesa with debug and not strip symbols too
16:15 karolherbst: Tom^: well, maybe mesa prints something but most likely not
16:16 karolherbst: no idea what "MISALIGNED_GPR" stands for, but I assum it has something todo with how registers are layed out in the generated binaries
16:17 Tom^: probably should restore the increased shader allocation before i apitrace too
16:19 tobijk: Tom^: my suspicion GPR = general purpose register, so something is wrong with some in or outputs maybe :)
16:19 tobijk: suspicion = guess
16:21 karolherbst: tobijk: well the hardware is a bitch here, because it simeply continues doing stuff. You have to tell the GPU how many GPRs should be allocated for the shader you upload, and for example if you allocate less than it actually needs, the GPU complains, but still renders stuff
16:24 tobijk: who does not want partially complete stuff :)
16:28 Tom^: karolherbst: the trace is 615mb :p
16:28 karolherbst: Tom^: xz compress it
16:38 Tom^: karolherbst: https://www.dropbox.com/s/lxrt0oneqaz4mvb/victorvran.tar.xz?dl=1
16:39 Tom^: karolherbst: suspect its gonna be near the end of the trace because it didnt spam in dmesg until i actually loaded and entered the world.
16:39 Tom^: and then i aborted the game rather quickly so the trace didnt grow to much :P
16:53 Soukyuu: randomly got "nouveau 0000:01:00.0: fifo: DMA_PUSHER - ch 2 [Xorg[625]] get 002002a3a0 put 002002a430 ib_get 00000039 ib_put 0000003d state 80000000 (err: INVALID_CMD) push 00400040"
16:53 Soukyuu: is this something to worry about? didn't notice the exact timing when that popped up in dmesg
16:54 Soukyuu: oh and the patch to fix up/downscaling seems to have worked
17:01 karolherbst: Tom^: don't use tar for single files
17:01 karolherbst: Tom^: it makes the archive bigger
17:01 Tom^: oh i see
17:01 karolherbst: by around 4k though only
17:02 karolherbst: 655718400 (tar) vs 655713937 (file)
17:02 karolherbst: Tom^: tar is used to put multiple files into one file
17:02 karolherbst: not to compress
17:02 karolherbst: yeah lol
17:03 karolherbst: apitrace is printing stuff like crazy
17:03 karolherbst: Tom^: does replaying the trace causes those messages to appear in dmesg?
17:03 Tom^: maybe i can slow it down if i downclock to 07 xD
17:04 karolherbst: `HG_GLOBALS_CB__' uses reserved `__' string
17:04 karolherbst: what a bs error
17:04 karolherbst: and no, I don't get your error
17:05 Tom^: yea it spams in dmesg when replaying
17:05 karolherbst: nice
17:05 karolherbst: gk110+ error
17:05 Tom^: thats not nice :(
17:05 karolherbst: imirkin: wanna replay the trace from Tom^ on your gk20x and see if you also get the demsg spaming?
17:12 Soukyuu: quick question: do i have to enable vsync explicitly? I seem to get tearing in chromium and pcsx2 but none in mpv fullscreen
17:13 Tom^: Soukyuu: is your desktop or rather window manager composited?
17:14 Soukyuu: I'm running kwin_x11, yes
17:14 karolherbst: Tom^: thing is, we can't know which shader is causing those messages to appear, so one thinkg you can do:
17:14 karolherbst: Tom^: open the trace in qapitrace and find the glDraw* call after which the first message appears
17:15 karolherbst: Soukyuu: try "force fullscene repaints" as tearing prevention
17:15 Tom^: karolherbst: ok
17:15 karolherbst: *full scene repaints
17:15 karolherbst: automatic is kind of stupid, because it disables tearing prevention on low perf
17:15 karolherbst: also happens on my intel GPU wit DRI2
17:17 Soukyuu: karolherbst: yes, that works. performance dropped but no more tearing so far
17:17 Soukyuu: thank you
17:17 karolherbst: Soukyuu: maybe DRI3 also works better
17:18 Soukyuu: karolherbst: how do I check/switch to that?
17:18 karolherbst: Soukyuu: xorg settings, but I have no idea about the nouveau DRI3 state
17:18 karolherbst: Soukyuu: otherwise upclocking usually helps with bad tearing prevention perf
17:18 Soukyuu: well at worst I'll land in a blackscreen, right?
17:20 Soukyuu: I'm feeling lucky, brb
17:21 karolherbst: :D
17:25 Soukyuu: karolherbst: well, apart from different font rendering I see no difference in performance after setting DRI3
17:46 Tom^: karolherbst: either im doing things wrong or this is gonna take a while
17:51 karolherbst: Tom^: this is gonna take a while :D
17:52 karolherbst: Tom^: well on the first level try to find the first frame with the dmesg print
17:52 karolherbst: Tom^: and you wanna take a bisecting approach
17:52 karolherbst: + being smart about that
17:52 Tom^: yea but its like 63883 calls in the frame
17:52 Tom^: xD
17:52 karolherbst: yeah
17:52 karolherbst: 16 checks
17:53 karolherbst: just ceck the middle first and see what it gives you
18:16 hakzsam: mupuf, I'm running a full deqp gles3 on reator, please do not shutdown until it's done
18:16 hakzsam: it really takes a while
18:17 hakzsam: btw, it would be good to silence dmesg infos like [ 5676.299283] perf interrupt took too long (2503 > 2495), lowering kernel.perf_event_max_sample_rate to 50100
18:17 hakzsam: because this gives me on dmesg-warn
18:17 hakzsam: *one
18:34 karolherbst: Tom^: and how far are you?
18:37 Tom^: karolherbst: ive opened a beer and just came back from a hot sauna.
18:38 Tom^: :D
18:40 Tom^: ok know which frame it is now just to find the draw call
18:44 karolherbst: mupuf: I think in the end we will have two options for OC: change max_voltage and change voltage for clock (or clock for voltage)
18:44 karolherbst: mhh
18:44 karolherbst: now that I think about it
18:45 karolherbst: maybe it makes more sense to change the clocks instead of manipulating the voltage and hope something good comes out of that
18:48 Tom^: karolherbst: ok ive found the glDrawElementsInstancesBaseVertex that does it
18:48 karolherbst: Tom^: okay, and are you sue the call before that doesn't result ina new print in dmesg?
18:49 Tom^: karolherbst: well ive filtered the calls so i only see glDraw calls, so idk
18:49 Tom^: karolherbst: its atleast the first draw call that does it
18:50 karolherbst: Tom^: in that frame I assume?
18:50 Tom^: yea
18:50 karolherbst: Tom^: k, then get the shaders of that draw call via qapitrace
18:51 Tom^: GL_FRAGMENT_SHADER and GL_VERTEX_SHADER
18:51 karolherbst: yeah
18:55 Tom^: or did you want the code from that code window in those shaders?
18:55 karolherbst: yeah
18:55 Tom^: oh i see
18:56 Tom^: https://gist.github.com/gulafaran/2dfa0998770e3b35468013df07216e71 GL_FRAGMENT_SHADER , and GL_VERTEX_SHADER https://gist.github.com/gulafaran/cb51eae8ba9bdb79ed59ca8988266e39
18:57 karolherbst: uhhhh
19:01 Tom^: karolherbst: oki im 100% sure now, http://i.imgur.com/EmbsYXc.png glBindBuffer doesnt generate anything, that glDraw does.
19:03 karolherbst: okay
19:10 karolherbst: lol, payday2 and borderlands presequel have a common shader
19:11 karolherbst: ohh
19:11 karolherbst: yeah well
19:11 karolherbst: like 20 lines
19:12 Calinou: Nouveau will get overclocking support?
19:13 Calinou: if so, that's very nice, it means you guys grew up over the radeon people :>
19:13 Calinou: who will refuse to add anything overclocking-related IIRC
19:15 karolherbst: Calinou: well weren't there patches?
19:15 Calinou: I don't know, but they seem very hostile to the idea
19:15 karolherbst: https://lists.freedesktop.org/archives/dri-devel/2016-May/107503.html
19:23 karolherbst: stupid bash compltion
19:26 karolherbst: odd
19:26 karolherbst: Tom^: well nothing seems wrong, maybe the emited binary has something odd
19:26 karolherbst: Tom^: it is the victor_vran/816.shader_test file by the way when somebody asks
19:28 karolherbst: Tom^: but I assume other than this, your GPU runs stable with my branch?
19:28 Tom^: yep
19:28 Tom^: it runs skyrim stable
19:29 Tom^: thats a good enough test ;)
19:29 karolherbst: awesome :)
19:29 karolherbst: seems like I did something right then
19:36 Tom^: karolherbst: so what do i do?
19:44 Tom^: besides shed a tear and hope imirkin comes in and saves the day
19:48 karolherbst: mh no idea
20:00 Tom^: karolherbst: hm i wonder how hard it would be to piece together some code that reproduces this
20:02 karolherbst: Tom^: well, you have the shaders
20:29 Soukyuu: just learned something: running min clocks and watching a video leads to fifo cache errors
20:29 karolherbst: Tom^: there is also a project to turn an apitrace into C code
20:30 Tom^: karolherbst: hm still suspect thats gonna be just as easy to just piece together something myself
20:30 Tom^: 600mb converted into C xD
20:31 Tom^: and then figure out where that bad glDraw is etc. ugh
20:31 imirkin: Tom^: i'm not sure what your issue is, but you guys talked for a while and i've lost track. please make a concise thing, e.g. file a bug, etc.
20:31 Tom^: imirkin: sure thing
20:31 imirkin: Soukyuu: the cache errors are an undiagnosed thing. we have no idea why they happen, but they appear to happen on all tesla's with varying frequency
20:32 Soukyuu: imirkin: so it's not just low clocks causing an underrun?
20:32 imirkin: we have no idea :)
20:33 Tom^: imirkin: to sum it up victor vran spams https://gist.github.com/anonymous/5074ed17c1f4e1b803d8e2b5f493fc6e this when playing. and ive apitraced it and found the bad glDrawElementsInstancedBaseVertex in qapitrace
20:33 Tom^: and it doesnt occur on karolherbst card when replaying the trace so it looks like a gk110+ issue
20:33 imirkin: Tom^: ooh, neat. i guess i messed up somewhere. if you can get the shader in question
20:34 imirkin: Tom^: then stick it into a shader_test and compile it to look at the output. hopefully the misaligned gpr thing will become apparent
20:34 Tom^: imirkin: there are two shaders in qapitrace , GL_FRAGMENT_SHADER https://gist.github.com/gulafaran/2dfa0998770e3b35468013df07216e71 and GL_VERTEX_SHADER https://gist.github.com/gulafaran/cb51eae8ba9bdb79ed59ca8988266e39
20:34 imirkin: Tom^: alternatively file a bug and i'll have a look
20:35 imirkin: please include those two shaders in there, and if it's not TOO huge, a reference to the apitrace
20:35 imirkin: i gtg
20:35 imirkin: but i'll have a look tonight
20:35 Tom^: oki
20:41 Tom^: karolherbst: what gpu did you have?
20:54 Tom^: imirkin: https://bugs.freedesktop.org/show_bug.cgi?id=95403
20:55 airlied: tobijk: the previous lowering pass was still lowering to a culldist semantic in some places
20:56 mupuf: karolherbst: please, OC is not on the list of features for now
20:57 mupuf:is busy since friday afternoon, being a cameraman for livestreaming a cycling competition
20:57 airlied: I'm also not sure how it all works with tess enabled
20:57 mupuf: I will talk to you more when I am done. Bed time for me
20:58 karolherbst: mupuf: :D yeah, it was just a thought
20:58 karolherbst: Tom^: 770m
20:58 Tom^: karolherbst: to late ive already written the report. :P
20:58 karolherbst: :D
20:59 karolherbst: Tom^: I would be happy if all my traces would be like 300MB :D
20:59 Tom^: haha
20:59 karolherbst: Tom^: I have a 6.6GB here for SR3
21:06 Tom^: oh well bed time.
21:08 Yoshimo: hard to find something usefull in such a big file
21:26 Soukyuu: "nouveau 0000:01:00.0: fifo: CACHE_ERROR - ch 6 [kwin_x11[13044]] subc 0 mthd 0060 data beef0201" <- it's always the same data, looks like a codeword
21:37 tobijk: airlied: it was just a observation, as i ran a full piglit run bevor releaseing my last series to the ml
21:37 tobijk: maybe we should just try to adapt that one instead of reinventing what is mostly there?
21:38 airlied: tobijk: the problem was the tests you ran piglit against didn't break when culling didn't work
21:38 airlied: so you didn't break clipping, however culling wasn't fully functional
21:38 tobijk: mh ok
21:53 pmoreau: imirkin: I’m finally taking care of the comments on my 64<->32bit CVT patch. But I’m hitting some issues while testing without it: https://phabricator.pmoreau.org/P97
21:55 pmoreau: imirkin: For some reason the def(0) of the CVT insn has an id of -1, marking it as a nop op (isNop will return true for it), however the store op later on keeps (the now defunct) result of CVT as its value to store.
21:56 pmoreau: imirkin: Any idea where the ids are set to -1 / why it would be set to -1?
22:16 imirkin: pmoreau: -1 == no RA
22:16 pmoreau: :-/
22:16 imirkin: pmoreau: or rather, before RA
22:17 pmoreau: Right, I updated the print function to also display the id, and they all change to >=0 after the RA pass, except for the result of the CVT op.
22:17 pmoreau: I’m looking at the code to try to understand why
22:28 imirkin: airlied: when tess is enabled, the one from TES is the one that "really matters"
22:29 imirkin: airlied: however the others should still work as usual... they just don't have the effect of clipping/culling anything, since that comes after the GS stage
22:35 pmoreau: So, it didn’t receive a color… why
22:36 imirkin: Tom^: ok.... i have a handful of ideas
22:37 imirkin: Tom^: definitely the shader results look fine, under the rules i'm aware of
22:37 imirkin: Tom^: however there could EASILY be rules i'm not aware of :)
22:37 imirkin: mwk: what do you think about this on gk110: 00000330: 011c1012 760000be texgrad p all $r4:$r5:$r6:$r7 t2d c[0x0] xygg $r4:$r5:$r6:$r7 gg__ $r2:$r3
22:38 imirkin: mwk: can you see that causing a MISALIGNED_GPR exception?
22:38 pmoreau: Oh! `size >> units[f]` nv50_ir_ra.cpp:80, so, with a size of 1, `units[f] == 2` for GPR, that is surely going to end up as 0…
22:40 pmoreau: I probably need to report a full 32-bit reg to Nouveau, even if it is a char?
22:41 pmoreau: Since if you want to pack 4 chars in a reg, you have to do it manually in the OpenCL code
22:41 imirkin: mwk: and separately what do you think about 7f9c0809 6001008e tex p lauto all dfp $r2:$r3:#:# t2d c[0x8] xy__ $r2:$r3 0x0 for the same error
22:41 Tom^: imirkin: can you reproduce it on your gk20x when replaying the trace?
22:41 imirkin: Tom^: difficult to say. my gk208 is on the shelf :)
22:43 pmoreau: imirkin: So moral of the story: I can’t use <32-bit size values, right?
22:45 imirkin: pmoreau: not sure what story you're talking about, tbh
22:45 imirkin: pmoreau: do you have an LValue that's 8-bit in size? that's probably not a great idea.
22:45 pmoreau: Still talking about https://phabricator.pmoreau.org/P97, and why the result of the CVT is not allocated and the whole CVT insn is eventually removed
22:47 pmoreau: I was testing the following kernel: `void cast(char* out, long in) { *out = convert_char_sat(in); }
22:47 imirkin: Tom^: among other things, unless i'm severely mistaken, the glsl -> tgsi (or tgsi -> nv50) is wrong
22:48 Tom^: imirkin: ok, atleast it cant be a major issue because i didnt notice it until i was looking for something else in dmesg :P
22:50 imirkin: Tom^: tbh, i've seen it before
22:50 imirkin: Tom^: i think at one point i had the theory that it was due to going over the maxgpr on some tex instruction or something
22:50 imirkin: but i don't think that's what's happening here
22:51 Tom^: i still have the trace open in qapitrace if theres anything else you would want
22:51 imirkin: oh wait, actually that's not a bug in the conversion
22:52 imirkin: i forgot that ddy multipled (optionally) by -1 for winsys idiocy :)
22:53 imirkin: Tom^: if i give you a patch to test, can you do so easily/
22:54 imirkin: i just have some random ideas
22:54 imirkin: basically it's not like the gpr alignment rules are well-spelled-out, esp for oddball ops like TEX
22:55 imirkin: and so i just have to look for things that could be a little fishy, and hope i get it right
22:55 imirkin: and the problem is that even if i get it wrong, it might shift GPR alloc around enough to fix the real problem :)
22:56 Tom^: imirkin: interestingly everything froze when i replayed the trace this time with this in dmesg https://gist.github.com/gulafaran/3f966523b369115c0bcbfe12938a2e44
22:56 Tom^: if its of any value
22:56 Tom^: idk why because i replayed the trace easily 150 times to find the faulty glDraw call :p
22:57 karolherbst: Tom^: well random gpu hang I guess
22:57 imirkin: that seems like an unrelated issue
23:00 imirkin: aha:
23:00 imirkin: // If TEX requires more than 4 sources, the 2nd register tuple must be
23:00 imirkin: // aligned to 4, even if it consists of just a single 4-byte register.
23:00 imirkin: and that texgrad thing fails
23:00 imirkin: now time to figure out why that is
23:00 imirkin: since we clearly TRY to handle it
23:01 imirkin: Tom^: expect a patch in 10-20 mins :)
23:01 Tom^: im waiting patiently.
23:01 Tom^: =D
23:09 imirkin: Tom^: http://hastebin.com/ipawebepac.md
23:09 Tom^: applying.
23:11 karolherbst: imirkin: any reasons why I didn't get the message on my kepler?
23:11 imirkin: karolherbst: you have fewer registers
23:11 imirkin: karolherbst: aka just happened to get lucky
23:11 karolherbst: ohh
23:11 imirkin: or perhaps TXD isn't as picky
23:11 karolherbst: you mean in total
23:12 imirkin: OR perhaps it just doesn't raise the error, and silently fails at life
23:12 karolherbst: mhh maybe
23:12 karolherbst: what is the expected issue caused by this?
23:12 karolherbst: just less perf or something serious too?
23:12 imirkin: wrong LOD selected
23:12 imirkin: when texturing
23:12 imirkin: aka... probably not even visible to you
23:12 karolherbst: ahh
23:12 karolherbst: so just bad texturing
23:13 imirkin: not _that_ bad
23:13 karolherbst: right
23:13 karolherbst: so more like less quality to _some_ dagree
23:14 imirkin: or too much
23:14 karolherbst: :D
23:14 imirkin: (and thus slower)
23:14 karolherbst: I se
23:14 karolherbst: e
23:14 pmoreau: It’s working better with a 32-bit reg than a 8-bit one :-) Now, Nouveau thinks it’s emitting `0: cvt u8 $r2 u64 c0[0x8] (8)` whereas envydis says `00000008: 21809c04 1c004000 cvt u8 $r2 ??? $r8 [unknown: 01800000 00004000] [unknown operand]` (and the GPU agrees with unknown operand)
23:15 imirkin: pmoreau: i think you just want to use the right u32
23:15 karolherbst: pmoreau: a week ago I also hit an issue where the post RA pass output looked good, but the emited binariy was garbage :)
23:15 pmoreau: And do the "conversion" manually to u8? Ok
23:16 imirkin: pmoreau: well
23:16 imirkin: pmoreau: just pick the right "half" of the u64 (aka the lower one)
23:17 pmoreau: Oh, well that is what my patch is doing (the one I submitted). I’m trying to test in which case it should apply, as you suggested
23:19 pmoreau: (This patch https://lists.freedesktop.org/archives/mesa-dev/2016-March/110423.html)
23:21 Tom^: imirkin: cool the trace didnt spam dmesg any more but i got this again https://gist.github.com/gulafaran/715a2b46e1348f1cc1b1c8112de84e74
23:21 imirkin: dunno, sounds like ttm is unhappy
23:22 Tom^: is that in mesa or the kernel
23:22 imirkin: kernel
23:22 Tom^: i probably could recompile and keep debug symbols
23:39 imirkin: hakzsam: any luck with default_attribute?
23:50 pmoreau: Fun: `p isSignedIntType(TYPE_S64)` -> `$ false`
23:51 pmoreau: Oh, I see why :-D
23:53 pmoreau: Missing a few TYPE_U/S64 cases in nv50_ir_inlines.h
23:53 Tom^: lol ok that didnt work. kernel is 1.2gb with debug symbols. my efi partition isnt that big
23:54 pmoreau: Wow! Yeah, who would create such a large EFI partition!
23:55 Tom^: i guess its time to create a suited kernel .config then
23:56 imirkin: Tom^: that's a LOT of debug symbols...
23:56 Tom^: =D
23:56 imirkin: Tom^: did my change seem to work though?
23:56 Tom^: yea it aint complaining in dmesg and the trace runs fine
23:57 imirkin: ok cool
23:57 Tom^: i guess i could test some games too incase other things broke
23:57 imirkin: were you looking into this because rendering was broken?
23:57 imirkin: or did you just happen to notice it for no reason?
23:57 Tom^: nah i was looking up a segfault in dmesg and noticed it for no reason
23:57 imirkin: ah ok
23:58 imirkin: well there's a VERY small chance it could affect actual rendering
23:58 imirkin: but most likely you'd never notice the difference