00:00karolherbst: imirkin: https://github.com/karolherbst/mesa/commit/cdcf3eaf20192062ac9653af688a97ad643f2340
00:02karolherbst: moving the quadpop/quadon doesn't seem to matter, so: https://github.com/karolherbst/mesa/commit/c975f1afbd8d2eaa84c58e7a77cfa8d39017e3e7
00:03karolherbst: okay, now the last 96 tests
00:05karolherbst: mhh, still 96
00:42karolherbst: 48 fails
00:43karolherbst: something is still wrong, but I can't see it yet: https://github.com/karolherbst/mesa/commit/6fcb43f74badcafd73fff267173df7e944b14140
00:44karolherbst: imirkin: would be nice if you could take a look at it
00:46karolherbst: I am sure only some shader type fails
00:46karolherbst: like compute
00:50karolherbst: oh no, I broke textureGather?
00:51karolherbst: but that means I fixed textureGrad :)
00:51karolherbst: textureGather was broken before already
01:25imirkin: karolherbst: you have to do the flip based on ... the stupid thing
01:25imirkin: karolherbst: look at how DDY is handled
01:25imirkin: it has to be done at the st/mesa level
01:25imirkin: er wait, actually, nevermind - that's totally bogus, ignore that
02:27imirkin: karolherbst: OH SHIT! that's a great find -- i totally forget to broadcast the array coordinate to all lanes!
02:27imirkin: karolherbst: dunno if your other changes are important
03:21imirkin: mwk: how would i broadcast a value from 1 lane to all lanes? preferably something that works on both fermi and kepler+
03:22imirkin: looks like nvidia uses quadop f32 add with 0 - that seems dangerous
03:25imirkin: i guess SHFL.IDX would be the thing to do on kepler+...
03:25imirkin: (does that even work in frag shaders though?)
04:01imirkin: karolherbst: the remainder of the 48 issues have to do with textureGather + depth/stencil. i think that generically sounds mildly familiar, but ... i don't really remember
04:04imirkin: mwk: any clue why things work when doing the tex from lane 0's perspective (and then always copying the l0 result)? even in frag shaders?
04:25imirkin: Test case 'KHR-GL45.texture_cube_map_array.sampling'..
04:25imirkin: Pass (Pass)
04:38imirkin: karolherbst -- see my updated 'cts' branch
05:07mangix: skeggsb: any chance of backporting i2c/gf119-: add support for address-only transactions to 4.12?
05:07mangix: Fedora's latest kernel does not include it
05:08mangix: and hence loading GNOME fails
06:45karolherbst: imirkin: well, only with all changes I got all subtests to pass
06:45karolherbst: except the quadpop/quadon move, I think this might be totally unimportant
06:49karolherbst: imirkin: and my changes are simply closest to what nvidia does
06:52karolherbst: will have a second look this afternoon
08:34karolherbst: imirkin: I could imagine that those quadon/quadpops are minor perf optimizations
09:34tesnv: How does nvClkMode parse letters other than f?
09:43tesnv: Like say a for a lower clock set on boot
09:46karolherbst: tesnv: as hex
09:48tesnv: Oh ok. The parameter guide on freedesktop shows it as not hex. Was a little confusing.
09:54karolherbst: uhm or not, meh, my mistake
09:54karolherbst: meant something totally different
12:02imirkin: karolherbst: yeah, i don't disagree that those changes made a difference. i just don't understand *why* they made a difference. and i hate that.
12:03karolherbst: I didn't even try to understand it
12:04imirkin: well - hopefully my comments shed light on at least what they're doing?
12:04karolherbst: yeah, okay sure, I get what each operation does on its own
12:04karolherbst: but I didn't know what the entire process is doing
12:05imirkin: anyways, with my other fixes that whole test passes now which is nice
12:05karolherbst: and I think we are better off doing what nvidia does before trying to understand what has to be done on our own (for now)
12:05karolherbst: I left a comment on your comit though
12:05imirkin: i disagree with nvidia there :)
12:05karolherbst: maybe they optimized it away?
12:06imirkin: IMO it's actually more efficient this way.
12:06imirkin: since only one texture will be getting hit instead of potentially 4 different textures
12:06imirkin: in 99.9999999999% of cases it won't matter
12:06karolherbst: maybe that's why they left it out
12:06karolherbst: probably they did some benchmarking and concluded it isn't worth it
12:07karolherbst: but we could try to figure out how much of an impact those quadon/quadpop pairs have
12:07karolherbst: that would be interesting
12:07imirkin: i guess given that observation, in practice it shouldn't matter
12:07imirkin: so *not* having the extra mov's would, on average, be faster
12:08imirkin: however in the 0.0000001% case where it does matter, i suspect my way will be faster.
12:08karolherbst: yeah, it boils down what is faster in the avg case
12:08imirkin: anyways - given that thought - i'll change it, with a comment
12:08karolherbst: your version looks much cleaner anyway
12:08karolherbst: than mine
12:08imirkin: you were randomly pressing buttons until the output code was what you wanted :p
12:09imirkin: it helps to actually understand what the thing is trying to do
12:09karolherbst: I didn't bother cleaning it up
12:09imirkin: and like i said - it took me a *long* time to figure that out
12:09karolherbst: makes sense
12:10karolherbst: I guess everything involving quadops isn't trivial
12:11karolherbst: I guess another big item on the cTS is "KHR-GL45.shader_image_load_store.non-layered_binding" :/
12:11imirkin: does that one fail coz of 3d?
12:12karolherbst: something not (properly) supported
12:12karolherbst: something image related ;)
12:12karolherbst: can't remember the exact wording
12:13imirkin: image3d? :)
12:13karolherbst: really, no idea
12:13karolherbst: something image related we didn't implement yet
12:16karolherbst: looking at the code, "3D images are not really supported" sounds like the error
12:16karolherbst: so yeah
12:36imirkin: karolherbst: updated my change a bit. still passes.
12:37karolherbst: did you push it?
12:38imirkin: to my github tree, yes
12:38karolherbst: ahh okay, it looked the same to me, but now not anymore...
12:46karolherbst: imirkin: I think you don't want to do this: "bld.mkQuadop(0x00, tex->getDef(c), 0, tex->getDef(c), zero);"
12:46imirkin: i do.
12:46karolherbst: doesn't that lead to warning printed about values being not uniquely defined?
12:47karolherbst: ohh, I see
12:47imirkin: pre-ssa = the best! :)
12:47karolherbst: I call it cheating
12:47karolherbst: okay, but it looks good to me
12:48karolherbst: I guess your other two commits fix those other subtests?
12:53imirkin: deceivingly simple, right? :)
12:54imirkin: had nothing to do with texturing
12:54imirkin: all to do with input processing
12:59imirkin: that gm107 manual txd logic seems ... silly.
13:00imirkin: i wonder why the SHFL is needed
13:00imirkin: is it coz FSWZADD works differently than QUADOP? probably...
13:07karolherbst: imirkin: planning to get the textureGrad fix applied on stable? If not I would try to see what games/applications are affected to figure if it's actually worth it or not
13:14imirkin: karolherbst: eh... it fixes a CTS test, i think that's good enough
13:14karolherbst: :) okay
13:14imirkin: the question is whether i'm going to try to figure out *why* it's needed before applying it or not
13:15karolherbst: mhh, tough question. But I expect that even if you figure it out, you wouldn't change the patch significantly I guess. I highly doubt that nvidia would generate such a code if there is a "better" way of doing this
13:19imirkin: i definitely *do* want to get this tested on fermi though
13:19karolherbst: makes sense
13:19imirkin: since all this texture stuff is subtle and quick to anger
13:20imirkin: and i really should probably fix it on maxwell first too
13:20imirkin: which would mean i'd need a volunteer to test, as i don't have one plugged in
13:20karolherbst: ask mupuf
13:20karolherbst: mupuf: plug a fermi pls
13:20imirkin: he's running on blob afaik
13:21imirkin: oh, for reator
13:21karolherbst: nah, I meant reator. There we could at least run the cTS stuff
13:21mupuf: right now, there is only one pascal
13:21mupuf: I took out the GM206 yesterday ... which happened to work better on the blob
13:21karolherbst: well, we would like a fermi card
13:22mupuf: Will do more testing tonight related to that, for ahuillet
13:22mupuf: which one?
13:22mupuf: just give me two GPUs you want and you'll get them
13:22karolherbst: imirkin: any preferences?
13:22karolherbst: but I guess any fermi might do
13:22imirkin: all the same to me
13:27karolherbst: mupuf: I guess a gk110 and a gfxx is nice
13:28karolherbst: this way I could also test things on the gk110 if I want to
13:28karolherbst: or maybe gk20x? instead?
13:28karolherbst: not quite sure how big the differences here are
13:28imirkin: i have a GK208 plugged in
13:28imirkin: GMxxx + GFxxx would be ideal for my purposes.
13:29karolherbst: okay, sounds good to me
13:29imirkin: but i won't have time to hack on the maxwell thing ... until tonight at the earliest, and weekend at the latest.
13:37mupuf: no worries, reator is fully idle nowadays
13:47karolherbst: :( sorry
13:47karolherbst: I should do more kernel stuff again
13:47mupuf: why sorry :D
13:47mupuf: I am not doing anything on it either :D
13:48imirkin: none of us are, and no one's come to replace us =/
13:48karolherbst: it was towards reator, I am sure it feels very bored
13:51karolherbst: fosdem is usually in februray, right?
13:52mupuf: closest weekend to the the 1st of February
13:52karolherbst: I think it would be nice to get official 4.5 support by then for kepler+maxwell :) I think this is completly doable. Just I don't know how long khronos needs to approve of any handed in test result
13:52mupuf: 30 days
13:53karolherbst: ohh, okay
13:53karolherbst: so end of year we should hand it in
13:53karolherbst: sounds reasonable
13:53mupuf: but X>org still needs to sign the agreement first... which we finally seem to have on the way!
13:53mupuf: the lawyers took their time
13:54karolherbst: I would be more shocked if they didn't
13:57mupuf: the next step will involve freedesktop.org: We need to start giving away x.org addresses to people who want to interact with khronos
13:57mupuf: I will likely take on that role
13:57mupuf: since I have been pushing everyone to get this contract signed
14:04karolherbst: mupuf: sounds good
14:04karolherbst: mupuf: and I could expect me to care about the CTS pain regarding nouveau
14:04karolherbst: except imirkin wants to do this ;)
14:26imirkin: karolherbst: full 4.5 conformance by end of year is aggressive given the current pace of progress
14:30imirkin: and also what is it that i want to do?
14:34imirkin: actually fixing 3d images on kepler should be relatively doable
14:35imirkin: nvc0_mt_zslice_offset has the calculation
14:35imirkin: although it'd be nice to see how the blob does it
14:36imirkin: since that calculation takes x/y into account and computes the actual address of that tile, while we're going to need to figure out a way to place nice with the built-in tiling. somehow.
14:36imirkin: (or maybe not?)
14:37karolherbst: imirkin: most failures are API things imho
14:38karolherbst: or compiler things, which all didn't look too complicated
14:38karolherbst: but yeah, the pace might be a bit aggressive
14:38karolherbst: but maybe also doable
14:38karolherbst: and maybe we end up only asking for 4.4 for now
14:39imirkin: are you running the whole mustpass list, or cherry-picking what you like?
14:39karolherbst: I did several full runs
14:39karolherbst: allthough some tests just don't "finish"
14:39karolherbst: so I need to skip those
14:40imirkin: right, like some of the AoA tests
14:40imirkin: i think they hit some O(n^2) thing
14:40karolherbst: those "skipped due to crashes/hangs:" in the trello card
14:40karolherbst: most likely
14:41imirkin: sure, but e.g. where's the pipeline statistics stuff?
14:41karolherbst: KHR-GL45.arrays_of_arrays_gl.InteractionFunctionCalls1 doesn't take _too_ long, but still a lot of time
14:41imirkin: is that not in mustpass?
14:41imirkin: i mean i don't see it on the card
14:41karolherbst: I removed all 4.5+ things for now
14:41karolherbst: just made sense to me to get the 4.4 list running first
14:42karolherbst: then we can still decide to either work on 4.5 before sending it in or to send in the 4.4 run
14:42karolherbst: I also removed optional extension and alike
14:42imirkin: if it's in mustpass, it must pass...
14:42karolherbst: there are also a handfull of tests which need an OpenGLES context
14:42karolherbst: imirkin: well, we could just stop exposing the extensions
14:43imirkin: yeah, i mean we could also stop tiling 3d textures
14:43imirkin: that'd fix 3d images right up
14:43imirkin: at the cost of perf
14:43karolherbst: there is also "NotSupported" as the result of a test
14:43karolherbst: and those refer to optional things usually
14:43imirkin: i'd rather not cause driver worsening in the name of passing some CTS tests
14:44karolherbst: it's just a matter of priorities right now, I have a complete list somehwere of failing things
14:45karolherbst: and in total we have around 100 Tests which do not pass in the 4.5 mustpass list and half of those don't pass due to "NotSupported"
14:52imirkin: karolherbst: you can try to solicit help from the wider community on fixing some of the API issues
14:52imirkin: (that aren't nouveau-specific)
14:53karolherbst: yeah, I am aware
14:53karolherbst: maybe I should just go through all of those and create bug reports for each one :)
14:54imirkin: unfortunately people are often overworked, and there can be a feeling of "well, we aren't failing any conformance test that we have to pass, so leave me the fuck alone"
14:54karolherbst: intel fails some of those as well
14:54karolherbst: "just saying (tm)"
14:54karolherbst: I know
14:54imirkin: what's relevant is that they didn't fail it at some specific version that they ran it on.
14:54imirkin: [specific version of the test, that is]
14:54karolherbst: it gets interesting for 4.6 again
14:55imirkin: that's a ways away though
14:55karolherbst: so I think in the end we'll fix those things
14:55imirkin: although perhaps in time for FOSDEM, i965 and radeonsi will have it - dunno
14:55karolherbst: which means that a bunch of those API bugs might get also fixed, dunno as well
14:56karolherbst: I will clean up my load_image patches today and send them as RFC out
14:58imirkin: karolherbst: fp64 too
14:59karolherbst: but I think you wanted to review the groundwork as well?
15:01imirkin: less active voice, more passive voice
15:03imirkin: but yeah - the most recent versions are actually pretty good
15:03imirkin: i'm probably going to add a handful of comments
15:03imirkin: but no code changes
15:47puttext: hello to everyone
15:49puttext: someone can help me to activate svideo mode for tvout on an nvidia 5900?
15:58karolherbst: imirkin: sounds good :) How are the perf impacts by the way here anyway? I could imagine that those builtin functions draw a lot of perf compared to what we do now, but I also have no clue what actually depends on such precise results?
15:59karolherbst: but I assume if somebody uses that extension, they do it for a reason, so it's fine
16:12imirkin_: puttext: should Just Work (tm) - what seems to be the problem?
16:13imirkin_: karolherbst: if you're using doubles, you want precise results. nvidia has similar built-in functions to get accurate results.
16:13imirkin_: puttext: grep . /sys/class/drm/card*-*/status
16:23imirkin_: puttext: if you're not in a PAL country, you may have to set the tv_norm module option to the appropriate thing. see https://nouveau.freedesktop.org/wiki/KernelModuleParameters/
16:23imirkin_: [or rather, if your output device doesn't support PAL]
17:43karolherbst: imirkin: uhh, I found the part in the spec regarding f11
17:44karolherbst: "2^-14 * (M / 64), if E == 0 and M != 0," that's the range I implemented with my patch
17:46karolherbst: but implementations are also allowed to do this: "0.0, if E == 0 and M != 0"
17:46karolherbst: which is the current situation
18:13Roy: tried installing nouveau drivers using Nvidia Geforce 710 (NVE0 Kepler) running Ubuntu 16.04 on aarch64 ARM system, but No devices, No screens found as reported in /var/log/Xorg.0.log, any ideas for X server to run?
18:19pmoreau: Roy: Could you please link the output of `dmesg` and /var/log/Xorg.0.log?
18:28karolherbst: uhm nice
18:28karolherbst: 'KHR-GL45.arrays_of_arrays_gl.InteractionFunctionCalls2' just used 15GB of RAM for me
18:31Roy: root@localhost:/var/log# startxfce4 /usr/bin/startxfce4: Starting X server X.Org X Server 1.18.4 Release Date: 2016-07-19 X Protocol Version 11, Revision 0 Build Operating System: Linux 4.4.0-83-generic aarch64 Ubuntu Current Operating System: Linux localhost 4.4.65 #3 SMP PREEMPT Tue Aug 15 11:44:00 CDT 2017 aarch64 Kernel command line: console=ttyS0,115200 root=/dev/sda4 rootwait earlycon=uart8250,mmio,0x21c0500 Build Date: 17 July
18:31tobijk: Roy: complete dmes and Xorg log
18:31tobijk: pastebin it please
18:32Roy: will do
18:32tobijk: karolherbst: then we miss a terminal condition somewhere :D
18:32tobijk: karolherbst: btw are those test publically accessible? i wanted to fix the conditional_render thingy
18:33karolherbst: yes, they are
18:33karolherbst: tobijk: https://github.com/KhronosGroup/VK-GL-CTS
18:33tobijk: thx :)
18:34Roy: root@localhost:/var/log# cat Xorg.0.log [ 309.604] X.Org X Server 1.18.4 Release Date: 2016-07-19 [ 309.604] X Protocol Version 11, Revision 0 [ 309.604] Build Operating System: Linux 4.4.0-83-generic aarch64 Ubuntu [ 309.604] Current Operating System: Linux localhost 4.4.65 #3 SMP PREEMPT Tue Aug 15 11:44:00 CDT 2017 aarch64 [ 309.604] Kernel command line: console=ttyS0,115200 root=/dev/sda4 rootwait earlycon=uart8250,mmio,0x
18:34Roy: not able to paste Xorg.0.log correctly
18:35tobijk: just copy it to pastebin/hastebin, whatever is your service of choice
18:37tobijk: Roy: but actually dmesg is more important for now
18:39tobijk: Roy: show me dmesg please
18:39tobijk: dmesg > file.txt and then pastebin that
18:41Roy: sorry if I'm a bit slow getting the info
18:43tobijk: Roy: np
18:44karolherbst: TGSI of death: https://gist.github.com/karolherbst/5c3676415e79bd15e3b7681c80f9bd11
18:44Roy: lmk if you see any issues in dmesg or the log, other than no screens or no devices found
18:44Roy: lspci shows the nvidia card
18:44tobijk: Roy: hmm it looks like everything should be in place
18:45tobijk: expect this is an AARCH64
18:45tobijk: imirkin: -^ (any hints)
18:45Roy: yes, running on A72 CPU cores
18:46Roy: that's why I'm trying nouveau drivers for the nvidia card as nouveau supports ARM
18:46karolherbst: well, AARCH64 is pretty much untested
18:46karolherbst: except for tegra a little?
18:46Roy: I think that's the issue, not tested
18:46Roy: was hoping to figure out a way to get it working
18:47karolherbst: Roy: do you have any kind of xorg.conf file or so?
18:47Roy: yes, I've tried several xorg.conf files entries with no change
18:47karolherbst: remove those
18:48Roy: I tried removing them, again there was no change in behavior
18:50tobijk: it is silly, how the xserver finds the device "xfree86: Adding drm device (/dev/dri/card0)" and later on it fails
18:50Roy: yes, it seems to load and unload nouveau many times too
18:53karolherbst: of course that shader compiles for f1 and not e6 :/
18:53Roy: not sure on the shader comment?
18:53karolherbst: Roy: this is a tegra, right?
18:54tobijk: Roy: not for you, ignore it
18:54karolherbst: or... wait
18:54Roy: yes GK208B
18:54karolherbst: uhh, k
18:55karolherbst: everything seems to be alright on the kernel side
18:56Roy: I've tried gdm3, lightdm, but if I can't get X to recognize a screen I don't think I can get very far
18:56karolherbst: I am wondering
18:57karolherbst: maybe the DDX doesn't support kepler
18:57tobijk: karolherbst: it does, well at least the common e6/e7
18:58tobijk: Roy: try running without the xf86-video-nouveau
18:58Roy: nve0 kepler is listed
18:58karolherbst: well, your DDX is quite old
18:59karolherbst: but I think this is fine though
18:59Roy: where does the ddx driver get loader from?
18:59tobijk: Roy: just use modesetting and see if that brings you anywhere (remove the nouveau x driver)
18:59karolherbst: yeah, should be
19:00Roy: if I remove the nouveau driver, then don't I eliminate nvidia card support?
19:00tobijk: Roy: remove the package with whatever packagemanager you use (apt-get ?!)
19:02Roy: I'm not sure which pkg to remove
19:02tobijk: Roy: no you get the modesetting driver which is included with the xserver
19:02pmoreau: Roy: Nouveau has multiple components: a kernel module, the OpenGL driver in Mesa, and a driver for X. If you just remove xf86-video-nouveau, X will use modesetting as well, and all the other components from Nouveau are still there.
19:02tobijk: its good enough for first tests
19:03tobijk: Roy: apt-cache search nouveau and pick the one with xorg or xf86 in it and delete :D
19:03Roy: I get this msg E: Unable to locate package xf86-video-nouveau
19:03Roy: I'm pretty confident I never installed xf86-video-nouveau
19:04pmoreau: It might have a different name on your distrib
19:04tobijk: Roy: maybe its called a bit different within ubuntu
19:04tobijk: use apt-cache to find it
19:05tobijk: or google: xserver-xorg-video-nouveau
19:05tobijk: damn those different names ~_~
19:05Roy: what to remove?
19:06tobijk: all of them
19:07Roy: ok I remove xserver-xorg-video-nouveau*
19:07tobijk: yep and then try to rerun X
19:09tobijk: karolherbst: where is that TGSI shader of death coming from?
19:09imirkin_: Roy: did you get your stuff figured out, or do i still need to look at your logs?
19:10Roy: no change, (EE) no screens found(EE)
19:10imirkin_: tobijk: pretty sure xf86-video-nouveau 1.0.0 supported kepler :p
19:10Roy: is still the error
19:10tobijk: imirkin_: kernel well he had two different ones, modesetting is good enough for a first test
19:10Roy: the monitor only shows a text console login
19:10imirkin_: Roy: ok, and to save me some time, pastebin links to xorg and dmesg?
19:11tobijk: imirkin_: up there :D
19:11imirkin_: that doesn't save me time :p
19:11tobijk: not that far :D
19:11imirkin_: ok; i guess i'll look later when i have time to read all that.
19:12tobijk: https://pastebin.com/buQG8wNs https://pastebin.com/j0W4cLid
19:12tobijk: imirkin_: just for you O.o
19:12tobijk: Roy: care to share the new Xorg log
19:13imirkin_: Roy: do you have any funny business in your userspace? e.g. no permissions to access /dev/dri/card0?
19:13Roy: dmesg https://pastebin.com/3KBZhdjv
19:13Roy: right now running as root
19:13Roy: same error when running as user
19:14imirkin_: what (exactly) are you running as root?
19:14Roy: my login is root right now
19:14imirkin_: i understand. but what command are you running?
19:16tobijk: Roy: does the new xorg log state the same? you pasted a new dmeg :/
19:16Roy: xlog https://pastebin.com/eayJQLwv
19:16imirkin_: [ 64.021] (II) modeset(G0): using drv /dev/dri/card0
19:16Roy: i run startx
19:17tobijk: [ 64.021] (II) modeset(G0): using drv /dev/dri/card0
19:17tobijk: [ 64.021] (EE) No devices detected.
19:17karolherbst: tobijk: CTS
19:17imirkin_: why is it added as a GPU screen
19:17imirkin_: instead of a regular screen
19:17karolherbst: tobijk: It loops in GCRA::cleanup or so
19:17imirkin_: Roy: grep . /sys/class/drm/card*-*/status
19:19imirkin_: i wonder if the missing vgaarb is confusing things
19:19imirkin_: bleh. it's not even missing ... it's there.
19:19tobijk: karolherbst: *$x not uniquely defined* messages ?
19:19imirkin_: i have no clue how xorg decides whether something's a gpu or not
19:20karolherbst: tobijk: nothing
19:20Roy: nothing from grep /sys/class/drm/card*-*/status
19:20tobijk: mhh, not the easy way out then :/
19:21karolherbst: tobijk: look at the shader... you will know whats going on
19:21imirkin_: Roy: errr ... what
19:21imirkin_: grep . /sys/class/drm/card*-*/status
19:21imirkin_: close enough :)
19:22Roy: that's it
19:22imirkin_: yeah, that looks right
19:22imirkin_: (or at least not wrong)
19:23imirkin_: presumably you have the HDMI plugged in...
19:23Roy: yes, hdmi
19:23Roy: tried vga too, no difference
19:23imirkin_: ok, so the issue is that for some reason Xorg is deciding that it should add a GPU screen rather than a regular screen.
19:23imirkin_: i believe you can solve this by adding a trivial device section
19:24imirkin_: Device "foo"
19:24imirkin_: Driver "nouveau"
19:24tobijk: karolherbst: actually the looks like the big alu shader test from piglit: did you let it run a while? may take a few minutes
19:24imirkin_: or something
19:24Roy: i tried a trivial device section, no change
19:24imirkin_: er, make that
19:24karolherbst: tobijk: I would if I would have more RAM
19:24imirkin_: Section "Device"
19:24imirkin_: Driver "nouveau"
19:25Roy: tried that
19:25karolherbst: tobijk: I got to 15.4G/15.6G
19:25karolherbst: well, I could let it still run though
19:25karolherbst: because swaping is "freeish" on my system
19:25imirkin_: Roy: well, ultimately i don't think you're having nouveau issues, you're having xorg issues. perhaps #xorg-devel or #xorg-users would have more knowledgeable people there. i'm not entirely sure how to debug this remotely.
19:25karolherbst: but at some point it won't matter much
19:26Roy: well thanks for the help
19:26tobijk: karolherbst: yeah we clearly shouldn not use that memory at all
19:26karolherbst: I call memory corruption
19:27karolherbst: or not?
19:27tobijk: i call aray_of_array_of_array....we dont terminate
19:28karolherbst: not even memory leaked
19:28karolherbst: okay, the loop is inside nv50_ir::GCRA::cleanup
19:29karolherbst: the for (Value::DefIterator d = lval->defs.begin(); d != lval->defs.end(); ++d) one
19:29karolherbst: lval->defs is like insanely huge
19:29karolherbst: around 1k
19:29karolherbst: and func->allLValues is 6200 big
19:30tobijk: karolherbst: the defs is a std:list?
19:30tobijk: if i remember right?
19:30imirkin_: there was a patch to change defs to a set
19:30imirkin_: i was too chicken to apply it iirc
19:30tobijk: imirkin_: yeah i still have it, you coward ;-)
19:30karolherbst: what should a set be any betteR?
19:31karolherbst: ohhh, I see
19:31tobijk: not in the memory case
19:31tobijk: but runtime wise
19:31karolherbst: I could imagine that duplicates get whiped out as well
19:32tobijk: karolherbst: runtime wise enough for a try? https://build.opensuse.org/package/view_file/home:tobijk:X11:XOrg:Unstable/Mesa/u_0001-nv50-ir-use-unordered_set-instead-of-list-to-keep-tr.patch?expand=1
19:32karolherbst: please don't use paste sites with no "raw" link
19:33karolherbst: ohh wait
19:33karolherbst: build server
19:33karolherbst: "anonymous_user(Anonymous user is not allowed here - please login):"
19:34karolherbst: git link pls
19:35tobijk: not in git anymore
19:35tobijk: and hastebin is broken, what the hell
19:35tobijk: i gave up bugging imirkin_ about that patch :D
19:37karolherbst: oh well
19:37karolherbst: I already copy/pasted it
19:37karolherbst: it works
19:38karolherbst: I guess if this patch survives CTS, it's fine
19:39tobijk: heh, it does shorten runtimes, not terminating the array cleanup, so you may see an error in a more timely fashion :D
19:39karolherbst: oh well
19:40karolherbst: it's still super slow
19:40tobijk: but it does work? :O
19:40karolherbst: but at least it doesn't get stuck
19:40karolherbst: holy crap
19:40karolherbst: what the heck are they doing
19:41karolherbst: "22522: exit - # (0)"
19:41karolherbst: and this isn't even the highest I saw
19:42karolherbst: "35294: exit - # (0)"
19:42karolherbst: "53758: exit - # (0)" :)
19:42karolherbst: I should grep for exit
19:43karolherbst: funny story though: the shader always ends up being like 2k ins bug
19:43karolherbst: at most
19:43karolherbst: wondering if there will be a 1m insn big shader
19:46karolherbst: Pass :3
19:46tobijk: ~_~ yay
19:47karolherbst: tobijk: so, you know what to do now :p
19:47tobijk: karolherbst: i gave it up, try bugging him about it (did it for around a year)
19:47tobijk: the patch is from 2013 :D
19:47karolherbst: imirkin_: we need this patch
19:48karolherbst: I guess when this patch doesn't regress anything with such a crazy CTS test, I guess it's fine enough (tm)
19:48imirkin_: it's all coz of the stupid merged defs thing =/
19:48karolherbst: well, if it's faster overall, it's fine to have such a patch anyway
19:49imirkin_: yes ... proof by induction ... if it works for n = 1, it must work for all n.
19:49karolherbst: doing a complete CTS run now :p
19:49imirkin_: mwk: did you happen to see my questions about the textureGrad stuff btw?
19:49tobijk: karolherbst: sadly imirkin_ is rigth, with it we have +-1 insn for a complete shader_db run, so something is off
19:50tobijk: yet getting this right would be the way to go
19:50karolherbst: in case of fail/compiles?
19:50tobijk: i cant remember if its + or -1 insn in total
19:50karolherbst: let me guess, one crazy shader compiles and another crazy shader fails?
19:50tobijk: nah instructions
19:51tobijk: fails are equal
19:51karolherbst: a std::set might have a different order
19:51karolherbst: so some changes are kind of expected
19:51karolherbst: set is alos sorted by default
19:52karolherbst: and list... is just stupid
19:52tobijk: its unordered_set though
19:52karolherbst: okay well
19:52karolherbst: still every item is there just once, right?
19:52karolherbst: so, order might be different ;)
19:53karolherbst: "KHR-GL45.texture_swizzle.smoke" another test which just needs quite a lot of time
19:53karolherbst: tobijk: as long as we don't have any regressions in terms of compileability of shaders, it's fine :)
19:54tobijk: karolherbst: i have never witnessed any (using it in my daily use since i initially wrote that patch)
19:55tobijk: as i know myself, cts will prove me wrong within the next few seconds :D
19:56karolherbst: CTS will prove you wrong on so many different levels :p
19:56karolherbst: we have a few super odd crashes though
19:56tobijk: so there are differences?!
19:56karolherbst: without the patch as well
19:58tobijk: karolherbst: which target do you use btw? x11_glx
19:58karolherbst: or so
19:58karolherbst: some tests use egl for whatever reason
20:01karolherbst: "KHR-GL45.enhanced_layouts.varying_array_locations" and "KHR-GL45.enhanced_layouts.xfb_global_buffer"
20:03karolherbst: we crash in those tests
20:03tobijk: well i have no overview, those are new then?
20:09karolherbst: what do you mean by "new"?
20:10tobijk: with the patch
20:10tobijk: typing while running cts is really hard :O
20:12karolherbst: they crash also without the patch
20:12karolherbst: don't run it on your main GPU then
20:12karolherbst: tobijk: do you run the mustpass list?
20:13karolherbst: ./glcts --deqp-caselist-file=../../../../external/penglcts/data/mustpass/gl/khronos_mustpass/4.5.5.x/gl45-master.txt inside build_cts/external/openglcts/modules
20:14tobijk: nah i ran cts_runner plain for starters, which creates _many_ windows
20:14tobijk: stealing the focus from my irc client :D
20:34tobijk: karolherbst: btw the KHR-GL45.texture_swizzle.smoke does take long on intel as well
20:48tobijk: karolherbst: do you overwrite gl version or something? with nouveau the tests do not run for me (FATAL ERROR: Requested GLX configuration not found or unusable at tcuLnxX11GlxPlatform.cpp:635)
20:49imirkin_: yes, you have to force it to 4.5
20:59karolherbst: tobijk: use my cts branch
20:59karolherbst: tobijk: https://github.com/karolherbst/mesa/commits/cts
20:59karolherbst: there are a few fixes already
21:00karolherbst: I think we could even drop the BGRA4 patch now
21:01tobijk: https://github.com/karolherbst/mesa/commit/1f76f75a3363b8c0a259292f45a025017690ee12 <-- this one may work, yet it kills performance with queries
21:02tobijk: but, overall, nice :)
21:05karolherbst: RA fail in KHR-GL45.arrays_of_arrays_gl.SubroutineFunctionCalls1
21:06tobijk: karolherbst: well i have that second chance ra thing, that my help
21:08karolherbst: those aoa tests :(
21:08karolherbst: I am sure we spend 95% of all time inside those
21:09tobijk: karolherbst: https://paste.fedoraproject.org/paste/Iu5ot3UEXaXeCJ7rNRICjQ
21:09tobijk: i really should add a open git repo again
21:12karolherbst: the heck
21:12karolherbst: okay uhm nice
21:12karolherbst: or not
21:12karolherbst: memory corruption I figure
21:12tobijk: well *not production ready* :D
21:13karolherbst: without your patch
21:14tobijk: switching branches while building mesa is a nice idea, silly me
21:20karolherbst: tobijk: okay, a lot of ssbo tests regress
21:20karolherbst: nv50_ir::Value::getUniqueInsn is messed up
21:20karolherbst: nv50_ir::Instruction* nv50_ir::Value::getUniqueInsn() const: Assertion `def->get() == this' failed
21:22tobijk: karolherbst: yeah thats the culprit (works with normal shaders, but not with test shaders) :/
21:22karolherbst: why did you add "Value::getUniqueInsnMerged" anyway?
21:23tobijk: it splits an old method if i remember right
21:23tobijk: into two new ones, where getUniqueInsn is one of them
21:23karolherbst: yeah okay, but getUniqueInsn is wrong now
21:23tobijk: no doubt
21:23karolherbst: kind of
21:26karolherbst: well okay, but why the split anyway?
21:26karolherbst: I don't get the point
21:27tobijk: karolherbst: i rember only that it was needed, was a long time ago
21:27karolherbst: I remove it
21:30karolherbst: tobijk: why unordered_set though?
21:30tobijk: karolherbst: fast as hell :D
21:31tobijk: but after looking at it, it looks like that we are missing to update the Values properly, triggering the assert()
21:31tobijk: with the patch
21:31karolherbst: mhh, right, true
21:31karolherbst: uhm no
21:31karolherbst: getUnqiueInsn is just not right
21:33tobijk: karolherbst: worth a try imho: https://paste.fedoraproject.org/paste/UZKUu3T9r2-kFpuV9YkCxw
21:33karolherbst: the thing is, we just assume that the first def is equal to the insn
21:34tobijk: instead of the assert
21:34karolherbst: but we have no order now
21:34karolherbst: but we don't need the split really
21:34karolherbst: we have to code inside the join != this case anyway
21:37tobijk: ah the reason splitting this was actually imirkin, just go blame him *kidding*
21:38karolherbst: k, fixed
21:39karolherbst: tobijk: https://github.com/karolherbst/mesa/commit/74e1d8bbf56399b4fb6f9f8e5444ea999db12052
21:41tobijk: karolherbst: you only remerged only the getUniqueInsn()? then its fine :)
21:41karolherbst: well no
21:41karolherbst: I just reset the entire commit
21:41karolherbst: and changed getUniqueInsn from scratch
21:42karolherbst: this ordering stuff is bad though
21:42karolherbst: I have an idea
21:42karolherbst: custom hash function
21:42tobijk: yay, ok go for it
21:42karolherbst: too lazy
21:43karolherbst: will be painful
21:43tobijk: i will gladly review in two month
21:43karolherbst: you can't read by hash anyway
21:45tobijk: somehow i really detest cts and like piglit, it is more reliable
21:45airlied: run cts inside piglit :-P
21:46karolherbst: well, CTS runs fine for me
21:46karolherbst: just those crashes are troublesome
21:46tobijk: karolherbst: yeah that bugs me, why it does not continue on crashes
21:46airlied: that's why you run it in piglit, so you get a complete run
21:46airlied: to avoid the crashes
21:46karolherbst: tobijk: are you even using my branch?
21:47airlied: though not having parallel it is slower
21:47tobijk: karolherbst: not yet, wanted to fix the conditional_render first
21:47airlied: since it has to do the cts init a lot
21:47karolherbst: tobijk: use it
21:47tobijk: but i will switch now
21:48tobijk: airlied: are the cts already included or is it actually work (havent had a look lately)
21:48tobijk: in piglit
21:48karolherbst: it's glue
21:49airlied:hasn't used it for gl cts in a long while
21:49tobijk: airlied: k, will have a look anyway, thx
21:49karolherbst: there is a README
21:49karolherbst: and I think you need to follow the deqp and CTS instructions
21:49karolherbst: or maybe not?
21:49karolherbst: not quite sure
21:50airlied: I know it works for vulkan and gles stuff, not 100% sure if anyone has used it for the open GL CTS
21:50tobijk: karolherbst: well as i said: didnt have a look at in for a while, so there may be some nice changes in piglit :D
21:56karolherbst: yeah dunno, somehow it doesn't really work with piglit :/
22:06tobijk: karolherbst: you actually make a full run already with your branch?
22:12karolherbst: here and there a few tests crash, will make a list
22:13karolherbst: but it's far less now I hope
22:38rhyskidd_: tobijk/karolherbst: A little selfishly, I've got some patches on the piglit mailing list to improve its README for this very purpose
22:43tobijk: rhyskidd_: i have selfishly ignored/deleted all piglit emails, pinters to the patches (patchwork) and i'll have a look
22:44karolherbst: only two crashes
22:45karolherbst: KHR-GL45.arrays_of_arrays_gl.SubroutineFunctionCalls1 and KHR-GL45.shader_ballot_tests.ShaderBallotAvailability
22:45karolherbst: now a full run without both tests
22:46tobijk: arrays_of_arrays is likely resolvable with some lines, the other one is a concurrent test, which may be more problematic?!
22:46karolherbst: I think both are kind of trivial to solve
22:46imirkin_: why does ShaderBallotAvailability die?
22:47karolherbst: virtual ir_constant* ir_swizzle::constant_expression_value(void*, hash_table*): Assertion `!"Should not get here."' failed.
22:48tobijk: oh that is far up
22:49karolherbst: I think for 75% of the fails we can apply the "not my department" strategy, fix the nouveau issues and then we are good to go :p
23:07rhyskidd_: tobijk: See https://patchwork.freedesktop.org/series/27964/ Only patch 1/3 of the series is missing a R-b and I don't have commit rights to push the series
23:17tobijk: rhyskidd_: lgtm, yet i dont have commit access as well
23:17tobijk: you may add my r-b anyway :)
23:28rhyskidd_: tobijk: Do you mind replying that R-b to the mailing-list? Not much point in me spinning another series if I don't have commit access
23:32tobijk: rhyskidd_: actually i did, for some reason it did not sho up yet
23:33karolherbst: oh wow , a complete CTS run needs quite a lot of time :(
23:33airlied: even after you fix RA to not loop in those tests :)
23:33imirkin_: it still loops
23:33imirkin_: just faster.
23:33karolherbst: and needs less memory
23:34imirkin_: removed one n from O(n^100)
23:34tobijk: airlied: we never looped, we just took a long time finding the righ insn with std:list :D
23:34imirkin_: ... in a loop
23:34imirkin_: as it were
23:34tobijk: as in endless loop :)
23:35karolherbst: maybe I should have built mesa not with -O0 now that I think about it
23:36karolherbst: nice, got pass all those aoa tests
23:46karolherbst: noooo :( crash in KHR-GL45.enhanced_layouts.varying_array_locations
23:47karolherbst: glcts: ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5823: ureg_src src_register(st_translate*, const st_src_reg*): Assertion `t->inputs[t->inputMapping[index]].File != TGSI_FILE_NULL' failed.
23:47imirkin_: yay - not my fault!
23:51karolherbst: only 4 crashes or not runable tests left, nice