03:08 imirkin: karolherbst: ok, so getImmediate() already looks at the sType. it's just wrong for TXF :)
03:09 imirkin: also worth noting that textureLod() does take a float lod, so in that case 0x00000002 would be rightly interpreted as a 0.
08:23 karolherbst: imirkin: interesting about that textureLod being float though, because the test passes if the immediate is indeed interpreted as a int. Maybe we just miss a cvt, because TGSI lod is int?
08:34 karolherbst: imirkin: okay, and my first two messages got lost: "imirkin: ohh, I thought you knew that already" and "that's why I had the idea about overriding the type"
08:35 karolherbst: and your fix looks okay, will test it later today
08:36 karolherbst: and now it all makes sense
11:15 mwk: https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/
11:15 mwk: this. this should be linked somewhere.
11:15 karolherbst: mwk: wiki
11:16 karolherbst: https://nouveau.freedesktop.org/wiki/Development/
11:19 mwk: seriously
11:19 karolherbst: I am serious
11:19 mwk: this gives you a "I know kung-fu" moment about rasterization
11:20 mwk: greatest thing ever
11:23 mwk: and linked
11:39 karolherbst: mwk: nice thanks
11:53 mupuf: mwk: it has been linked ofr a long time t othe wiki
11:53 mwk: uh?
11:53 mwk: where?
11:53 mupuf: hmm, not on this page it would seem
11:53 mupuf: pmoreau also added it recently
11:53 mupuf: and I added it years ago :D
11:54 mupuf: oh, no, not the same one
11:54 pmoreau: mupuf: :-p
11:54 mupuf: sorry, I meant the presentation from gnurou
11:54 mwk: which one do you mean?
11:54 mupuf: and also the life of a pixel by nvidia
11:54 pmoreau: https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline and http://on-demand.gputechconf.com/gtc/2016/video/S6138.html is what I linked
11:55 karolherbst: good stuff, maybe I find time to look at any of those at some point :/
11:55 pmoreau: One day, it might be a good idea to consolidate the IntroductoryCourse and the Development pages.
11:56 karolherbst: maybe I should have a public todo list and I will work items from that list one day a week or so
11:56 mupuf: yep
11:56 karolherbst: *on items
11:57 pmoreau: karolherbst: I guess you could create a Trello board in the group, or a page on the wiki
11:57 karolherbst: good stuff
11:58 karolherbst: done, list on trello
11:58 karolherbst: add stuff
11:58 pmoreau: On Trello *might* be better, as that way most todo lists related to Nouveau can be found in a single place, but it doesn't matter that much.
11:58 pmoreau: You could also have it on GitHub, on your Nouveau repo.
12:00 karolherbst: I've enabled issues on my repository
12:01 karolherbst: on the mesa one as well
12:02 karolherbst: you can put things on that list and I will see how much time I can spend on those things
12:03 karolherbst: hopefully one day a week, maybe it will be only half a day, who knows
12:04 pmoreau: Ok :-)
12:30 karolherbst: imirkin: I think I will indeed post the patch on the mailing list to limit the clocks to 1GHz for now and maybe add an option to disable this "feature"
12:30 karolherbst: or add a new NvBoost option
12:32 imirkin: ok
12:32 imirkin: you might want to just make it a NvMaxCoreClk
12:32 karolherbst: or that
12:32 karolherbst: and then users could play with values or so
12:32 imirkin: as for texelFetch vs textureLod -- texelFetch takes an int lod, while textureLod takes a float lod
12:33 karolherbst: ahhh, okay
12:33 imirkin: http://docs.gl is nice :)
12:35 karolherbst: nice
12:38 karolherbst: imirkin: I am currently wondering how many "real applications" had issues because of that texelFetch bug
12:38 karolherbst: funny that we only found it through a robustness test
12:40 imirkin: karolherbst: i think it's rare to texelFetch from anything but level 0
12:41 karolherbst: okay
12:42 karolherbst: mhh
12:42 karolherbst: could it have big impact on performance?
12:42 imirkin: anyways, i'm going to push that patch out
12:42 karolherbst: okay
12:42 imirkin: minimal.
12:42 karolherbst: okay
12:42 imirkin: mostly it saves a reg
12:42 karolherbst: ohh, I was more thinking about using level 1 or higher ones instead of 0
12:43 imirkin: oh
12:43 imirkin: well, when you're doing texelFetch, it's more of a lookup
12:43 karolherbst: aren't the higher level textures usually smaller?
12:43 imirkin: i don't think multi-lod really makes sense in those scenarios
12:43 karolherbst: k
12:44 karolherbst: so visual bugs are more likely to be caused by this bug than bad performance
12:44 imirkin: yes
12:44 karolherbst: one day we will figure out why some games run super terrible on nouveau...
12:45 imirkin: resource management i'm sure
12:45 karolherbst: mhhh
12:45 karolherbst: maybe
12:45 karolherbst: I am talking about those saints row games which run at like 10% speed
12:45 pmoreau: Would that bug have caused to fetch LoD smaller than what they should have, like lvl 0 instead of 1?
12:46 karolherbst: pmoreau: yes
12:46 karolherbst: exactly this
12:46 karolherbst: the robustness test was filling nothing inside level 0, but a texture inside level 1
12:46 karolherbst: and this bug triggered that level 0 was being used instead of 1
12:47 pmoreau: So it could have a small impact on performance then, as the texture is smaller, and a tile would be used by more pixels so you have better cache coherency.
12:47 karolherbst: okay well
12:48 karolherbst: but I doubt this would get some games from 10% to 50%
12:48 pmoreau: But if the games were not limited by the texture fetch to start with, then it doesn't matter.
12:48 karolherbst: right
12:48 karolherbst: I really want to know why saints row runs like shit
12:48 pmoreau: Probably not such a boost.
12:48 karolherbst: it's the only game so far
12:49 karolherbst: everything else runs good enough compared to nvidia
12:49 pmoreau: Have you tried comparing perf counters between a run on Nouveau and on NVIDIA?
12:49 karolherbst: not yet
15:35 orbea: nouveau crashed while not doing much of anything (Just xscreensaver), here is a Call Trace from dmesg. http://dpaste.com/2HVKKN0 And a more complete log - http://dpaste.com/3J32ZEC Any ideas?
15:37 imirkin_: well, some error messed up pgraph state
15:37 imirkin_: and nouveau wasn't able to bring it back to life
15:40 orbea: is there anything I can do to help debug it in case it happens again?
15:41 orbea: seems a rare occurance at least
15:41 imirkin_: no
15:42 imirkin_: my entirely unsubstantiated guess is that something went wrong on a context switch
15:42 imirkin_: which then in turn messed things up bigtime
15:42 imirkin_: which caused the gpu to hang itself
15:42 imirkin_: and nouveau can't undo that.
15:42 orbea: ah, okay, thanks for the guess :)
15:42 imirkin_: basically OOR_REG means that either we can't count the number of registers used in a shader
15:43 imirkin_: or it means that the setting is out of touch with the actual shader contents
15:43 imirkin_: i guess this could also happen if something were to randomly overwrite shader instructions
19:42 Lyude: oh no, upstream mesa no longer compiles: https://paste.fedoraproject.org/paste/cwdNc2EwUsMZeUaLfM3OKQ
19:42 karolherbst: imirkin: with your patch 'KHR-GL44.robust_buffer_access_behavior.texel_fetch' passes here as well :)
19:43 imirkin_: Lyude: gcc bug?
19:43 karolherbst: Lyude: it does for me
19:43 imirkin_: (or clang?)
19:43 Lyude: imirkin_: that will be a first for me if so, but this is a pretty recent version of gcc: https://paste.fedoraproject.org/paste/lEMUAB195z7-FWpidtiKRA/
19:43 imirkin_: or ... wtf is the util/bitscan.h reference
19:44 imirkin_: i->src(t).mod should be a Modifier
19:44 imirkin_: and Modifier(bla) is a modifier
19:44 imirkin_: assert(i->src(t).mod == Modifier(NV50_IR_MOD_NOT));
19:44 imirkin_: /home/lyudess/Projects/mesa/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp:1253:13: error: ambiguous overload for ‘operator==’ (operand types are ‘nv50_ir::Modifier’ and ‘int’)
19:44 imirkin_: (sorry, those are inverted)
19:45 imirkin_: fundmanetally, that is a false statement
19:45 imirkin_: both sides should be nv50_ir::Modifier.
19:45 karolherbst: imirkin_: okay, so in the end we have only 4-5 errors in total for CTS: 1. API issues in several places 2. pipeline_statistics 3. 3d images 4. fp64 5. input components
19:45 imirkin_: oh, the bitscan.h is in reference to the inclusion site of assert.h
19:46 imirkin_: Lyude: anyways, j'accuse gcc.
19:46 tobijk: "make clean"
19:46 karolherbst: and then we are done with 4.4 CTS
19:46 imirkin_: input components?
19:46 imirkin_: oh
19:46 imirkin_: 124 vs 128
19:46 Lyude: tobijk: don't think that could be it but I'll double check, most of my stuff (mesa included) is setup to do out of tree roots
19:46 Lyude: *buildroots
19:46 karolherbst: imirkin_: yes
19:47 karolherbst: 12 tests are varyings related
19:47 karolherbst: ohh and an issue with KHR-GL44.arrays_of_arrays_gl.SubroutineFunctionCalls1
19:48 airlied: is that the eats all my cpu for ages issue?
19:48 karolherbst: no, that thing is fixed now
19:48 karolherbst: I fixed tobijks patch
19:48 karolherbst: imirkin_: https://github.com/karolherbst/mesa/commit/8558417d9ad1838176a5921b40de71ef88898b34
19:48 karolherbst: no regressions with that
19:49 karolherbst: ohh airlied asked
19:49 karolherbst: you have the same colors as imirkin_
19:49 karolherbst: airlied: the test runs for 11 seconds now for me
19:50 tobijk: btw the little preparation patch is missing, karolherbst: i guess its better to kepp that in there then
19:50 karolherbst: which little preparation patch?
19:51 karolherbst: airlied: most of the CTS test fails are due to bug within mesa glsl and API validation
19:51 tobijk: the unrelated spill little part(change d-- to ++d)
19:51 karolherbst: tobijk: ohh true
19:51 karolherbst: I just wait until you post it on the ML and it gets eliminated through rebase
19:52 tobijk: karolherbst: i did, imirkin_ told me he would push it
19:52 airlied: karolherbst: that's pretty much the standard problem with CTS
19:52 airlied: esp for new tests
19:52 imirkin_: i say lots of things...
19:52 karolherbst: airlied: yeah, but my point was, nouveau is looking super good now for 4.4
19:52 imirkin_: i remember so few of them =/
19:52 tobijk: imirkin_: yeah go review my patch :P
19:53 airlied: karolherbst: I should retest the actual 4.5 CTS suite at some point
19:53 karolherbst: :)
19:53 airlied: once you've landed everything
19:53 karolherbst: you can already use my branch though
19:53 airlied: I'm lazy
19:53 karolherbst: I see :D
19:54 airlied: running nouveau cts is a trivial git update, rerun what I did last time, anything more will require brain power
19:56 karolherbst: those array of arrays shaders
19:58 karolherbst: if somebody wants to check it out, here is an issue somewhere: https://gist.githubusercontent.com/karolherbst/b5c26bbbfb16d170f1b4c86593ef9e26/raw/420340513a9fd0786adf7f3899e12ee43d8bbf86/gistfile1.txt
19:59 tobijk: line 9511, yay ~_~
19:59 karolherbst: just to be complete, here is the glsl: https://gist.githubusercontent.com/karolherbst/bd7d8bad4e1a617a3c0d59bb5220b197/raw/a4c30ac32090bdad035d12453605024d856b8f0d/gistfile1.txt
20:00 karolherbst: so as you might guess, this shaders result is 0.0
20:01 karolherbst: and not 1.0
20:01 karolherbst: or whatever big shader nouveau generates out of this
20:02 karolherbst: or at least I think
20:02 karolherbst: quite sure though
20:04 karolherbst: or is it 1.0 indeed?
20:04 tobijk: 1.0 is what should come out
20:04 tobijk: we likely end with 0.0
20:05 karolherbst: would be interesting what would happen with aggressive loop unrolling
20:05 karolherbst: ohh wait
20:05 karolherbst: that's what we do
20:07 tobijk: printing the iterator state at the end would be nice to know :)
20:08 karolherbst: I think we should be able to optimize most of those sets away
20:10 karolherbst: mhh
20:10 karolherbst: those nops are odd
20:11 Lyude: imirkin_: definitely not a compiler bug, just tried with clang and it does the same thing. I'll take a look and see if I can figure out what's up
20:12 imirkin_:likes to blame stupid compilers
20:12 imirkin_: coz the one i maintain has so many issues :)
20:12 Lyude: hehehe
20:12 imirkin_: anyways, i don't see how it decides that one of those sides is an int.
20:12 imirkin_: perhaps someone added a #define Modifier() somewhere
20:12 imirkin_: which would be most unfortunate
20:15 imirkin_: perhaps something in c++14 breaks that and they changed what they default to? dunno.
20:15 tobijk: Lyude: which compiler and version is that btw?
20:15 imirkin_: gcc 7.1.1
20:16 tobijk: saw it, yep
20:16 tobijk: ok lets see, i have the mostly same version
20:17 imirkin_: Lyude: tail end of the build log with make V=1?
20:19 tobijk: mhh incremental build works here with 7.1.1
20:23 tobijk: CC nv50/nv50_miptree.lo
20:23 tobijk: CC nv50/nv50_program.lo
20:23 tobijk: CC nv50/nv50_push.lo
20:24 tobijk: nothing here, 7.1.1 is fine for me
20:24 Lyude: tobijk: gcc 7.1.1, clang 4.0.0
20:24 tobijk: gcc7.1.1 clang 4.0.1
20:25 karolherbst: Lyude: could you get us the file generated by cpp?
20:25 Lyude: karolherbst: that's what i'm about to do actually
20:25 Lyude: trying to remember the file prefix you give to make for that
20:28 tobijk: Lyude: if you remember and tell me, we can see if files are comparable :>
20:28 karolherbst: wasn't the flag -E ?
20:28 karolherbst: ohh wait
20:28 karolherbst: different meaning
20:28 karolherbst: oh no
20:28 karolherbst: -E should be fine
20:28 Lyude: yeah, I thought there was a automake target autotools added automatically for that
20:28 imirkin_: .P or something? i forget
20:29 Lyude: bleh, i'll just copy paste the command from V=1 and do it the old boring way
20:29 imirkin_: =]
20:29 imirkin_: aha, .S
20:30 imirkin_: errr, no
20:30 imirkin_: sorry
20:30 karolherbst: -E does the trick already ;)
20:30 imirkin_: or maybe yeah
20:42 Lyude: https://paste.fedoraproject.org/paste/mAaAPt52FxaebLo7278j2A there we go, cpp outpuit
20:47 karolherbst: i->src(t).mod = Modifier(0);
20:48 karolherbst: mhh
20:48 karolherbst: that's the thing after it
20:50 karolherbst: mhh
20:51 karolherbst: Lyude: could you try this: change the "Modifier operator==(const Modifier m) const { return m.bits == bits; }" to "Modifier operator==(const Modifier& m) const { return m.bits == bits; }" in src/gallium/drivers/nouveau/codegen/nv50_ir.h
20:52 karolherbst: the former isn't technically right anyway
20:52 Lyude: that is bizarre, if I ask clang what the type of i->src(t).mod and Modifier(0) on the line above it says "nv50_ir::Modifier" "nv50_ir::Modifier"
20:52 Lyude: and yeah sure
20:52 karolherbst: well
20:52 karolherbst: I am sure there is some super hidden internal c++ magic going on
20:52 karolherbst: and I kind of know what kind of
20:52 imirkin_: it'd be very sad if the cref made any different
20:53 karolherbst: operator==(const Modifier m) means you have to use the copy constructor once
20:53 karolherbst: and now you have the choice
20:53 karolherbst: convert both to int, or use the copy constructor
20:54 karolherbst: but I don't see why it shall convert it to int to begin with
20:54 Lyude: https://paste.fedoraproject.org/paste/E8SEnB-OPDs76-7a2lZ-fQ
20:54 Lyude: double compiler bug? ;P
20:54 imirkin_: i think it's deciding that Modifier(...) is an int
20:54 imirkin_: which is obviously wrong
20:54 karolherbst: duh...
20:54 karolherbst: Lyude: do you have something like -Ox passed?
20:54 karolherbst: try with -O0
20:55 karolherbst: just to make sure
20:55 Lyude: karolherbst: it's set to my default of -O0
20:55 Lyude: yeah
20:55 karolherbst: imirkin_: I don't see why it should do that
20:55 imirkin_: karolherbst: like i said ... bug ;)
20:56 Lyude: tobijk: can you grab the compiler output for your mesa and post it here so I can compare?
20:56 karolherbst: still
20:56 karolherbst: operator== deals with references anyhow
20:56 karolherbst: normally
20:56 imirkin_: doesn't matter.
20:56 imirkin_: can be ref, can be copy
20:56 Lyude: oh, hm. i wonder what clang's ast says
20:57 imirkin_: only operator++ has the weird semantics
20:57 imirkin_: to distinguish pre and post
20:57 karolherbst: okay, right
20:57 karolherbst: c++ is allowed to opt this constructor away anyhow
21:01 Lyude: ugh, no clang just filters it out of the ast output
21:01 Lyude: is this building for you guys?
21:03 tobijk: Lyude: if you remember me how to get to it :)
21:04 Lyude: make V=1, use the command it uses for compiling that file with "-E" added to it and the output args set to "-o -"
21:04 Lyude: also wait, I wonder...
21:06 Lyude: that might explain why it doesn't happen for you guys at least, --disable-debug == assert() turns into a no-op
21:06 karolherbst: I have debug enabled
21:06 tobijk: me either
21:07 Lyude: wat, blah
21:11 tobijk: Lyude: which one did you put out the .lo compile or the .o compile? :o
21:12 karolherbst: Lyude: could you give us your compiler invocation?
21:12 Lyude: karolherbst: yep, once this finishes compiling
21:12 Lyude: or finishes trying to
21:15 tobijk: pff, hastebin, why is that file too large for you
21:16 tobijk: Lyude: https://paste.fedoraproject.org/paste/Fm9mZVucqzJ7ZzV3ULSWWQ
21:17 tobijk: whoops, peephole
21:18 karolherbst: ohhh
21:18 karolherbst: I found the bug
21:18 karolherbst: 2873: $p0 mov u32 $r62 0x00000000
21:18 karolherbst: it's always 0x0
21:18 imirkin_: bug in what?
21:18 karolherbst: the CTS test
21:19 imirkin_: what's wrong with 0x0?
21:19 karolherbst: well
21:20 karolherbst: mhh
21:20 karolherbst: no you are right
21:20 karolherbst: 0x0 is fine
21:20 Lyude: karolherbst: https://paste.fedoraproject.org/paste/TZrHaeFscWJ4LnGP0ln4Yg
21:21 tobijk: Lyude: the right one: https://paste.fedoraproject.org/paste/pt4g20IA9LPo-1iFRlZ5bw
21:23 karolherbst: Lyude: okay, the call works on my system
21:23 karolherbst: Lyude: want the preprocessor output?
21:23 Lyude: i should have known it was going to be one of THOSE bugs
21:23 Lyude: karolherbst: sure
21:25 Lyude: huh, TIL you can open raw URLs in vim
21:27 Lyude: tobijk: and was that with debugging on or off?
21:29 tobijk: Lyude: https://paste.fedoraproject.org/paste/lzekkyDWvzRha2DttdakOQ
21:29 karolherbst: Lyude: https://filebin.ca/3Y1bQ9edp39M/nv50_ir_peephole.o
21:31 Lyude: yeah wtf, the cpp output on here is way different. yours doesn't have the weird sizeof() expression
21:32 karolherbst: gcc 6.4.0
21:32 tobijk: hmm let me try with g3 and O0
21:32 karolherbst: tobijk: why not using Lyudes cmpiler call?
21:34 karolherbst: Lyude: when did you update your system last?
21:34 karolherbst: Lyude: https://bugs.gentoo.org/show_bug.cgi?id=618070 https://bugs.gentoo.org/show_bug.cgi?id=618068
21:35 karolherbst: funny though that both times it was the projects fault
21:36 tobijk: brrrr: Fatal error: can't open a bfd on stdout -
21:36 Lyude: lemme update right now and try
21:37 karolherbst: abs(unsigned int) doesn't seem to work with gcc-7.1 anymore
21:37 karolherbst: fun....
21:44 Lyude: YES, i guess it was a glibc bug
21:45 Lyude: i successfully compiled the peephole
21:47 karolherbst: mhh this looks wrong
21:48 karolherbst: ld u32 $r0 l[0x708]; st u32 # l[x <= 0x700 & 0x8] $r0 ; ld u64 $r0d l[x]; set u8 $p0 neu f64 $r0d 246.000000
21:48 karolherbst: and the last source of set changes with x
21:48 karolherbst: but
21:48 karolherbst: the set always compares the same
21:48 karolherbst: because the value is always loaded from l[0x708]
21:49 karolherbst: and stored into l[x] and loaded again into $r0
21:55 tobijk: karolherbst: your output looks different from the one you poste above? there it was st u32 # l[0x6b8] $r0
21:56 tobijk: i greped for the 246.0 :>
22:00 karolherbst: tobijk: it's a range
22:00 karolherbst: the block repeats
22:00 karolherbst: but it doesn't make any sense
22:01 tobijk: karolherbst: yup saw it, the one posted is from 255.0 :)
22:07 karolherbst: okay, RA has a bug, big news
22:09 imirkin_: why are we doing abs(uint)?
22:10 karolherbst: imirkin_: we don't, it was just an example of things changed with 7.1
22:10 imirkin_: oh
22:12 karolherbst: imirkin_: did you try to reproduce the Tesla bug by the way?
22:12 imirkin_: not sure how
22:12 imirkin_: but allegedly the issue happens due to new compiler + mesa
22:13 tobijk_: oh where did that go: not %p8735 bra BB:518
22:14 karolherbst: imirkin_: yeah
22:14 karolherbst: imirkin_: it's odd though
22:14 karolherbst: imirkin_: but maybe indeed some follow up issue on Tesla, which didn't get triggered before
22:14 imirkin_: sure, yea
22:20 tobijk_: karolherbst: you fund the problem, i cant see where we set the 1.0 for the actual pass of the test
22:20 tobijk_: *found
22:21 karolherbst: well "found"
22:21 karolherbst: no idea what's wrong with RA, never actually done anything important there
22:21 karolherbst: maybe it's not even a RA bug
22:22 karolherbst: but I think it is
22:22 tobijk_: ah you wrote the line in question while i timed out
22:22 karolherbst: and I wouldn't be surprised if this test passes on gk110
22:39 imirkin_: which test is this?
22:39 imirkin_: mind making a card on trello with your detailed analysis?
22:41 tobijk_: imirkin_: its the array of arrays test
22:56 imirkin_: there's like 10 of those
22:56 imirkin_: anyways, wtvr - you guys can figure it out.
22:56 imirkin_: if you want my help, write it up somewhere
22:58 tobijk_: imirkin_: its karolherbst work, so he has the honor :D, anyway test is KHR-GL44.arrays_of_arrays_gl.SubroutineFunctionCalls1
22:58 tobijk_: fyi
22:58 imirkin_: k