01:23 mew007: hello all
01:23 mew007: May I talk to Dr. Martin Peres?
01:24 mew007: skeggsb_: hello
02:14 mew007: are you alive? or you all hangovered to smithereens?
02:15 mew007: ducking alcoholics...
02:52 mew007: ok. I'll be back when you are sober
04:15 RSpliet: imirkin: do you happen to know of a way to generate 4-byte shader instructions on nv50 easily?
04:15 RSpliet: (calim: ^)
06:07 pmoreau: imirkin: For which Tesla cards do you want piglit results?
09:43 imirkin: RSpliet: nv50_ir_emit_nv50.cpp should auto-emit them... not sure what your question is...
10:17 RSpliet: imirkin: I want to hand craft them for testing purposes
10:17 imirkin: use envyas?
10:17 RSpliet: and then upload them?
10:18 imirkin: no clue -- what do you mean by "hand craft"? and what sort of testing purposes?
10:18 RSpliet: thing is, I found the sat modifier on the 8-byte mul opcode for NV50, and wanted to see if I can find it for the 4-byte opcodes as well
10:18 imirkin: ok... so find a piglit with a shader that would trigger that optimization
10:18 RSpliet: the only way to test, is to have a shader that generates a 4-byte opcode
10:18 imirkin: and then modify the nv50_ir_emit_nv50.cpp code to generate 4-byte opcodes for that situation
10:19 RSpliet: yeah, it sounded easier to create a shader rather than look for it
10:19 imirkin: CodeEmitterNV50::getMinEncodingSize
10:19 RSpliet: mmhmm, but somehow if the scheduler decides that there's no dual issue possible, it'll generate 8-byte opcodes anyway
10:19 RSpliet: or so it seems
10:20 imirkin: dual issue?
10:20 imirkin: that's not a thing on nv50 iirc
10:21 imirkin: can you pastebin the tgsi?
10:22 imirkin: note that if one of the args is an immediate, that forces encsize to 8
10:22 imirkin: or if it has an exit modifier
10:23 RSpliet: I don't have TGSI right now, but my attempts probably failed because I used immediates
12:35 pmoreau: imirkin: max-texture-size caused a TRAP_M2MF (trapped read: reason PAGE_NOT_PRESENT) on my G80 it seems.
12:35 pmoreau: I'll restart the computer and try it again to confirm
12:37 imirkin: yeah, max-texture-size can cause some issues sometimes too
12:39 pmoreau: Apparently it's GL_TEXTURE_2D, internal format = GL_RGBA32F
12:39 pmoreau: Ok, I'll remove that test from the run
13:32 pmoreau: imirkin: Results sent (G80 + G86)
13:33 imirkin: pmoreau: received, thanks!
13:33 imirkin: hopefully these don't have the texelfetch errors
13:34 pmoreau: Let's hope it doesn't
13:35 imirkin: still does
13:35 imirkin: i dunno wtf you're doing
13:36 imirkin: but you consistently get texelFetch to fail
13:36 imirkin: also
13:36 imirkin: [ 5560.852179] systemd-journald[171]: /dev/kmsg buffer overrun, some messages lost.
13:36 imirkin: welcome to systemd
13:36 imirkin: if you can somehow destroy it, that'd be great
13:36 pmoreau: :D
13:36 imirkin: /dev/kmsg buffer overrun, so let me just log something to kmsg real quick...
13:36 pmoreau: Meh… Is there some special configuration needed for texelFetch?
13:36 imirkin: "Keyboard Error. Press F1 to continue."
13:37 imirkin: i have no idea how you're achieving the texelFetch fails
13:37 pmoreau: Eh eh! I didn't believe the keyboard error message was real, until I experienced it
13:37 imirkin: and this is presumably not on your macbook
13:37 imirkin: yeah, it's easy to get if you mash the keys during POST
13:37 pmoreau: Right, it is on my desktop computer
13:38 tibbs: Just boot with log_buf_len cranked way up.
13:38 imirkin: all the dmesg-warn tests are just full of that junk
13:38 pmoreau: I probably forgot to modify the config file on the desktop; I have it at max (21) on my laptop
13:38 imirkin: like 1000x that buffer overrun message
13:39 pmoreau: I should probably remove Nouveau debug messages too :D
13:39 imirkin: the doubly-odd thing is that it only happens with texelFetchOffset
13:39 imirkin: nah, those get logged at debug
13:39 imirkin: so piglit doesn't pick them up
13:39 imirkin: but if you could somehow destroy systemd-journald, that'd be great
13:39 RSpliet: pmoreau: but you can just hot-plug a PS/2 keyboard and often it works just fine
13:40 pmoreau: RSpliet: I don't have any of those :/
13:40 RSpliet: maybe USB too
13:40 RSpliet: funny
13:40 RSpliet: I only have PS/2 and Bluetooth keyboards
13:40 RSpliet: anyway, off-topic, sry
13:40 pmoreau: :)
13:41 imirkin: RSpliet: btw, if fmul short encoding has a sat modifier, most likely that fmul long encoding does too
13:41 imirkin: RSpliet: you can play with it with nvdisasm... sort of. it's really pretty shitty for SM10
13:45 RSpliet: imirkin: I'm certain the long encoding has it
13:46 RSpliet: in fact, I would've been surprised if it didn't, because most likely fmul, fadd and fmad all use the same multiply-addition hardware unit
13:47 imirkin: RSpliet: agreed... but the emit code doesn't handle it. double-check envydis/nv50.c -- the modifiers may already be there
13:47 RSpliet: imirkin: I already added the emit bit for the long encoding :)
13:47 imirkin: RSpliet: it was recently pointed out that the neg modifiers were missing on min/max
13:47 RSpliet: well, in mesa that is
13:47 RSpliet: it's sitting in my git-tree
13:47 RSpliet: but as long as I don't find the encoding for the IMM format, can't push that patch
13:48 RSpliet: not sure how to force an IMM though, the nv50 compiler seems to prefer using a ld from c[] (if memory serves me right)
13:49 imirkin: uhh
13:50 imirkin: can you provide examples? i'm not sure what you're talking about
13:50 imirkin: IMM is immediate... c[] is constbuf...
13:50 RSpliet: I know
13:50 imirkin: they are diff register files and highly non-interchangeable
13:50 imirkin: the nv50 compiler starts by putting all immediates into their own registers
13:51 imirkin: and then copy-prop will move them directly into instructions if the instructions allow it
13:51 RSpliet: highly non-interchangable on the ISA level, but functionally I guess it's a "decision"
13:51 imirkin: if you run with NV50_PROG_OPTIMIZE=0 that should disable all that
13:51 imirkin: eh. not a decision that nouveau/codegen makes
13:51 imirkin: it has no ability to "modify" the consts
13:52 imirkin: a theoretical compiler might, but not this one
13:52 RSpliet: okay
13:52 RSpliet:takes mental notes
13:52 RSpliet: my head is in different material at the moment, so sorry for sounding a bit incoherent every now and then
13:53 RSpliet: (although there seems to be understanding :))
13:54 imirkin: no worries. this stuff took me a while to work out. i think i have a decent handle on it now though.
13:55 RSpliet: sure, I have to give credit to calim for writing this thing... it's mostly well readable once you add "peephole" to the dictionary with a description of "that place where optimisations are done" :D
13:56 imirkin: yeah, well peephole opts are a special type of optimizations
13:56 imirkin: however it seems to have become the repository of all the opts
13:56 RSpliet:wonders where that name came from
13:57 imirkin: a peephole is a hole in a door so you can see who's ringing the bell
13:57 imirkin: (in regular english)
13:57 tibbs: Back when I took compiler theory, I was told the name arose because you look through a very small set of instructions to find sequences to optimize.
13:57 imirkin: i think the idea is that it's optimizations that are done based on very little information in a greedy manner
13:57 imirkin: or something like that
13:58 imirkin: http://en.wikipedia.org/wiki/Peephole_optimization
13:59 RSpliet: right right
14:00 imirkin: pmoreau: ok... so what insanity are you *consistently* doing to get texelFetchOffset to fail
14:00 imirkin: pmoreau: something that's both on your desktop and laptop
14:00 pmoreau: imirkin: Launching piglit I gues
14:01 pmoreau: *guess
14:01 imirkin: but no one else's
14:01 imirkin: funny compositor?
14:01 imirkin: do you secretly go in and modify the code to ignore offsets?
14:01 pmoreau: I don't think I have a compositor running, just using Awesome
14:01 imirkin: aha
14:02 imirkin: and awesome is a tiling wm?
14:02 pmoreau: Right
14:02 imirkin: which will futz with window sizes
14:02 imirkin: can you run
14:02 imirkin: bin/texelFetch offset 140 fs sampler2DRect -auto -fbo
14:02 pmoreau: Sure
14:02 imirkin: oh, but it's fbo, and auto... so no window...
14:03 pmoreau: Then which command?
14:03 imirkin: no, still run that
14:03 imirkin: let me know if it fails
14:03 imirkin: if it doesn't then i'll be _really_ surprised
14:06 pmoreau: It fails due to some syntaxic errors
14:14 pmoreau: imirkin: Ok, it does fail. :)
14:14 imirkin: pmoreau: can you run it without -auto -fbo
14:14 imirkin: and take a screenshot
14:14 pmoreau: Probably :D
14:19 pmoreau: imirkin: http://i.imgur.com/LxxR4Cv.jpg
14:20 imirkin: uhhh
14:20 imirkin: oh, that's ~what it's supposed to look like
14:21 imirkin: could you take an actual screenshot though? e.g. use xwd?
14:21 pmoreau: Yep
14:21 imirkin: wait, WTF!
14:21 imirkin: it fails for me too now
14:21 imirkin: it works with my build
14:22 imirkin: but not with my system 10.3.5 build
14:22 pmoreau: ;)
14:22 pmoreau: So, what are you running that others don't? :p
14:23 imirkin: well i'm also on nvc0
14:23 pmoreau: http://i.imgur.com/JVbGNww.png
14:25 imirkin: let me try to futz with build options
14:27 pmoreau: Ok
14:27 pmoreau: (I used the one from the "Install Nouveau" page from the Wiki for Mesa)
14:28 pmoreau: Btw, there are some options that are listed twice, is it something necessary?
14:29 imirkin: no
14:34 pmoreau: Fixed
14:35 imirkin: well THATS annoying
14:35 imirkin: --enable-debug appears to be the flag that fixes it
14:35 imirkin: that's very annoying for... you know... debugging
14:36 pmoreau: :D
14:37 pmoreau: I happened to fix some segfault errors in some library by adding the -g flag to the compile line
14:38 pmoreau: Or sometimes the program starts working when you launch it with gdb --'
14:39 pmoreau: skeggsb_: ping http://lists.freedesktop.org/archives/nouveau/2014-December/019380.html
14:43 RSpliet: imirkin: interesting!
14:43 RSpliet: 0: MUL TEMP[0].x, IMM[0].xxxx, CONST[0].xxxx
14:43 RSpliet: turns into
14:44 RSpliet: EMIT: mov u32 $r0 0x40400000 (8)
14:44 RSpliet: EMIT: mul f32 $r0 $r0 c0[0x0] (8)
14:44 imirkin: why is that surprising?
14:44 imirkin: i don't think you can have a const reference on the first src
14:44 imirkin: (check the target though...)
14:45 RSpliet: I'm thinking semi-out-loud right now
14:46 imirkin: ok. WTF
14:46 imirkin: in the working case:
14:46 imirkin: 00000018: 1c00dde2 1800001e mov b32 $r3 0x787
14:46 imirkin: in the failing case
14:46 imirkin: 00000010: 0000dde2 18000000 mov b32 $r3 0x0
14:46 imirkin: i must have messed something up BIGTIME when i added ARB_gs5 support...
14:47 tobijk: ouch :/
15:06 imirkin: found it.
15:06 imirkin: that was annoying.
15:08 tobijk: that was fast :>
15:15 imirkin: pmoreau: http://lists.freedesktop.org/archives/nouveau/2015-January/019636.html
15:22 RSpliet: imirkin: thanks for the hints, got all those test cases figured out :)
15:23 imirkin: RSpliet: note that it will also try to swap the order of mul (and add/etc) arguments to make it possible to put a constbuf/immediate into the second src
15:24 RSpliet: imirkin: I'd expected it to yes :)
15:25 imirkin: RSpliet: can you add the relevant changes to envydis/g80.c ?
15:25 imirkin: that file is ~impossible to understand first time 'round, so let me know if you have questions
15:25 RSpliet: I'll erm... take a look to see if I understand the format :)
15:25 imirkin: s/if/when/
15:25 imirkin: lots of macro magic
15:25 imirkin: lots of cleverness
15:26 imirkin: makes a lot more sense once you "get" it
15:26 RSpliet: oh, I did only test this on NVA8
15:27 RSpliet: which, admittedly, is bad practice in a lot of cases
15:27 imirkin: it's unlikely something like that would have been added at that late stage
15:27 imirkin: but perhaps get pmoreau to check it out on the G80
15:27 RSpliet: I reckoned, but still :)
15:29 imirkin: how sure are you about the 20 for the encsize == 8?
15:29 RSpliet: really sure
15:29 imirkin: k
15:32 imirkin: RSpliet: as long as you're fixing things, the modifiers on OP_ADD should prolly be the same as on OP_SUB...
15:34 RSpliet: well that sounds somewhat sensible... I'll keep that in the back of my head
15:35 imirkin: just missing the sat enable
15:36 RSpliet: will it emit an ADD with negative SRC1?
15:37 RSpliet: it seems to :)
15:38 imirkin: yeah... it flips the neg modifier on src1 at emit time
15:39 imirkin: RSpliet, pmoreau: can one of you test out my patch http://lists.freedesktop.org/archives/nouveau/2015-January/019636.html to make sure it fixes things on nv50 (e.g. bin/texelFetch offset 140 fs sampler2DRect -fbo -auto) for release builds
15:41 imirkin: RSpliet: btw, if you want a non-compiler related thing to fix on that nva8, there's some transform feedback stuff (like drawing from a saved tfb, probably others)
15:42 RSpliet: imirkin: but the compiler is that one thing I _do_ understand sort of :p
15:43 imirkin: RSpliet: hehe, well... that was one of the areas i started in too
16:04 RSpliet: imirkin: you can add yours truly as Tested-by ;-)
16:07 imirkin: RSpliet: for the offsets thing? thanks
16:07 RSpliet: npo
16:10 imirkin: npo being "np" or "nope"?
16:12 RSpliet: heh, No PrOblem
16:14 imirkin: :)
16:14 RSpliet: apparently also means Natalie Portman Obsessives...
22:14 imirkin: RSpliet and pmoreau: if either of you get a chance, give this a shot on nva0+ (nvac/nva8 both qualify) -- https://github.com/imirkin/mesa/commit/c01bb891221e3cd3e030fc33042fd32a80a19cc4 -- and run bin/arb_transform_feedback2-* that'd be cool.
22:16 imirkin: let me know if there's any change compared to baseline
23:29 pmoreau: imirkin: I'll give them all a try this evening. :)