00:37 nyef: Hrm. Well, the AIGLX thing has *some* effect.
00:38 nyef: If I move the nouveau_dri.so file that it uses out of the way before starting X, the difference is very obvious in the logs.
00:38 nyef: And this also implies that I have a tesla-specific lockup scenario.
01:36 imirkin: nyef: moving it out of the way means you don't get GL at all
01:36 imirkin: well, hw-accelerated GL
01:36 imirkin: it's unfortunately needed for both direct and indirect operation
01:36 imirkin: due to largely various historical silliness
01:51 nyef: Problem is, it's the *wrong version*.
01:51 nyef: It's the system copy, not the development copy.
01:53 nyef: Hrm. Do I still get issues if gxine can't find that .so either?
01:57 nyef: ... Yeah, I still get issues under such circumstances.
01:59 nyef: I can drag a Mate Terminal window around with no problems, but if I try to drag gxine around, I get a lockup within seconds.
02:01 nyef: Does mean that it's not gallium at fault, though, and that it's an nv50 issue, not an nvc0 issue.
02:10 nyef: Hrm. Going to have to try and find a Fermi card and a system to wrap around it. Joy.
03:45 rhyskidd: karolherbst: sure that nv140 is from nouveau? it still has the errors with BIT table 'P' i'd seen before
03:46 karolherbst: rhyskidd: you are right, it isn't
03:46 rhyskidd: oob pointers
03:47 rhyskidd: have seen something similar with the vbios of a Tesla V100 off techpowerup, which is almost certainly a "windows"-format dumped vbios
03:47 rhyskidd: (i know it's not windows format per se, but has some other headers around it)
03:47 rhyskidd: not nouveau derived
05:48 orbea: oh cool, one of the recent xorg-server commits seems to fixed the DRI3 + modestting issue for me
05:48 orbea: well, at least it doesn't immediately blow up anymore
06:07 karolherbst: orbea: what issue?
06:07 orbea: karolherbst: would hang with startx, I think it had to do with compton, but I never got around to bisecting it
06:08 orbea: actually, it was worse at first, used to hang glxgears (black window, no gears) too even without compton
06:09 orbea: but that ws fixed maybe a few weeks ago?
06:10 orbea: started with xserver 1.20.0
15:06 pendingchaos: imirkin: have you tested the phi patch with the traces you mentioned?
15:53 imirkin: pendingchaos: no, but i will right now
15:56 imirkin: pendingchaos: is the cycle estimate purely informational (and potentially to see how various opts do)?
15:57 pendingchaos: yes
15:58 pendingchaos: you have another use in mind?
16:03 nyef: Use them to drive optimization selection?
16:04 imirkin: pendingchaos: well, looks like hearthstone works ok with your patch
16:04 imirkin: and yeah, latency estimates are often used to drive instruction ordering within a BB
16:05 nyef: Is xf86-video-nouveau multi-threaded or single-threaded?
16:05 imirkin: (or rather, works as ok as without - there's still some fail, never tracked it down)
16:05 imirkin: nyef: single
16:06 nyef: Okay, good. I don't have to try to track down thread-interaction issues, at least. Thank you.
16:06 imirkin: everything's serialized by the X server afaik
16:06 imirkin: that code is as close to bug-free as it gets...
16:06 imirkin: there are some known issues with acceleration of certain X primitives that no one ever uses, but that's a separate issue
16:07 imirkin: (trapezoids and whatnot)
16:07 nyef:points out that he's trying to figure out an X lockup triggered by moving around a gxine window that's not actually playing anything.
16:07 imirkin: what makes you think that X locks up?
16:08 imirkin: see if LIBGL_ALWAYS_SOFTWARE=1 gxine still locks things up
16:08 nyef: It still happens even if I rename away all of the nouveau_dri.so files.
16:08 imirkin: also, the vdpau stuff has seriously gone downhill lately
16:08 imirkin: i think there's a kernel bug which makes it just not work
16:08 imirkin: which gpu is this on?
16:09 nyef: Happens on tesla, doesn't happen on kepler.
16:09 imirkin: and on tesla, it's the dma_pusher thing?
16:09 nyef: Not always. Managed to have it happen without any kernel messages once.
16:10 imirkin: ok
16:10 imirkin: well the dma pusher thing is a kernel-level issue
16:10 imirkin: we're not switching channels correctly ... or something
16:10 imirkin: (if we knew what, it'd be fixed already)
16:10 nyef: It's always pointed at the X server, never any other process.
16:11 imirkin: you don't really know that
16:11 imirkin: nouveau reports the process that opened the fd, not the process to which that fd was passed to over a domain socket
16:11 nyef: Ahh.
16:12 imirkin: not sure if there's a way of retrieving that
16:12 nyef: I'm trying to update a test machine so that I can try with fermi, but that's likely to take all day.
16:52 karolherbst: imirkin: do you think this has to be ">=" instead of ==? https://github.com/mesa3d/mesa/blob/master/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c#L948
17:09 nyef: That'd depend on if it's a "something new was introduced with this revision" thing or a "there's something quirky about this revision in particular" thing, surely?
17:12 Armada: nyef, that subchannel is used here: https://github.com/mesa3d/mesa/blob/master/src/gallium/drivers/nouveau/nvc0/nvc0_transfer.c#L612
17:13 Armada: It's likely it's a new feature in that revision and not something quirky, without changing the first check current behaviour results in a subchannel being used without an engine
17:14 Armada: bound to the subchannel
17:14 rhyskidd: karolherbst: perhaps skeggsb has a nouveau-derived vbios from the volta (GV100) he's been using for initial bringup?
17:15 karolherbst: rhyskidd: maybe
17:16 nyef: Armada: Thank you. That has just somewhat reduced my level of compound ignorance.
17:24 karolherbst: I have a super silly fix for the last CTS fail and I kind of fear I don't get any fails now...
17:43 karolherbst: ahhhh :(
17:45 karolherbst: imirkin: this patch fixes the packed_depth_stencil.blit.depth32f_stencil8 test: https://github.com/karolherbst/mesa/commit/44222f90597089cf16b9fa3001b4a3445d3bada2
17:45 karolherbst: does it make sense?
17:45 karolherbst: it seems to not help any of those piglit fails
17:45 karolherbst: doing a piglit run now to check for regressions
17:46 karolherbst: but uhm, we kind of pass all CTS tests
17:46 karolherbst: except those which sometimes fail
17:47 karolherbst: Armada: with that changed I get "fifo: PBDMA0: 00000010 [HCE_ILLEGAL_CLASS] ch 3 00000000 0000a0b5"
17:48 karolherbst: on pascal
17:55 Armada: hmm, a0b5 doesn't exist on the switch either, but b0b5 does
17:57 Armada: in that case, the nvc0_transfer check should be changed until more recent copy engines are supported
18:00 Armada: karolherbst, could you try adding another check for greater or equal to NVF0_P2MF_CLASS that sets 0xb0b5 to the subchannel?
18:11 karolherbst: Armada: after the piglit run, yes
18:41 nyef: MCP89, no nouveau_dri.so files to be found. LIBGL_ALWAYS_SOFTWARE=1 gxine... lockup.
18:42 nyef: fifo: DMA_PUSHER - ch 3 [X[3719]] get 0000034b54 put 0000034b78 ib_get 00000273 ib_put 0000029a state 80000040 (err: INVALID_CMD) push 00400040
18:46 nyef: What next, I start gxine again, then use lsof to see if it has any suspicious-looking fds open?
18:47 nyef: Or re-enable nouveau_dri.so and try to replicate the issue using modesetting instead of xf86-video-nouveau?
18:49 karolherbst: imirkin: hum, no regressions/fixes inside piglit with that path
18:49 karolherbst: *patch
18:52 pendingchaos: karolherbst, imirkin: is this a reasonable assumption to make (even with CROSS edges): https://hastebin.com/dogecudabo.diff (in CFGIterator::search)?
18:52 karolherbst: Armada: also illegal class on pascal
18:52 Armada: :/
18:53 Armada: I guess for now just change nvc0_transfer so it doesn't use that subchannel
18:53 imirkin: karolherbst: didn't test piglit, just a handful of traces
18:53 Armada: only use nve4 m2mf transfers on nve4
18:54 karolherbst: imirkin: what are you refering to? I meant the patch I have for the CTS fail
18:55 imirkin: karolherbst: oh. nevermind.
18:55 karolherbst: I am very suspicious regarding that patch
18:55 karolherbst: it looks too trivial
18:55 imirkin: nyef: if you get errors without 3d accel, only xorg is performing accel
18:55 imirkin: nyef: and fbcon, theoretically, but only if you switch to a console
18:56 imirkin: karolherbst: that MIGHT make sense
18:56 karolherbst: yeah, I know
18:57 imirkin: depending on how rast is done
18:57 imirkin: problem is, it'll only affect very large rects
18:57 nyef: imirkin: So, does that mean "not a context switch issue"?
18:57 imirkin: so you have to test it with maximally-sized surfaces
18:57 nyef: I suppose it could mean "context setup isn't quite right"...
18:57 karolherbst: imirkin: mhh
18:57 imirkin: nyef: wellll ... unlikely to be a context switch issue. but the issue is the same as the other "random tesla fail" issues
18:58 imirkin: i.e. DMA_PUSHER gets upset, things go downhill from there
18:58 karolherbst: imirkin: well it affects the CTS tests which does 256x256 surfaces
18:58 imirkin: karolherbst: right
18:58 imirkin: but the reason those numbers are so large
18:58 imirkin: is to deal with very large surfaces
18:58 karolherbst: what is more suspicious is, that it doesn't affect any fail/pass inside piglit
18:59 karolherbst: so I am wondering
18:59 karolherbst: ahh
19:00 karolherbst: imirkin: ohhh I was wrong about piglit,
19:00 karolherbst: the tests which fail do _look_ better
19:00 karolherbst: the test still fails
19:00 karolherbst: but
19:00 karolherbst: at least the window content looks better
19:00 imirkin: :)
19:01 karolherbst: instead of those weirdly scaled textures, they are sized correctly
19:01 imirkin: have a look at the commit logs for that code
19:01 imirkin: it's iterated a few times
19:01 imirkin: perhaps i left some comments a few of those times
19:02 karolherbst: anyway, it seems like with this it should be easier to fix the piglit tests as well, because seems like one bug less inside the path. Let me check the commits
19:03 karolherbst: imirkin: "nvc0: fix blit triangle size to fully cover FB's > 8192x8192" https://github.com/karolherbst/mesa/commit/a651bc027d5ed4150bb5240fc9f46a6ca569f665
19:04 karolherbst: ohh wait
19:04 karolherbst: that doesn't add the shift
19:05 imirkin: yeah. like i said - there have been a few fixes in that area
19:05 imirkin: and i definitely remember leaving it off in a state where some of those ext_framebuffer_multisample tests looked wrong
19:05 imirkin: and differently wrong on nv50 and nvc0
19:05 imirkin: which i wasn't too happy about
19:05 karolherbst: uhh, that code is old
19:07 karolherbst: ohhh I see
19:08 imirkin: https://hastebin.com/raw/osivehizew
19:08 imirkin: the idea of that code is that you draw a single triangle
19:08 imirkin: which goes way outside the bounds of the fb
19:08 imirkin: but the whole fb gets covered, in a single draw. rather than 2 for a quad.
19:09 karolherbst: right
19:09 karolherbst: the thing is, do we have to upscale the triangle if the destination is a ms surface?
19:10 karolherbst: or well
19:10 karolherbst: we kind of do
19:11 karolherbst: but we do that for the coordinates
19:11 karolherbst: so x0/1 and y0/1 are increases
19:11 karolherbst: *increased
19:12 karolherbst: but do we also have to increase the vertex
19:15 karolherbst: mhh..
19:17 imirkin: problem is whether the rast is multisampled or not
19:17 imirkin: and how it deals with a MS fb
19:17 imirkin: etc
20:01 pendingchaos: ping on the CFGIterator::search() question?
20:05 imirkin: pendingchaos: sorry, not sure offhand, i'd have to read through a bunch more code
20:06 imirkin: unfortunately the CFG is not quite the way it's supposed to be in practice
20:06 imirkin: i've been too chicken about fixing it
20:06 imirkin: the idea is that edges are categorized based on the MST + extra edges
20:06 imirkin: but ... that's not how they're done in practice
20:06 pendingchaos: "MST"?
20:10 pendingchaos: https://en.wikipedia.org/wiki/Minimum_spanning_tree?
20:13 pendingchaos: not sure how that applies to a CFG though
20:15 HdkR: imirkin: You know how many times I've told people to blit with a single triangle and it has blown their mind that they have never thought of doing that? :P
20:18 imirkin: pendingchaos: all those edge types aren't just randomly named
20:18 imirkin: i think i have a comment about it
20:18 imirkin: but basically tree = part of the MST of the CFG
20:18 imirkin: forward = jump to a descendent in the MST
20:18 imirkin: back = jump to a parent
20:18 imirkin: cross = other
20:19 imirkin: this info is mostly used for layout of the code
20:20 imirkin: i.e. you have a bunch of bb's, which all jump to one another -- how to lay them out in actual code space to avoid unnecessary jumps all over the place
20:20 imirkin: but it's also used for some other matters, like critical edge detection
20:27 pqatsi: Hello folks! I still fighting with F28 environment with my inspiron 7000. The issue now is I cant pass (in best case) the login screen, with a permanent freeze. With a live cd, I got the result of journalctl --boot=-1 with a chroot: https://pastebin.com/212PhD7F
20:27 pqatsi: What can I do to have a at least usable system?
20:30 nyef: Downgrade your video card?
20:32 imirkin: pqatsi: boot with nouveau.modeset=0
20:55 karolherbst: imirkin: inside codegen the edges are rather categeorized by the type of "jump" creating that edge though.
20:57 karolherbst: if-then/if-else: Edge::TREE, endif: Edge::FORWARD, loops: Edge::TREE, loob-end: Edge::BACK, break: Edge::CROSS, continue: Edge::BACK. There is also that special case for RET, but... that doesn't really matter
20:58 karolherbst: normally you categorize the edges based on the iteration path you choose through the CFG, but we don't do that
20:58 karolherbst: so sometimes edges could be end up as different one, depending on which path you take
21:00 karolherbst: I am sure that by accident all the tree edges could be equal to the MST, but I highly doubt that this is the always the case
21:00 karolherbst: also, we create circles
21:00 karolherbst: which by definition can't be a MST
21:04 karolherbst: uhm, maybe we don't create circles, but at least we could end up with a non connected tree edges
21:12 ReinUsesLisp: hello, what does post-fermi SEL instruction do?
21:18 karolherbst: ReinUsesLisp: select value based on the result of the compare
21:18 karolherbst: ReinUsesLisp: src0 compareOp 0 ? src1 : src2
21:19 karolherbst: uhm
21:19 karolherbst: src2 compareOp 0 ? src0 : src1 actually
21:19 imirkin: karolherbst: it's *supposed* to be based on the MST though
21:19 karolherbst: so a SEL.LT a b c d writes either b or c into a depending on whether d is less then or not
21:20 imirkin: karolherbst: that's not a thing
21:20 imirkin: there's a SELP
21:20 imirkin: and a FCMP/ICMP/etc
21:20 karolherbst: imirkin: sure, but the if-then and the else-then block connect through a Tree::Forward edge to the successor
21:20 imirkin: which is wrong.
21:20 karolherbst: imirkin: CMP is SET, no?
21:20 imirkin: which is what i was trying to explain. it should be based on the MST, but isn't.
21:20 karolherbst: right
21:21 ReinUsesLisp: so `SEL R18, RZ, c[0x1][0x0], !P0` would translate to `R18 = !P0 ? RZ : c[0][1]`
21:21 imirkin: OP_SET -> *CMP. i assumed the question was about the SEL name in nvdisasm
21:21 karolherbst: ReinUsesLisp: mhh, that is actually a SELP
21:21 imirkin: ReinUsesLisp: c1[0], but yeah
21:21 karolherbst: imirkin: ohh right, I might have got it wrong with SLCT vs SELP and their names in nvdisasm
21:21 ReinUsesLisp: oh, yeah, mistyped
21:22 karolherbst: but yeah, slct and selp are basically the same (in our nouveau terms)
21:22 karolherbst: just selp already gets an boolean input
21:22 imirkin: karolherbst: errrr, OP_SET -> *SET. OP_SLCT -> *CMP
21:22 karolherbst: imirkin: k
21:22 imirkin: SELP uses a predicate
21:22 karolherbst: naming is always confusing
21:22 imirkin: such is life.
21:23 ReinUsesLisp: about SSY and SYNC calls, those should be handled by GLSL compiler, right?
21:23 imirkin: they're added in by the compiler, yes
21:23 imirkin: not by the glsl compiler, but by layers further down.
21:23 imirkin: (glsl compiler's job is to parse the glsl and produce an IR to be consumed further on)
21:24 ReinUsesLisp: yea, I was thinking from IR to GLSL terms
21:24 ReinUsesLisp: ok, thanks!
21:25 imirkin: may i ask why you're asking?
21:25 ReinUsesLisp: Nintendo Switch emulation
21:25 imirkin: ah
21:25 HdkR: imirkin: I hear it's the only reason why people care about Maxwell these days
21:25 ReinUsesLisp: >.>
21:25 imirkin: yeah, you have to keep track of the most recent SSY, and convert the SYNC into a jump
21:26 imirkin: this will happen with if/else for the most part
21:26 imirkin: internally it allows the hw to know when all lanes are "SIMD" again
21:26 ReinUsesLisp: oh, so it can be ignored
21:26 ReinUsesLisp: can't*
21:27 imirkin: SYNC is a jump, so you can't exactly skip it
21:27 HdkR: Don't forget loops that have unstructured control flow inside of it :P
21:27 imirkin: if/else -> SSY; @P0 BRA else; if-code; SYNC; else-code; SYNC
21:28 imirkin: the SSY has a pointer to after the else code
21:28 imirkin: we call it "JOINAT" and "JOIN"
21:29 imirkin: and right. there are various tructures with a for loop as well
21:29 imirkin: i just wanted to make a simple example :)
21:29 imirkin: but usually one would use a PBRK for that
21:29 HdkR: Or a combination of the two
21:29 imirkin: i.e. PBRK + BRK when exiting the loop
21:30 imirkin: and if there's if/else inside the loop, then SSY + SYNC
21:30 HdkR: brainmelting
21:30 imirkin: but someone who knew what they were doing wouldn't necessarily be bound by such simplicity :)
21:30 skeggsb: and fortunately gone in volta :P
21:30 imirkin: moral of the story - don't let HdkR write games
21:30 HdkR: er uh
21:30 HdkR:hides stockpile of games
21:31 HdkR: skeggsb: You get new fun toys in Volta though ;)
21:31 imirkin: skeggsb: yeah, that's definitely nice.
21:32 HdkR: <3 Volta's threading model
21:35 karolherbst: oh no :( more CTS fails
21:36 karolherbst: some robustness stuff
21:43 karolherbst: imirkin: appernatly we fail some robust_buffer_access_behavior tests as well
21:57 karolherbst: but maybe that's just prime related..
22:24 karolherbst: mhh, can somebody check if GLX_ARB_create_context_robustness is exposed as a GLX extension inside glxinfo? (not server/client)
22:25 karolherbst: but I am sure it doesn't get reported due to prime
22:26 imirkin: for me it's reported on client but not server
22:26 imirkin: dunno if we need to do something or not
22:26 imirkin: might need newer xorg
22:26 karolherbst: I guess so
22:26 karolherbst: well, my X runs on intel
22:27 karolherbst: let me check with a dedicated nouveau X
22:29 karolherbst: ....
22:29 karolherbst: right
22:30 karolherbst: X doesn't start on nouveau here, because the GPU doesn't report any displays
22:30 karolherbst: well at least the modesetting ddx isn't happy
22:30 karolherbst: nouveau ddx just crashes
22:34 karolherbst: also my X is like 1.20 or so
22:34 karolherbst: uhh 1.19.6 actually
22:34 karolherbst: imirkin: well, it works perfectly with intel, just prime offloaded nouveau not
22:35 imirkin: ok
22:36 imirkin: i have not investigated.