09:01karolherbst: imirkin: fun fact: the 4.6 and 4.5 mustpass files are equal
09:02karolherbst: which basically means, that we are quite close to 4.6 conformance as well (just the spirv sutff has to be finished)
09:17HdkR: karolherbst: How many spir-v tests are there?
09:59karolherbst: HdkR: 12 or something
10:01HdkR: That seems...lacking
10:09karolherbst: well, write more tests :p
10:10HdkR: I currently am, just not for Khronos conformance :P
12:26karolherbst: imirkin: do you think it would be a problem to disable rgba4 until we fix gallium or nvc0 to handle the situation correctly?
12:27karolherbst: the rgba4 KHR-GL45.packed_pixels.* tests also fail
12:28karolherbst: "FBO is not complete but expected complete with sizedFormat: GL_RGBA4"
14:44imirkin: karolherbst: let's revisit when that's the last thing
15:30Wolf480pl: Hello. I've been using DRI_PRIME with nouveau and Intel, and after last Xorg update, it stopped working.
15:30Wolf480pl: I thought maybe it's xf86-video-intel's fault, so after downgrading Xorg and confirming it works as it used to, I tried uninstalling xf86-video-intel and using the modesetting driver instead.
15:30karolherbst: imirkin: it is
15:31karolherbst: imirkin: (ignoring the random fails)
15:31karolherbst: " Failed: 0/7774 (0.0%)" :)
15:31karolherbst: well, that is with disabled rgba4
15:32Wolf480pl: If I startx without any configuration, it'll use modesetting for the intel card, and nouveau for the nvidia card, but if I set any options for modesetting (AccelMethod glamor, etc), then suddenly it uses modesetting for the nvidia card too. Can I do something to force Xorg to use nouveau?
15:32karolherbst: Wolf480pl: weird
15:32imirkin: karolherbst: seriously? for 4.5?
15:32karolherbst: imirkin: yes
15:32imirkin: karolherbst: wasn't there ... other stuff? or is this on top of a bunch of your patches?
15:32karolherbst: this is with my patches
15:32karolherbst: well sure, we can focus on upstreaming those first
15:33karolherbst: and then think about rgba4
15:33imirkin: Wolf480pl: as a secondary gpu, it really doesn't matter what DDX it runs with
15:33karolherbst: Wolf480pl: use dri3, then you don't need a ddx for nouveau
15:33karolherbst: and it seems to work better in avg anyway
15:33karolherbst: no idea if intel or modesetting actually enable dri3 by default now
15:33imirkin: and ... it doesn't matter at all, since no accel happens on the secondary gpu as far as the X server is concerned
15:34karolherbst: well, with dri2 offloading you also need to setup that xrandr stuff
15:34karolherbst: desktops seems to take care of that these days
15:34imirkin: sure, but ... doesn't matter what ddx you use
15:35karolherbst: imirkin: I plan to send out a series with all the patches anyway, then we can discuss via email
15:35imirkin: either modesetting or nouveau will work equally well in that case
15:35imirkin: karolherbst: sure
15:38karolherbst: imirkin: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c#n773
15:39imirkin: i see.
15:39karolherbst: and that array is accessed with the returned value from nv50_blit_select_mode
15:39imirkin: you are right.
15:39imirkin: your suggestion's good then.
15:39imirkin: much cleaner.
15:40imirkin: R-b me
15:43karolherbst: over the next days I plan to do a CTS run on kepler1/kepler2 hw as well and see what remaining issues we have there
15:43karolherbst: allthough I am quite sure that the 3d image fix will fix 3d images there as well
15:44karolherbst: because I remember running into the same issues as I did on pascal
15:49imirkin: what was your fix? disable tiling?
16:04pendingchaos_: imirkin: any thoughts on this: https://hastebin.com/amobobudiy.diff? (I think it has to be before validation of user constbufs btw)
16:06pendingchaos_: the blob seems to put stuff like that when updating the constant buffer bindings btw but I can't figure out under what conditions
16:08imirkin: ok, so ... this code used to be different back in the day
16:08imirkin: if you look now, nvc0_cb_bo_push does all of that, except for the BIND
16:10imirkin: now, when you're binding a new ubo, it'll issue a MEM_BARRIER to flush things out, because who knows what (valid) data is in that buffer
16:10imirkin: however here we always upload immediately, so any stale data is effectively invalidated
16:12imirkin: so what does SERIALIZE really do ... it waits for all previously-dispatched work to complete
16:13imirkin: now, cb data updates are normally processed in such a way as not to affect current draws
16:14imirkin: and presumably bindings as well, since we don't serialize when switching ubo's (oh, but we do stick in a MEM_BARRIER)
16:15imirkin: pendingchaos_: what happens if you just mark nvc0->cb_dirty = 1 there?
16:15pendingchaos_: fixes most artifacts, but some still remain (at least with Valley)
16:21imirkin: well, i don't have a particularly insightful answer
16:21imirkin: if you find it's necessary there, then it must be necessary in other situations as well (just that might not be as trivially visually obvious)
16:21imirkin: i do wonder if it's a maxwell+ issue, since those artifacts only appear there
16:22imirkin: i'd recommend sticking in cb_dirty=1, and then for maxwell+, sticking in a SERIALIZE in the nvc0_vbo logic that deals with cb_dirty=1
16:23imirkin: that means other gens gain a MEM_BARRIER when the uniform size changes, but ... i'm not overly concerned about that
16:25pendingchaos_: cb_dirty=1 when "nvc0->state.uniform_buffer_bound[s] < size" or when nvc0_constbufs_validate() is called?
16:25imirkin: cb_dirty when the size changes
16:26imirkin: otherwise we'd be in *serious* trouble
16:39karolherbst: imirkin, pendingchaos_: could this be a bit related to that? https://trello.com/c/Kgv3gKBf/26-khr-gl45vertexattribbindingadvanced-iterations-fails-in-full-run
16:39karolherbst: I don't really know how those gl things map to whatever we do on the hardware
16:45karolherbst: imirkin: ohh btw, I have a GPU which causes me "DRM: EVO timeout"s
16:45imirkin: that's something for skeggsb -- must be a combination of gpu + display
16:46pendingchaos_: is the trace included in the log complete? it doesn't look like it uses uniforms at all
16:46karolherbst: imirkin: ahh
16:46karolherbst: imirkin: the cable is a bit broken, but it usually works quite good with nvidia
16:47karolherbst: anyway, it works when I don't have the display connected on boot
16:48karolherbst: imirkin: and sometimes this happens: https://fpdl.vimeocdn.com/vimeo-prod-skyfire-std-us/01/996/11/279980046/1049345786.mp4?token=1531687693-0x8c3cca2aef58f6ec5552bdce5bff40d8c5fe9d2a
16:49imirkin: some kind of sync timing is off
16:49karolherbst: I think this is due to the broken cable as I experienced this with nvidia as well
16:49karolherbst: just super rare
16:50karolherbst: and more "stable"
16:50karolherbst: I don't think the EVO issue is caused by the cable though
16:50pendingchaos_: though perhaps transform feedback uses something in the driver constant buffer
16:54pendingchaos_: imirkin: why do cb_dirty=1 on pre-maxwell? simplicity?
16:56imirkin: and to avoid sticking in more serializes than necessary
16:56imirkin: iirc there's another thing that can stick in serializes, it's somehow kept track of somehow
16:58pendingchaos_: ah wait, the serialize has to be done before the user constant buffer validation
16:59pendingchaos_: otherwise it only makes things a bit better, not fixed
17:02imirkin: very interesting.
17:03imirkin: ok - fine, do it the way you had it, but only on maxwell+
17:04pendingchaos_: only user constant buffers?
17:04imirkin: no, do it always
17:04imirkin: i think the only way you end up in that function is if there's a change
17:04imirkin: so you can just do it unconditionally at the top
17:04imirkin: (for maxwell+)
17:05imirkin: er hmmm
17:05imirkin: no, that's not quite right
17:05imirkin: since you'll get there on every draw, but you don't actually need to do it on every draw
17:05imirkin: new recommendation:
17:05imirkin: always set the size to 64K
17:05imirkin: but be careful to only copy in the appropriate amount of data
17:05imirkin: that should avoid the need to serialize at all
17:06imirkin: that *could* run afoul of some robustness issues, but ... meh
17:07imirkin: if no CTS test calls us out on it, i don't really care
17:07pendingchaos_: if KHR_robustness is an issue, I guess unused memory could just be zero-initialized or something
17:08karolherbst: yeah.. I just wanted to mention the CTS :D
17:08karolherbst: I think there are some robustness tests...
17:08imirkin: there are.
17:08karolherbst: pendingchaos_: if in doubt, just run the CTS
17:08karolherbst: it's faster than piglit
17:08imirkin: but this would have to be an indirect access on a default uniform array
17:09imirkin: it's solvable anyways, as pendingchaos_ points out
17:09karolherbst: I really don't want to risk breaking some CTS tests right now, so we should be a bit more careful here
17:09imirkin: hm, i wonder if there's a practical downside to always setting to 64k though
17:09pendingchaos_: imirkin: am I correct in thinking that you now recommend that the serialize should only be done when updating the user buffer bindings?
17:10imirkin: pendingchaos_: i recommend the serialize be done on maxwell+ any time that the (address, size) data changes
17:10karolherbst: imirkin: btw, my mod fix is in an accepted state, though not merged. (I guess due to the open review on the PR)
17:10pendingchaos_: for the user constant buffer or all?
17:11imirkin: for all
17:11imirkin: however the reality is that
17:11imirkin: if you're in that function, if it's not a user constbuf, it's definitely been updated
17:12imirkin: if it _is_ a user constbuf, then the (bo) address never changes, so you just have to check the size
17:16pendingchaos_: so if I'm following along well enough, this is how it should behave: https://hastebin.com/gozirukego.diff?
17:18pendingchaos_: (and maybe with zero-initialization stuff if it ends up being a problem)
17:19imirkin: you could init serialize to class_3d < GM107
17:19imirkin: then you don't have to keep checking
17:20pendingchaos_: (the "if (i == 0) ..." thing seems to break stuff...)
17:20karolherbst: imirkin: so uhm.. it seems like we really fail some GLX stuff still
17:20pendingchaos_: well a better name for it would be "has_serialized"
17:20imirkin: pendingchaos_: yeah, because it gets turned off
17:20imirkin: and nothing turns it back on again
17:21imirkin: i wonder if that's it actually
17:21imirkin: that you need to serialize before turning *on* a CB
17:21imirkin: i.e. if it was previously off, and now on
17:21imirkin: what if only do this if the old size == 0?
17:21imirkin: without bumping the size to 64k every time
17:25pendingchaos_: imirkin: no change in Valley with https://hastebin.com/ehuxayutoc.diff
17:25imirkin: no change meaning you still see corruption?
17:27imirkin: well, can't argue with results =/
17:27imirkin: no matter how much one would like to
17:28pendingchaos_: I've got to go. I should be back in a short while
17:31imirkin: ok. so if we don't hear about issues with UBO's, we can theorize that the problem only happens when changing a size without changing an address. in that case, your original patch makes sense. you don't have to add the serialize boolean check, since that condition can only happen once (there can only be one user bo)
17:32imirkin: (but do add the generation check)
17:33pendingchaos_: there is one user buffer for each shader stage though?
17:34pendingchaos_: and I think it's possible for a UBO binding to change size without changing address
17:34pendingchaos_:will probably disappear again soon for an actual short while
17:36imirkin: oh right, there is. good point.
17:36imirkin: yes. a ubo binding can change size without address. very good point.
17:37imirkin: with e.g. glBindBufferRange
17:44imirkin: karolherbst: i think you were going to send an updated version of that image load fix?
17:44karolherbst: imirkin: yes, running a piglit on all patches
17:44imirkin: ah ok
17:44karolherbst: and after that is done, I send it out
17:44karolherbst: currentl -12 fails and -4 crashes :)
17:45karolherbst: but still a bit to go
17:45karolherbst: xcb_glx_create_context_attribs_arb_checked fails
17:45karolherbst: which can mean whatever
17:46karolherbst: ahh right, we don't expose GLX_ARB_create_context_robustness
17:46karolherbst: for whatever reasons
17:48imirkin: that needs some tracing
17:48imirkin: probably something silly
17:48karolherbst: yeah like some flag not set in the DDX
17:49imirkin: it's a client-only ext
17:50karolherbst: ohh, right
17:51karolherbst: it appears under "server glx extensions" for intel
17:51imirkin: ok, so it's under dri_robust_screen_extensions
17:51imirkin: which only gets used when PIPE_CAP_DEVICE_RESET_STATUS_QUERY is set
17:51imirkin: which we don't, apparently
17:52imirkin: so that should cover it. setting that to 1 will expose that ext, but i'm guessing we don't support something
17:52imirkin: i haven't looked at the requirements, just following the code.
17:52karolherbst: oh well, cts fails will tell me
17:52imirkin: lol, it's a bit more than that :p
17:53pmoreau: pendingchaos_: I’m hearing some talking about serializing stuff on Maxwell+ and artefacts in Valley: I’m wondering if that could fix https://bugs.freedesktop.org/show_bug.cgi?id=100177
17:53imirkin: that's a dangerous approach to development
17:53imirkin: pmoreau: i'm sure it will
17:54karolherbst: imirkin: right.. I am just wondering if we really have to implement that for the CTS though
17:54karolherbst: didn't know that it cares that much about GLX extensions
17:54imirkin: presumably part of the whole robustness requirements thing
17:54imirkin: they can require whatever they want
17:56karolherbst: anyway, the fails are fixed by just enabling the cap...
17:56karolherbst: but wee should probably check what else we have to implement for that
17:56karolherbst: I am sure those are simply API tests or something
17:58karolherbst: mhh, that cap tells gallium that pipe_context::get_device_reset_status is implemented
18:02karolherbst: imirkin: amdgpu doesn't implement this and simply returns 0
18:02karolherbst: they return 0 in the winsys
18:02karolherbst: radeon seems to have something implemented
18:04karolherbst: okay, basically it is a counter telling how often the GPU was reset by the driver
18:04karolherbst: sounds fairly trivial to implement
18:06karolherbst: mhh, but the gallium bits are more about what kind of reset happen
18:17karolherbst: imirkin: sent out
18:17karolherbst: no piglit regressions with all the patches
18:26RSpliet: karolherbst: perhaps pedantic, but wouldn't it be more in line with C(++) convention to have an nv50_blit_mode enum? That way the individual entries don't have to be numbered, and as long as the "sentinel" remains the final entry the counter remains valid regardless of copy-pasta.
18:27karolherbst: feel free to send cleanup patches :p
18:27karolherbst: but that wasn't the issue to begin with
18:28RSpliet: I know, it's just a random thought
18:28karolherbst: more like if we also add bit flag semantics, then arrays break anyway or waste a lot of memory
18:29RSpliet: "bit flag semantics"?
18:30RSpliet: you mean like, being able to select multiple "modes" by ORing them together rather than having an entry for every valid permutation? I know too little about the matter to judge whether that makes sense :-)
18:31karolherbst: kind of
18:33imirkin: i think what karol has was reasonable
18:44karolherbst: imirkin: we might also have to fix some memory leaks we might have as the CTS does like ~30 runs in the same process
18:44karolherbst: I think my last cts run aborted after 8 rounds due to something like that
18:44karolherbst: or some weird issue
19:13nyef: Hrm... 0x1540 is units-enabled (plausibly not written-to while the system is in operation) and 0x4008fc is... units-something (needing service? context dirty?) and varies as the system runs?
20:03pendingchaos_: there seems to be a function almost identical to nvc0_constbufs_validate in nvc0_compute.c... "nvc0_compute_validate_constbufs"
20:05pendingchaos_: imirkin: I guess I should probably update that too?
20:13imirkin: pendingchaos_: maybe? i wouldn't object.
20:26pendingchaos: do you know how the compute bindings are aliased with the 3d ones?
20:38pendingchaos: I think that all might actually be Fermi only
20:52nyef: ... and not Tesla?
20:55pendingchaos: nyef: are talking to me?
20:59nyef: This time, yes.
21:00pendingchaos: were you talking to me that time you said "... and not Tesla?"
21:01nyef: I wasn't when I was commenting on 0x1540 vs. 0x4008fc, though.
21:02pendingchaos: well I think tesla has a sort-of-separate driver? I was talking about the code in nvc0_compute.c, not nv50_compute.c?
21:02nyef: Ah, okay.
21:04nyef: Ah, I see now. There's a gap covering the 0xb0 & ~0x0f range. That'd be partly why I was confused.
21:12pendingchaos: pmoreau: here's the current state of the fix if you want to try it out: https://hastebin.com/raw/uconihoxug
21:57imirkin: pendingchaos_: not very precisely. in general, things are aliased though.
21:57imirkin: iirc we re-bind everything when switching between compute and graphics
23:04reinuseslisp: do shader headers (those 0x50 bytes) include primitive input type? (e.g. points, triangles)
23:05reinuseslisp: in geometry shaders
23:15HdkR: reinuseslisp: https://download.nvidia.com/open-gpu-doc/Shader-Program-Header/1/Shader-Program-Header.html
23:16pendingchaos_: also in nvc0_program.c
23:17reinuseslisp: yea, I used that to get output, max_vertices and instances count, but I couldn't find input type
23:17pendingchaos: it seems nouveau just ignores TGSI_PROPERTY_GS_INPUT_PRIM
23:17pendingchaos: so I guess it's not set anywhere
23:33imirkin: reinuseslisp: also https://github.com/envytools/envytools/blob/master/rnndb/graph/gf100_shaders.xml
23:34imirkin: input prim is what it is
23:34imirkin: so no need for that to be advertised
23:34imirkin: while output prim is configured by the shader header
23:37reinuseslisp: ok, I'll be back in ~20 mins