04:16 inolen: anholt: just confirmed, i was managing to release the buffers in the wrong order
05:27 imirkin: i spent a bit more time on this ... a js runner for edid-decode: https://people.freedesktop.org/~imirkin/edid-decode/
05:27 imirkin: still not sure there's a really good use for it, besides me playing with emscripten.
08:05 mgot: Im writing usb driver that uses midi to communicate but has reach set of sysex commands - meaning its more of a char device than audio streaming or whatever. Should I manually create procfs entry or usb code do this somehow automatically?
08:06 mgot: s/reach/rich/
08:07 mgot: I was greping linux/drivers to see if anyone does cdev_add, but seems like usb core is the only one
08:10 mgot: s/usb code/usb core/
08:23 pq: mgot, does that have something to do with graphics drivers? If not, it is unlikely to see interest in this channel.
08:32 mgot: its novation mk2 launchpad
08:32 mgot: I wanted to learn how to write linux device drivers and that the first usb device i had near me
08:33 mgot: I guesss, for now ill register this manually myself
08:43 mgot: Ok i greped for register_chrdev_region, now this resulted in many hits
08:43 mgot: I guess its the standard way
08:50 mripard: Lyude: mlankhorst_ is the one responsible for it this release cycle, but if he's not around by noon (CET time), I'll do it
08:59 mlankhorst_: missed a comment?
09:01 pq: mgot, so a 8x8 pixel display, which is actually a lot more about input than display. It probably does not classify as graphics. Might be more fruitful to seek help from people working on the input driver side.
09:02 mripard: mlankhorst_: possibly, it was at your underscore-stripped nick :)
09:03 mripard: mlankhorst_: Lyude wanted to have drm-misc-fixes merged into drm-misc-next-fixes
09:04 mlankhorst: ah
09:04 mlankhorst: I'll merge and send a pull req
09:06 mlankhorst: hm maybe a pull req first then backmerge
09:43 mlankhorst: mripard: I've sent 2 pull requests, can sync with drm-next after its pulled
09:47 mripard: ack, thanks :)
10:24 MrCooper: anholt robclark: there have still been 2 spurious arm64_a630_gles3 failures out of six pipelines on https://gitlab.freedesktop.org/mesa/mesa/pipelines since https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3662 was merged
10:24 gitbot: Mesa issue (Merge request) 3662 in mesa "ci: Bump the GLES CTS version to 3.2.6.1." [Ci, Merged]
12:15 shadeslayer: tomeu: re Mesa CI: I'm thinking of splitting out arm_build into arm64_build and armhf_build
12:15 shadeslayer: arm_build just takes insanely long and can be parallelized imo
12:16 tomeu: shadeslayer: what is insanely long?
12:16 shadeslayer: tomeu: the entire thing? it takes a ~hour to complete
12:17 shadeslayer: https://gitlab.freedesktop.org/shadeslayer/mesa/-/jobs/1577408
12:17 shadeslayer: if we split it out, we should be able to at least halve the time
12:18 tomeu: shadeslayer: guess there's a balance to strike between total resource usage and response time
12:18 tomeu: response time isn't that important in this case as the containers are rarely rebuilt
12:18 tomeu: how much would total resource usage increase? eg. how much work would be duplicated if they were split?
12:21 iive: they are completely separate architectures. it is like x86 and x86_64 .
12:27 shadeslayer: tomeu: would the overall resource usage really go up if instead of 1 long running job, we run 2 shorter jobs?
12:27 karolherbst: mhhh
12:28 karolherbst: "(('umod', a, '#b(is_pos_power_of_two)'), ('iand', a, ('isub', b, 1)))," causes some tests to fail for me
12:28 tomeu: well, that's what I was asking you :p
12:28 shadeslayer: tomeu: ah, I don't think so
12:28 shadeslayer: I'm playing around with the idea in my head, I'll mull it over
12:28 tomeu: fine with me then, as long as it doesn't make the scripts more complicated
12:28 shadeslayer: but I don't see why resource usage would go up
12:29 shadeslayer: tomeu: if anything, we'd be dropping some code :)
12:30 karolherbst: but that opt sounds alright.. weird
12:30 karolherbst: maybe something further down the pipe
12:45 daniels: shadeslayer: if it parallelises well then splitting it is definitely a good idea, yeah
12:45 shadeslayer: daniels: afaict we just need a separate job to run each of these https://gitlab.freedesktop.org/mesa/mesa/blob/master/.gitlab-ci/container/arm_build.sh#L69-70
12:47 daniels: hmm, looks like we don't build libdrm for armhf?
13:17 danvet: tzimmermann, sravn "[PATCH 0/5] disable drm_global_mutex for most drivers, take 2" <- testing on a simple driver that fulfills the check for the new locking would be really great
13:17 danvet: I think CI is liking the code now :-)
14:45 karolherbst: ohh right... shift is broken in nouveau with nir
14:56 imirkin: karolherbst: something nir-specific, i hope? or more general?
14:57 karolherbst: imirkin: super trivial, we just have to set NV50_IR_SUBOP_SHIFT_WRAP on all shifts
14:57 imirkin: iirc nir's shifts don't align with hardware, or at least not with the flags we use. maybe there's a way to make hw shifts work like nir shifts.
14:57 imirkin: right.
14:57 karolherbst: which.. doesn't matter as those shifts are already optimized with nir anyway
14:57 imirkin: that'll break some lowering though, you can't do it across-the-board.
14:57 karolherbst: yeah.. I am a bit worried about that
14:58 imirkin: in some lowering i explicitly use the fact that the shifts don't wrap
14:58 imirkin: for 64-bit shifts
14:58 imirkin: on SM30 and/or SM20
14:59 imirkin: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#n219
15:00 karolherbst: I could do the shift lowering in nir...
15:01 imirkin: or you could set the subop...
15:01 imirkin: my point was you couldn't set it on _all_ shifts everywhere.
15:02 karolherbst: I think the best way to fix it to teach nir about both shifts type I guess
15:03 imirkin: that's a big lift, and i think nvidia hw is unique in that regard
15:03 karolherbst: I don't think so
15:04 karolherbst: mhh, there was this spirv extension
15:05 karolherbst: http://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/KHR/SPV_KHR_no_integer_wrap_decoration.html
15:05 karolherbst: which got pushed by intel a lot
15:05 karolherbst: and we already get it set on some shifts in nir
15:06 karolherbst: we have no_signed_wrap and no_unsigned_wrap on the instruction
15:06 karolherbst: so I guess it's all there already
15:06 karolherbst: it's probably just never used in glsl
15:06 imirkin: ah
15:07 imirkin: those are about declaring that no wrapping shall occur
15:07 imirkin: for things like add/etc
15:07 karolherbst: and shifts
15:07 imirkin: i.e. you won't do 0xffffffff + 1
15:07 imirkin: same.
15:07 imirkin: it's not about the op being different
15:07 karolherbst: mhhh
15:07 imirkin: it's about guaranteeing that certain values will be "cool"
15:07 karolherbst: yeah.. maybe it doesn't affect the shift operand :/
15:08 karolherbst: imirkin: but why do we assume in TGSI that shifts do not wrap?
15:08 imirkin: it's unspecified.
15:08 imirkin: whereas in nir, it's specified.
15:09 imirkin: and some algebraic things in nir rely on it
15:09 karolherbst: yeah, hence the issue
15:09 imirkin: so stick the subop on there and move on
15:09 imirkin: might have to slightly audit the nouveau algebraic opts to be sensitive to this
15:09 imirkin: i don't think there's much effect there
15:11 karolherbst: imirkin: ohh, wait, with my initial statement I meant setting that subop on all nir shifts at the translation
15:11 karolherbst: not on all shifts in all codegen
15:12 karolherbst: thought you meant to tell me that the lowering for 64 bit shifts might need adjustements, but I guess it's safe for both variants
15:35 karolherbst: imirkin: btw, ever looked into nv_shader_atomic_float or intel_shader_atomic_float_minmax?
15:39 robclark: MrCooper: any chance those were pipelines that hadn't rebased yet?
15:48 imirkin_: karolherbst: shader_atomic_float is currently exposed by nouveau
15:48 imirkin_: karolherbst: the intel one requires intel hw
15:48 imirkin_: funny enough, the shader_atomic_float support is largely courtesy of idr, who foolishly thought that intel hw could support it
15:48 imirkin_: but instead ended up making that INTEL_* ext instead
15:52 karolherbst: imirkin_: ohh, right, we don't implement GL_NV_shader_atomic_float on maxwell
15:53 imirkin_: i have a patch
15:53 imirkin_: but i felt icky about it for some reason
15:55 imirkin_: karolherbst: not sure if this is it, or if that's an old version that doesn't work: https://github.com/imirkin/mesa/commit/071c0fd6089c1fc21a33623c03430f4b0a070002
15:56 idr: imirkin_: I was hoping i965 could do both, but alas.
15:56 karolherbst: imirkin_: ohh, I remember that one
15:56 imirkin_: karolherbst: the sticking point is that there's no native support for that op on shared memory
15:57 imirkin_: so it uses a CAS-style loop
15:57 imirkin_: but CF being what it is in nv50 ir ... fails abound.
15:57 karolherbst: right
15:59 karolherbst: I'd kind of like to be able to do such lowering in nir, so we don't have to mess with codegens CF stuff so much :/ especially because I am sure that it would be usefull for others as well
15:59 karolherbst: there isn't anything nv special in this lowering, is there?
15:59 imirkin_: in this one? no.
15:59 karolherbst: probably already exists in nir
15:59 imirkin_: but in e.g. handleManualTXD or whatever it's called - yes.
16:00 karolherbst: ahh sure
16:00 imirkin_: it does explicitly stick predicates in there
16:00 imirkin_: which iirc nir has questionable support for
16:00 karolherbst: yeah, that's fine
16:00 imirkin_: but it's probably possible to make it all work out anyways
16:00 imirkin_: iirc the sticking point is whether you branch on success or failure, and the impl is sensitive to it
16:01 imirkin_: due to warp issues
16:01 imirkin_: warp stack
16:01 imirkin_: branch stack? whatever. you know what i'm talking about.
16:01 imirkin_: "the thing that overflows when you branch wrong"
16:02 karolherbst: right
16:02 imirkin_: so the trick was making sure that it went the "right" way
16:02 karolherbst: I know of games overflowing it :)
16:02 imirkin_: which is actually more of a generic concern, obviously
16:03 imirkin_: but we LARGELY get away with not caring
16:03 karolherbst: right
16:03 imirkin_: but this one spot triggered it badly, iirc
16:03 karolherbst: and on compute we don't have this issue to begin with
16:03 imirkin_: until i randomly rearranged the order of things until it didn't
16:03 imirkin_: and all was well.
16:03 karolherbst: yeah...
16:03 karolherbst: I think I know have a better understanding on how that all works
16:03 karolherbst: especially as I debugged the shaders causing it as well
16:04 imirkin_: can't be worse than mine...
16:04 imirkin_: "things go boom" is about the level i'm at =/
16:04 karolherbst: we need a loop merging opt
16:04 karolherbst: or enable the off chip stack
16:04 karolherbst: there is nothing else we can do
16:04 imirkin_: we can stick our heads in the sand...
16:04 karolherbst: right...
16:04 imirkin_: worked so far
16:04 karolherbst: well
16:04 karolherbst: again, there are games triggering it already
16:04 imirkin_: and when in doubt, blame it on multithreading issues =]
16:05 karolherbst: ;)
16:05 karolherbst: anyway, I don't want to implement loop merging with the current version of codegen, I'd rather reworked all of codegen to make it easier, then implement it ;)
16:05 karolherbst: but anyway
16:06 karolherbst: we need to be able to predict if we might run out of c/r stack
16:06 karolherbst: and then enable the off chip
16:06 karolherbst: stack
16:06 karolherbst: maybe we should do it whenever we have nested loops...
16:06 karolherbst: won't matter in practise
16:08 karolherbst: imirkin_: ohh, you don't check if the loop depth needs to be increased in that patch either
16:08 karolherbst: which RA relies on :(
16:09 karolherbst: maybe it doesn't matter here that much though
16:09 imirkin_: karolherbst: maybe that's why i had trouble?
16:09 karolherbst: maybe
16:09 karolherbst: were registers wrongly allocated?
16:09 imirkin_: maybe.
16:09 imirkin_: it's really hard to tell
16:09 imirkin_: coz the order of the args for these ops is weird in the first place
16:09 imirkin_: and you keep flipping stuff around
16:09 karolherbst: anyway, inserting loops or doing other CFG manipulations in codegen sucks
16:10 imirkin_: it does.
16:10 karolherbst: maybe it would be easier to make it a builtin function...
16:12 karolherbst: mhh, could we even do that as we also need to save predicates?
16:21 Lyude: mripard, mlankhorst: yeah - I pushed some mst fixes that should have been pushed into drm-misc-next-fixes/drm-misc-next instead of drm-misc-fixes
16:24 vsyrjala:happy he isn't the only one confused by drm-misc :)
16:45 MrCooper: robclark: none; that page only shows pipelines from the main project, all six of them were from the master branch after that MR was merged
16:55 MrCooper: daniels: hmm, fdo-packet-2 is special once again: it seems to consistently hit the 60s timeout for the glcpp tests in the meson-s390x job (e.g. https://gitlab.freedesktop.org/mesa/mesa/-/jobs/1578780), which only take ~10-20s on other runners; OTOH it's faster than other runners for the llvmpipe tests in there
16:55 imirkin_: more, slower cpu's?
16:56 imirkin_: [actually, not sure that'd even explain it]
16:56 MrCooper: the CI jobs always use up to 4 cores
16:57 MrCooper: that runner had a similar issue before due to mis-aligned memory access, maybe qemu hits something like that
16:59 imirkin_: urgh
16:59 imirkin_: most arch's kernels do the fallback RMW for stuff like that
16:59 imirkin_: which is ... not fast.
17:00 MrCooper: this is x86, though a relatively weak CPU I gather
17:20 MrCooper: tpalli xexaxo1: does src/compiler/glsl/tests/cache_test.c:cache_exists() intentionally use uninitialized stack memory for dummy_key? valgrind complains about it, and I wonder if it might explain https://gitlab.freedesktop.org/mesa/mesa/-/jobs/1581118
17:36 daniels: imirkin: same config as packet-3 and packet-4
17:36 imirkin_: i misunderstood, probably - i thought it was running inside of qemu, so the arch would matter
17:36 daniels: MrCooper: please file an issue for it; I'm on my way to the airport so won't be able to look at it for a bit
17:40 MrCooper: daniels: well, for all I know it could be a Mesa issue, like last time; might be tricky to track down though with qemu in the mix
18:07 karolherbst: ehhh ufff
18:08 karolherbst: jekstrand: we need to be able to mark if nodes as not optimizeable to bcsel
18:08 imirkin_: karolherbst: what's the issue?
18:08 karolherbst: think about OOB global memory reads we want to guard against
18:08 imirkin_: ah right.
18:08 imirkin_: just use the nv50_ir lowering passes :p
18:09 karolherbst: I know :p
18:09 karolherbst: but it's for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2753
18:09 gitbot: Mesa issue (Merge request) 2753 in mesa "nir: Add load_ssbo_address/size intrinsics" [Nir, Opencl, Opened]
18:11 karolherbst: ahh.. nir_selection_control.nir_selection_control_dont_flatten
18:11 karolherbst: okay, so we already have that
18:12 pendingchaos: karolherbst: nir_opt_peephole_select() should already avoid optimizing IFs with ssbo loads
18:12 karolherbst: why would it matter for ssbos?
18:12 karolherbst: ssbos are bound check by definition, aren't they
18:13 pendingchaos: *global loads
18:13 karolherbst: pendingchaos: it checks for the dont_flatten control
18:13 karolherbst: pendingchaos: doesn't seem to
18:13 karolherbst: or not explicitly at least
18:14 karolherbst: it prevents it for indirect loads though
18:14 karolherbst: which by definitions are all global loads
18:16 pendingchaos: should also prevent it for direct loads except for nir_var_shader_in/nir_var_uniform
18:17 pendingchaos: nir_intrinsic_load_global should go to the default case in block_check_for_allowed_instrs(), which returns false
18:17 pendingchaos: (default case in "switch (intrin->intrinsic)")
18:19 karolherbst: ahh, right, missed that
18:20 karolherbst: ehh.. now I trigger an infinite loop with opt_cse
18:29 karolherbst: heh mhhh, now what to do with merged ssbo loads :/
20:51 agd5f: is the general expectation that drivers without load and unload should not initialize their display hw until after calling drm_dev_register?
20:52 agd5f: I keep running into horrible ordering issues
20:53 agd5f: but re-arranging our entire driver init sequence is not really appetizing
20:59 vsyrjala: i think we do pretty much everything except fbdev_initial_config() before registering
21:07 agd5f: I think I got it. ordering issue with cec