04:16inolen: anholt: just confirmed, i was managing to release the buffers in the wrong order
05:27imirkin: i spent a bit more time on this ... a js runner for edid-decode: https://people.freedesktop.org/~imirkin/edid-decode/
05:27imirkin: still not sure there's a really good use for it, besides me playing with emscripten.
08:05mgot: Im writing usb driver that uses midi to communicate but has reach set of sysex commands - meaning its more of a char device than audio streaming or whatever. Should I manually create procfs entry or usb code do this somehow automatically?
08:07mgot: I was greping linux/drivers to see if anyone does cdev_add, but seems like usb core is the only one
08:10mgot: s/usb code/usb core/
08:23pq: mgot, does that have something to do with graphics drivers? If not, it is unlikely to see interest in this channel.
08:32mgot: its novation mk2 launchpad
08:32mgot: I wanted to learn how to write linux device drivers and that the first usb device i had near me
08:33mgot: I guesss, for now ill register this manually myself
08:43mgot: Ok i greped for register_chrdev_region, now this resulted in many hits
08:43mgot: I guess its the standard way
08:50mripard: Lyude: mlankhorst_ is the one responsible for it this release cycle, but if he's not around by noon (CET time), I'll do it
08:59mlankhorst_: missed a comment?
09:01pq: mgot, so a 8x8 pixel display, which is actually a lot more about input than display. It probably does not classify as graphics. Might be more fruitful to seek help from people working on the input driver side.
09:02mripard: mlankhorst_: possibly, it was at your underscore-stripped nick :)
09:03mripard: mlankhorst_: Lyude wanted to have drm-misc-fixes merged into drm-misc-next-fixes
09:04mlankhorst: I'll merge and send a pull req
09:06mlankhorst: hm maybe a pull req first then backmerge
09:43mlankhorst: mripard: I've sent 2 pull requests, can sync with drm-next after its pulled
09:47mripard: ack, thanks :)
10:24MrCooper: anholt robclark: there have still been 2 spurious arm64_a630_gles3 failures out of six pipelines on https://gitlab.freedesktop.org/mesa/mesa/pipelines since https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3662 was merged
10:24gitbot: Mesa issue (Merge request) 3662 in mesa "ci: Bump the GLES CTS version to 126.96.36.199." [Ci, Merged]
12:15shadeslayer: tomeu: re Mesa CI: I'm thinking of splitting out arm_build into arm64_build and armhf_build
12:15shadeslayer: arm_build just takes insanely long and can be parallelized imo
12:16tomeu: shadeslayer: what is insanely long?
12:16shadeslayer: tomeu: the entire thing? it takes a ~hour to complete
12:17shadeslayer: if we split it out, we should be able to at least halve the time
12:18tomeu: shadeslayer: guess there's a balance to strike between total resource usage and response time
12:18tomeu: response time isn't that important in this case as the containers are rarely rebuilt
12:18tomeu: how much would total resource usage increase? eg. how much work would be duplicated if they were split?
12:21iive: they are completely separate architectures. it is like x86 and x86_64 .
12:27shadeslayer: tomeu: would the overall resource usage really go up if instead of 1 long running job, we run 2 shorter jobs?
12:28karolherbst: "(('umod', a, '#b(is_pos_power_of_two)'), ('iand', a, ('isub', b, 1)))," causes some tests to fail for me
12:28tomeu: well, that's what I was asking you :p
12:28shadeslayer: tomeu: ah, I don't think so
12:28shadeslayer: I'm playing around with the idea in my head, I'll mull it over
12:28tomeu: fine with me then, as long as it doesn't make the scripts more complicated
12:28shadeslayer: but I don't see why resource usage would go up
12:29shadeslayer: tomeu: if anything, we'd be dropping some code :)
12:30karolherbst: but that opt sounds alright.. weird
12:30karolherbst: maybe something further down the pipe
12:45daniels: shadeslayer: if it parallelises well then splitting it is definitely a good idea, yeah
12:45shadeslayer: daniels: afaict we just need a separate job to run each of these https://gitlab.freedesktop.org/mesa/mesa/blob/master/.gitlab-ci/container/arm_build.sh#L69-70
12:47daniels: hmm, looks like we don't build libdrm for armhf?
13:17danvet: tzimmermann, sravn "[PATCH 0/5] disable drm_global_mutex for most drivers, take 2" <- testing on a simple driver that fulfills the check for the new locking would be really great
13:17danvet: I think CI is liking the code now :-)
14:45karolherbst: ohh right... shift is broken in nouveau with nir
14:56imirkin: karolherbst: something nir-specific, i hope? or more general?
14:57karolherbst: imirkin: super trivial, we just have to set NV50_IR_SUBOP_SHIFT_WRAP on all shifts
14:57imirkin: iirc nir's shifts don't align with hardware, or at least not with the flags we use. maybe there's a way to make hw shifts work like nir shifts.
14:57karolherbst: which.. doesn't matter as those shifts are already optimized with nir anyway
14:57imirkin: that'll break some lowering though, you can't do it across-the-board.
14:57karolherbst: yeah.. I am a bit worried about that
14:58imirkin: in some lowering i explicitly use the fact that the shifts don't wrap
14:58imirkin: for 64-bit shifts
14:58imirkin: on SM30 and/or SM20
15:00karolherbst: I could do the shift lowering in nir...
15:01imirkin: or you could set the subop...
15:01imirkin: my point was you couldn't set it on _all_ shifts everywhere.
15:02karolherbst: I think the best way to fix it to teach nir about both shifts type I guess
15:03imirkin: that's a big lift, and i think nvidia hw is unique in that regard
15:03karolherbst: I don't think so
15:04karolherbst: mhh, there was this spirv extension
15:05karolherbst: which got pushed by intel a lot
15:05karolherbst: and we already get it set on some shifts in nir
15:06karolherbst: we have no_signed_wrap and no_unsigned_wrap on the instruction
15:06karolherbst: so I guess it's all there already
15:06karolherbst: it's probably just never used in glsl
15:07imirkin: those are about declaring that no wrapping shall occur
15:07imirkin: for things like add/etc
15:07karolherbst: and shifts
15:07imirkin: i.e. you won't do 0xffffffff + 1
15:07imirkin: it's not about the op being different
15:07imirkin: it's about guaranteeing that certain values will be "cool"
15:07karolherbst: yeah.. maybe it doesn't affect the shift operand :/
15:08karolherbst: imirkin: but why do we assume in TGSI that shifts do not wrap?
15:08imirkin: it's unspecified.
15:08imirkin: whereas in nir, it's specified.
15:09imirkin: and some algebraic things in nir rely on it
15:09karolherbst: yeah, hence the issue
15:09imirkin: so stick the subop on there and move on
15:09imirkin: might have to slightly audit the nouveau algebraic opts to be sensitive to this
15:09imirkin: i don't think there's much effect there
15:11karolherbst: imirkin: ohh, wait, with my initial statement I meant setting that subop on all nir shifts at the translation
15:11karolherbst: not on all shifts in all codegen
15:12karolherbst: thought you meant to tell me that the lowering for 64 bit shifts might need adjustements, but I guess it's safe for both variants
15:35karolherbst: imirkin: btw, ever looked into nv_shader_atomic_float or intel_shader_atomic_float_minmax?
15:39robclark: MrCooper: any chance those were pipelines that hadn't rebased yet?
15:48imirkin_: karolherbst: shader_atomic_float is currently exposed by nouveau
15:48imirkin_: karolherbst: the intel one requires intel hw
15:48imirkin_: funny enough, the shader_atomic_float support is largely courtesy of idr, who foolishly thought that intel hw could support it
15:48imirkin_: but instead ended up making that INTEL_* ext instead
15:52karolherbst: imirkin_: ohh, right, we don't implement GL_NV_shader_atomic_float on maxwell
15:53imirkin_: i have a patch
15:53imirkin_: but i felt icky about it for some reason
15:55imirkin_: karolherbst: not sure if this is it, or if that's an old version that doesn't work: https://github.com/imirkin/mesa/commit/071c0fd6089c1fc21a33623c03430f4b0a070002
15:56idr: imirkin_: I was hoping i965 could do both, but alas.
15:56karolherbst: imirkin_: ohh, I remember that one
15:56imirkin_: karolherbst: the sticking point is that there's no native support for that op on shared memory
15:57imirkin_: so it uses a CAS-style loop
15:57imirkin_: but CF being what it is in nv50 ir ... fails abound.
15:59karolherbst: I'd kind of like to be able to do such lowering in nir, so we don't have to mess with codegens CF stuff so much :/ especially because I am sure that it would be usefull for others as well
15:59karolherbst: there isn't anything nv special in this lowering, is there?
15:59imirkin_: in this one? no.
15:59karolherbst: probably already exists in nir
15:59imirkin_: but in e.g. handleManualTXD or whatever it's called - yes.
16:00karolherbst: ahh sure
16:00imirkin_: it does explicitly stick predicates in there
16:00imirkin_: which iirc nir has questionable support for
16:00karolherbst: yeah, that's fine
16:00imirkin_: but it's probably possible to make it all work out anyways
16:00imirkin_: iirc the sticking point is whether you branch on success or failure, and the impl is sensitive to it
16:01imirkin_: due to warp issues
16:01imirkin_: warp stack
16:01imirkin_: branch stack? whatever. you know what i'm talking about.
16:01imirkin_: "the thing that overflows when you branch wrong"
16:02imirkin_: so the trick was making sure that it went the "right" way
16:02karolherbst: I know of games overflowing it :)
16:02imirkin_: which is actually more of a generic concern, obviously
16:03imirkin_: but we LARGELY get away with not caring
16:03imirkin_: but this one spot triggered it badly, iirc
16:03karolherbst: and on compute we don't have this issue to begin with
16:03imirkin_: until i randomly rearranged the order of things until it didn't
16:03imirkin_: and all was well.
16:03karolherbst: I think I know have a better understanding on how that all works
16:03karolherbst: especially as I debugged the shaders causing it as well
16:04imirkin_: can't be worse than mine...
16:04imirkin_: "things go boom" is about the level i'm at =/
16:04karolherbst: we need a loop merging opt
16:04karolherbst: or enable the off chip stack
16:04karolherbst: there is nothing else we can do
16:04imirkin_: we can stick our heads in the sand...
16:04imirkin_: worked so far
16:04karolherbst: again, there are games triggering it already
16:04imirkin_: and when in doubt, blame it on multithreading issues =]
16:05karolherbst: anyway, I don't want to implement loop merging with the current version of codegen, I'd rather reworked all of codegen to make it easier, then implement it ;)
16:05karolherbst: but anyway
16:06karolherbst: we need to be able to predict if we might run out of c/r stack
16:06karolherbst: and then enable the off chip
16:06karolherbst: maybe we should do it whenever we have nested loops...
16:06karolherbst: won't matter in practise
16:08karolherbst: imirkin_: ohh, you don't check if the loop depth needs to be increased in that patch either
16:08karolherbst: which RA relies on :(
16:09karolherbst: maybe it doesn't matter here that much though
16:09imirkin_: karolherbst: maybe that's why i had trouble?
16:09karolherbst: were registers wrongly allocated?
16:09imirkin_: it's really hard to tell
16:09imirkin_: coz the order of the args for these ops is weird in the first place
16:09imirkin_: and you keep flipping stuff around
16:09karolherbst: anyway, inserting loops or doing other CFG manipulations in codegen sucks
16:10imirkin_: it does.
16:10karolherbst: maybe it would be easier to make it a builtin function...
16:12karolherbst: mhh, could we even do that as we also need to save predicates?
16:21Lyude: mripard, mlankhorst: yeah - I pushed some mst fixes that should have been pushed into drm-misc-next-fixes/drm-misc-next instead of drm-misc-fixes
16:24vsyrjala:happy he isn't the only one confused by drm-misc :)
16:45MrCooper: robclark: none; that page only shows pipelines from the main project, all six of them were from the master branch after that MR was merged
16:55MrCooper: daniels: hmm, fdo-packet-2 is special once again: it seems to consistently hit the 60s timeout for the glcpp tests in the meson-s390x job (e.g. https://gitlab.freedesktop.org/mesa/mesa/-/jobs/1578780), which only take ~10-20s on other runners; OTOH it's faster than other runners for the llvmpipe tests in there
16:55imirkin_: more, slower cpu's?
16:56imirkin_: [actually, not sure that'd even explain it]
16:56MrCooper: the CI jobs always use up to 4 cores
16:57MrCooper: that runner had a similar issue before due to mis-aligned memory access, maybe qemu hits something like that
16:59imirkin_: most arch's kernels do the fallback RMW for stuff like that
16:59imirkin_: which is ... not fast.
17:00MrCooper: this is x86, though a relatively weak CPU I gather
17:20MrCooper: tpalli xexaxo1: does src/compiler/glsl/tests/cache_test.c:cache_exists() intentionally use uninitialized stack memory for dummy_key? valgrind complains about it, and I wonder if it might explain https://gitlab.freedesktop.org/mesa/mesa/-/jobs/1581118
17:36daniels: imirkin: same config as packet-3 and packet-4
17:36imirkin_: i misunderstood, probably - i thought it was running inside of qemu, so the arch would matter
17:36daniels: MrCooper: please file an issue for it; I'm on my way to the airport so won't be able to look at it for a bit
17:40MrCooper: daniels: well, for all I know it could be a Mesa issue, like last time; might be tricky to track down though with qemu in the mix
18:07karolherbst: ehhh ufff
18:08karolherbst: jekstrand: we need to be able to mark if nodes as not optimizeable to bcsel
18:08imirkin_: karolherbst: what's the issue?
18:08karolherbst: think about OOB global memory reads we want to guard against
18:08imirkin_: ah right.
18:08imirkin_: just use the nv50_ir lowering passes :p
18:09karolherbst: I know :p
18:09karolherbst: but it's for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2753
18:09gitbot: Mesa issue (Merge request) 2753 in mesa "nir: Add load_ssbo_address/size intrinsics" [Nir, Opencl, Opened]
18:11karolherbst: ahh.. nir_selection_control.nir_selection_control_dont_flatten
18:11karolherbst: okay, so we already have that
18:12pendingchaos: karolherbst: nir_opt_peephole_select() should already avoid optimizing IFs with ssbo loads
18:12karolherbst: why would it matter for ssbos?
18:12karolherbst: ssbos are bound check by definition, aren't they
18:13pendingchaos: *global loads
18:13karolherbst: pendingchaos: it checks for the dont_flatten control
18:13karolherbst: pendingchaos: doesn't seem to
18:13karolherbst: or not explicitly at least
18:14karolherbst: it prevents it for indirect loads though
18:14karolherbst: which by definitions are all global loads
18:16pendingchaos: should also prevent it for direct loads except for nir_var_shader_in/nir_var_uniform
18:17pendingchaos: nir_intrinsic_load_global should go to the default case in block_check_for_allowed_instrs(), which returns false
18:17pendingchaos: (default case in "switch (intrin->intrinsic)")
18:19karolherbst: ahh, right, missed that
18:20karolherbst: ehh.. now I trigger an infinite loop with opt_cse
18:29karolherbst: heh mhhh, now what to do with merged ssbo loads :/
20:51agd5f: is the general expectation that drivers without load and unload should not initialize their display hw until after calling drm_dev_register?
20:52agd5f: I keep running into horrible ordering issues
20:53agd5f: but re-arranging our entire driver init sequence is not really appetizing
20:59vsyrjala: i think we do pretty much everything except fbdev_initial_config() before registering
21:07agd5f: I think I got it. ordering issue with cec