00:05karolherbst: mhh.. I think I don't handle PIPE_RESOURCE_FLAG_DONT_MAP_DIRECTLY correctly :)
00:06karolherbst: though ... weird
00:18karolherbst: I wonder what kinda of super optimization radeonsi is doing which causes those issues... mhh
01:18alyssa: jenatali: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests?scope=all&state=opened&author_username=alyssa
01:18alyssa: feels good
01:20jenatali: :O
01:20alyssa: the trick is getting suckers at microsoft to ack my patches
01:20alyssa: uh
01:20alyssa: i mean
01:20alyssa: reviewing lots of other people's patches so they'll review yours in turn
01:20alyssa:sweats
01:20alyssa: :-P
01:21karolherbst: *scratches head*
01:21karolherbst: I wished I'd knew why this stuff is broken
01:22jenatali: alyssa: Is that an r-b for that util patch then? :P
01:23karolherbst: mhhhhh https://github.com/darktable-org/darktable/issues/8000
01:26kisak: oh no, the moment you hop off that MR treadmill you start to get rusty. Who knows, you might even go for a walk.
01:26karolherbst: I'm already curious why I get some reviews from alyssa on my CL MRs π€π€π€
01:28kisak: The rust is already showing itself.
01:32alyssa: jenatali: I thought I did r-b?
01:33alyssa: well, lol'd-by
01:33alyssa: close enough
01:33jenatali: Oh is that what that means?
01:33alyssa: :D
01:33alyssa: karolherbst: Maybe Boston Dynamics has a vested interest in OpenCL ;-)
01:33karolherbst: sounds plausible
01:34alyssa: actually the new rumour was that Boston Dynamics is a cover for me working at John Deere on tractors
01:34alyssa: I think that was also courtesy of zmike
01:34karolherbst: I think I've heard something about that
01:35zmike: no that's old news, I corrected the record already
01:35zmike: you can't fool me
01:35karolherbst: though I'm still catching up and are kinda stuck at you'll be going to apple
01:35karolherbst: yes yes I know
01:35karolherbst: I'm bad at catching up with rumours
01:35zmike: she's going to dyson to work on vacuum tech
01:35karolherbst: work on a dyson sphere? gotcha
01:36jenatali: There's a joke about sucking to be made there but I'm not going to make it
01:36karolherbst: you have to explain to me what you mean, I'm literally clueless
01:37alyssa: zmike: are you saying my code sucks
01:37alyssa: or will suck
01:37alyssa: rather
01:37zmike: sounds like you're saying it
01:37zmike: you should have more confidence
01:37alyssa: jenatali: too late i made it
01:38HdkR: We already know John Deere is a red (green?) herring since you would never work at a company that is so anti-consumer :)
01:39alyssa: HdkR: hey, quit it with your facts and logic here, this is a serious shitposting event
01:39alyssa: :p
01:39zmike: smh bringing facts to #dri-devel
01:40HdkR: :D
01:40karolherbst: what? serious facts in my chatroom?
01:41HdkR: I just keep missing my opportunity to nab people for FEX. :P
01:41alyssa: HdkR: Why would Dyson pay me to work on FEX?
01:42alyssa: This is not skepticism, I'm asking you to come up with a reason, c'mon now =D
01:42HdkR: For their new Gaming while you vacuum initiative. Gets people to clean more frequently
01:42zmike: because FEX su...wait a minute
01:42alyssa: zmike: ouch
01:42HdkR: lol
01:43karolherbst: there is one thing I'm curious about tho, why dyson?
01:43alyssa: IDK ask zmike why should I know
01:44zmike: need a custom mali-based graphics driver for the touchscreens on new vacuum line
01:44alyssa: i literally just announced im done with mali
01:44HdkR: Explains the OpenCL and AI accelerated cleaning
01:44alyssa: as cover
01:44alyssa: for my dyson job
01:44alyssa: right?
01:45psykose: dyson is getting acquired by apple and there's a new apple gpu for the touchscreens
01:45alyssa: ooh now you're thinking
01:45HdkR: A lifestyle company acquiring another lifestyle company
01:45HdkR: Makes sense
01:45alyssa: something something my alternative lifestyle
01:46zmike: accurately describes working on apple hw
01:47kisak: Too bad that iRobot went the way of OpenCV and camera mapping your house for Amazon instead of rendering on a ball. They're calling it the Dyson Sphere Project.
01:48karolherbst: I've arrived at a conclusion: It's impossible that rusticl is wrong, hence radeonsi is broken, cased close
01:48alyssa: no lies detected
01:48alyssa: rusticl is written in rust and radeonsi is written in c
01:48alyssa: pretty obvious who must be at fault
01:48karolherbst: way a second, I have conclusive evidence
01:49karolherbst: radeonsi still uses LLVM for compilation
01:49alyssa: that wasn't conclusive?
01:49karolherbst: let's say we have both conclusive evidence
01:49alyssa: speaking of conclusive evidence
01:49alyssa:is playing with anholt's compare-perf.sh
01:49alyssa: results are illuminating
01:50alyssa: damn I sure waste a lot of time on optimizations that haven't actually helped.
01:50alyssa: i mean i already knew that but
01:50karolherbst: that reminds me, I have some "benchmarks" which are very good, where rusticl is 128x as fast as intel. Facts
01:50alyssa: now i know that with numbers n=30
01:51karolherbst: I wonder when I'll start doing actual performance optimizing
01:52karolherbst: but seriously, this bugs just keeps bugging me and I have no idea what's wrong
01:52alyssa: nod
01:52karolherbst: I wonder if it's something really stupid
01:52alyssa: anholt: btw, the "$i -gt 1" in compare-perf.sh should be "$1 -gt 2" since the 0th run didn't count, otherwise the first ministat invocation will complain
01:52psykose: maybe the vacuum is making too much noise to focus? i hear dyson can help with that, we can get you a sponsorship..
01:53karolherbst: mhhh
01:53HdkR: karolherbst: If something is 100x faster, then something bad is happening on one end
01:53karolherbst: HdkR: you don't want to know, because it's so stupid
01:53karolherbst: all I can say is: rusticl isn't wrong
01:54karolherbst: probably
01:54karolherbst: turns out, there are some silly benchmarks trying to "hide" the fact, they are writing 128 times to the same memory by doing weirdo loops and shit
01:54HdkR: I follow
01:54karolherbst: apparently nir can look behind it
01:54karolherbst: llvm can't
01:54karolherbst: sooo....
01:54HdkR: lol
01:54karolherbst: fun fact
01:55karolherbst: they report memory bandwidth and hope it's not getting optimized away
01:55karolherbst: so apparently I get like TB/s of memory bandwidth on my laptop
01:55HdkR: LLVM isn't a GPU compiler, reaffirmed
01:55HdkR: TB/s of memory BW in the L1 cache maybe
01:56karolherbst: I mean..
01:56karolherbst: if you divide it by 128...
01:59karolherbst: I think I need to find somebody at AMD to look into this for me π
02:02kisak: It's nice how radeonsi has been quietly (from the news side) preparing for not LLVM as an option.
02:04karolherbst: yeah uhm...
02:04karolherbst: so here is this thing
02:04karolherbst: I need proper function calls
02:05karolherbst: *runs*
02:05HdkR: OpenGL subroutines call to you
02:05karolherbst: not the same thing
02:05HdkR: Basically the same thing if the UBO is just a raw GPU pointer :P
02:06HdkR: er, Uniform variable
02:06karolherbst: kinda sad you don't have a driver in mesa, otherwise I could say things like "wanna throw a bug about not being able to compile a shader with 2.000.000 ssa values with your driver at you?" :P
02:06karolherbst: *wanna me
02:06alyssa: how does aco do with shaders that big OOI?
02:06HdkR: It's an advantage that I have :)
02:07karolherbst: probably the same as LLVM
02:07karolherbst: OOM
02:07alyssa: OOM.. right.
02:07alyssa: we should fix that
02:07alyssa: 2 million isn't that big, there are gigabytes of RAM
02:07karolherbst: so apparently some of those kernels compile on some amount of RAM, like 64GB for instance
02:07alyssa: karolherbst: OOM in the backend or in NIR?
02:07karolherbst: oh, nir is fine
02:08karolherbst: it's RA which goes OOM
02:08karolherbst: well
02:08alyssa: oh
02:08karolherbst: the nir shader is still huge
02:08karolherbst: like 20GB or something
02:08alyssa: we should definitely fix that, aco's RA should be able to cope with just a few million values
02:08karolherbst: obviously
02:08karolherbst: or make nir better in optimizing that stuff
02:09karolherbst: I mean.. some glocal CSE won't hurt with a shader that big, will it?
02:10karolherbst: but yeah anyway.. turns out inlining functions with thousand of instructions a couple of times does blow up some shaders
02:10karolherbst: I'm still trying to find the person doing the work on the intel backend compiler for this π
02:14alyssa: i will not wire up images on asahi for opencl over holiday.
02:14alyssa: i will not wire up images on asahi for opencl over holiday.
02:14alyssa: i will not wire up images on asahi for opencl over holiday.
02:15alyssa: *bart simpson*
02:15karolherbst: well.. you could do it for gl instead :P
02:15alyssa: or vulkan
02:15alyssa: maybe dyson will pay for agxv
02:15HdkR: You could do it for Vulkan and let Zink handle OpenGL for you :P
02:15alyssa: for vacuum dxvk
02:15karolherbst: but yeah...
02:15karolherbst: I should boot up my m2 again...
02:16karolherbst: for uhm...
02:16karolherbst: reasons
02:16alyssa: I'll probably do images in May
02:16karolherbst: how is the compuer shader work going?
02:16karolherbst: noice
02:16alyssa: Maybe I'll have funding from the vacuum company by then
02:16karolherbst: passing the CTS in may should be feasible
02:17karolherbst: last time I ran the CTS on agx it was like 50% pass rate
02:17karolherbst: that was in december
02:17karolherbst: though a lot of stuff just passed because it doesn't report "not supported" and if images are disabled...
02:18karolherbst: I think I'll do radeonsi first, because it's like 25 tests to fix left
02:24alyssa: karolherbst: please don't work on rusticl+asahi until I have gles/vk level compute finished off / give the green light, it will end up being stressful for me otherwise i expect
02:24alyssa: in the mean time, delete clover? :fire:
02:24karolherbst: :D
02:24karolherbst: well..
02:25karolherbst: svm is assigned to marge
02:25karolherbst: I wonder if I have enough to make HIP/SyCL run now as well...
03:22robclark: gfxstrand: ir3 can really indirect into register file (or const file)
03:24robclark: large arrays are useful to lower to scratch (we do) since register pressure is a thing.. but small arrays are useful to keep in gpr since we can fold indirect into alu instructions
03:25robclark: ahh, I guess emma covered that
03:25robclark:catching up on scrollback
03:32HdkR: RF indexing is still such a cool feature. I wish more things supported that
04:28mdnavare: danvet, daniels , robclark : I am running through my dim set up and when it tries to pull the drm-tip, i see this issue:./dim setup
04:28mdnavare: Setting up drm-rerere ...
04:28mdnavare: navaremanasi@git.freedesktop.org: Permission denied (publickey).
04:28mdnavare: fatal: Could not read from remote repository.
04:28mdnavare: Please make sure you have the correct access rights
04:28mdnavare: and the repository exists.
04:28mdnavare: dim: Failed to fetch drm-tip
04:39dolphin: mdnavare: does ssh work to people.freedesktop.org?
04:40ishitatsuyuki: HdkR, do you know some examples of code that can be lowered to RF indexing in practice? AMD ISA also has some support for it, so maybe we should have it in ACO too
04:44HdkR: ishitatsuyuki: I used to have a good example like half a decade ago but I forget what it was now
04:44ishitatsuyuki: oof
04:47mupuf: anholt: yes, we do automatically reboot before trying again. This is useful for the navi10, but maybe we can drop that on navi21
04:48mupuf: anholt: set B2C_SESSION_REBOOT_REGEX to an empty string if you want to disable the behaviour
04:51HdkR: ishitatsuyuki: I think it was something in an uber shader that generated a bunch of state in to RF and then later stages would be more optimal because it could then just index the various bits that were necessary?
04:52HdkR: But that's so vague at this point that the idea of storing 16 vec4s in registers or whatever doesn't really help
04:52ishitatsuyuki: it does feel a bit vague indeed
04:52ishitatsuyuki: but let me keep some of that in mind
04:53HdkR: If you find any shaders generating some large thread local array, gimme a link so I can refresh my mind on it at some point :P
05:18mdnavare: dolphin: Should i try to ssh to prople.freedesktop.org and check?
05:18mdnavare: people.freedesktop.org
05:18dolphin: yeah, the basic SSH setup needs to work first, seems like it doesn't
05:20mdnavare: Nope it doesnt I get the same denied error: Permanently added 'people.freedesktop.org' (ED25519) to the list of known hosts.
05:20mdnavare: navaremanasi@people.freedesktop.org: Permission denied (publickey).
05:20mdnavare: navaremanasi@navaremanasi:~$
05:21dolphin: Host *.freedesktop.org\n User dolphin\n IdentityFile ~/.ssh/id_freedesktop
05:21mdnavare: so basically i had the set up working and also commit right with my keys i had on my intel linux laptop, now that i moved to Google I am trying to set up the dim tools and it shows denied
05:21dolphin: does your SSH config have something like that? of course with your freedesktop user login :)
05:22dolphin: I think you may have to supply a new ssh public key to FDO admins to update to your account
05:23mdnavare: Hmm I dont have that entry in my ssh config either
05:24mdnavare: Let me try requesting FDO admins to update my account first, how do i do that? I dont remember top of my head
05:26mdnavare: dolphin: Should i just file a bug at fdo and ask the admins to update my account with new ssh public key?
05:26dolphin: https://www.freedesktop.org/wiki/AccountMaintenance/
05:26dolphin: Basically yes, there used to be more detailed instructions but see the above page
05:44mdnavare: dolphin: Created a ticket, danvet : Assigned to you
05:44mdnavare: Thank you so much
06:10dolphin: mdnavare: you probably meant to assign to daniels instead?
06:11danvet: mdnavare, you need to assign to site wranglers, I don't do fdo admin :-)
08:41dj-death: I guess turnip also burns the embedded sampler properties in the shader right?
08:42dj-death: since it's kind of derived from the same hw architecture
09:37emersion: is it possible to have optional sub-passes in vulkan?
09:37emersion: ie, i have a render pass with two sub-passes, and i want to skip the first (it's a no-op)
09:39emersion: let me rephrase: sometimes i need two sub-passes, sometimes only one, do i always need to create two command buffers and submit both?
10:30emersion: radv not happy about VkSubmitInfo.commandBufferCount = 0
11:09ishitatsuyuki: hm, that should be fine
11:17siddh: Hello, can someone review my logging macro patches? I sent them a while ago. Thanks.
11:17siddh: https://lore.kernel.org/dri-devel/1875fe8015d.134ced49297190.1370743243402238612@siddh.me/
11:47zmike: eric_engestrom: how is the branching and is it yet time to start backporting to 23.1
11:50eric_engestrom: zmike: the branchpoint happened yesterday morning, but then I had issues merging !22260 that needed to go in before the first -rc, which got merged this morning
11:50eric_engestrom: I'm a bit busy so I won't post the -rc1 and email until after work today, but the staging/23.1 branch is now at the state that will be -rc1, so you can start doing things with it already (testing, backport MRs, etc.)
11:51zmike: oh cool
11:51eric_engestrom: for backport MRs I won't merge them until I've posted -rc1 though
11:51zmike: hm I have a lot of patches that I was just going to backport directly
11:51zmike: or is that not allowed anymore
11:52eric_engestrom: no, please make an MR so that the CI verifies we don't backport too many bugs :)
11:53zmike: gasp
11:53zmike: how dare you
11:53zmike: I ONLY backport bugs
11:53eric_engestrom: but also, if you `cc: mesa-stable` or `fixes: abcdef123` it will automatically be handled
11:53zmike: yea, I have a bunch of patches without that
11:53eric_engestrom: I didn't wanna say it :P
11:53zmike: :P
12:48emersion: ishitatsuyuki: thanks, it turned out to be an issue in my code
12:48ishitatsuyuki: good to hear it's resolved
12:48emersion: a bit sad that the validation layers didn't complain about my timeline sema value being 0
12:48gawin: DavidHeidelberg[m]: fixes is better than cc:stable? I was always slapping second one.
12:49DavidHeidelberg[m]: gawin: it should be, someone told me that Fixes automatically pick for backports
12:49emersion: fixes is better, it allows the fix to be propagated only to the trees that need it
12:50kisak: Fixes: <commitid> gets read as, if release branch has <commitid>, then pick this up too.
12:51kisak: particularly useful near branchpoints
12:55jenatali: My rule of thumb is if it was regressed sometime in the last year, use fixes, otherwise don't bother trying to search for the right commit
12:59zmike: smh not having every commit memorized by the release it shipped in so you know whether to use fixes
13:02emersion: there are only a few hundred thousand commits after all
13:08haasn: in 2023, is there any way to get actual, working VRR in GL/Vulkan/whatever apps on Linux?
13:08haasn: ideally not requiring VkDisplayKHR?
13:08haasn: on mesa drivers, I mean
13:10emersion: that's up to the compositor
13:10emersion: many compositors do support it
13:10haasn: sway?
13:10haasn: it works in sway just moving the mouse cursor but doesn't work inside GL apps
13:11emersion: if the GL app is never late to submit a frame, it'll just use the full framerate
13:12haasn: I'm testing with https://github.com/Nixola/VRRTest
13:12haasn: maybe somebody with working VRR can verify that this app works as it should?
13:12haasn: because on my end I get stuttering motion and duplicated frames
13:13haasn: even though monitor says FreeSync Premium is enabled and I can see the fps jump around when I just move cursor (from 48 -> 175)
13:13haasn: but maybe it's a problem of this specific test
13:13haasn: in which case, is there a better test I could run?
13:14emersion: you can get the raw KMS timings with https://github.com/emersion/drm_monitor
13:14psykose: seems to work for me https://img.ayaya.dev/SaMv9FldXJuC
13:14psykose: no stutters or jumps or anything
13:15emersion: iirc, there was a recent AMD bug where VRR wouldn't work with direct scanout and some modifiers
13:15emersion: maybe try disabling direct scanout to see if you're hitting that?
13:15haasn: emersion: where can I set that?
13:16emersion: https://github.com/swaywm/sway/wiki/Debugging-flags
13:16emersion: -D noscanout
13:17zamundaaa[m]: FYI it works in KWin, with direct scanout, modifiers and all
13:21haasn: huh, how odd
13:22haasn: it works in *windowed* mode, but not in fullscreen
13:22haasn: in fullscreen the display always reports 175 Hz even if I set rendering to 150 Hz
13:22haasn: in windowed mode, at 150 Hz the display also reports a number wildly fluctuating between 147 and 153 Hz or so
13:22haasn: that's.. the exact opposite of what I'd expect
13:23haasn: but it's still visually stuttering, and the second scene test (+ long exposure camera) confirms that it's duplicating frames
13:23haasn: so somehow even though it's supposed to be VRR the compositor is sending out duplicate frames sometimes/somehow
13:25haasn: but visually it's very unsmooth in both windowed and fullscreen
13:26haasn: emersion: oh, yeah, -D noscanout makes fullscreen behave the same as windowed
13:27emersion: aha
13:27haasn: i.e. the monitor reports the display fps responding to the render speed, but I still get visual stutter and duplicated frames
13:55haasn: Seems to all work fine under KWin/Plasma also
13:57haasn: I think it works better in actual games?
13:57haasn: Or at least it’s harder to spot duplicated frames there
14:10haasn: Oh nice, after switching from Plasma Wayland back to X11 I get flickering garbage at the bottom
14:13haasn: Seems X11 doesn't want to do freesync at all anymore though
15:44tleydxdy: follow-up on the drm-file pid issue
15:44tleydxdy: if the game allocates vram, would be only be added to the game's vm list? or would X also have access to all
15:46MrCooper: karolherbst: looks like rusticl's fill_image/create_box is setting pipe_box::depth = 0 for 2D regions, should be = 1
15:47karolherbst: ohh really...
15:47karolherbst: yeah, that's not good :D
15:48karolherbst: mhhh
15:48karolherbst: yeah.. I think I need to wrap creation of pipe_box and require a target for those fix ups
15:50karolherbst: yeah.. there are some other places like that where it could be a potential problem
15:54jenatali: zmike: That failure is from st_TexImage for a multisampled texture. Is that supposed to work?
15:54zmike: yes
15:55zmike: it seems to on every other driver
15:55jenatali: π€
15:55zmike: indeed
15:56jenatali: That's... not something I can easily make work
15:57zmike: then how is the test not failing at present?
15:58jenatali: 'cause it's not trying to CPU upload to an MSAA texture. The null texture that was created was non-MSAA
15:59zmike: I see
16:01jenatali: I haven't looked to see why that worked in the first place, it might've been working by pure accident
16:01jenatali: But this CPU upload of an MSAA texture is not going to work, we'd need to blit it. At which point we'd really prefer to just get a clear rather than a texture upload
16:02zmike: tbf I haven't looked at the test either
16:02zmike: I just saw that you were the only one failing
16:03jenatali: I think it's a coincidence, the test binds a program and does a clear and a readback, and it's just that during the readback Mesa sets up the state for the program and creates this null texture
16:03zmike: ah ok
16:03jenatali: And while trying to init the null texture we hit an error condition
16:04jenatali: Like... I could skip the copy and the test would probably work
16:04zmike: hm so it might be the case that I need a special condition for the texsubdata there with ms
16:04zmike: I'll look into it
16:04jenatali: Yeah, I think so
16:04zmike: thx for your expert debugging
16:04jenatali: Heh I wouldn't go that far. Run test under debugger, see debug message, report debug message to you :P
16:05zmike: E X P E R T
16:37dcz_: what hardware does rusticl support?
16:38dcz_: is it all models covered by radeonsi?
16:44mdnavare: daniels:
16:45mdnavare: daniels: Could you help me update the ssh public keys for my fdo account so I can re setup my dim
17:20daniels: mdnavare: yep I did see that and I’ll change it later
18:02mdnavare: Okay thank you so much daniels
18:41ChaosPrincess: hi all. I am writing a display driver, and my hardware and linux has differing ideas on which order the pixels should be in memory, with mine scanning out from the bottom left corner and then going up, and then right at the end of the row. as a result, the image ends up being turned 90 degrees anticlockwise (kinda). Is there a way to make drm rotate everything?
19:33alyssa: jenatali: look into u_transfer_helper's msaa_map?
19:34jenatali: alyssa: I don't think D3D allows copies between MSAA resources and buffers, which is how upload/download is done
19:35jenatali: Any kind of workaround would involve a second hop through a single-sampled resource, either a blit or a resolve
19:35jenatali: Which, if we have to do it because GL says so, then fine, but considering we're not doing it yet, uploading a single black texel doesn't seem like the reason to implement that kind of thing
22:02karolherbst: MrCooper: actually... it doesn't because the API validation makes sure it's 1 not 0, or at lesat should be
22:03karolherbst: but mh.. maybe fill_image/fill_buffer doesn't care? let me see...
23:16eric_engestrom: ChaosPrincess: with the caveat that I don't know what I'm talking about, I think you're looking for https://www.kernel.org/doc/html/latest/gpu/drm-kms.html#c.drm_connector_set_panel_orientation
23:37ChaosPrincess: Already tried, didnt help.