09:45Quibus: hi
14:53rasterman: o/
14:55rasterman: so... 890m. white screen/overlay plane of death ... known problem with any known patches?
15:08Remco: If it's a known problem, it's probably on gitlab
15:08rasterman: i assumed my googling would pick it up on gitlab or elsewhere
15:11Remco: Does it happen when you do anything specific?
15:11rasterman: it was kind of random but... just now i've made a reliable repro
15:12rasterman: kmscube... let kmscube exit
15:12rasterman: boom. white screen :)
15:12Remco: Anything in system logs?
15:13rasterman: (or well specifically it's some overlay plane that is almost all whitw wth perhaops the top ~10-20pixels clear so can see the planes below - e.g. the vt console text and i can hit enter to keep it scrolling/changing below)
15:13rasterman: no logs :/
15:13rasterman: sometimes i get
15:13rasterman: [ 1237.600096] [drm] DMUB notification skipped due to no handler: type=NO_DATA
15:13rasterman: but its not "every time is ee the white overlay"
15:13rasterman: i can get rid of the white overlay by starting xorg ...
15:13rasterman: weston doesnt clear it off though.
15:14rasterman: i am guessing it's some overlay plane control regs being left in or set to some garbage state
15:14rasterman: but it seems at least processes from userspace that should fix it ...don't
15:14rasterman: and on the odd occasion running x hasn't cleared it up in the past
15:15rasterman: i also am havbing fun with complete lockups of rendering in compositing and then entire kernel lockups too
15:15rasterman: so i'm trying to knock off the problems now one at a time.
15:17Remco: Which mesa/amdgpu versions?
15:17rasterman: https://gitlab.freedesktop.org/drm/amd/-/issues/3800 <- is kind of some other stuff i'm seeing ... kind of.
15:17rasterman: strix point APU. 890m..
15:17rasterman: 6.12
15:18rasterman: mesa 1.24.2.5
15:18rasterman: tho tbh not sure its mesa ... it smells lower down
15:19rasterman: methinks i need to add soem mroe debugging
15:20soreau: maybe try an olderish kernel
15:20rasterman: i had this on 6.11 too
15:21soreau: could libdrm be an issue?
15:21rasterman: as the machine is quite new havent been on anything older :)
15:21rasterman: hmm i doubt its libdrm
15:21rasterman: if kmscube is exiting then fbcon should handle any plane resets itself...
15:22rasterman: should be™
16:22rasterman: and some logs finally of some worth
16:22rasterman: https://gitlab.freedesktop.org/drm/amd/-/issues/3839
16:22rasterman: https://gitlab.freedesktop.org/drm/amd/-/issues/3840
16:40fililip: yeah strix seems to have lots of different kinds of display issues :(
16:40fililip: every other component works perfectly fine though
16:41MrCooper: rasterman: I've got some bad news for you about fbcon
16:41MrCooper: it doesn't reset anything other than what it needs itself
16:41rasterman: MrCooper: i thought t's the job of whoever owns the vt next to set everything to what they need?
16:42rasterman: eg x, or your wl compositor etc.
16:42rasterman: thus ... extend that to fbcon vt...
16:42MrCooper: that's not the current reality of KMS
16:43MrCooper: I don't disagree though
16:43rasterman: so if whatwever was owning it e.g. segv's - you cant get to something useful without like 67 overlay planes still on :)
16:44rasterman: actually i wonder if the problem with kmscube ie... it by luck uses an overlay plane?
16:44rasterman: and when it exits that's left on?
16:44MrCooper: possible in theory, not sure what it would use an overlay plane for though
16:45MrCooper: if that's it, it should be visible in drm_info output
16:45rasterman: rendering egl surface target :)
16:46rasterman: tho i see this overlay plane sometimes appear on startng x
16:46MrCooper: primary plane should suffice for a single fullscreen surface
16:46rasterman: thus my ~/,xinitrc has a chvt 1; sleep 1; chvt 7
16:46rasterman: yeah - primary should...
16:47rasterman: the problem more is that this iis generally random-ish EXCEPT with my kmscube example
16:47rasterman: which makes me lean to some garbage internal state?
16:48MrCooper: offhand I rather doubt it's explicit overlay plane usage, though e.g. amdgpu now internally uses overlay planes for the cursor plane in some cases
16:48rasterman: yeah - i think its something internal to the kernel driver state and maybe a plane...
16:49rasterman: but t could also be the primary plane being somehow cut short after N amount of data
16:49rasterman: and then the output just displaying all 0xff's for everything when it cant read any data?
16:52rasterman: geez
16:53rasterman: 10 planes
16:53rasterman: ooh and now tyhe white plane is back by just stopping x
16:56rasterman: ok. x was using plane 6 for the mouse... pretty obvious. it's 256x256
16:57rasterman: but then n fbcon it's not used - width, height and db_id are 0
16:58rasterman: and plane 3 is the primary it seems
16:59rasterman: we switched from swizzled tiles to linear but it otherwise looks sane. fb_id. 2880x1800. pitch is correct fro 32bit.
16:59rasterman: all other planes seem to be off
17:00rasterman: (fb_id 0 and 0x0 geoms etc.)
17:02rasterman: and i have drm_info dumops fronm a weorking fbcon to broken white
17:03rasterman: the same except mode id blob...
17:08MrCooper: it sounds like it's most likely just some amdgpu DC bug
17:09MrCooper: if it's an internal panel, you could try disabling PSR / Panel Replay
17:09rasterman: yeah- but... where? :)
17:09rasterman: it is an oled internal panel indeed
17:10rasterman: that's self refresh?
17:10MrCooper: yep
17:10rasterman: oh it's actually using it?
17:10rasterman: tho i have to say sopme drm debug is making me unhappy
17:10MrCooper: not sure, many modern laptops support some variant of it though
17:11rasterman: the debug is saying its gettign update rects for the entire screen
17:11rasterman: when i know compositor is swapping buffers with damage sub-rects
17:12rasterman: in theory with the oled panel you can do partial updates with that correct info
17:12fililip: it would need to support PSR-SU for that, and my lenovo laptop for instance does not support it even though it has an oled panel
17:13rasterman: oh wait... this is an xorg problemit looks like
17:13rasterman: update dirties are correct with weston
17:15rasterman: hmm but wrong with e in wl mode.. ugh.. i swore we swapbufferswithdamage...
17:21MrCooper: that doesn't translate to KMS properties with the GBM platform, the compositor needs to use the KMS properties directly
17:21rasterman: yeah... we have the logic to even use swapbufferswithdamage.. never noticed - it doesnt work. :)
17:22rasterman: it was all there duplicated from everywhere else assuming it all worked
17:31rasterman: ok. found the psr debug mask bit
17:31rasterman: let's see
17:33rasterman: same as before. psr fully off - still same problem