00:00 imirkin: 00:00:00.032 [wlr] [backend/x11/backend.c:418] X11 does not support required DRI3 version (has 1.0, want 1.2)
00:00 emersion: yeah, it just crashes on startup for me
00:00 imirkin: sigh
00:00 emersion: eh
00:00 imirkin: what's 1.2 got that 1.0 doesn't?
00:00 emersion: i was testing with the modesetting driver
00:00 imirkin: i'll add it to xf86-video-nouveau
00:00 imirkin: modesetting isn't so great with nouveau
00:00 imirkin: i wouldn't recommend it for "production"
00:01 emersion: lists of modifiers, and… buffers with multiple DMA-BUF planes i think?
00:01 imirkin: (in spite of this, major distros appear to patch the X server to prefer modesetting over nouveau. o well.)
00:01 imirkin: hrmph
00:01 emersion: i should just patch wlroots to accept 1.0
00:01 emersion: it seems like nothing except modesetting implements 1,2
00:01 emersion: i can do that tomorrow
00:02 imirkin: i'll check into it, although multiple planes are definitely not supported
00:02 imirkin: and modifiers aren't going to work with xf86-video-nouveau
00:02 imirkin: at least ... i don't think they will. would have to re-read
00:02 emersion: the downside is that i'll just assume what formats are supported, and error out on multi-planar
00:02 imirkin: well, you should assume linear + no multi-planar
00:03 imirkin: coz that's what i'd say if i made xf86-video-nouveau "support" 1.2
00:03 emersion: hm, i'll re-read and see
00:03 emersion: in the mean time, you should be able to just Ctrl+Alt+F2, starts sway, let it crash, and investigate, if you wanted to
00:04 imirkin: yeah, but i want gdb/etc
00:04 emersion: yeah, makes sense
00:04 emersion: you could try the headless backend
00:04 imirkin: sure, let's try that
00:04 emersion: export WLR_BACKENDS=headless
00:04 emersion: then start it
00:04 imirkin: segfault
00:04 imirkin: yay
00:04 emersion: segfault or abort?
00:04 imirkin: backend_get_drm_fd (backend=0x5555555fd040) at ../subprojects/wlroots/backend/backend.c:76
00:04 imirkin: 76 if (!backend->impl->get_drm_fd) {
00:04 emersion: bleh
00:05 imirkin: (gdb) p backend->impl
00:05 imirkin: $2 = (const struct wlr_backend_impl *) 0x0
00:05 emersion: sounds like it's mu fault
00:05 imirkin: maybe i'm not building something it needs? dunno. i should have egl/gbm/etc though
00:08 emersion: ok, it's due to a recent commit
00:08 emersion: i have a fix
00:09 imirkin: ah ok
00:09 imirkin: well, if you push it out, lmk - i have a few things to take care of in the meanwhile
00:10 emersion: just pushed -- you should be able to pull, build, run
00:17 imirkin: sway or wlroots? (or both)
00:18 imirkin: ok, now it 'works', but it just exits
00:18 imirkin: 00:00:00.117 [wlr] [render/gles2/renderer.c:897] GL renderer: llvmpipe (LLVM 11.0.0, 128 bits)
00:18 imirkin: ah right
00:19 imirkin: hmmm
00:19 imirkin: i should have access to render nodes though
00:19 imirkin: (as my user)
00:21 imirkin: aha. looks like it picks my TNT2 board, which doesn't have GLES
00:21 imirkin: how do i tell it to use a particular render node?
00:22 imirkin: emersion: --^
00:39 imirkin: i tried WLR_DRM_DEVICES=/dev/dri/renderD128, but doesn't seem to do the trick
09:49 emersion: imirkin: sorry about the bumpy ride. here are some extra patches:
09:49 emersion: DRI3 1.0: https://github.com/swaywm/wlroots/pull/2655
09:50 emersion: select DRM node used by headless backend: https://github.com/swaywm/wlroots/pull/2656
09:51 emersion: i thought i'd need to do more weird workarounds for DRI3 1.0, but it's not that bad at all
11:00 pmoreau: Do we already have a driver CB when doing compute on Tesla? I would like to get `get_work_dim()` (to get the dimension of the grid) to work there, and it does not seem to be exposed by the hardware.
11:34 pmoreau: Ahhh, all the fun stuff with Tesla: if you have over 0x100 bytes of arguments to a kernel, the first 0x100 bytes are stored in shared memory and the rest gets put in a constant buffer apparently.
11:34 pmoreau: I wonder if Tesla might actually have 16,656 bytes of shared memory to accommodate those extra 0x110 bytes used by the driver/card, rather than the exposed 16,384 bytes.
15:48 sautax: Hello, i’m trying to use a refresh rate greater than 60Hz on my screen (it supports up to 240Hz) but when i try to set a high frequency only a part of the screen is shown. I saw in the troubleshooting guide i need to raise the performance of the card, is it worth the risk ?
15:50 sautax: i’m running Arch linux with the nouveau driver, gnome on wayland. My card is a RTX2070 super
16:00 imirkin: sautax: can you pastebin your edid?
16:00 imirkin: or even better, upload it to https://people.freedesktop.org/~imirkin/edid-decode/ and pastebin that
16:00 sautax: uhhh, do you have a guide to get  it ?
16:00 imirkin: you can get it from /sys/class/drm/card0-conn/edid
16:01 imirkin: only a part of the screen being shown implies that it's a multi-tile screen
16:01 imirkin: and the second (or first) tile didn't get modeset properly
16:02 sautax: when i change the frequency the displayed part ratio changes
16:03 imirkin: that's quite odd
16:03 imirkin: if it's not multi-tile, then we're screwing something up pretty royally
16:03 sautax: like when i’m at 200Hz only 1/10 of the screen is showing whereas at 120Hz 2/3 of the screen is showing
16:03 imirkin: since the support for these is new and relatively untested (relative to the older boards), that's also not impossible
16:06 sautax: here is the edid : https://pastebin.com/MSTyAJLM
16:07 imirkin: sautax: ok, looks like a pretty vanilla monitor, other than the high frequencies.
16:09 sautax: should i try to disable scaling ?
16:11 imirkin: emersion: thanks for dealing with my esoteric setup ;) i suspect not a lot of wlroots users are trying to host it inside X, and even fewer inside non-modesetting X :)
16:11 imirkin: sautax: scaling?
16:12 sautax: in the troubleshooting page with the title "My custom video mode is not effective"
16:13 sautax: it is suggested to run an xrandr command
16:14 sautax: but i don’t know if it is the same problem described
16:20 imirkin: sautax: that troubleshooting page is worthless
16:20 imirkin: it *might* have been useful 10y ago
16:20 sautax: oh
16:20 imirkin: emersion: ok, so now it starts and exits ... how do i get it to actually run something?
16:21 sautax: i regret buying a nvidia card ...
16:22 imirkin: good.
16:22 imirkin: hopefully your next purchase will send money to a company that cares about linux
16:23 sautax: i’m still in the return window of my card
16:23 sautax: but the graphics card marked is very strange right now
16:23 sautax: market *
16:27 imirkin: well, i don't know what your issue is, but it's *likely* that we can work out the problem and fix it
16:27 imirkin: however that won't address the fact that you bought an expensive space heater
16:28 sautax: yes x)
16:29 imirkin: (which isn't even good at space-heating, with all this power management stuff)
16:31 sautax: when i’m gaming on it (on windows...) it can heat up my little bedroom x)
16:32 imirkin: so if on linux you're really only look for modesetting, chances are reasonable that we can fix it
16:32 imirkin: esp if you're willing to test patches, and maybe make some traces of blob
16:32 sautax: i like testing things
16:33 imirkin: like kernel patches
16:33 sautax: i’m ok with it, i have a back up distro on an other drive
16:34 imirkin: well, it's more about "are you comfortable building your own kernel"
16:34 imirkin: different people are at different technical levels, not sure what yours is
16:34 sautax: i tried to install gentoo 2 times so i know a little about it
16:34 sautax: and i’m not afraid of it
16:34 imirkin: ok cool
16:34 imirkin:has been using gentoo since ~2005
16:35 imirkin: actually earlier i guess. that was when i switched my main desktop
16:35 imirkin: after the atlas 10k3 died. much sad.
16:36 sautax: i stopped when i encountered a modesetting problem :/
16:36 imirkin: those were simpler days. single output, user mode setting, no acceleration
16:36 imirkin: (beyond Xv)
16:37 sautax: but i wasn’t born yet x)
16:37 imirkin: that ... definitely doesn't make me feel old... sigh
16:37 sautax: sorry '=D
16:37 imirkin: it'll happen to you too
16:38 sautax: yeah that’s life
16:38 imirkin: anyhoo
16:38 sautax: what do i dowload ?
16:38 imirkin: does the monitor have a HDMI connector?
16:38 sautax: yes
16:38 imirkin: coz i bet this will all work better over HDMI
16:39 sautax: but if use the hdmi for this screen i will loose my dual screen setup
16:39 imirkin: oh.
16:39 imirkin: coz the other screen doesn't have DP and you don't have 2 hdmi connectors on the board?
16:40 sautax: no
16:40 sautax: i mean, you’re right
16:40 imirkin: and i don't suppose you have a DP -> HDMI round hole/square peg adapter?
16:40 imirkin: (all DP ports these days are DP++)
16:41 sautax: i have a miniDP to hdmi...
16:41 sautax: but no DP to hdmi
16:41 imirkin: but no DP -> miniDP? :)
16:41 sautax: no
16:41 imirkin: much sad.
16:41 imirkin: well - it'd be smoething to try
16:41 imirkin: if it works, that's a good data point
16:41 emersion: imirkin: isn't it SIGABRT'ing?
16:41 imirkin: i suspect you can live without the second screen for a few minutes
16:42 sautax: yes
16:42 imirkin: emersion: i applied your DRI3 patch
16:42 sautax: i was about to say that
16:42 imirkin: emersion: now it starts normally and just exits
16:42 imirkin: emersion: http://paste.debian.net/plainh/328225f6
16:43 emersion: hm, that's unexpected. it should open a window and not exit
16:43 imirkin: i never see a window even flash
16:44 imirkin: (but i suppose it could be SUPER fast, e.g. within one vrefresh period)
16:44 emersion: hm, can you try starting with "-d"?
16:44 emersion: it should enable debug logs
16:44 imirkin: fwiw this is what i was getting with headless too
16:45 imirkin: even when it picked llvmpipe
16:45 imirkin: emersion: http://paste.debian.net/plainh/3944373b
16:45 imirkin: (type type = bla stuff is just debug reporting of shader compilation stats)
16:46 imirkin: do i need a config of some sort?
16:46 imirkin: (do i maybe have a config which is screwing things up? hm)
16:46 emersion: oh, it's dumb. it's missing the config file. maybe try -c builddir/config
16:47 emersion: the error message could definitely be clearer
16:47 imirkin: 00:00:00.143 [sway/commands/output/background.c:122] Unable to access background file '/usr/local/share/backgrounds/sway/Sway_Wallpaper_Blue_1920x1080.png': No such file or directory
16:47 imirkin: /bin/sh: swaynag: command not found
16:47 emersion: this shouldn't be fatal, or is it?
16:48 imirkin: it's not
16:48 imirkin: now i just have a window with contents that don't refresh to anything
16:48 imirkin: i.e. it has stale old contents
16:48 imirkin: how do i do something?
16:48 imirkin: is present on nouveau broken? :)
16:49 emersion: you can start something in it by doing `WAYLAND_DISPLAY=wayland-1 <command>`
16:49 emersion: e.g. a gtk3 program
16:50 imirkin: mmmmmm
16:50 imirkin: i probably have like -wayland set
16:50 emersion: i'll try later today to reproduce with xf86-video-nouveau, i haven't tried
16:50 imirkin: is there some simple standalone thing i can get/build?
16:50 imirkin: i have weston - that has some terminal app i think?
16:50 emersion: yes, try weston-simple-egl
16:50 imirkin: do you know what it's calleD?
16:51 emersion: or weston-terminal
16:51 imirkin: it starts, but the window is stuck
16:52 emersion: ok. not sure what's going on then, i'll try to reproduce
16:52 imirkin: i don't see any Xorg logs, so at least the errors aren't obvious
16:52 emersion: are you running with mesa master?
16:52 imirkin: yes
16:52 emersion: hm
16:52 imirkin: (maybe a couple commits behind)
16:53 sautax: imirkin: i have connected my monitor with hdmi, what should i download ?
16:53 imirkin: 00:00:00.382 [wlr] [backend/x11/backend.c:672] Unhandled X11 event: 21
16:53 imirkin: 00:00:00.383 [wlr] [backend/x11/backend.c:672] Unhandled X11 event: 19
16:53 imirkin: emersion: --^
16:53 imirkin: sautax: nothing - does it work with higher frequencies?
16:54 sautax: i tried and it doesn’t work
16:54 emersion: hm, yeah, hard to say whether these are important
16:54 imirkin: sautax: same problem?
16:54 sautax: yup
16:54 imirkin: sautax: thanks, that's useful to know
16:55 imirkin: emersion: the full -V -d log: http://paste.debian.net/plainh/f2538e4e
16:55 imirkin: in case you see something unexpected
16:56 imirkin: emersion: hm, the fact that it has a modifier is slightly worrying, no?
16:56 imirkin: 00:00:00.382 [wlr] [render/gbm_allocator.c:50] Allocated 1024x768 GBM buffer (format 0x34325241, modifier 0x300000000000014)
16:56 emersion: yes, i was wondering about this as well
16:56 imirkin: but this isn't pageflipping, the DDX should be blitting this surface onto the root
16:56 emersion: do you ahve this patch? https://github.com/swaywm/wlroots/commit/bf86110fc5615bfd1af46e95cf81ac720ecac307
16:56 imirkin: so worst case, you get weird blocks
16:57 imirkin: emersion: probably not. let me rebase
16:57 emersion: ou may want to `git fetch && git reset --hard origin/master`
16:57 imirkin: yeah, just did
16:57 emersion: (the DRI3 1.0 patch has been merged)
16:57 imirkin: success!
16:57 emersion: yay!
16:57 imirkin: well, at least i get a grey background
16:58 emersion: yeah, that's expected
16:58 imirkin: that's a substantial improvement over before
16:58 imirkin: hmmmm
16:58 imirkin: running wayland terminal gets me nothing though
16:58 imirkin: also should there be some bar of some sort somewhere?
16:58 emersion: yeah but it doesn't find it since it's not in $PATH
16:59 emersion: no error for weston-terminal?
16:59 imirkin: no, but doesn't render anything either
16:59 imirkin: i mean, it complains about not loading cursor 'dnd-move' and a couple others
17:00 emersion: maybe try weston-simple-egl in case it's doing something dumb
17:00 imirkin: that one just aborts
17:00 imirkin: but it's old, perhaps needs rebuilding
17:00 imirkin: i also don't see a cursor
17:01 imirkin: when i'm over the window
17:01 emersion: hm, it should be rendered with OpenGL
17:01 emersion: maybe it still doesn't update the window
17:15 imirkin: emersion: oh, lol, forgot what i was trying to do
17:15 imirkin: debug the latest mesa
17:15 imirkin: now it crashes
17:15 imirkin: yay
17:15 imirkin: sway: ../src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c:212: nvc0_user_vbuf_range: Assertion `nvc0->vb_elt_limit != ~0' failed.
17:15 imirkin: which iirc is what you were getting too
17:15 KungFuJesus: imirkin: have those kernel patches ready for me to try?
17:15 KungFuJesus: today would probably be perfect to do it
17:16 imirkin: KungFuJesus: sadly no - turned out harder than i anticipated
17:16 imirkin: next weekend i guess
17:16 imirkin: KungFuJesus: i tried it last night, but some of the BE stuff wasn't quite as figured out as i had anticipated
17:16 KungFuJesus: hah, it broke for the LE cases?
17:17 imirkin: KungFuJesus: no, i didn't even write it, coz some bits were missing, or i wasn't sure how to do them
17:17 imirkin: i started writing it :)
17:17 imirkin: that's almost the same
17:20 KungFuJesus: I'm half tempted to try newer generations of nvidia GPUs on this G5, but I'd had have to leave an OF compatible one in one slot, I think. Also there's the whole limited PCI express power situation
17:21 KungFuJesus: I'm pretty sure I have the generation right after it, I've got an 8800 Ultra somewhere, but that would require some jerry rigging of power connectors to get to work
17:22 KungFuJesus: Does NV42 differ from the NV80s?
17:23 imirkin: yes, nv50 is a totally different generation
17:23 imirkin: 8800 Ultra is either a G80 or G92 probably
17:23 imirkin: there's been next to no testing of "newer" GPUs on BE
17:23 KungFuJesus: G80
17:25 emersion: imirkin: yes!
17:25 imirkin: KungFuJesus: do you actively seek out problem GPUs?
17:25 imirkin: or do they just fall into your lap naturally?
17:26 KungFuJesus: hah, well, for PPC64 BE I have a pretty short list that are actually compatible with OF, I'm sure
17:26 KungFuJesus: It came with a 6600 LE or something
17:26 imirkin: yeah
17:27 imirkin: the G80 is a funny board
17:27 imirkin: it was the first one of its generation
17:27 imirkin: and that generation was the very first DX10 GPU
17:27 KungFuJesus: ATI would have been _probably_ a safer bet, I'm guessing. But then who would debug all these BE nouveau issues?
17:27 imirkin: so it actually has some extra-special problems
17:27 imirkin: although it should basically work
17:28 KungFuJesus: The funny thing is that I have 2 of them. Bought them second hand from a co-worker while I was co-oping for college quite a while ago. Ran them in SLI, but they'd periodically drop off the bus
17:28 imirkin: emersion: btw - if you're open to TOTALLY esoteric requests -- make an option for the X11 backend to not hide the cursor
17:29 imirkin: emersion: otherwise when gdb catches the abort, the cursor's gone.
17:30 KungFuJesus: The lack of 2 pci express power connectors is probably motivation enough. That, and the incovenience of starting X on a different output than what the OF framebuffer starts up on
17:30 KungFuJesus: sorry, demotivation
17:32 KungFuJesus: I have the late 2005ish quad model, so it does have the beefier power supply, but I'd still be nervous about pegging the GPU and CPUs at the same time
17:33 KungFuJesus: I happen to have stumbled onto a spare parts machine while getting sent failed liquid cooling systems and returning them back (they sent me a whole damn G5), but still, I'm kind of sorta using it to test some actual code for work
17:46 emersion: imirkin: oh, so if an X11 client crashes, the cursor gets stuck to its current image? is there something i can do to prevent that other than not hide the cursor?
17:47 imirkin: the cursor is disabled
17:47 imirkin: i believe when the window has focus, you disable the native X11 cursor
17:47 imirkin: and then draw your own
17:48 emersion: yeah
17:48 imirkin: but if it aborts, the X application is still running
17:48 emersion: oh, abort pauses
17:48 imirkin: yes
17:48 emersion: i see
17:48 imirkin: good thing i have sloppy focus ;)
17:49 emersion: ahah
17:49 imirkin: i did once attach gdb to an X server, from an xterm attached to that X server
17:49 imirkin: the microsecond you run that, you realize the massive fail...
17:50 imirkin: but it's too late
17:50 emersion: lol
17:51 imirkin: btw, ok if i put you in the reported-by for that crash-on-draw on nvc0?
17:51 emersion: don't cut the branch you're sitting on
17:51 imirkin: pretty much yea
17:51 emersion: oh yeah sure, feel free
17:57 imirkin: emersion: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8546
17:57 emersion: \o/
17:57 emersion: will test
17:57 emersion: thanks a lot
17:57 imirkin: thanks for reporting
17:58 imirkin: ... and for making wlroots deal with my weird setup :)
17:59 emersion: :P
20:14 imirkin: emersion: want me to file a bug about that?
20:16 imirkin: emersion: btw, can i recommend mentioning the config thing in https://github.com/swaywm/sway/wiki/Development-Setup ? (i.e. for the non-system-install case)
20:17 imirkin: emersion: also, with the fix, no more crash, but we're back to "i just have a gray background, nothing rendered" problem
20:24 imirkin: emersion: btw, X events 20: MapRequest, 19: MapNotify, 18: UnmapNotify, 17: DestroyNotify -- these aren't handled, but not sure they need to be
20:25 pmoreau: imirkin: Is there a difference on Tesla between doing a `st u32 # s[$r0+0x0] $r6` and a `st u32 # s[$a0+0x0] $r6` (for example)? And do the addresses also work for global memory or just shared?
20:26 imirkin: pmoreau: i suspect the difference is that the former isn't allowed
20:26 imirkin: $a0-$a7 are address registers
20:26 imirkin: these are 16-bit registers, with a 17th sticky bit
20:26 imirkin: so if you have like $a0 = 0xffff
20:26 imirkin: and then you add 1 to it
20:26 imirkin: it will become 0x10000
20:27 imirkin: and then any operations you perform on it will leave at 0x10000 (unless you set it to a value directly, obviously)
20:27 imirkin: whereas $r0 is a full 32-bit reg
20:27 imirkin: i suspect that s[] takes $a offsets
20:27 imirkin: whereas g[] takes $r offsets
20:27 imirkin: or maybe both
20:28 pmoreau: s[] can take $r, at least for loads
20:28 imirkin: nvdisasm agrees?
20:28 pmoreau: Hardware agrees
20:28 imirkin: hardware don't lie
20:28 pmoreau: And I think nvdisasm too, though it did show a different offset than what envydis says
20:29 imirkin: https://envytools.readthedocs.io/en/latest/hw/graph/tesla/cuda/isa.html#shared-memory-access
20:29 imirkin: so yeah, looks like it actually doesn't take $a sources, only regular register sources
20:31 imirkin: i'd generally trust envydis about g80 stuff
20:31 imirkin: mwk did a very thorough job with it
20:31 imirkin: iirc he has hwtests for stuff too
20:32 imirkin: might be ALU only though
20:32 pmoreau: nvdisasm shows the blob always using $a sources for shared, but could be a bug.
20:32 pmoreau: I tried using $a for global but that didn’t work out so well: it ended up emitting using some $r instead. :-D
20:32 imirkin: well, tbh i would have assumed s[] would use $a
20:32 imirkin: coz i don't think it goes over 64k
20:33 pmoreau: 16K only on Tesla
20:33 mwk: imirkin: she*
20:33 mwk: and yeah, the tests are basically ALU-only
20:34 imirkin: mwk: i see there are g80_atom tests
20:34 imirkin: mwk: ok, will update my notes
20:34 mwk: anyway, s[] are always addressed by $a only, g[] by $r only
20:35 imirkin: right, that makes sense.
20:36 pmoreau: Okay, thanks mwk. I’ll need to update a few things most likely.
20:37 mwk: envydis can be slightly wrong about which shared mem modes are allowed for which instruction
20:37 mwk: as in, it disassembles modes that are actually illegal insns in hardware
20:37 mwk: I never got around to fixing that
20:37 pmoreau: I haven’t looked at indirect local mem accesses in a while, and today I saw the compiler output `add s32 $r6 s[$r63+0x10] $r16` but actually running that instruction via envydis gave `add b32 $r6 b32 s[0x10] $r16` without $r63.
20:38 mwk: what
20:38 imirkin: $r63 is kinda zero
20:38 mwk: $r63 would be the always-0 pseudoregister anyway, wouldn't it?
20:38 pmoreau: Right
20:38 mwk: well, kinda-0, yes
20:39 mwk: sounds like a strange way to write that instruction
20:39 pmoreau: But using the register notation (even if it is 0 in the end) lead me to believe $r could be used for s[].
20:39 mwk: that's a definite no
20:39 pmoreau: Ok
20:39 mwk: the only thing you can use $r addressing for is g[]
20:39 mwk: (and, in turn, that's the only thing you cannot use $a for)
20:40 pmoreau: Can c[] also be indirectly addressed, or only directly?
20:40 imirkin: sure, it can be indirect
20:40 imirkin: with $a
20:40 pmoreau: Ah, ok
20:40 mwk: but also there are restrictions
20:40 mwk: you cannot address it indirectly if you also use s[] in the same insn
20:41 mwk: because the insn has only one field to encode address and it's already in use
20:41 imirkin: too mcuh indirection
20:41 pmoreau: I see
20:43 pmoreau: Thanks for the information! 👍️ It’s been way too long since I last looked at that stuff.
20:45 imirkin: pmoreau: btw, i have a G84 plugged in
20:45 imirkin: i made a bunch of fixes to your branch
20:46 pmoreau: Oh, awesome! Were can I find those?
20:46 imirkin: working on it
20:46 imirkin: i definitely did it
20:46 pmoreau: :-)
20:46 imirkin: but then got distracted by other stuff
20:46 imirkin: so ... where is it
20:46 imirkin: heh
20:46 pmoreau: As we all do, no worries
20:47 imirkin: is there a way to see the top commit of each branch?
20:47 imirkin: aha, there's a --format
20:49 imirkin: cleverly named the branch 'compute'. who could have guessed
20:49 pmoreau: `git branch -vv` should show you that as well.
20:49 pmoreau: Damn, too obvious it was just hiding in plain sight
20:49 imirkin: yeah, but i had a compute and compute2
20:50 imirkin: i only looked at compute2
20:50 imirkin: assuming that compute was super-old
20:50 pmoreau: I would have assumed as much as well
20:50 imirkin: git branch --format '%(refname) %(subject)'
20:50 imirkin: in case this comes up in the future
20:51 imirkin: pmoreau: https://gitlab.freedesktop.org/imirkin/mesa/-/commit/9728de9f33edeb6b922c136084fd3798f283bb32
20:52 imirkin: pmoreau: btw, the reason that BIND_TIC/TSC aren't arrayed for compute is that there's only one compute stage. with 3d, each stage has its own list of bound tic/tsc's
20:52 pmoreau: `git branch -v` will give you that same info + the commit hash and some extra stuff + some alignment of the output.
20:53 imirkin: aha, will do that next time
20:53 pmoreau: Thank you for the link!
20:53 imirkin: pmoreau: my goal was to fake ES 3.1 and run some of deqp
20:53 imirkin: but then i realized you hadn't hooked up any of the regular pipeline for this
20:53 imirkin: and then i realized other things were totally broken
20:53 imirkin: so i went off fixing those
20:53 imirkin: and haven't gotten back to this
20:54 pmoreau: Add an extra v, `git branch -vv`, and it will also show you whether that branch is tracking a remote, and if so which. ;-)
20:55 pmoreau: The sampler thing definitely needed some more love.
20:55 imirkin: yeah
20:55 imirkin: i realized there was a lot to do
20:55 imirkin: i thought it'd be as simple as just force-enabling ES 3.1
20:55 imirkin: anyways, no worries, hopefully my fixes explain wtf was going on with the need to flip s values around
20:55 pmoreau: I was thinking of just dropping the patches I had and just add stubs/make sure the code does not assert for compute, and move on for now with the other stuff.
20:56 imirkin: what you have isn't necessarily wrong
20:56 imirkin: it's just incomplete
20:56 imirkin: you nee liek nv50_validate_compute_textures
20:56 imirkin: and compute_samplers
20:56 imirkin: and compute_buffers
20:56 imirkin: and compute_constbufs
20:57 pmoreau: IIRC in those patches I hadn’t changed in the code emission side of things, and/or hardware expected something specific and I fed it something else.
20:57 imirkin: i know.
20:57 imirkin: i fixed it up ;)
20:57 pmoreau: \o/
20:57 imirkin: check the change to nv50_context.h
20:57 imirkin: it's not random.
20:57 pmoreau: Ah right
20:57 imirkin: i don't go around renumbering lists for the sheer joy of it, fun though it is
20:58 pmoreau: :-D
20:58 imirkin: it's not hard to add the remainder
20:58 imirkin: i've just gotten diverted on a bunch of other things
20:58 imirkin: like "basic draws crash driver"
20:58 imirkin: due to various changes in mesa core
20:59 imirkin: seemed more important
20:59 imirkin: pmoreau: btw, which board do you have?
20:59 imirkin: nvac iirc?
21:00 pmoreau: NVAC + G96 in my laptop.
21:00 imirkin: right yeah
21:00 pmoreau: A bunch of other discrete GPUs.
21:00 imirkin: well, i just mean what's easy for you to get to
21:00 imirkin: basically i've only tested my fixes with the G84
21:00 imirkin: there could be some post-nva0 issues that need addressing too
21:01 imirkin: i have a nva3, but it's the GDDR5 one. not sure i have any others
21:01 pmoreau: The ones in the laptop are definitely easiest :-) Though I do have one plugged in a different computer, with SSH and Nouveau set up on it. I used it to run the OpenGL CTS daily with the latest mesa on it, to check for regressions.
21:01 imirkin: like a nva8 or something, not sure. i think i tried to get one at one point, but it turned out to be a G84
21:01 pmoreau: I think I have a GT200, no idea what it has for VRAM.
21:02 imirkin: well, the nvac should be fine
21:02 imirkin: (hm, how did i get vp4 to finally work then? i forget... maybe someone let me ssh to their box. or maybe i was able to repro the problems with the nv98)
21:03 imirkin: (or maybe i actually do have a board)
21:03 pmoreau: Speaking of other issues, I need to investigate those DRM timeouts and “trapped read at 00000007c0 on channel 3 [0faf5000 Xorg[28135]] engine 00 [PGRAPH] client 03 [DISPATCH] subclient 00 [GRCTX] reason 0000000f [DMAOBJ_LIMIT]” I now get every single time since 5.10.
21:03 imirkin: pmoreau: i'd appreciate it if you could run VK-GL-CTS KHR-GL33.* and KHR-GLES3.* (well, not .*, but the master list)
21:04 imirkin: pmoreau: as well as dEQP-GLES2 and dEQP-GLES3 from the aosp deqp
21:04 pmoreau: On the NVAC?
21:04 imirkin: anything that's nva0+
21:04 imirkin: so yeah, nvac included
21:04 imirkin: should be a mostly clean run -- you even get seamless cube maps there, so should let GLES3 cube texturing tests pass
21:05 pmoreau: Okay, let me double check what I have connected in the other computer, as I have everything set up there.
21:08 pmoreau: mwk: BTW, do you know how much shared memory the Tesla boards actually have? Looking at the NVIDIA compiler it still let me allocate all 16KB of shared mem for use by the compute shader, even though 0x0–0x10 was reserved for various info like block size, and 0x10–0x110 for the input arguments to the kernel so I wonder whether they only have 16KB, or 16KB+0x110 bytes. All user-controlled shared mem accesses were offseted
21:08 pmoreau: 0x110.
21:10 imirkin: pmoreau: i'd very much like to land some nv50 compute support, even if it's fairly partial, so let's coordinate efforts so we can make some good progress on it. i know you've been working on it for ages, with various detours, and i think it's gotten quite close
21:11 pmoreau: I was planning on spinning some patches off in a separate MR, especially now that cl_khr_il_program as landed so I only carry half the patches around I used to.
21:11 imirkin: ok. my immediate target is ES 3.1, not CL
21:12 imirkin: since i don't want to figure out the llvm stuff i need to build :)
21:12 pmoreau: :-)
21:12 imirkin: i think ultiamtely ES3.1 is achievable
21:12 imirkin: iirc only a handful of fallbacks for the full thing will be needed
21:12 imirkin: like indirect draws
21:12 imirkin: maybe a couple other things
21:13 emersion: imirkin: file an issue about which bug?
21:13 imirkin: emersion: cursor
21:13 pmoreau: All the LLVM pieces are now shipped on Arch, so I don’t even rely on a local LLVM build. Could be the same on Gentoo, who knows. Okay, except for libclc since the latest release still hasn’t the SPIR-V stuff. But that was relatively easy to get, IIRC.
21:14 emersion: hm, you can, but i'm not sure about the best way to fix it. i don't really want to go out of my way to fix it, but maybe yet another env variable would be acceptable
21:15 imirkin: emersion: yeah, it's an awkward issue
21:15 imirkin: emersion: but i posit that the X11 backend isn't really meant for "production" in the first place
21:16 emersion: well… we'll probably use it for gamescope at some point. also, cage uses it for "production"
21:16 imirkin: emersion: btw, if there's a task in sway/wlroots that you think might be appropriate for me - feel free to point in my direction. i like to help those who help me.
21:16 emersion: i guess the real fix would be to do like the Wayland backend, and set the "real" cursor instead of hiding it
21:17 emersion: ah, thanks a bunch. well, you've already helped me with this bug fix :P
21:18 emersion: i'll have a look
21:19 imirkin: maybe even this cursor thing, if i could get some pointers
21:19 emersion: if you're up to dealing with X11 stuff, sure!
21:20 emersion: i don't remember how cursor images work on X11, i should look that up again
21:20 imirkin: i haven't the faintest idea
21:20 imirkin: i know it works though
21:20 imirkin: i'll make a wild guess: you give a pixmap :)
21:20 emersion: sounds likely!
21:21 imirkin: anyways, in the meanwhile: https://github.com/swaywm/wlroots/issues/2659
21:21 imirkin: if you can point at how the wayland backend does this, i could give it a stab for x11
21:22 mwk: pmoreau: the max is 16kiB, and if compiler lets you access more it sounds like a bug
21:22 imirkin: pmoreau: esp nv50 era stuff wasn't super-conformant
21:23 imirkin: they were missing some stuff in GLSL too (that admittedly their hw couldn't do, but it's still in the spec...)
21:23 imirkin: (cube shadow maps + bias)
21:24 mwk: in theory you could access all 16kiB if you first move the fixed inputs out of the way to, say, registers
21:28 imirkin: emersion: aha, output_set_cursor / output_move_cursor in backend/wayland/output.c
21:28 imirkin: i'll see what i can do.
21:29 emersion: oh, you found it, cool
21:29 imirkin: basically i should be able to copy/pasta most of that, i presume
21:29 emersion: yeah. basically need to add a swapchain for the cursor, render to it in set_cursor
21:30 emersion: yeah, should probably work. i just hope X11 accepts DRI3 buffers for the cursor
21:30 imirkin: definitely not
21:30 emersion: eh
21:30 imirkin: will have to read it out
21:30 emersion: then we'll need to read the pixels or something
21:30 imirkin: yea
21:30 emersion: we have wlr_renderer_read_pixels
21:31 emersion: https://github.com/swaywm/wlroots/blob/f17b0f975d271a2c001627fe47af9a5b8800c774/include/wlr/render/wlr_renderer.h#L113
21:31 imirkin: cool, thanks
21:31 emersion: should be possible to call this guy before wlr_renderer_end
21:32 emersion: hm, i hope the pixel format will work fine -- GLES2 is pretty restrictive
21:33 emersion: basically just supports GL_RGBA and GL_BGRA_EXT
21:34 imirkin: that's fine
21:34 imirkin: the cursors have to be ARGB8 i think
21:34 imirkin: or something along those lines
21:34 pmoreau: mwk: Arf, I’ll leave it on the side for now then. I don’t think the CTS tests that one can indeed access 16KB so that could still be fine until someone really needs those 16KB.
21:35 pmoreau: imirkin: Yeah, like you can only do 2-d grids on it. 🙃
21:36 pmoreau: So, it’s a G94 I currently have plugged in my other computer, so that won’t do it for you. I could probably plug in a GT200 instead, but I first need to solve the computer not getting an IP address.
21:37 imirkin: pmoreau: yeah, the IP address is useful
21:37 pmoreau: It used to be able to get one, as I would always SSH in it, but today it refuses to.
21:51 emersion: imirkin: hm, now i see XDefineCursor, and it sets a cursor for a given window
21:51 emersion: right now wlroots uses xcb_xfixes_hide_cursor when the cursor enters the window
21:51 imirkin: see my thousand questions in #xorg-devel
21:51 emersion: but i figure we could just set an invisible cursor with XDefineCursor?
21:51 imirkin: yes, that's also possible
21:52 imirkin: that's the cheap fix to this problem :)
21:52 emersion: ah, lemme read
21:53 imirkin: so yeah. v1 is "just make an empty-looking cursor"
21:53 imirkin: v2 is "don't do soft-cursor"
21:58 pmoreau: You know what helps getting an IP address? Making sure the Ethernet cable is properly plugged in. 😅
22:00 ccr: !
22:30 Lyude: btw emersion just curious, do you have plans to work on nouveau display stuff at any point? mostly asking since you were asking about those patchesI need to send to the ml (sorry it slipped my mind again, few more things came up at work <<)
23:07 KungFuJesus: dear lord xz is slow on ppc64
23:07 KungFuJesus: I have half a mind to profile it just so that emerge firefox doesn't take like 15 minutes to actually start
23:15 KungFuJesus: https://github.com/xz-mirror/xz/blob/869b9d1b4edd6df07f819d360d306251f8147353/src/liblzma/check/crc64_fast.c "fast"
23:17 imirkin: shoulda seen the slow one
23:18 KungFuJesus: hah
23:20 KungFuJesus: I imagine it's hitting a misaligned read cost for some of this
23:20 imirkin: oh yeah - ppc doesn't have unaligned memory access
23:20 imirkin: so kernel traps + emulates
23:20 imirkin: that doesn't come cheap
23:20 KungFuJesus: though I'm not entirely sure what this macro does: aligned_read32ne
23:21 KungFuJesus: that table permutation though I imagine could actually translate well to altivec
23:21 imirkin: what are you waiting for :)
23:21 RSpliet: read 32-bits no-endinannes (e.g. don't try to byte-shuffle)?
23:22 RSpliet: https://fossies.org/dox/xz-5.2.5/tuklib__integer_8h_source.html#l00460
23:22 RSpliet: Looks like it could be very expensive
23:23 imirkin: cheaper than the kernel handling the SIGILL
23:24 KungFuJesus: yeah looks like there's probably a memcpy cost instead of guaranteeing the buffer is aligned to 64 bit words
23:24 KungFuJesus: there are reasonable portable ways of doing that, though...Unless somehow they need byte level indexing apart from these functions
23:24 imirkin: they probably do.
23:25 imirkin: but you'd be better off reading the stuff in byte by byte
23:25 imirkin: and returning the assembled integer
23:25 imirkin: than copying it around
23:25 KungFuJesus: I could write my own personal altivec patches - strangely the x86 assembly for this stuff doesn't leverage any simd even though it looks entirely possible
23:26 KungFuJesus: just shifts, adds, and xors: https://github.com/xz-mirror/xz/blob/869b9d1b4edd6df07f819d360d306251f8147353/src/liblzma/check/crc64_x86.S
23:26 HdkR: If you use SIMD then you lose using the CRC GPR ops on x86
23:27 HdkR: Oh, they don't use those ops at all. womp womp
23:31 KungFuJesus: yeah, though this seems to be a 64 bit crc, but I imagine you could leverage them all the same
23:33 KungFuJesus: I'm surprised that xz is as fast as it is on x86. I am doing this stuff over NFS, though. It could just be that the network driver for this broadcom chip sucks or the NFS code is just not friendly to ppc64BE
23:33 KungFuJesus: didn't see a whole lot of time waiting in kernel functions with perf, though