00:05Lyude: nope, looks like that didn't work. imirkin, know anything I can turn on to get nouveau to spit out alloc related info?
00:05imirkin: i hear printk is all the rage...
00:05imirkin: not sure what you're looking for
00:06Lyude: heh, figured :P. imirkin - I'm not sure what i'm looking for either, maybe something related to where it decides the bo should live
00:07Lyude: it seems like in nouveau though we're calling nouveau_gem_new() with basically identical args,
00:10imirkin: what can i say ... modetest >> igt :p
00:10imirkin: you could print the literal vram address in each case
00:10imirkin: (well, the VA)
00:10imirkin: as well as its PTE settings
00:11imirkin: although it should all just be linear/etc
00:11imirkin: maybe the dumb buffer is big enough to get 64k pages
00:11imirkin: but the nouveau_gem_new settings somehow prevent the big pages?
00:11imirkin: seems unlikely since i think that decision is made deep down
00:12imirkin: but i don't remember exactly how it all works
00:14Lyude: yeah-that's the kind oif stuff I'm looking for :). also, the size of the buffer is 262144 so 256k. also-the other thing I'm suspicious of (but less so-I don't feel like I remember running into any issues with this before I taught igt how to use nouveau libdrm instead of using dumb bos) is if it might have something to do with how we're mapping it, since in igt we avoid doing mmap() at all and just
00:14Lyude: blit back and forth with the ce
00:15imirkin: works for 128x128
00:15imirkin: and works on pascal
00:15imirkin: just sayin'
00:16imirkin: 128x128*4 = 64k, so big enough for a 64k page
00:16imirkin: i forget if the "large" pages are 64 or 128k
00:16imirkin: (iirc it's configurable, but i forget which way we configure it)
00:18imirkin: er what
00:18imirkin: i guess it's happy with both? how does that work?
00:19imirkin: Lyude: have a look at the VMM_DEBUG/TRACE macros btw
00:19imirkin: they should dump A LOT of info
00:19Lyude: a-ha, that was the thing I was thinking of
00:20imirkin: looks like nouveau.debug=mmu=trace
00:20imirkin: should do it
00:21imirkin: super. mmu->subdev.device->fb->page controls whether we do 64k or 128k pages
00:21imirkin: anyways, i'm guessing that it's set for 128k pags
00:21imirkin: and so the issue only hits with 256x256
00:24Lyude: oooh, nice catch
00:26Lyude: imirkin: btw - where did you find that?
00:26imirkin: and related
00:28imirkin: tbh i don't see where the logic lives for which page size to pick
00:29Lyude: yeah this is uh, some very dense code
00:30imirkin: found it
00:31Lyude: ??, *looks at that function again*
00:31imirkin: has a loop which tries to determine it
00:31imirkin: and at the end it sets nvbo->page = ...shift
00:31imirkin: which is what drives everything else down the line
00:32Lyude: oooh-didn't even notice fixup_align somehow
00:33imirkin: anyways ... not sure how you tell from that whether it's a dumb buffer
00:33imirkin: could abuse tile flags or whatever
00:34imirkin: for now you could also just force it to 4k pages to see if that fixes everything :)
00:34imirkin: or limit to 128x128 on kepler
00:34Lyude: i'm determined now
00:35imirkin: i think large pages are more useful than 256x256 cursors
00:35Lyude: imirkin: no I mean I'm determined to find the proper fix for this
00:35imirkin: going back and time and never touching nouveau?
00:36Lyude: also - i don't think that code actually knows the difference between a dumb buffer or not
00:36imirkin: it does not.
00:36skeggsb: what hw is it busting on?
00:36imirkin: that's what i was saying
00:36Lyude: imirkin: ah whoops, misunderstood
00:36skeggsb: i thought hw error-checked this, but:
00:36imirkin: 256x256 cursors are 256kb (ah, the irony) - so they get large pages
00:36skeggsb: args.gf119.page = GF119_DMA_V0_PAGE_LP;
00:36skeggsb: you need that set on the cursor ctxdma if the buffer is large pages
00:37skeggsb: i suspect if you *do* that though, the hw might throw an exception instead of silently failing
00:37skeggsb: pretty sure you need 4k pages for cursor
00:38Lyude: skeggsb: that's just setting the alignment to 4096 right?
00:38imirkin: Lyude: no, the nvbo->page
00:38imirkin: to 12
00:39imirkin: (which will also set the align to 4096, but there's a lot more)
01:11Lyude: imirkin: yeah me and skeggsb just figured it out, it's definitely the page type being used, and I'm pretty sure I know how to come up with a fix for this
01:19damo22: im trying to compile deqp and i keep getting glslang not found on fedora 33 even though its installed
01:19damo22: i have glslang-devel and glslang
01:20damo22: ackage glslang-devel-11.0.0-1.20200803.git5743eed.fc33.x86_64 is already installed
01:24airlied: damo22: I expect it wants its own copy of glslang
01:25damo22: ah theres a bug in the package, the .pc file has spaces at the front
01:25airlied: damo22: oh iwerd
01:26airlied: I should fix that
01:26damo22: but that still didnt fix it
01:26airlied: deqp normally does some wierd cmake or just checks out it's own copy
01:27airlied:hassn't use deqp much thougn normally just use VK-GL-CTS instead now
01:27Lyude: imirkin: anyway-it's getting late now, I'll write up a fix asap tomorrow now that we've found the issue
01:36imirkin: Lyude: sounds good
01:36imirkin: damo22: there's a fetch_sources.py script
01:37imirkin: external/fetch_sources.py iirc
01:37imirkin: and then yeah, you have to use cmake
02:08damo22: hmm i ran the python script and it cloned some repos but cmake still cant find glslang
02:09imirkin: hmmm ... i haven't run into that
02:10damo22: which commit of deqp are you on
02:10imirkin: from Dec last year
02:10imirkin: coz i update often ;)
02:11damo22: YAY that works
02:17imirkin: damo22: btw, make sure you update my branch. i pushed some potential fixes to the locking logic a few hours ago
02:18imirkin: (branch is "nv50_compute")
02:52damo22: ah right no worries
02:52damo22: i saw some maybe uninitialised warnings when i compiled last
02:53damo22: * d24ae47f789 (HEAD, imirkin/nv50_compute) nv50/ir: don't load value at same time as attempting to lock
02:54imirkin: damo22: yeah, i haven't cleaned up yet
02:54imirkin: and yes, that's the right commit
02:54damo22: as soon as this deqp compiles i will run the tests
02:54damo22: im on a quite old machine
02:54imirkin: damo22: only run like dEQP-GLES31.functional.compute.basic.shared_atomic_op_single_*
02:55imirkin: probably newer than mine =]
02:55damo22: Asus P5Q Pro
02:55imirkin: i dunno what that means
02:55damo22: (with coreboot)
02:55imirkin: i have a i7-920 cpu
02:55damo22: model name : Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
02:56imirkin: ah yeah, you win
02:56imirkin: on the shitty computer game :)
02:56damo22: its quite good though because theres no management engine on board
02:56imirkin: the ME on this one is weak, so that's nice
02:57damo22: also this board has no onboard gfx
02:58imirkin: nor mine
02:58imirkin: intel started adding onboard gfx on the next gen after mine
03:00imirkin: deqp takes a while to compile. linking the final binaries is very slow
03:06damo22: hmm cannot find -lGL
03:06damo22: mesa-libGL-devel is installed
03:06imirkin: that's not good.
03:06imirkin: how did you run cmake?
03:07damo22: build]$ cmake -DCMAKE_INSTALL_PREFIX=/zam/git/deqp-bin ..
03:07imirkin: i think you have to give it a deqp target of some sort
03:07imirkin: ah, i never install it
03:07imirkin: not sure what it does by default
03:07imirkin: i think it's like cmake -DDEQP_TARGET=x11_glx or something like that
03:07imirkin: maybe with an egl thrown in
03:08imirkin: how do i find out the settings i used with cmake?
03:08damo22: no idea
03:08imirkin: hold on
03:08imirkin: and i used -GNinja iirc
03:08imirkin: not that it really matters
03:11damo22: i have more cpu bugs than features
03:12damo22: bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
03:12imirkin: same as my list
03:13damo22: one day i will port nouveau driver to GNU/Hurd userspace
03:14imirkin: should be easy
03:14imirkin: at least the core nouveau part
03:14damo22: does netbsd have it?
03:14imirkin: it already runs in userspace if you want it to
03:14imirkin: yeah, netbsd has a copy of linux drm afaik
03:14damo22: ok cool
03:15damo22: netbsd has a userspace driver framework out of the box called rump
03:15imirkin: that's not what netbsd has though for nouveau
03:15imirkin: pretty sure it's a regular kernel driver there
03:15damo22: im jumping all over the place, sorry, i meant hurd is using that
03:16damo22: rump uses the standard netbsd kernel drivers but compiles them for userspace with a wrapper
03:17damo22: like a unikernel
03:17damo22: one day i will port the drm ones to hurd
03:18damo22: we already have pci layer in userspace
03:19damo22: could be interesting to have a user open the video card device
03:19damo22: instead of root owning all the devices
03:20imirkin: yeah, i mean the main thing is that you need some mechanism to do mmio writes
03:20imirkin: and then provide some API for the GPU
03:20imirkin: nouveau is split into 2 parts
03:20imirkin: the core
03:20imirkin: and the drm glue
03:20imirkin: the drm glue is ... drm-specific
03:20imirkin: but the core has very few dependencies
03:20imirkin: and it enables you to write a driver which presents any api you want
03:21damo22: yeah, i plan to reuse the netbsd core driver from the bsd kernel, compiled as a userspace unikernel and then port drm to userspace lib and link it to the driver
03:22damo22: then gfx "driver" will be a userspace process
03:25damo22: just got disk to work that way
03:28damo22: is mmio writes possible if you have access to /dev/mem ?
03:28damo22: or a mapped region from pci
03:29imirkin: sure, that's all it is
03:29imirkin: the PCI config space will expose some number of BAR's
03:29damo22: ok good, we are using libpciaccess
03:29imirkin: and you just need a mapping of each of those BARs
03:30damo22: and implemented a userspace arbitration server
03:30imirkin: but it has to be fast -- which means that you need to delegate permissions to that process in order to have such a mapping
03:30imirkin: the accesses can't be proxied
03:30damo22: right, so when the driver writes to the mapping it needs to have a mmaped region
03:31imirkin: a few regions, but yeah
03:31imirkin: one for each bar
03:31damo22: ok no worries
03:31damo22: i think we have that already
03:32damo22: they are files exposed by the userspace pci arbitration but when you run an RPC on that file you get direct access to the mmaped region
03:33damo22: and we can do access control using file permission system
03:34imirkin: so yeah. should be as simple as taking nouveau and replacing a couple of ioremap() calls
03:34airlied: and irqs :-P
03:34imirkin: who needs those
03:34imirkin: just poll!
03:34damo22: ah yes we have an IRQ server
03:39damo22: im learning a lot about operating systems by hacking on these rump drivers
03:39damo22: even though i havent written a line of code for actual driver implementation, i can see how it fits together with hud
03:40damo22: cant wait to have disk finished so i can work on gfx
03:40imirkin: graphics is hard
03:40imirkin: it's not like a self-contained thing
03:41imirkin: you need many pieces to play nice
03:41damo22: yeah there are lots of libs and JIT compiled GL stuff?
03:41imirkin: i mean like
03:41imirkin: so you made a graphics driver
03:41imirkin: now what?
03:41imirkin: what can you do?
03:41damo22: yeah you need an API to use it
03:42imirkin: and software which uses that API :)
03:42damo22: Audio stack is completely broken on linux
03:42damo22: there is no uniform api for it
03:42damo22: yeah but it sucks
03:42imirkin: does it?
03:42imirkin: the hardware sucks
03:42damo22: yes, not as good as video
03:42airlied: yeah like if you need sw for GL or Vulkan you need mesa
03:43imirkin: the userspace alsa stack tries to make up for a lot of that suckage
03:43airlied: and then you are fighting an upstream battle trying to get your replacement for ioctsl api in
03:43damo22: let me post a link to a bunch of reasons why ALSA needs work
03:43imirkin: i'm not suggesting alsa is perfect
03:43imirkin: and also linux sound APIs aren't great
03:44imirkin: i used to have a cron job which would just run 'kill -9 esd' back in the good ol' days
03:44imirkin: coz it would just get started up for no reason as a result of calling some API
03:44imirkin: (and i had a sound card which allowed concurrent audio streams, so no need for software mixing)
03:44damo22: it encourages a push model at the lowest level
03:44imirkin: emu10k1 ... the good ol' days
03:45airlied: most of the fixes for that are called pipewire now
03:45airlied: alsa isn't the layer for that really
03:45damo22: yes but pipewire is a stack one level up isnt it?
03:45damo22: runs on top of alsa?
03:45imirkin: but the sad reality is that audio hardware just doesn't have a lot of this functionality that userspace wants
03:45airlied: not really
03:45airlied: that document speaks aobut flaws in pulseaudio as well, that I think pipewire fixes in some ways
03:45imirkin: so you do end up needing to do things in userspace
03:46imirkin: airlied: does pipewire get me bluetooth audio?
03:46imirkin: the bt stack had audio, and then they removed it, "let userspace deal with it" =/
03:46airlied: imirkin: should be the same as pulse in that regard
03:46imirkin: iirc there's a special pulse plugin for it
03:47imirkin: is there a special pipewire plugin for it?
03:47damo22: sory for hijacking the conversation -> audio
03:47imirkin: ah, looks like it does
03:47imirkin: at least there are faq's about low audio quality on bluetooth
03:47airlied: imirkin: it does bluetooth stuff, not sure how it works though, it's magic to me :-)
03:47imirkin: so we know it works ;)
03:48airlied: damo22: but yeah back to what happens once you port the driver, that like step 0.0001 of having graphics
03:48airlied: there's a reason most ports end up porting all of the drm stuff
03:48damo22: yes i need drm
03:48damo22: so i can drop in userspace applications that already work
03:49damo22: i am not going to rewrite a video stack
03:49imirkin: those want to do ioctl's on device nodes
03:49airlied: and that's the tricky bit when you have userspace port
03:49damo22: ok so drm will be tricky
03:49imirkin: admittedly it's only a couple of key libraries
03:49imirkin: mostly mesa
03:51damo22: but rump exposes ioctl as a normal kernel api iiuc
03:51imirkin: "PipeWire handles Bluetooth audio devices if the pipewire-pulse package is installed"
03:51imirkin: oh well.
03:51imirkin:has avoided pulse
03:51damo22: i use jack
03:51imirkin: i use alsa :)
03:51damo22: and pulse dummy -> jack
03:52damo22: i have a 14in/12out mixer USB2 audio device that works flawlessly with alsa
03:53imirkin: i have onboard sound =]
03:53imirkin: i think i still have my sb live! somewhere
03:53imirkin: couldn't bring myself to throw it out
03:54imirkin: along with a 29160
03:56damo22: not sure what it is
03:56imirkin: aka "things that were cool in the year 2000"
03:56damo22: i had a AWE32
03:56imirkin: adapter 29160 -- 2x U160 scsi board
03:57imirkin: i've thrown out my scsi drives, so why i keep that board around, i couldn't tell you
03:57damo22: sandybridge/ivybridge laptops are good because they have coreboot support for the ram and ME can be partially wiped
03:58damo22: they cold boot in 1second
04:02damo22: i recently bought a W520 thinkpad, it has an NV card i need to reboot and turn it on at some point
04:02imirkin: i have a T420s
04:03damo22: the W has 4 dimm slots :D
04:04imirkin: and 4 extra pounds
04:05damo22: haha yes
04:05damo22: its a pretty cheap desktop though
04:05imirkin: my metric for a laptop is that i have to be able to comfortably hold it by the edge with one hand
04:05imirkin: if i can't, it's too heavy
04:05imirkin: if i work out more, i could get a heavier laptop :)
04:06damo22: flex those hands
04:08damo22: well youve been doing a lot of that recently typing the code for the nv50, maybe you can get a heavier one soon
04:18damo22: what is libGL.so supposed to point to on fc?
04:19damo22: lrwxrwxrwx. 1 root root 14 Jul 29 2020 libGL.so -> libGL.so.1.7.0 but that is missing
04:22imirkin: errr ... it's supposed to the GL library
04:22imirkin: perhaps something got messed up with glvnd?
04:22damo22: i installed nvidia crap i think i need to clean up
04:27damo22: i removed all my gl stuff and reinstalled mesa-libGL
04:27damo22: but now theres no libGL.so in /usr/lib64
04:27imirkin: is there a libOpenGL?
04:28damo22: ah yes
04:28imirkin: you need to install glvnd
04:28imirkin: iirc it's supposed to supply the libGL now
04:28imirkin: or if this is gentoo
04:28imirkin: you need to ... do something
04:29imirkin: i forget
04:29imirkin: point it at the xorg libgl
04:29damo22: its fedoa
04:29damo22: its fedora
04:33damo22: ok i removed the nvidia shit and sudo dnf reinstall libglvnd* mesa-*
04:50damo22: imirkin: i dont have Xorg installed, do i need it? it cant open display
04:50imirkin: you did the x11_egl_glx target right?
04:51imirkin: i think X is required for that, sorry. you could have picked a diff target
04:51imirkin: but i've had various troubles with other ones, so i always use this one
04:51damo22: where do i get a list of targets?
04:52damo22: i wouldnt mind trying a plain one that doesnt need X
04:52imirkin: ls targets
04:52imirkin: (it's a good system, right)
04:52imirkin: there's a surfaceless one which ought to work. probably.
04:53imirkin: but iirc i had unspecified troubles with it in the past years
04:53imirkin: since i have X, i just don't need to play those games :)
04:53imirkin: i have enough problems. i don't need more.
04:53damo22: yeah no worries
04:53damo22: but i prefer not to install X on my server, ill try again with surfaceless
05:06damo22: heh it needs EGL already inited
05:06damo22: so it probably needs a gbm thingy which is missing
05:07damo22: FATAL ERROR: Got EGL_NOT_INITIALIZED: initialize(m_eglDisplay, &eglMajorVersion, &eglMinorVersion) at tcuSurfacelessPlatform.cpp:277
05:07imirkin: did you build mesa with gbm?
05:08damo22: i didnt point deqp at your mesa
05:08damo22: only at runtime
05:08damo22: do i need to build deqp with a param that points to your built mesa?
05:10damo22: ok will check if i have gbm devel lib
05:11damo22: ah was missing that
05:11imirkin: it's in the mesa build params
05:11imirkin: -Dplatforms=something probably
05:18damo22: ok it autodetected it now GBM: yes
05:18damo22: EGL/Vulkan/VL platforms: x11 wayland surfaceless drm
05:32damo22: hmm you might be right, it does not seem to work
05:32damo22: it cannot init EGL
05:33damo22: i think i might be missing GLES31
05:33imirkin: are you running with the env vars i told you about?
05:33imirkin: one of them should force the driver to expose ES 3.1
05:33damo22: MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/home/damien/git/mesa-bin/lib64 ./modules/gles31/deqp-gles31 --deqp-visibility=hidden -n dEQP-GLES31.functional.compute.basic.shared_atomic_op_*
05:33imirkin: probably doesn't matter
05:34imirkin: but i only ever run it from the "gles31" dir
05:34imirkin: as the cwd
05:34imirkin: what exact error do you get?
05:35damo22: target implementation = 'Surfaceless'
05:35damo22: FATAL ERROR: Got EGL_NOT_INITIALIZED: initialize(m_eglDisplay, &eglMajorVersion, &eglMinorVersion) at tcuSurfacelessPlatform.cpp:277
05:35damo22: ./test-run: line 1: 69746 Killed MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/home/damien/git/mesa-bin/lib64 ./modules/gles31/deqp-gles31 --deqp-visibility=hidden -n dEQP-GLES31.functional.compute.basic.shared_atomic_op_*
05:35imirkin: can you run eglinfo?
05:36damo22: GBM platform:
05:36damo22: libEGL warning: failed to open /dev/dri/card0: Permission denied
05:36imirkin: that's probably not ideal
05:40damo22: i put myself in the video group and now eglinfo looks good
05:40damo22: gles31]$ LD_LIBRARY_PATH=/home/damien/git/mesa-bin/lib64 ./deqp-gles31 still fails
05:40damo22: eglinfo only works with the override for lib path
05:42imirkin: you're doing the overrides for deqp-gles31 right?>
05:42imirkin: (sorry i keep asking, but it's just a very simple explanation to things not working)
05:42imirkin: so yeah. i dunno. egl is not my wheelhouse
05:43imirkin: one would have to figure wtf is going on
05:43imirkin: you could use gdb to see what's going wrong with eglInitialize
05:43imirkin: i.e. just add a break on _eglInitialize
05:43imirkin: and see wtf is going on
05:43damo22: yep no worires
05:43imirkin: sorry :(
05:48damo22: you did warn me about surfaceless, but i pushed on
05:53imirkin: it should work.
05:53imirkin: iirc it doesn't because of some platform enum mismatch
05:53imirkin: but i dunno who's wrong
06:38damo22: does the wayland target work?
06:38imirkin: no clue, never tried it
06:38damo22: or will it work if i install a desktop environment on fedora and use x11_egl_glx
06:39imirkin: you don't need a desktop env
06:39imirkin: just X
06:49damo22: FATAL ERROR: GLX protocol not supported by X server at tcuLnxX11GlxPlatform.cpp:247
06:50imirkin: not your day, huh
06:51imirkin: pastebin xorg log?
06:57imirkin: makes it sound like X doesn't have acceleration working at all
06:58damo22: missing /usr/lib64/dri/nouveau_dri.so
06:58imirkin: ah yeah, you need that
06:59imirkin: doesn't have to be the one you just compiled
06:59imirkin: but it can be
06:59imirkin: should be able to just symlink it directly there
06:59imirkin: (or copy)
07:04imirkin: i'm off
07:04imirkin: good luck
07:05damo22: TestResults.qpa worked, but test failed
07:09damo22: 01:00.0 VGA compatible controller : NVIDIA Corporation GT200 [GeForce GTX 280] [10de:05e1] (rev a1)
07:11damo22: DISPLAY=:0 MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/home/damien/git/mesa-bin/lib64 ./deqp-gles31 --deqp-visibility=hidden -n dEQP-GLES31.functional.compute.basic.shared_atomic_op_*
07:12damo22: * d24ae47f789 (HEAD, imirkin/nv50_compute) nv50/ir: don't load value at same time as attempting to lock
07:15damo22: http://paste.debian.net/plain/1187778/ <-- xorg.log
10:04pmoreau: imirkin: I can confirm damo22’s results, going from 2 passed 2 failed to 4 failed; DEBUG=1 output for TGSI can be found here: https://gitlab.freedesktop.org/pmoreau/mesa/-/snippets/1668.
12:31karolherbst: tagr: ehhh.. I think your modifier patches actually broke mutter with kms modifiers enabled :/
12:32karolherbst: still bisecting, but doesn't look too good
12:40karolherbst: could also be that my setup is bonkers.. but let's see
12:57karolherbst: 129d83cac2accc4a66eae50c19ac245b864dc98c broke it
13:35imirkin: pmoreau: sad :(
13:41imirkin: pmoreau: very odd
13:41imirkin: the code looks identical-enough
13:42karolherbst: I'd like blame scheduling, but... :D
13:50imirkin: pmoreau: actually i think i see it
13:50karolherbst: imirkin: maybe some caching is bonkers?
13:50imirkin: just a vanilla fail
13:50imirkin: i failed at one of the compiler stages
13:50imirkin: will figure it out.
13:50imirkin: the code is wrong.
13:51imirkin: i think i hooked something up wrong, as i was messing with defs/etc
13:51imirkin: and it ends up using an undef
13:51imirkin: i.e. the most vanilla of all vanilla fails
13:53imirkin: not obvious what i did wrong, but the code coming out is def wrong
13:54imirkin: will work it out.
13:54imirkin: but not now.
13:54imirkin: damo22: pmoreau: thanks for testing!
13:54imirkin: i should have an update tonight
14:26imirkin: yeah, i messed up the BB setup somewhere
14:26imirkin: will need to go over it a bit more carefully
14:26imirkin: this causes RA to, essentially, fail
15:13imirkin: yeah, i tried to skimp on some of the looping i think
15:21imirkin: moral of the story: it was failing for me even in the single thread/etc case - i figured it was due to something unrelated. nope - highly related.
19:41karolherbst: ahhh... more tegra bugs
19:44pmoreau: Fun times! 🙂
19:45RSpliet: karolherbst: There's still no easy installation of Fedora on the Jetson TX1, is there?
19:45karolherbst: there is
19:45karolherbst: you just need to flash the SPI once, then it supports UEFI based booting
19:45RSpliet: :-O since when?
19:45karolherbst: since there is UEFI support
19:45RSpliet: Oh! Right... is there some tutorial I can follow easily?
19:46karolherbst: just download newest images
19:46karolherbst: newest l4t and newest base image for the OS
19:46karolherbst: the SPI flashing is annoying
19:46karolherbst: but you only have to do it once
19:46RSpliet: That's the nano. Don't suppose that also works for the TX1?
19:46karolherbst: then you can use the arm-image-installer
19:46karolherbst: uhm... it should?
19:47karolherbst: the model numbers are jsut different
19:47karolherbst: essentially you just need to use fedoras uboot and flash the newest SPI firmware
19:47karolherbst: to enable uefi
19:47karolherbst: but yeah
19:47karolherbst: I only tried it on the nano
19:47karolherbst: I can ask if you want to :D
19:48RSpliet: arm-image-installer doesn't mention support for tegra
19:48karolherbst: RSpliet: doesn't matter
19:48karolherbst: you just flush grub2 on the sd card and uefi picks it up ;p
19:48karolherbst: just like a desktop
19:48RSpliet: Ah ok that sounds doable
19:48RSpliet: I might free up an SSD some day for it
19:48karolherbst: hence the firmware flashing
19:48RSpliet: ... can it boot from SSD?
19:49karolherbst: I think not... let me read through emails
19:49karolherbst: I am involved in this work :D
19:49karolherbst: we were "pestering" nvidia about uefi support, now they added it
19:49karolherbst: RSpliet: what board do you have?
19:50karolherbst: just the normal TX1?
19:51karolherbst: RSpliet: yeah.. the TX1 also seems to have uefi support
19:51karolherbst: just need to figure out what you have to flash :D
19:51karolherbst: and how you get it into FRC mode
19:52RSpliet: Yeah just the normal TX1
19:52karolherbst: RSpliet: one thing though, after flashing, the next boot needs a valid OS ont he sd card, otherwise it might fail
19:52karolherbst: or however the TX1 one boots
19:52RSpliet: Tbf, I actually got a Ryzen-based mini-PC recently because I dreaded the ARM bring-up dance
19:52karolherbst: the fedora provided uboot also supports TFTP booting
19:52karolherbst: just saying
19:53karolherbst: yep... it was painful in the past
19:53karolherbst: had a huge script to automate flashing and everything
19:53karolherbst: now I only needs this medai arm-image-installer which removes like all the pain :D
19:56RSpliet: Yeah, it does sound like it helps a bit
19:57karolherbst: "a bit"
19:57RSpliet: Well, I think for the... what's it called
19:57RSpliet: the Snowball board
19:57karolherbst: I am sure you can also just flash the iso files and bring up the installer
19:57karolherbst: but having the installed system already there kind of makes it faster :D
19:57RSpliet: There were scripts to create a filesystem. The pain was in the "first n megabytes needed to contain 6 layers of bootloader data"
19:58RSpliet: once you had that prepared, it was flash & go
19:58karolherbst: ahh yeah
19:58RSpliet: It's not my first rodeo :-P
19:58karolherbst: you won't need any bs like this anymore :p
19:58karolherbst: those "hey, let's put all the data into device partitions" insanity always bothered me
19:59imirkin: karolherbst: what are the additional issues?
19:59karolherbst: the only annoying thing about the flashed UEFI support is, you can't replace the boot logo anymore :( *much sad*
19:59karolherbst: imirkin: some other mem corruptions
19:59karolherbst: less obvious though
19:59karolherbst: so I will bisect it
19:59imirkin: did you decide to run deqp or something?
19:59karolherbst: nope, gnome-shell
19:59imirkin: even worse :)
20:00karolherbst: well, that's what user care about :p
20:04RSpliet: Versatile Express, Snowball, Sabre Lite, this Qualcomm inforce thing, Olimex A10/A20... yeah, I've had my share of that pain :-P
20:23karolherbst: mhhh "fd = -1"
20:25karolherbst: this is annoying
20:25karolherbst: imirkin: so the segfault is inside cli_push_get
20:25karolherbst: pushbuf_kref called with a NULL bo
20:25karolherbst: ohh wait...
20:29imirkin: that means you made some poor life choices
20:29karolherbst: I guess
20:29karolherbst: but it used to work...
20:30imirkin: with null? i don't think so
20:30imirkin: i mean, i added an assert at some point
20:30imirkin: to make it easier to detect problems
20:30karolherbst: I mean.. running gnome-shell used to work
20:32karolherbst: I bet 20.3 works and something since then broke it
20:33imirkin: i would not bet against that :)
20:33imirkin: 21.0 has been breakage left and right
20:33karolherbst: that terrible?
20:33imirkin: i mean ... you saw it
20:34imirkin: all classic drivers got broken too
20:34imirkin: i'm not saying things are STILL totally broken
20:34imirkin: but there were multiple things that got broken throughout
20:34imirkin: i didn't get any issues with CTS or dEQP
20:34imirkin: back when i was fixing stuff
20:35karolherbst: I really need to find some time to finally wire up a machine for CI
20:35imirkin: there was another round of breakage after that though
20:35imirkin: so i haven't fully retested
20:35imirkin: nv50 got busted for example: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4380
20:35karolherbst: I mean.. I have two laptops I don't use, and the jetson, and a "server dektop" + switch and everything fancy
20:35karolherbst: huh? nir to tgsi landed?
20:36imirkin: people have been removing perfectly working tgsi shaders from st/mesa
20:36imirkin: and making them be nir instead
20:36imirkin: they're tiny trivial shaders, do i don't much care
20:36imirkin: but the conversion of those trivial shaders gained a problem
20:36imirkin: which in turn broke pbo
20:48imirkin: sigh. copying code works so much better if you use copy/paste instead of look + type
20:48imirkin: apparently failLockBB != tryLockBB
20:50imirkin: getting the cfg totally wrong is one great way to mess up RA...
20:50pmoreau: But you can’t fail if you don’t try. 😉
20:52imirkin: pmoreau: going to have an updated branch for you in a sec
20:53pmoreau: Not sure if I’ll get to reviewing any patches today either… working on solving an unrelated bug first but getting very close now.
20:53pmoreau: I will definitely have some time during the weekend, and I need to review the patches quickly otherwise next time I will look, there will be 40 additional patches to look at. 😆
20:53pmoreau: 👍️That I can test tonight
20:53imirkin: pmoreau: well, i haven't made any changes to that MR
20:53imirkin: maybe i added one or two tiny patches
20:54imirkin: pmoreau: ok, check nv50_compute branch now
20:54imirkin: same instructions as before
20:54imirkin: but this time it might actually work =]
20:54pmoreau: When you said “in a sec”, you really meant it! :o
20:54imirkin: pmoreau: well, more like "in a min", but i was finalizing testing at that point already
20:56pmoreau: Passed: 4/4 (100.0%)
20:56imirkin: ok, now can you peel back one by one
20:56imirkin: to figure out what was necessary
20:56imirkin: and what was not
20:56imirkin: just like checkout HEAD^
20:57pmoreau: I’ll do that a bit later or tomorrow.
20:57imirkin: *ideally* only that first one is needed
20:57imirkin: i.e. flip EQ -> LT
20:58imirkin: in the meanwhile i'll try to figure out how to make the global memory barrier thing happen
20:58imirkin: i think i'll have to hand it a "scratch" buffer
20:59imirkin: and/or at a fixed address
21:06karolherbst: okay yeah.. so 20.3 at least doesn't crash
21:06imirkin: ship it!
21:06karolherbst: but not sure if my 4k screen is too much for it
21:06imirkin: that'd be sad
21:06imirkin: tegra is supposed to have dynamic clocks i thought
21:06karolherbst: ehhh.. no
21:06imirkin: or adjustable?
21:06karolherbst: but the output is just black :D
21:07karolherbst: but it's broken
21:07karolherbst: at least on my board
21:07imirkin: but it renders REALLY fast ;)
21:07karolherbst: it works to some degree, but...
21:07karolherbst: well.. not for too long
21:07imirkin: all the frames blend together, that's how fast it's rendering
21:07imirkin: black on black :)
21:08imirkin: any errors in dmesg?
21:08karolherbst: but my displayn is also very dumb sometimes
21:08karolherbst: I mean.. it displays the console just fine
21:09karolherbst: the board turned off...
21:10imirkin: sounds like my TK1
21:11karolherbst: I actually have it powered over PoE ... :/
21:11imirkin: my TK1 dies due to, what it feels like, network traffic
21:11imirkin: not great for nfsroot
21:12karolherbst: well, you have three ways to power the nano
21:12karolherbst: USB with 2A
21:12karolherbst: PoE with 3A and jack with 4A
21:12karolherbst: I am using a poe splitter via jack, because... it only does 5V and a poe header to downconvert was to painful for me
21:12karolherbst: so I have a splitter which downconverts the votlage
21:13karolherbst: no idea what happaned though
21:13karolherbst: the poe light on my switch was still on
21:14karolherbst: so, let's see
21:14karolherbst: okay, getty runs on the display
21:15karolherbst: and it's all 4k
21:15karolherbst: now it works
21:15karolherbst: login prompt is there
21:16karolherbst: the known syncing issue as well, but everything works.. nice
21:16karolherbst: so another round of git bisect
21:19imirkin: as a software engineer, some ridiculous fraction of your time is spent watching progress bars =/
21:21karolherbst: buy faster machines...
21:22karolherbst: actually.. I could, but then I would have to cross compile and scp and shit :/
21:22imirkin: aka more progress bars ;)
21:22karolherbst: at some point I think the best idea is to buy a beefy 24 core machine and set up a gcc server
21:22imirkin: i did do everything on my desktop for arm
21:22imirkin: and just nfsroot
21:22karolherbst: too lazy to set it up
21:22imirkin: took a little bit to get it set up initially, but once it's going it's fine
21:23imirkin: well, you could just make one dir nfs-mouonted
21:23imirkin: where you have the "updated mesa" or whatever
21:23karolherbst: it's fine for running processes, but gdm is this thing you only can launch as a service on a tty
21:23karolherbst: and then it just sucks
21:23karolherbst: so I just install mesa over the system
21:23imirkin: ah yea
21:23karolherbst: really. I like the idea of a gcc server
21:23imirkin: systemd makes everything simpler.
21:24imirkin: the "disk" is too slow
21:24Lyude: anyone know if it's possible for us to move a bo in memory to a new page table with nouveau, potentially while it's mapped by userspace?
21:24karolherbst: Lyude: uhh...
21:24imirkin: BO's can get migrated from VRAM to GART (and back again), no problem
21:24karolherbst: imirkin: you can add new LD paths that get picked up, but I am even to lazy for that :D
21:24imirkin: userspace is none the wiser afaik
21:25karolherbst: mDNS enabled gcc server...
21:25imirkin: Lyude: actually hmmm
21:25imirkin: not sure.
21:25imirkin: now that i think a bit more about it
21:25imirkin: ttm has these managers
21:25imirkin: and a buffer lives in one spot at a time
21:25imirkin: and can be migrated around
21:25Lyude: oh awesome - I'm about to write up a fix for the dumb bo stuff and I'm thinking that's probably a solution that would be quite nice to use. although I do need to double check that I understand precisely why we're getting a different page table in this situation (should only require me to trace some stuff with printk)
21:26imirkin: i assume that it ttm is also fixing up mappings in the "background"
21:26karolherbst: imirkin: ohh btw, we have a fix for this vmm_del race :D
21:26imirkin: since nouveau itself doesn't have any code to actually insert PTE's/etc into the CPU VM
21:26imirkin: Lyude: what situation?
21:27imirkin: is the solution to move to GART?
21:27imirkin: coz then the answer is simple - gpu can't have large pages looking at GART :)
21:30Lyude: imirkin: no-the cursor situation yesterday ended up actually being a problem with using 128K page tables for the cursor plane, which means the easy fix is just going to be forcing smaller pages for small dumb bos. Note though I still need to trace why we end up with 128K pages in the dumb bo path and not the regular allocation path with the nouveau gem ioctls (which I'm going to do now), but I'm
21:30Lyude: mostly just curious if I go forward with forcing the use of specific page tables whether it has to be done on bo creation or if it can be done once it's been assigned to the actual cursor plane
21:30imirkin: hm, careful how you do the cutoff there. i guess if you do the cutoff at 256K that should be fine
21:31imirkin: basically i wouldn't want any legit scanout buffer to end up with small pages
21:31Lyude: imirkin: yeah-that's why I'm thinking to limit it to the cursor max size - and only limit it to the largest page we know works for such small buffers (which I -think- should be 64k pages), and only in the dumb bo path
21:31imirkin: Lyude: yeah, i guess you could add a nvif to split large pages
21:32imirkin: should be completely transparent to everything
21:32Lyude: cool :)
22:57Lyude: skeggsb: whenever you're on: what are the first two numbers I'm seeing here after "user:"? [ 3463.709391] nouveau 0000:1f:00.0: mmu: user: 00000:00101: flush: 1
23:00skeggsb: PDE and PTE
23:00skeggsb: lucky you're not on pascal :P
23:00imirkin: Lyude: it's from nvkm_vmm_trace
23:00imirkin: it iterates over the pte/pde levels
23:02Lyude: imirkin: I know that much-I'm mostly just wondering because I think the solution we found last night might be working around the actual issue. mostly because I realized in the igt case where we're just blitting back and forth to vram, we're technically also scanning out from the same size page table as modetest
23:02skeggsb: none of that stuff is involved in disp btw, it bypasses the mmu
23:02skeggsb: uses ctxdmas instead
23:03Lyude: skeggsb: what do you mean exactly?
23:03skeggsb: no page tables, an old-school nv3-era thing called a "context dma object" is used instead
23:03Lyude: ok-I didn't realize those bypassed the mmu entirely, but I'm starting to remember that now
23:03imirkin: it's basically just tuple of (memory type, start, limit)
23:04imirkin: made a lot more sense before mmu's were a thing
23:04skeggsb: they don't have to, you can point a ctxdma through the usual mmu too, though i dunno that disp supports it
23:04skeggsb: we don't anyway, we don't setup an instance block for it anywhere
23:04skeggsb: i've never seen nvidia either
23:05Lyude: imirkin: yeah - I had to use them with CRCs if you recall :p
23:05imirkin: i don't, but i can believe it :)
23:05imirkin: damo22: btw, looks like i fixed the tests, feel free to pull the updated branch
23:06imirkin: (but pmoreau already confirmed they worked)
23:07skeggsb: oh, nah, nvidia have NV_DMA documented in dev_disp.ref, no mention of goint through mmu
23:07Lyude: skeggsb: i'm starting to wonder if we might just be messing up the mmap then, if we know we're able to scanout from 128k pages in the cursor size then why does changing the page size to 4K fix the artifacts for modetest
23:08skeggsb: we don't set NV_DMA_PAGE_SIZE == BIG for the cursor ctxdma
23:08skeggsb: we use a "whole vram" ctxdma for that
23:08skeggsb: it bypasses the usual plane prepare() stuff that creates one from a fb because of the original nv50 being a complete pain
23:09skeggsb: and also because it was easier than propagating that state to a head to do that in a core channel update
23:11Lyude: skeggsb: well yeah - that just kinda makes it more confusing then if the page size we set has no effect on how it's scanned out, but we're still getting different results on the scanout when we're forcing the page size for the bo that ends up getting used for the cursor scanout to 4k
23:12skeggsb: yes, because everything else that touches the cursor object has a mapping through the mmu using large pages
23:12skeggsb: disp doesn't
23:12skeggsb: and the gpu plays funny tricks inside large pages
23:13Lyude: skeggsb: ignoring scanout for a second - does the page size change how the data is physically arranged in the vram?
23:13Lyude: like - other then the obvious 'where it's allocated'
23:13skeggsb: yes, that's what i mean by "plays funny tricks"
23:13skeggsb: some GPUs rearrange the data in there for whatever (presuming performance) reason
23:14Lyude: ahhhhh, ok, I -think- the issue is in how we're mmaping dumb buffers then, not how we're allocating them :)
23:14Lyude: for the application writing the data to the dumb bo I mean
23:14skeggsb: the issue is just disp needs to make the right ctxdma
23:15skeggsb: (or force small pages for any buffer small enough to end up as a cursor)
23:16skeggsb: i don't see a problem with either option, the latter is easier though
23:16skeggsb: i don't see any real gain in large pages for a small, pitch surface
23:16imirkin: yeah, 256k seems like a reasonable cutoff
23:17imirkin: for like a 4k scanout image -- that's a lot fewer PTE entries
23:17imirkin: (4k as in 3840xwhatever)
23:18pmoreau: x2160, i.e. 2x1080
23:18Lyude: yeah... I think I'm just going to do that then. I guess I was just surprised that even though the cursor is being scanned out from the same ctxdma in both cases, that the ce would arrange the pixel data correctly in the igt case such that disp has no issues but just writing with cpu + mmap doesn't
23:19skeggsb: yeah, i'm not 100% sure how to explain that either tbh
23:19Lyude: ahhh-ok, then I'm not misunderstanding that lol
23:20skeggsb: it could be interesting to dig into, but don't have time rn
23:21skeggsb: oh right
23:21skeggsb: hmm, no.
23:24Lyude: maybe i'll figure it out someday, for now though I'm just going to go with the fix we had before. that being said though, looking into this more definitely taught me a bit about nvidia mm stuff I didn't know, so that's nice :)