00:04 Lyude: Hm. Is anyone aware of any scratch registers that can be accessed through mmio?
00:59 gnarface: don't look at me, i don't know
00:59 imirkin_: Lyude: define 'scratch'
01:03 RSpliet: Every register is a scratch register if you're brave enough
01:03 imirkin_: RSpliet: i wouldn't start with 0x200 though :)
01:10 gnarface: hehe
05:30 imirkin: hrmph. fbcon=rotate:1 didn't work for me. very odd. (and yes, i have CONFIG_FRAMEBUFFER_CONSOLE_ROTATION compiled in...)
06:53 imirkin: weird. for some reason i'm having trouble getting a draw to go to 16k. it goes to 8k just fine... but nothing over 8k.
06:54 HdkR: Nobody needs more than 8k vertices anyway
06:56 imirkin: 8k pixels
06:56 HdkR: ah
06:56 imirkin: with a single tristrip
06:57 imirkin: too much rast?
06:57 HdkR: haha
06:57 imirkin: or something silently clipping at 8k that i'm not seeing
10:00 mupuf: imirkin: it is not uncommon to have a maximum rendering stride
10:00 mupuf: intel has 16k on ivb+
10:01 mupuf: what does nvidia advertise as a RT limit?
10:04 HdkR: Depends on generation :P
10:08 HdkR: It's 8k to 32k depending on gen
10:12 HdkR: Tesla 8k, fermi 16k, pascal 32k
10:27 mupuf: HdkR: thanks, sounds about right
13:22 karolherbst: imirkin: you didn't had a patch ready for the blitter MS issue, right?
13:35 karolherbst: imirkin: mhh, "KHR-GL45.packed_pixels.varied_rectangle.rgba8" failed randomly
13:49 pmoreau: karolherbst: \o/ I updated my older SPIR-V -> NVIR branch to the latest master + the new clover patches, and it still works (at least on simple example). So it should be easy know to figure out what is missing for NV50 support.
13:49 karolherbst: :) cool
13:51 karolherbst: pmoreau: btw, you should probably switch over to my nouveau_nir_spirv_opencl_hmm_v4 branch
13:51 pmoreau: I think I’m already running on it, though could be an older version.
13:52 karolherbst: yeah
14:38 seaword: I'm running linux on a mid-2010 macbook pro. The Apple website says the card in this machine is a Gefore GT 300M which supports OpenGL 3.3. When I run glxinfo it states the OpenGL version is 2.1.
14:38 seaword: I'm curious to know if it's possible to run 3.3 on this machine?
14:39 seaword: Apologies if this is the wrong place to ask, I'm not too sure where to begin to find this out.
14:42 HdkR: `glxinfo | grep "renderer string:"` what does that say?
14:43 seaword: Mesa DRI Intel(R) Ironlake Mobile
14:43 HdkR: It's because there are two GPUs in that system and you're rocking the old Intel one that is stuck on GL 2.1
14:44 seaword: Ah right, so I need to try and switch to using the nvidia card right?
14:44 HdkR: Yes
14:44 seaword: Ok cool, thanks for the help!
15:18 pmoreau: seaword: What if you try `DRI_PRIME=1 glinfo -B`?
15:19 seaword: I got the program running with DRI_PRIME=1 :)
15:20 pmoreau: Nice! :-)
15:20 seaword: It will be interesting to see how this laptop goes using the nvidia card. Back in 2010/11/12 when I was using it every day it used to crash (this was runnign OS X). The whole machine would lockup and you'd have to reboot. There was no telling when it would happen, 5 minutes into using it or a couple of hours.
15:21 seaword: Apple acknowledged a problem with the mid-2010 but I was outside the repair window when I found out about it.
15:22 seaword: I always wondered what the issue was and I'm wondering if I'll run into it using the graphics card.
15:22 seaword: I should have mentioned the problem was to do with the graphics hardware.
15:22 seaword: So I'm not curious to see how this goes :)
15:22 karolherbst: seaword: dedfine "crash"
15:22 karolherbst: *define
15:23 seaword: The entire machine locked up (grey screen - maybe it was black, I can't remember).
15:23 karolherbst: ahh, well that basically means a bug inside their driver though
15:23 seaword: You had to turn the machine off and on again.
15:23 karolherbst: it might be that your GPU behaved in a way the driver didn't expect it though
15:24 seaword: I wondered about that but apparently there was no software fix. A guy released a utility to choose only the onboard graphics and that was a workaround for many people.
15:24 karolherbst: and it might be even outside of what nvidia expected... but normally it's hard to tell
15:24 seaword: Ah right
15:25 seaword: Hopefully it's all good on linux. It's an 8 year old machine but still very usable and does everything I need.
15:25 karolherbst: can be that nouveau handles it just fine, maybe it does not, maybe it would work/crash with nvidia, nobody knows before trying it out ;)
15:26 seaword: I'll report in if it happens :)
15:27 seaword: I only need DRI_PRIME=1 for blender. A change in a recent build of 2.8 means OpenGL 3.3 is now required.
15:28 seaword: I think because now they are only using the eevee renderer, as opposed to supporting cycles as well.
15:31 karolherbst: I see
15:32 karolherbst: seaword: is it the 320M?
15:32 seaword: I think it's the GT 300M but I will double check.
15:34 seaword: Yeh, NVIDIA Geforce GT 330M
15:36 karolherbst: mhh, sad. I doubt we are able to reclock that GPU because of GDDR3
15:50 pmoreau: \o/ Success!
15:53 pmoreau: Only needed to change `ld xx $yd a[0xzz]` to `ld xx $yd s[0x10 + 0xzz]`, and replace g0 by g15 in load/stores.
16:10 imirkin: mupuf: GL4 requires 16k
16:21 HdkR: TIL
16:22 pmoreau: karolherbst: The following (https://hastebin.com/hudemanafa.php) on top of your hmm_v4 branch is enough to get some simple OpenCL programs to run on Tesla.
16:22 karolherbst: \o/
16:22 karolherbst: ohhh
16:23 karolherbst: that "info->prop.cp.inputOffset" should be part of the lowering for nv50 though
16:23 karolherbst: I think... mhh
16:23 karolherbst: info->io.auxCBSlot as well I guess
16:23 pmoreau: I think I have a slightly less hacky way for the sampler in some other patch.
16:24 pmoreau: cp.inputOffset and io.auxCBSlot are done as part of nv50_ir_from_tgsi.cpp
16:24 karolherbst: right, but all those changes look like things we want to do inside lowering
16:24 imirkin: ok. so the test i wrote works on intel. but the drawing fails MISERABLY on nouveau.
16:24 karolherbst: nvc0 lowers input to c0[]
16:24 karolherbst: nv50 should lower that to s[] whatever
16:24 karolherbst: and g0[] -> gx[] whatever
16:24 imirkin: karolherbst: lowering is the wrong term. it's basically "linkage" information to the outside world
16:24 imirkin: the converter should do the appropriate thing
16:25 karolherbst: right, but currently that's done while lowering for nvc0
16:25 imirkin: i don't know specifically what you guys are talking about
16:25 imirkin: just explaining what all that "io" stuff is :)
16:26 karolherbst: it's a patch affecting the nir -> nvir conver
16:26 karolherbst: *converter
16:26 imirkin: k
16:26 imirkin: the converter has to be somewhat sensitive to its target
16:26 karolherbst: but I think we should just use INPUT for kernel inputs and g[] for global memory access
16:26 karolherbst: and let the lowering stage figure things out
16:26 imirkin: at least that's how we've done it in the past
16:27 pmoreau: karolherbst: I haven’t changed your code to not use INPUT for kernel inputs; the INPUT -> SHARED is done as part of nv50_ir_lowering_nv50.cpp
16:27 karolherbst: pmoreau: right, but you still add an offset
16:27 karolherbst: it's something we would end up with a TGSI path as well
16:27 pmoreau: I add whatever offset is defined for that target
16:27 karolherbst: or any otehr IR essentially
16:28 karolherbst: so, what is sharedOffset?
16:28 pmoreau: Yes, and it is done in the TGSI path.
16:28 imirkin: if there's a reasonable way to move things into lowering phases, it's fine to move them out of the converters
16:28 imirkin: but it's not an absolutely necessary goal
16:29 karolherbst: pmoreau: it's not, that's disabled codde
16:29 karolherbst: *code
16:29 karolherbst: pmoreau: sharedOffset means "reserved space in s[]" but now you adjust an offset into INPUT memory
16:30 karolherbst: this might be correct for nv50, because INPUT -> SHARED
16:30 pmoreau: sharedOffset should be inputOffset + sizeof(kernel args) on Telsa, and 0 on Fermi+
16:30 karolherbst: but for nvc0 it's not
16:30 pmoreau: https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp#L2605-2634
16:30 karolherbst: pmoreau: I don't argut if that doesn't cause issues at runtime, I am more arguing about what would be the cleaner or saner thing to do
16:31 karolherbst: pmoreau: again, that's disabled code
16:32 pmoreau: What do you mean by “disabled code”? It doesn’t look to me like unused code nor ifdefed out code.
16:32 karolherbst: comment
16:33 karolherbst: line 2580
16:33 karolherbst: end 2677
16:33 pmoreau: Hum
16:34 karolherbst: it's a nice thing having an editor which actually tells you this through colors or something ;)
16:35 karolherbst: not saying I am not open to change what I have in mind, but I think it would be cleaner to just let the target specific lowering code to figure all that out, but using "info->prop.cp.inputOffset" and "info->prop.cp.sharedOffset" for s[]/a[] might make sense nonetheless
16:35 karolherbst: have to check up on that
16:37 pmoreau: Having it in the lowering code would mean that you don’t need to do it in every single some IR -> NVIR translator.
16:37 karolherbst: exactly
16:37 karolherbst: (also makes it easier to change target specific stuff or add stuff for new targets)
16:38 karolherbst: the IR translators really shouldn't be bothered with that
16:58 mupuf: imirkin: thanks for the info :)
17:17 pmoreau: Still some progress to be made: https://hastebin.com/naqakadaho.coffeescript
17:28 pmoreau: 😓 when providing the SPIR-V directly to clover, no one is checking that is uses the proper memory model, so passing a Physical64 on Tesla goes all the way down to the emitter which asserts in `nv50_ir::CodeEmitterNV50::emitLoadStoreSizeCS()`.
18:59 karolherbst: pmoreau: yeah.. I guess we should check that somehow
19:52 pmoreau: karolherbst: Implemented, gonna push that to the series.
19:56 karolherbst: pmoreau: I guess you add the check where we preparse the spir-v for the parameters, right?
20:00 pmoreau: I could, but I actually did it when validating the SPIR-V before returning out of clCreateProgramWithIL
20:00 karolherbst: ahh, sounds fair
20:03 pmoreau: karolherbst: What do you get for https://hastebin.com/itusuwacok.pl ? I’m getting https://hastebin.com/ejajihulel.coffeescript which seems weird
20:05 karolherbst: pmoreau: mhh, I think there is still pointer support missing for bitcast
20:06 karolherbst: I have a patch somewhere to fix it up, but jekstrand also changed bitcast around a little
20:07 pmoreau: But AFAICT, id 19 is only used in the OpString. Are those used somehow by the spirv_to_nir pass or later by NIR?
20:07 karolherbst: the ids don't match
20:07 karolherbst: ohh wait
20:07 karolherbst: anyway, those ids don't match
20:07 karolherbst: usually you have to substract 2
20:07 karolherbst: or 3 or something
20:07 karolherbst: no idea why
20:08 karolherbst: anyway, it will be the source of the bitcast where the current code expects a ssa value
20:08 karolherbst: but gets a pointer
20:14 pmoreau: Ah, I hadn’t thought spirv_to_nir would be using different IDs.
20:14 karolherbst: yeah.. no idea why that happens though
20:14 karolherbst: maybe something happens to the spirv after we write it to a file? dunno
20:15 pmoreau: Could be the linker now that I think of it
20:15 pmoreau: Though since it doesn’t get linked with anything, they should remain the same I think.
20:16 karolherbst: mind verifying this?
20:17 karolherbst: might be worth to dump the spirv later then or something
20:17 pmoreau: Yep, going to add that
20:18 karolherbst: I don't think we need to write out both versions, or maybe? dunno
20:18 karolherbst: guess it depends on what you want to debug
20:32 pmoreau: karolherbst: It’s the linking that is doing some renaming in a linear fashion: https://hastebin.com/upiledifoc.pl
20:32 pmoreau: s/linking/linker
20:32 karolherbst: ahh
20:54 imirkin: anyone running blob on fermi+ who can do an mmt for me?
20:58 pmoreau: Not me, sorry. I don’t have a Fermi lying nearby (only got my 2xTesla laptop with me).
20:59 imirkin: i don't understand what i'm doing wrong
20:59 imirkin: i can't get it to render beyond 8k of a RT
20:59 imirkin: something guardband-related perhaps
21:05 pmoreau: karolherbst: I think I’ll move the address bits check to the compilation part, where I check that the devices supports the capabilities and extensions.
21:06 pmoreau: I’ll rework the series tomorrow if I have time, but I’ll start with going back to Connor’s patch.
21:54 karolherbst: imirkin: saw my comment regarding the blitter fix? I was wondering if you had a fix ready by now
22:44 imirkin: karolherbst: i'm trying to write a test
22:44 imirkin: and that test reveals a major issue
22:44 imirkin: unrelated to blitting
22:44 imirkin: so i've been focused on that
22:44 karolherbst: ahh, I see
22:44 imirkin: seems like fb's are broken past 8k
22:44 imirkin: however i can't believe we would not have seen that before
22:44 imirkin: so something else weird is going on maybe
22:45 karolherbst: it's not like that something happening all that often in general
22:45 karolherbst: it's
22:46 imirkin: no, but i can't believe there aren't tests
22:47 imirkin: my basic check passes on intel, so i know my test isn't just totally wrong
22:48 karolherbst: mhh
22:48 karolherbst: interesting
22:48 karolherbst: maybe there aren't?
22:49 imirkin: i wonder if there's some dumb "global" limit we're running into
22:50 imirkin: that gets set by the blob driver
22:59 imirkin: karolherbst: got blob running anywere? i could use an mmt to check what's up...
22:59 karolherbst: imirkin: is it on a gp107 okay?
22:59 karolherbst: allthough not quite sure if all that still works here :/
22:59 karolherbst: long time ago when I was running the blob and used mmt
23:00 imirkin: well ... probably good to check with nouveau on another gpu too
23:00 imirkin: hold on, let me undo some changes
23:02 imirkin: karolherbst: http://paste.debian.net/hidden/1c39c46e/
23:03 karolherbst: hehe, no link to the raw file?
23:04 imirkin: http://paste.debian.net/downloadh/1c39c46e
23:04 imirkin: or: http://paste.debian.net/plainh/1c39c46e
23:04 karolherbst: ahh
23:05 karolherbst: mhh, that passes here with maxwell
23:05 karolherbst: uhm.. pascal
23:06 karolherbst: or is the value printed actually important and it always passes?
23:06 imirkin: it always passes ;)
23:07 imirkin: do you get a yellow screen?
23:07 imirkin: (without -fbo -auto)
23:07 karolherbst: not with nouveau :/
23:07 imirkin: and i bet the first print has 0.2's in it?
23:07 karolherbst: yeah
23:07 imirkin: yeah. that's bad.
23:08 imirkin: well, at least it's not just this GK208... yay
23:10 karolherbst: okay, seems like bumblebee still works here
23:10 karolherbst: 254.500000 32666.500000 0.500000 1.000000
23:10 karolherbst: with nvidia
23:11 imirkin: that's correct
23:11 imirkin: oh, right -- 32k on pascal+ according to HdkR
23:11 imirkin: nice.
23:11 imirkin: well, we only expose 16k
23:11 imirkin: but either way, it doesn't seem to work past 8k
23:12 imirkin: mmt plz ;)
23:12 karolherbst: yeah.. searching my script to actual do the mmt..
23:12 imirkin: valgrind --tool=mmt --mmt-trace-nvidia-ioctls --mmt-trace-nouveau-ioctls --log-file=foo.mmt the-program
23:19 karolherbst: sent it via email
23:21 imirkin: thanks =]
23:21 imirkin: now where's that magic "make it work" bit...
23:23 imirkin: crap, demmt doesn't like it
23:24 karolherbst: oh
23:24 karolherbst: why not?
23:25 imirkin: aha, local changes is why :)
23:25 karolherbst: :)
23:38 imirkin: crap, i don't see anything obvious =/
23:41 vincenttc: I have a question regarding drmModeSetCursor. I'm rendering to a gbm surface and using the result as the cursor bo. However, unless I add glFinish before drmModeSetCursor, the cursor isn't always visible. My guess is that this happens because the render operations haven't finished yet and this isn't checked/waited for when the cursor is set. Would this be considered a nouveau bug or is this something
23:41 vincenttc: the application should deal with?
23:43 imirkin: i think with bo sharing, the "far" side can't know when rendering is done
23:43 imirkin: there are syncfd's which can help, but i don't think nouveau implements that
23:45 vincenttc: imirkin: in this case it's on the same gpu btw, since I recognise you from #dri-devel ;)
23:45 imirkin: if you didn't, it'd be a different gpu? :p
23:46 vincenttc: haha, fair point :P
23:46 imirkin: anyways ... not sure what the situation is tbh
23:46 imirkin: i think even on same gpu, some sync is needed
23:47 imirkin: i forget if we take the cursor bo as-is, or if we copy it somewhere
23:49 vincenttc: if it's the second case that might explain the problem I'm observing
23:49 HdkR: imirkin: Pascal also gets stupid large max X dimension on textures :P
23:49 vincenttc: but currently no sync support is implemented then?
23:50 imirkin: vincenttc: check the "dispnv50" directory, search for "curs"
23:50 imirkin: [in the kernel]
23:52 vincenttc: imirkin: I'll have a look, if I get stuck I'll ask here again
23:53 imirkin: drivers/gpu/drm/nouveau/dispnv50
23:54 imirkin: hrmph. even i can't find it properly, so unfair to expect you to. hold on.
23:56 vincenttc: ok
23:56 imirkin: urgh. this stuff is so spread out...
23:57 imirkin: vincenttc: btw, which nvidia gpu do you have?
23:58 vincenttc: I tested it on a GT 445M and GTX 1080
23:59 imirkin: right ok. so it looks like we should have a live view of the image
23:59 imirkin: are you doing a glFlush() though?
23:59 imirkin: if you don't flush, then the draw may never even be sent to the gpu