00:36karolherbst: imirkin: I am not proud of my solution... but it works...
00:36HdkR: What is your shame? :)
00:37karolherbst: imirkin: "Copy with imageLoad and imageStore failed for texture type: Cube map array. Source and destination textures are different" ... okay, so more issues. nice
00:38karolherbst: this is just a brutal hack: https://github.com/karolherbst/mesa/commit/22930a364525f1433dda2b920eea16d287321d88
00:39karolherbst: I guess I broke cubes with that
00:39karolherbst: ahh, okay, thats the issue
00:41karolherbst: so now the piglit and the cTS test are actually passing
00:41karolherbst: imirkin: what do you say? ... I think we may really want a 2D based solution for that
00:41karolherbst: this way the shader gets a bit enourmous
01:34imirkin: Subv: one additional fun little tidbit is that shader access to memory has 16MB windows which are mapped to local and shared memory. the location of the window is configurable, but it's not possible to disable them.
01:35imirkin: this makes it so that any access to that window would actually go to shared/local memory
01:35imirkin: instead of out to the VM
01:35skeggsb: imirkin: i think "disabling" it involves using LDG instead of LD
01:35imirkin: skeggsb: oh neat
01:35imirkin: that's maxwell+ though
01:36imirkin: and there's no ATOMG
01:36Subv: this just keeps getting better and better
01:36skeggsb: imirkin: there is on volta, nfi about maxwell
01:36imirkin: skeggsb: ah ok
01:37nyef: Hello skeggsb.
01:38skeggsb: imirkin: yeah, not on maxwell, according to cuda-binary-utilities docs
01:46karolherbst: imirkin: do you know how we could do that 3d layer thing with 2d operations? I kind of never got the shader to actually read from the other layers properly (sure I can mess up texture type in the tic and all goes weird, but still)
01:47imirkin: karolherbst: not sure what the question is
01:48karolherbst: imirkin: well, with my hacky version I just put both version into the shader, a 2D and a 3D and select depending on the z value (0 executes the 2d thing, !0 the 3d one)
01:48karolherbst: but that kind of leads to ugly shader code and the lowering isn't that nice as well
01:49karolherbst: so I was wondering if there is a solution where we can do all that with inside the 2D surface operations
01:51karolherbst: well the nvidia driver clearly does it, but fails in the piglit test, which looks kind of as if it should pass actually
02:11imirkin: it also doesn't work, since if it's a 3d texture, and layer == 0 ...
02:11imirkin: well, you could manually de-tile and figure out the proper coordinates and treat it as a 2d array
02:11karolherbst: imirkin: no, that works
02:12imirkin: karolherbst: through luck maybe
02:12imirkin: if there's z tiling, it won't
02:12karolherbst: mhh, the layout is saved in the tic though
02:12imirkin: or at least, is unlikely to
02:12karolherbst: I played around with that a little
02:12imirkin: it is.
02:12karolherbst: and when I write 2D as the layout the entire thing doesn't work
02:12karolherbst: with 2d operations
02:13karolherbst: which is a bit weird
02:13karolherbst: on kepler I know that the tiling was basically messing up everything, even layer 0
02:13karolherbst: but not here ...
02:14karolherbst: imirkin: but if I treat it as a 2d array, then I end up with a 2d array operation? or would it then be lowered to 2D? But I guess using 2d images inside 2d array operations is less painful than 2d imagees inside 3d operations
02:16karolherbst: okay, then I check out how "easy" it would be using 2d array operations and if that doesn't break the other cases where we use 2d, we could actually do that then
02:23imirkin: karolherbst: tbh i haven't really been following your reasoning. but hopefully you have a solid enough handle on the situation
02:24karolherbst: better than yesterday
02:27imirkin: but worse than tomorrow ;)
02:29karolherbst: but mhh
02:29karolherbst: with my hacky patch, only "KHR-GL45.packed_depth_stencil.blit.depth32f_stencil8" left... in a CTS run
02:29karolherbst: and those tests which sometimes fail... like "KHR-GL45.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations"
02:30nyef: ... I think that I'm starting to get some idea of how the interface between gallium and a driver works. Maybe.
02:30nyef: Still no idea what I'm looking for, though. /-:
02:31imirkin: what are you looking for?
02:32imirkin: wandering aimlessly in the wasteland that is gallium api?
02:32karolherbst: nyef: what do you want to do, fix those multithreading bugs?
02:32nyef: karolherbst: Yes! Or at least to understand what needs doing.
02:33imirkin: well, have a look at nvc0_resource_validate and nvc0_resource_fence
02:34imirkin: try to gain an understanding of what those fences and whatnot do
02:34imirkin: then look for push_kick - this is a callback that happens at random points in time
02:34imirkin: this is the big problem, as i see it
02:34imirkin: which means ... dump libdrm_nouveau
02:34imirkin: and use an approach which doesn't allow for this level of randomness.
02:37nyef: ... I see no push_kick here. Is it in libdrm_nouveau?
02:37airlied: start with mkdir src/gallium/drivers/newnouveau :-P
02:38imirkin: i might have gotten the name wrong
02:38nyef: airlied: Implies a bit more of a project than I particularly wish to scope at this point, but also implies possibly using a non-LLVM-based compiler, which I would absolutely consider a win.
02:39karolherbst: imirkin: mhh, do I have to do some magic with 2d arrays? putting in the layer as the z coordinate doesn't do anything
02:39nyef: imirkin: pushbuf_kick ?
02:39imirkin: that's a function
02:39airlied: nyef: the compiler isn't llvm now :-P
02:39imirkin: nyef: kick_notify
02:40nyef: ... Okay, it's not llvm. Something about it (the C++-ness, probably) trips a number of alarms for me, basically screaming "don't go here."
02:40imirkin: karolherbst: for a 3d thing? the TIC gets an adjusted base address for 2d arrays. that obviously won't work with 3d.
02:40imirkin: nyef: the compiler is entirely unrelated to this effort.
02:40imirkin: compiler just converts some bits into other bits. it doesn't interact with hw.
02:40karolherbst: imirkin: ahh I see
02:41nyef: Ah, there's a PUSH_KICK. What I get for not throwing -i on my grep.
02:41imirkin: yeah, but that just calls pushbuf_kick
02:41imirkin: nyef: kick_notify is the callback
02:41imirkin: that gets called by libdrm_nouveau internals
02:41nyef: Ah, okay.
02:41imirkin: "every so often"
02:41nyef: And the unpredictability there means that you can't schedule context switches within a single DRM connection?
02:42imirkin: means it's VERY difficult to get locking right
02:42imirkin: since the pushbuf_kick has to write stuff into the pushbuf.
02:42imirkin: er, kick_notify
02:42imirkin: and add fences
02:42imirkin: and all kinds of other things
02:45nyef: Fun and games.
02:48imirkin: and we like to attach work to fence completion
02:48imirkin: and that work can also be in a precarious locking context
02:49nyef: What actually needs locking? Or is this a case of "if we weren't being so bloody clever with this callback thing, nothing WOULD need locking"?
02:52imirkin: well, we want a bunch of GL contexts to share a single hw context
02:52imirkin: so ... you gotta be a little careful
02:52imirkin: since you can't exactly interleave commands randomly
02:52imirkin: state emission relies on knowing what the hw state is
02:52nyef: ... If I end up doing a new backend for this, I'll use the latin "de novo". d-:
02:52imirkin: just don't use 'renouveau' ;)
02:53nyef: Yes, yes. I know that that one is already in use.
02:54nyef: Do we have multiple hw contexts to work with, or is it one hw context to the DRM client connection?
02:54karolherbst: nyef: even then, what do you do if you run out of hw contexts?
02:55imirkin: nyef: nnnnot sure. but you really don't want multiple hw contexts.
02:56imirkin: those things are enormous, and slow to switch
02:56nyef: Okay, so noted.
02:57nyef: Can I test this mess against a running X server "just" by setting LD_PRELOAD before running the test program?
02:58nyef: (And will it at least mostly not screw up the rest of the system?)
02:58imirkin: you just get an fd, you can do whatever you want with it
02:58imirkin: each GL client gets its own channel
02:59imirkin: i think blob may try to be even cleverer and have multiple applications share a hw context via dispatch through the X server or something, but we don't need to dabble in that madness.
02:59nyef: The number of times I've locked my mcp89 solid by draging things around in xine while trying to set up to watch a video...
03:00imirkin: that's how i got (re)interested in nouveau - dragging mplayer windows around would (seemingly) randomly crash X
03:00nyef: Is that threading, or something else nasty?
03:01imirkin: eventually tracked it down to this friendly fellow... https://cgit.freedesktop.org/nouveau/xf86-video-nouveau/commit/?id=2fa3397e348161a3394e2b456f065921272a056a
03:03nyef: ... I'm probably already running that on my system, aren't I?
03:03imirkin: seems likely.
03:03imirkin: it's from 2013
03:03nyef: So I probably have a different bug. Games and fun.
03:04imirkin: mine was very Xv-specific
03:04imirkin: it would occur when the x coordinate of the xv overlay was less than 0
03:04imirkin: which would happen as i would drag the mplayer window from desktop to desktop in windowmaker
03:04nyef: (Okay, technically not locked solid, the mouse cursor still moved, but it seemed as though something had a keyboard and mouse grab, and had died and left a good-looking corpse.)
03:05imirkin: this would cause bytes to be read out of bounds, and sometimes those bytes were in the next page, thus unmapped, thus SIGSEGV
03:05imirkin: highly determinstic once i figured out the underlying cause
03:05nyef: Umm... Accompanied by some sort of error (PGRAPH?) in the kernel log. I haven't tried recently, so I don't remember the details.
03:08imirkin: are you using nouveua or modesetting?
03:08imirkin:likes to blame everything on the modesetting driver
03:08nyef: Oh, probably modesetting.
03:08nyef: Right now the system in question isn't even plugged in.
03:09nyef: For various (thunderstorm, borrowing the cables to mess with HDMI input, it's too bloody hot to run the thing anyway) reasons.
03:09imirkin: modesetting uses the GL driver to do rendering
03:09imirkin: in nouveau's case, that's unwise.
03:09nyef: ... because the GL driver is less than completely stable?
03:10nyef: ... because modesetting ends up being multi-threaded or multi-contexted?
03:10imirkin: the former
03:10imirkin: also modesetting often does things in a way that the driver doesn't totally handle properly
03:11orbea: modesetting worked really well with nouveau before xorg 1.20.0 here
03:11imirkin: which leads to rendering corruption and whatnot
03:11imirkin: orbea: kepler?
03:11imirkin: should be mostly ok there
03:11orbea: the nouveau ddx is pretty useless with DRI3 here on the other hand
03:11imirkin: solution: don't use DRI3
03:12orbea: but DRI3 also gives me a noticable perf boost in things like pcsx2
03:12nyef: Okay, so I have some things to check on my mcp89 once I have the chance, and a pile of digging and figuring out to do with the whole pushbuf thing.
03:14nyef: ... log extract taken, so that I don't lose the information by taking an unexpected shutdown (thank you *ever* so much, almost-dead laptop battery).
03:15imirkin: but hey - at least it's past its warranty!
03:19nyef: Yup. And I can do a goodly bit of maintenance myself, although fixing a blown trackpad without replacing the entire topcase that it's plastic-welded to is beyond me.
03:20nyef: I'd consider trying to get a wireless display module for this thing, except that I have no confidence in my ability to get a receiver for it.
04:58karolherbst: imirkin: uhh, that tiling is super weird...
04:58karolherbst: there are... holes
06:47karolherbst: imirkin: https://github.com/karolherbst/mesa/commit/704ffc766779a3076768a7d9cdf29690cf12528d
06:48karolherbst: still needs some height/width handling and so on though
06:50karolherbst: or... maybe not?
08:25karolherbst: imirkin: okay... I know what nvidia is doing know
08:25karolherbst: they offset the address
08:26karolherbst: and just use a tiling mode where that works
08:51karolherbst: imirkin: https://github.com/karolherbst/mesa/commit/31b7ea55ffdb205923aa584868696e8588344785 .....
08:57karolherbst: I guess we could use an untiled layout for 3d images...
08:57karolherbst: because, that's what nvidia does
09:07HdkR: Ow, no wonder 3D sucks so hard
09:07HdkR: and 2D Array is good stuff? :D
09:10karolherbst: HdkR: you mean it sucks like for real?
09:10karolherbst: like bad perf and everything?
09:11karolherbst: HdkR: well 2d array is an array, so each element is always directly accessible
09:12karolherbst: but 3d is its own primitive
09:13karolherbst: soo... "KHR-GL45.packed_depth_stencil.blit.depth32f_stencil8" left to fix
09:13karolherbst: like literally the only test left
09:15HdkR: karolherbst: I mean, if it is being used as linear rather than tiled in any way than it makes sense
09:15HdkR: I tested 3D images versus 2D Array like...once and found 3D images to be terrible. Not sure how long ago I tested this though
09:15karolherbst: 3d images and 2d arrays are not fit for the same problems
09:17HdkR: Right, I think I was abusing it a bit :D
09:17karolherbst: volumetric rendering is what they are usually used for
09:18HdkR: Aye. nice use for the volumetric lights and fog
09:21karolherbst: ohh and GenerateMipmap is different on 3d vs 2darray
09:23karolherbst: I really should learn about OpenGL at some point...
09:23karolherbst: or maybe it is too late for that
09:23HdkR: I'd rather play in IR land
09:23karolherbst: I mean, it would kind of help me writing tests...
09:28HdkR: Tests, don't remember me. I need to write a few dozen tomorrow :P
09:28HdkR: don't remind me*
09:42crmlt: Anyone able to fix this? Can I somehow help? Nouveau is broken on mpc89 from linux kernel 4.15 https://bugs.freedesktop.org/show_bug.cgi?id=106512
09:52karolherbst: crmlt: willing to compile a kernel yourself?
09:55karolherbst: crmlt: is this a mac mini by any chance?
09:56karolherbst: or macbook or something
09:56karolherbst: anyway, I can check on my machine later if I have the same issue
10:56pendingchaos: shouldn't begin_query/end_query only include commands from the context the begin_query/end_query was called with?
10:56pendingchaos: I'm not sure if nvc0 does that
11:22RSpliet: crmlt: The first question to ask is always whether it's still broken on a 4.17 kernel. No point in chasing bugs that have already been fixed ;-)
11:22RSpliet: ... gone
11:30imirkin: karolherbst: you mean just not tile along the depth axis?
11:35crmlt: RSpliet: it is
11:36crmlt: RSpliet: 4.15 and higher are affected even 4.18
11:36crmlt: latest quite stabel is 4.14
11:37crmlt: karolherbst: it is MacBook Mid 2010... and yes these chipsets are in Minis too
11:38crmlt: karolherbst: I could do it..
12:59kubast2: I just got llvm to pass their test for regressions compiled into packages installed ,but src/gallium/targets/opencl/meson.build:36:0: ERROR: C++ library 'clangCodeGen' not found uhm how do I get this library?
13:06kubast2: well gonna try with the repos llvm instead of the one that mesa-git requests
13:06kubast2: by arch at least
13:07kubast2: yeh now I played myself
13:07kubast2: can't install repos llvm because I installed the svn libs can't uninstall those because half of the system including repos lib32-mesa etc. depends on it
13:26nyef: G'morning all.
13:27nyef: Ah. Someone else with an mcp89? I don't think I've done a kernel update on mine in a while... Or maybe I did, but it wasn't to anything that recent.
13:37nyef: Oh, FFS. Swpied the power strip that my mcp89 was using.
13:37imirkin: gotta watch out for thieves
13:41nyef: ... Gotta stock up on more power strips. Clearly I've used up all of my spares and then some. /-:
13:41imirkin: time to head to microcenter
13:42nyef: Yeah. Might have this sorted by August.
13:43nyef: So, did something major happen in 4.15? Power-management related, maybe?
13:43imirkin: there was a ton of churn around there
13:43imirkin: pm has been a moving target since the dawn of power
13:45nyef: Mmm. Okay, I found power and a video device. Of course, said video device is the HDMI input on my main system, but "mendicants can't be choosicants".
13:47nyef: 4.16.2, WFM.
13:50nyef: ... x11 video driver appears to be nouveau, not modesetting.
13:51nyef: Is there an easy way to wring a version out of this thing, or should I just check to see what portage says it installed?
13:51imirkin: xorg log
13:51imirkin: gets printed when the module is loaded by X, iirc
13:52imirkin: (not later, when it probes)
13:52nyef: "module version = 1.0.15"?
13:52nyef:pops a DVD in.
13:53nyef: I've got a crash to try and reproduce. (-:
13:55nyef: Some graphical errors, then once I dragged the (playing) video area such that the bottom went offscreen, *boom*.
13:56nyef: fifo: DMA_PUSHER - ch 3 [X] get 000001f860 put 0000021e94 ib_get 000002e7 ib_put 0000031c state 8000f228 (err: INVALID_CMD) push 00400040
13:57nyef: nouveau 0000:05:00.0: gr: DATA_ERROR 00000004 [INVALID_VALUE]
13:57nyef: nouveau 0000:05:00.0: gr: 00100000  ch 3 [000fa2a000 X] subc 2 class 502d mthd 08b4 data 00044110
13:59nyef: Then it goes into a bit about TRAP_PROP DST2D_FAULT, a trapped write because VRAM_LIMIT, goes generally nuts, does not recover.
14:00nyef: And it's the X server process, because why wouldn't it be something that I can't just kill to recover the system?
14:02nyef: Nothing obvious in the Xorg log.
14:04imirkin: right, so the DMA_PUSHER thing is pretty much the kiss of death
14:07nyef: Because I think it corresponds to the early graphics corruption, not to the total lockup.
14:07imirkin: yeah, but one is a precursor to the other
14:08nyef: If I had quit the media player at that point and forced a desktop redraw, it'd probably have been fine.
14:08imirkin: perhaps there's a different way we can handle those DMA_PUSHER errors
14:08imirkin: but they're asynchronous
14:08imirkin: an interrupt gets raised when the error condition is detected
14:08nyef: So are X11 protocol errors.
14:09nyef: Going to see if I can reproduce the video corruption thing easily.
14:09imirkin: error recovery is a tough topic.
14:09imirkin: you can ask ben about it
14:09nyef: Easiest way to recover from errors is not to have errors in the first place, right? (-:
14:10nyef: Ugh. Nearly typed my password on the wrong keyboard.
14:10imirkin: as long as it's hunter2, no one will ever know.
14:11imirkin: [and bash.org is down. that's just great.]
14:12nyef: Well, that wasn't it... Time to ssh in and see how this crash differs.
14:13nyef: Five commands only. Caused by dragging a window while video is playing.
14:15nyef: DMA_PUSHER INVALID_CMD, DATA_ERROR INVALID_VALUE again, along with the follow-on line but with different method and data, and two more DMA_PUSHER.
14:20nyef: ... Should I try to reproduce this with Kepler?
14:20nyef: (What am I saying? Of course I should! This is an important use-case.)
14:35nyef: New data point: I can get the same crash merely with xine at idle, no video need be playing.
14:35nyef: Next test: Can I make it happen with glxgears?
14:40nyef: Can't make it happen with glxgears: glxgears just segfaults after opening the window, rather than actually drawing anything.
15:07nyef: ... rebuilding glxgears didn't help, now rebuilding mesa (17.3.9, apparently).
15:12karolherbst: imirkin: no, on all axis' there is this field in the TIC to specify the block size for each axis, and nvidia specifies a 1x1x1 block size. Maybe I don't really understood the stuff yet. (GM107_TIC2_3_GOBS_PER_BLOCK_WIDTH/HEIGHT/DEPTH)
15:13karolherbst: I don't know if they always do this or only if a 3d image is used in such a way
15:13imirkin: can you check how the image is actually set up for real?
15:13karolherbst: but the TIC shows me this
15:13imirkin: e.g. try rendering to it
15:14imirkin: i dunno - maybe they decided that tiling on 3d textures wasn't worht the hassle :)
15:14imirkin: no one uses them anyways
15:14imirkin: and tons of brain cells go towards dealing with this crap
15:14karolherbst: anyway, for us that seems like a valiable solution for now
15:14karolherbst: we can always optizmie later if there is the need for that
15:14imirkin: could even just turn off tiling in the z direction
15:15imirkin: that would be enough
15:15karolherbst: mhh, true
15:17karolherbst: anyway, demmt doesn't show the TICs/TSCs as those, just 0x00000000, so I got a bit annoyed and never actually looked for the data...
15:20imirkin: for rendering? it's set in the RT data
15:23karolherbst: imirkin: no, I meant the TIC/TSC entries itself: https://gist.githubusercontent.com/karolherbst/def61a2f0812c3201e4cf8469fd2d67c/raw/4a8d97c2f7c27f68859d91f0288d2e3a820215ae/gistfile1.txt
15:25karolherbst: well this is a TIC one
15:25karolherbst: would be nice if demmt could shows a parsed output of those
15:25imirkin: right ... demmt decodes those though
15:25karolherbst: the only parsed ones are full of 0
15:26imirkin: well it doesn't always know when to do it
15:26imirkin: you can check what the logic is
15:26imirkin: but there is a logic ;)
15:35karolherbst: imirkin: yeah, setting the depth tile mode to 0 helps already
15:40nyef: Hrm. STILL no luck with glxgears. /-:
15:41karolherbst: imirkin: https://github.com/karolherbst/mesa/commit/dead8470e5405fb51b882100ff64c3114f530494
15:42karolherbst: I _think_ this should just work on kepler as well, right?
16:03nyef: Hunh. Okay, might have found the magic to get usable backtraces.
16:23nyef: Urgh. Another possible complication with the HDMI-input-and-LVDS case that just occurred to me: There may actually be an integrated GPU available at that point, just to confuse things.
16:27nyef: Does an LVDS panel i2c channel have its own power? That is, will the panel respond to an i2c transaction even if the main panel power is off?
16:45nyef: ... And no wonder my gk104 system is surprisingly stable in X: modesetting driver and llvmpipe.
16:50nyef: glxgears segfault... from st_update_framebuffer_state (nouveau_dri), in st_validate_state (nouveau_dri), in st_Clear (nouveau_dri), in draw (glxgears), in main (glxgears).
16:58nyef: Might be a variant on / instance of https://bugs.freedesktop.org/show_bug.cgi?id=101000 ?
17:04karolherbst: imirkin: so, any ideas on how to fix KHR-GL45.packed_depth_stencil.blit.depth32f_stencil8?
18:21kubast2: so I just do a manual compilation without makepkg
18:49nyef: Hrm. Either the mesa options on https://nouveau.freedesktop.org/wiki/InstallNouveau/ involve a different set of dependencies than the system build, or the system build is doing behind-the-scenes stuff to deal with mako without requiring it as an installed package in the main system.
18:53nyef: Ah. Dev vs. release version thing, probably.
18:53nyef: Okay, that makes a certain amount of sense.
18:55kubast2: nyef, I just gave up on using aur(xf86-video-nouveau-git is incompatybile with mesa-git) or a mix of aur+pkgbuilds from archnouveau
18:56kubast2: I do the instructions from freedesktop wiki rn
18:56kubast2: just that I put a small script so I don't have to search through my bash history etc
18:56kubast2: to compile it all :shrug:
18:58kubast2: I will just want to setup kdeconnect on xfce4(perhaps requires startxfce4 restart ,but I can't pair rn)
18:58kubast2: and send notifications once it will require a sudo for mkinitcpio and copying the resulting vmlinuz file
18:59nyef: kubast2: I don't have any issue with portage as far as nouveau and mesa go, but I'm trying to set up a development system which means not using them anyway. My problem is that one of the (python-based) dependencies isn't installed and doesn't want to install.
19:00nyef: It's looking like something changed the defaults for what python versions to use / have around, so there's a weird circular dependency involved in sorting it out for one damned package.
19:14nyef: Must have done an emerge --sync and then not done world updates or something.
19:33Subv: it seems mesa sets the viewport scale to a negative value when the GL clip origin is set to GL_UPPER_LEFT, if i'm not mistaken this should also flip the triangle front face right?
20:06pendingchaos: Subv: setting the clip origin to GL_UPPER_LEFT does flip the front face: https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/state_tracker/st_atom_rasterizer.c#n80
20:10karolherbst: pendingchaos: looking at your "compute invocation counter" patch, that was basically with what I came up with except the indirect part
20:10karolherbst: I am not sure if we really have to do all the fancy multiplication there
20:10karolherbst: I think the spec is a bit... vague here
20:11karolherbst: the CTS only tests against ++
20:11karolherbst: or well, any value except 0 really
20:13karolherbst: pendingchaos: the thing here is, that doing that multiplication could let the value overflow quite fast in the end, so we might want to read the spec and come up with a sane enough approach
20:13karolherbst: I think only increasing by the block size might be already sufficient
20:13karolherbst: but... dunno
20:16kubast2_: First time compiling; What should I do after I compile and boot the drm_next kernel as I understand besides the kernel it all defaults the install into $NVD directory when I compile with source nouveau_env.sh and. /configure sh --prefix=$NVD
20:17nyef: Looking at this glxgears segfault thing again, I'm looking at st_update_framebuffer_state(), and suspecting that something isn't initialized properly, probably something back-endish, and I have no idea what.
20:18nyef: kubast2_: There's some instruction about spiking your X11 configuration, and then you probably just startx from a shell that has your nouveau environment in play... And you might need to re-source your nouveau environment from within whatever X shell you end up with as well.
20:18nyef: kubast2_: I haven't gotten that far yet myself, for various reasons.
20:19nyef: (Between "most of the critical things for me up to now have been kernel bugs" and "need to update my Gentoo environment AGAIN"...)
20:21nyef: "USHRT_MAX"? "Undershirt max"? "Microshort max"? "United States Holistic Rapid Transit max"? Clearly, "max" is short for "maxilla" or jawbone...
20:33kubast2_: I see I'm still compiling tbh nyef
20:33kubast2_: Kernel should compile just fine
20:34kubast2_: Not sure about mesa
20:34kubast2_: Libdrm should compile just fine too
20:39pendingchaos: karolherbst: both the piglit and CTS tests allow both behaviours but increasing by just block size seems unlikely and not very useful
20:39pendingchaos: swr also seems to increase by block_size * grid_size
20:41pendingchaos: I don't overflow is much of a problem? at least with 64-bit integers
20:43karolherbst: 65535^3 * 1024*1024*64 > (1<<64)
20:43karolherbst: actually 1d is 0x7fffffff on kepler and newer
20:43karolherbst: so if you really want to, I guess you can overflow with one compute shader invocation already
20:45karolherbst: in the end it depends on what do you want to know
20:45karolherbst: if you launch a 64x64x4 block, does it make a difference if the counter increases by 16384 or 1?
20:45karolherbst: you basically get the same amount of information, one vlaue just wastes a lot of bits
20:46karolherbst: pendingchaos: do you know by what nvidia increases the value? I kind of think they do a ++, but I don't really remember right now
20:49karolherbst: pendingchaos: I think, because there is GL_ARB_compute_variable_group_size it makes sense to at least bump it by the group size, because that's not a fixed value really
20:50pendingchaos: 2^64 invocations is quite a lot though
20:50pendingchaos: at 1 nanosecond per thread, running 2^64 (serially) would take centuries to finish
20:50karolherbst: but some GPUs have _lot_ of threads
20:52karolherbst: but yeah
20:53karolherbst: I guess it won't overflow any time soon
20:53kubast2_: --enable-vdpau flag/something/option not found 🤔 🤔
20:56pendingchaos: I think I'll see what nvidia does
21:02pendingchaos: nvidia seems to increase the counter by block_size * grid_size
21:02airlied: seems the most logical
21:05pendingchaos: I think I'll end up creating a patch that uses a compute shader to increase the counter
21:05pendingchaos: since the macro approach is unreliable and reading from the cpu defeats the purpose of indirect dispatches
21:12kubast2_: I might be doing something wrong encoutered a freeze on drm next but pretty sure it's related to something I have done along the way probablly nyef
21:13kubast2_: will install openssh to safelly reboot the machinr next time
21:21karolherbst: pendingchaos: piglit/bin/arb_shader_image_load_store-max-size --quick -auto -fbo fails for "image2DMS max size test/8x16384x8x1"
21:22karolherbst: image2DMSArray max size test/8x16384x8x8 as well in a non quick run
21:22karolherbst: I hope that isn't caused by my 3d image fix, but kind of sounds unlikely
21:22karolherbst: yeah, it shouldn't be that
21:23kubast2_: So I figured I'm runnung 18.1.3 mesa instead of 18.2 nyef
21:23karolherbst: airlied: for a CTS submission, we only have to run the cts-runner thing with the appropiate gl versions, or do we still need some private tests?
21:25airlied: karolherbst: its messy
21:26airlied: the answer doesnt start from the git tree
21:26airlied: a 4.4 submission is mesaier than a 4.5
21:26karolherbst: ohh I'll go directly for 4.5 anyway
21:27karolherbst: I am just curious if I can use any of the open source releases or still have to run those non public tests
21:27kubast2_: Yeh forgot to make install everythin
21:28airlied: you have to get the adopters cts package, backport a ton of fixes
21:28karolherbst: airlied: ... this sounds like a lot of pain
21:28airlied: karolherbst: yup
21:28karolherbst: why even bothering making it pubilc then...
21:28airlied: the 4.6 submission would be easier
21:29karolherbst: I thought I can just pick _any_ release version for a submission
21:29airlied: karolherbst: the cts at pec releaae is a fixed point in time
21:30airlied: so to confirm to that release everyone has to pass the exact same set of unchanging tests
21:30airlied: so as to not disadvantage any future submission with new hoops to jump tbrough
21:30karolherbst: airlied: I am quite sure that when I read the documentation it stated that it doesnt matter which release you pick, just pick one
21:31airlied: maybe they relaxedit
21:31karolherbst: otherwise I just pick that version and diff against master....
21:31airlied: you have t o fetch the losed tests as well
21:34karolherbst: airlied: ohh, it seems like we don't need those private tests anymore
21:34karolherbst: "For OpenGL CTS releases, and OpenGL ES CTS releases prior to opengl-es-cts-126.96.36.199 download Khronos Confidential Conformance Test Suite by running"
21:35kubast2_: nyef added in .xinitrc with nouveau-env.sh enivormentals I guess I would had to install it all to rootfs it's my test unit anyway
21:36karolherbst: airlied: I think I just go by the newest documentation and if somebody complains, then I can just point to that... hopefully that works?
21:36airlied: worth a try!
21:37airlied: karolherbst: do you have x.org adopter access?
21:39karolherbst: anyway, there are still a handful of issues, but we are quite close (randomly between 1-3 fails, some tests just fail randomly, and I hope that won't be too hard to fix those issues :( )
21:41kubast2_: Yeah I have done something wrong somrwhere
21:44kubast2_: I see it now
21:45kubast2_: It's allready late tho I will have to check tomorrow
21:45airlied: karolherbst: ill do some digging inside khronosnlater and aee if i can find anything
21:47karolherbst: airlied: oh well, I could do the same, but I was hoping you would know that right away
21:47airlied: yeah i only remember the old horrible ways
21:47airlied: hopefully the new ways are in place to use releaaea
21:53karolherbst: airlied: well it is less work for anybody checking the result, because in best case you only have patches already applied on master, so you can kind of skip reviewing all those patches
21:53pendingchaos: karolherbst: probably because a texture can only be at most 32768 pixels in a dimension
21:53pendingchaos: a MS8 image with a width of 16384 would be interpreted as an image with a width of 65536
21:53karolherbst: pendingchaos: yeah, something like that
21:53karolherbst: pendingchaos: wondering if we should just limit the max size for MS images?
21:54pendingchaos: I think nvidia limits texture size to 16384x16384
21:54pendingchaos: I don't think there is a separate cap for MS images vs non-MS images
21:55karolherbst: yeah, probably there is none
21:56pendingchaos: seems with the blob, Maxwell is 16384x16384 and Pascal is 32768x32768
21:57pendingchaos: I think I might see what they do with the piglit test tomorrow
21:57pendingchaos: perhaps you can set the width in the TIC to 65536 somehow
21:58karolherbst: pendingchaos: well I guess your approach wouldn't really work for textures that big then :(
21:58karolherbst: let me check the tic
21:58pendingchaos: it should be the same approach nvidia uses
21:58karolherbst: height/width go up to 65535
21:59karolherbst: depth up to 16383
21:59karolherbst: pendingchaos: ohh, I see
21:59pendingchaos: you looked at what the blob does or is it in envytools?
21:59karolherbst: mesa header file
21:59karolherbst: all values +1
21:59karolherbst: so maybe that works?
22:00karolherbst: if you want you can look into that tomorrow then
22:01pendingchaos: ohh the TIC allows up to 65536
22:02pendingchaos: I wonder why the size is limited to 32768 is nouveau then
22:04pendingchaos: actually I think it's limited to 16384 in nouveau
22:14pendingchaos: perhaps because MAX_TEXTURE_LEVELS is 15 in config.h
22:21pendingchaos: mesa seems to be make with the assumption that dimensions are never > 16384
22:34HdkR: One thing I've been wondering and haven't checked. Why does CUDA show some stupid high limits on texture resolution, while GL is significantly lower? :)
22:35HdkR: 131072x65536 for 2D images
22:36HdkR: For 3D it stays at 16k x 16k x 16k
22:36kubast2_: I think I will just modify the original mesa PKGBUILDS and use that to build mesa-git
22:37karolherbst: HdkR: maybe there are different limits with compute, or maybe the do that in software, dunno
22:38HdkR: Yea, it's curious, I've just never checked what happens with stupid sized textures there :D