00:08 imirkin_: tagr: can you provide a patch to expose hw etc2 / astc on tegra x1? have a look at https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c#n77
00:08 imirkin_: all you'd have to do is whitelist the 3d class, and make sure that everything else works (it should)
00:19 imirkin_: if it doesn't Just Work (tm) feel free to drop it
01:52 imirkin: karolherbst: please confirm that KHR-GL45.direct_state_access.queries_functional is still broken for you, with the patch that's upstream. you said as much this morning, but ... hopefully it's not a huge effort for you to re-check
02:55 imirkin: huh. TXQ for 2d array is just bested??
02:55 imirkin: busted*
02:55 imirkin: that makes zero sence
03:46 imirkin: so ... both a 2d array and a 2d array bound as an unlayered 2d image fail with the new imageSize handling. it's binding a non-zero level, perhaps related.
03:47 imirkin: however with all the other things, including 3d image and cube array (which gets recast as a 2d array!) -- it works
03:47 imirkin: so there's something fishy
03:47 imirkin: which i'm sure has nothing to do with any of this stuff
03:50 imirkin: if those are the last CTS tests that fail, i'll revert my patches. but i'm pretty sure the issue is not directly related.
03:54 imirkin: and i guess KHR-GL45.arrays_of_arrays_gl.SubroutineFunctionCalls1 hits some kind of perf retardo in nvir =/
03:54 imirkin: ... and of course glcts segfaults when i try to run perf on it
03:55 imirkin: or perf segfaults
04:01 imirkin: fk. RA being slow.
04:03 imirkin: curious. mostly coalesceValues sucks... in lists?
04:03 imirkin: surprising.
04:03 imirkin: i wonder if i can just change list -> vector
04:04 cwabbott: well, coalesceValues is O(n) so if it gets called repeatedly it can wind up being O(n^2)
04:05 cwabbott: nir has some fancy datastructures that makes the equivalent O(log n) iirc
04:05 cwabbott: or maybe i'm remembering it wrong
04:06 cwabbott: but it definitely is the naive approach
04:06 imirkin: well, it appears to be spending a lot of time in appending things to lists
04:06 imirkin: which are std::list, i.e. double-linked-list
04:06 imirkin: i think making defs a vector will work out better. going to try it.
04:07 imirkin: not exactly a "smart datastructure", but in practice, len(defs) == 1 for non-weird cases
04:08 imirkin: which i realize isn't the case by the time it's hitting coalesceValues()...
04:12 imirkin: and of course std::list has a remove, but std::vector does not...
04:12 imirkin: (for obvious reasons)
04:14 imirkin: but that's only used in 2 places. perhaps i can hack around it somehow... repack in cleanup, and just null out in ValueDef::set...
08:23 tagr: imirkin: can do, what would be the best way to test those?
08:23 imirkin: there are CTS / dEQP tests, if you know how to run those
08:23 imirkin: there are piglit tests if you don't
08:23 imirkin: but the piglit tests are funky and aren't actually reliable
08:24 tagr: hmm... my testing so far has been limited mostly to kmscube
08:24 tagr:ducks.
08:25 imirkin: it'll need to be a bit fancier than that i'm afraid
08:25 imirkin: but not much
08:25 imirkin: with gbm it hsould work ok
08:25 tagr: imirkin: dEQP probably doesn't work quite yet because I don't have AOSP running on this device
08:25 tagr: with CTS you mean the OpenGL CTS or the Android CTS?
08:25 karolherbst: you don't need the AOSP for deqp
08:26 tagr: the latter is called VTC now, isn't it?
08:26 karolherbst: ohh wait, this is for actual android here, right?
08:26 imirkin: ES-CTS, which i think is part of VK-GL-CTS
08:26 karolherbst: nvm then
08:26 imirkin: dEQP definitely has them too
08:26 imirkin: more of a pain to get working with gbm though
08:26 imirkin: for that you just want to use piglit
08:27 tagr: karolherbst: I don't run any kind of Android on Tegra, this is all just regular Linux
08:27 karolherbst: tagr: ahh, then you probably don't need the AOSP
08:27 tagr: though I do plan to go back some day and attempt reproduce what others have done with the open source graphics stack
08:27 karolherbst: you can raun the deqp tests on a regular linux
08:27 karolherbst: *run
08:27 tagr: oh, nice
08:28 karolherbst: imirkin used it to fix OpenGLES issues I think :p
08:28 tagr: the first hit on Google directed me to some Android website, so I just assumed
08:28 imirkin: trust me - just grab piglit.
08:28 karolherbst: right
08:28 imirkin: don't waste your time with the dEQP / CTS stuff. you won't get it to work.
08:28 tagr: I should probably get into the habit of making piglit part of my normal test routine, anyway
08:28 karolherbst: the deqp things aren't really reliable, I think it would be better with your own crash handler or something
08:28 imirkin: with piglit, once you build it, you can run
08:29 tagr: okay, I'll take a look
08:29 tagr: I must admit I've never actually run piglit before
08:30 karolherbst: piglit is quite fast if you don't chicken and disable parallel tests
08:30 karolherbst: :p
08:31 imirkin: tagr: PIGLIT_SOURCE_DIR=. PIGLIT_PLATFORM=gbm bin/khr_compressed_astc-basic_gles2 -auto
08:31 imirkin: and the -miptree_gles2 one
08:32 imirkin: tagr: don't run the whole thing, that'll be disaster.
08:32 imirkin: just those handful of tests.
08:32 tagr: imirkin: okay, thanks!
08:37 karolherbst: pmoreau: mhh the shader_info thing is indeed never filled :( I guess we need to do that in clover as well
08:40 pmoreau: Well, I’m not 100% sure about that.
08:40 karolherbst: mhh
08:41 karolherbst: in spirv_to_nir a SpvBuiltInWorkgroupSize is exppected
08:41 pmoreau: All the info can be found in `struct pipe_grid_info`, so I wonder if it spirv_to_nir should not read that one when processing OpenCL stuff.
08:41 karolherbst: or SpvExecutionModeLocalSize
08:41 karolherbst: uhm... no?
08:41 karolherbst: no deps to gallium
08:42 pmoreau: Ahem
08:43 pmoreau: Where is that shader_info coming from?
08:44 karolherbst: it is filled
08:44 karolherbst: there is nir_gather_info, but it just parses the nir and fills in some things
08:44 karolherbst: not everything
08:44 karolherbst: the other stuff is set somewhere
08:45 karolherbst: the glsl linker is setting the local_size things
08:46 karolherbst: link_cs_input_layout_qualifiers
08:50 imirkin: tagr: if you want to run actual conformance tests, then the way to do it that i know would require running Xorg
08:50 imirkin: there might be other ways too
08:56 karolherbst: pmoreau: it is passed from gl_linked_shader.Program.info into nir_shader_create, but there is more stuff filled in later
08:56 karolherbst: and as I said the local_size is filled in at glsl linking time
08:56 karolherbst: I don't see how we could use any of this inside clover
09:04 karolherbst: pmoreau: how do I get the pipe_grid_info in kernel::launch?
09:04 karolherbst: uhm
09:04 karolherbst: kern::exec_context::bind
09:04 karolherbst: *kernel
09:04 karolherbst: ohh q->pipe...
09:05 pmoreau: It is created in kernel::launch
09:05 karolherbst: ahh crap
09:05 karolherbst: too late
09:06 pmoreau: That’s the only place where you receive that information.
09:06 karolherbst: guess I have to pass in grid_size
09:07 pmoreau: And block_size, and block_offset
09:07 karolherbst: or block size?
09:07 karolherbst: mhh
09:07 karolherbst: what is local_size now...
09:07 pmoreau: *grid_offset, not block_offset
09:07 pmoreau: block_size
09:07 karolherbst: okay
09:08 pmoreau: And reduced_grid_size is how many blocks you launch, per dimension.
09:08 karolherbst: okay, but this isn't important for what I try to solve right now
09:12 karolherbst: ...
09:21 karolherbst: pmoreau: in one test I get OOR_ADDR :(
09:21 karolherbst: it is the barrier test
09:22 karolherbst: pmoreau: grid dimension is the grid_size?
09:22 karolherbst: ...
09:22 karolherbst: obviously
09:23 pmoreau: Depends what grid dimension is
09:23 karolherbst: compute header
09:23 pmoreau: Is it the number of blocks per dimension, or the number of threads per dimension
09:23 karolherbst: grid dimensions = 1x1x1
09:23 karolherbst: block dimensions = 1024x1x1
09:23 karolherbst: I think I fail to set the grid dimensions to proper values
09:23 karolherbst: should be 16k or so
09:24 karolherbst: I think
09:24 pmoreau: Okay, so the number of blocks. In which case, it’s reduced_grid_size.
09:24 karolherbst: ahh
09:27 tagr: imirkin: basic_gles2 seems to pass, miptree_gles2 segfaults
09:31 karolherbst: skeggsb: ping on the backlight patch?
09:31 karolherbst: pmoreau: ohh, you have such a laptop right?
09:31 karolherbst: main GPU being nvidia + backlight controls?
09:32 karolherbst: pmoreau: would be awesome if you could verify there is a backlight regression and that my patch fixes that
09:41 karolherbst: pmoreau: ... that clover thing is a overused of template stuff...
09:41 karolherbst: *overuse
09:43 karolherbst: grid ix 16x1x1.. makes sense
09:45 karolherbst: pmoreau: so grid dimension has to become 16x1x1?
09:58 karolherbst: grid dimensions = 16x1x1 :)
10:10 tagr: imirkin: basic_gles2 seems to pass, miptree_gles2 segfaults~.
13:54 karolherbst: wow, my mmiotrace patch was applied to 4.14, 4.9, 4.4 and 3.18
13:54 karolherbst: I didn't even ask for that
14:12 mupuf: karolherbst: :)
14:34 matadores: hi
15:51 karolherbst: imirkin: :( I can't just throw BBs away, because that makes the maxwell sched calculator unhappy
17:41 juri_: is there a page documenting what GPUs work with OpenCL on nouveau?
17:41 imirkin_: juri_: just pick any empty page :)
17:42 imirkin_: [i.e. none]
17:43 juri_: um. it's all over the internet as working. there's even coriander, to run cuda code against opencl...
17:43 imirkin_: ok
17:43 imirkin_: feel free to update the internet.
17:44 juri_: what happened? :P
17:44 imirkin_: nothing
17:44 imirkin_: it never worked
17:44 imirkin_: still doesn't
17:44 imirkin_: people were working on it
17:44 imirkin_: people still are.
17:44 imirkin_: if all you need is to add some integers in opencl, we got you covered.
17:45 juri_: nope. need to do a lot of double precision multiplication.
17:45 pmoreau: Might work
17:45 pmoreau: I haven’t tried it explicitly, but at least some of the OpenCL tests using doubles pass.
17:46 imirkin_: juri_: in case it's not clear, this work is not end-user ready
17:46 juri_: I can drop back to single if needbe.
17:46 imirkin_: double vs single isn't the issue
17:46 imirkin_: various intrinsics are
17:46 juri_: i'm not exactly an end user, so that's fine. :)
17:47 imirkin_: not sure what the control flow situation is
17:47 imirkin_: esp in light of unstructurized code
17:48 pmoreau: I haven’t tried that complicated control flow. Just simple if/else, switch and for loops.
17:49 juri_: I'll be compiling to cuda from haskell, so using cuda-- is fine. i just need to test suite the stuff out.
17:49 imirkin_: pmoreau: can you put a quick wiki page together to provide developer-type people instructions on the sequence of repositories to get, build, etc
17:49 imirkin_: juri_: definitely ain't no cuda...
17:49 juri_: that would be a great help. :)
17:50 juri_: imirkin_: yeah. i'll be using coriander as a shim.
17:50 imirkin_: may i ask if you _really_ need opencl?
17:50 imirkin_: e.g. what does opencl have that opengl compute shaders don't?
17:50 imirkin_: [that you need]
17:51 imirkin_: coz opengl compute shaders work fine on all fermi+ hw.
17:51 juri_: it's targetable (in theory) by the meta-par and acceleration languages in haskell. I've got some really big 3d renders (think: raytracing--) that i'd like to throw GPU at, in addition to CPU.
17:51 imirkin_: (with some rare exceptions)
17:51 imirkin_: right, i get that
17:51 imirkin_: but what do opengl compute shaders not do that you need opencl for?
17:51 pmoreau: imirkin_: Already have: https://github.com/pierremoreau/mesa/wiki/OpenCL-support-for-Nouveau
17:52 imirkin_: i know that opencl has more grown up support for various floating point things than glsl's "welp, it's all just undefined" approach
17:52 imirkin_: but does that matter in practice?
17:52 juri_: functionally? no idea. code-wise? my haskell code only spits out cuda, and coriander only "converts" to opencl. I'm trying to assemble a toolchain, and make it work, befor fixing links in the chain.
17:52 imirkin_: oh, i see.
17:53 imirkin_: so round hole, square peg
17:53 juri_: yep.
17:53 pmoreau: I would never think of writing a BVH builder in GLSL, though that might be possible. Especially with the pointer extension.
17:54 juri_: in an ideal world, it'd be nice if my haskell code spat out opencl, to start with.
17:56 juri_: heck, the only reason i'm running X11 is as an opencl loader...
17:56 juri_: (servers are headless)
18:00 imirkin_: juri_: see pmoreau's link above, in case you missed it, for getting his opencl work going
18:01 imirkin_: if there are simple things missing, feel free to point them out (or even better, send patches)
18:04 juri_: thanks. will do. :)
19:07 captainchris: hi everybody
19:07 captainchris: I need to reinstall my OS and i want to select nouveau instead NVIDIA
19:08 captainchris: Can is use blender and godot engine with nouveau driver ?
19:08 captainchris: Can I use ^ sorry
19:08 karolherbst: should work I guess
19:09 imirkin_: captainchris: i'm not actively aware of any bugs reported against those + nouveau (for recent versions)
19:09 imirkin_: captainchris: what GPU do you have?
19:10 captainchris: NVIDIA Corporation GT215 [GeForce GT 240]
19:10 imirkin_: is it the DDR3 or DDR5 one?
19:10 captainchris: DDR3 i think
19:10 imirkin_: cool. then you should even be able to have reclocking on it. (manual)
19:12 captainchris: I can install it easy ?
19:15 imirkin_: well, a few things to be aware of...
19:15 imirkin_: (a) nouveau is nowhere near the level of stability of the nvidia blob driver
19:15 imirkin_: (b) the nv50 series of gpu's (of which yours is one) has an unidentified issue that causes things to die at random on occasion.
19:15 imirkin_: but in general, it should work
19:16 imirkin_: i've been using it on various gpu's for the better part of a decade with few issues
19:16 imirkin_: otoh other people run into all kinds of issues. depends what you do.
19:16 imirkin_: i do very little :)