00:08imirkin_: tagr: can you provide a patch to expose hw etc2 / astc on tegra x1? have a look at https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c#n77
00:08imirkin_: all you'd have to do is whitelist the 3d class, and make sure that everything else works (it should)
00:19imirkin_: if it doesn't Just Work (tm) feel free to drop it
01:52imirkin: karolherbst: please confirm that KHR-GL45.direct_state_access.queries_functional is still broken for you, with the patch that's upstream. you said as much this morning, but ... hopefully it's not a huge effort for you to re-check
02:55imirkin: huh. TXQ for 2d array is just bested??
02:55imirkin: that makes zero sence
03:46imirkin: so ... both a 2d array and a 2d array bound as an unlayered 2d image fail with the new imageSize handling. it's binding a non-zero level, perhaps related.
03:47imirkin: however with all the other things, including 3d image and cube array (which gets recast as a 2d array!) -- it works
03:47imirkin: so there's something fishy
03:47imirkin: which i'm sure has nothing to do with any of this stuff
03:50imirkin: if those are the last CTS tests that fail, i'll revert my patches. but i'm pretty sure the issue is not directly related.
03:54imirkin: and i guess KHR-GL45.arrays_of_arrays_gl.SubroutineFunctionCalls1 hits some kind of perf retardo in nvir =/
03:54imirkin: ... and of course glcts segfaults when i try to run perf on it
03:55imirkin: or perf segfaults
04:01imirkin: fk. RA being slow.
04:03imirkin: curious. mostly coalesceValues sucks... in lists?
04:03imirkin: i wonder if i can just change list -> vector
04:04cwabbott: well, coalesceValues is O(n) so if it gets called repeatedly it can wind up being O(n^2)
04:05cwabbott: nir has some fancy datastructures that makes the equivalent O(log n) iirc
04:05cwabbott: or maybe i'm remembering it wrong
04:06cwabbott: but it definitely is the naive approach
04:06imirkin: well, it appears to be spending a lot of time in appending things to lists
04:06imirkin: which are std::list, i.e. double-linked-list
04:06imirkin: i think making defs a vector will work out better. going to try it.
04:07imirkin: not exactly a "smart datastructure", but in practice, len(defs) == 1 for non-weird cases
04:08imirkin: which i realize isn't the case by the time it's hitting coalesceValues()...
04:12imirkin: and of course std::list has a remove, but std::vector does not...
04:12imirkin: (for obvious reasons)
04:14imirkin: but that's only used in 2 places. perhaps i can hack around it somehow... repack in cleanup, and just null out in ValueDef::set...
08:23tagr: imirkin: can do, what would be the best way to test those?
08:23imirkin: there are CTS / dEQP tests, if you know how to run those
08:23imirkin: there are piglit tests if you don't
08:23imirkin: but the piglit tests are funky and aren't actually reliable
08:24tagr: hmm... my testing so far has been limited mostly to kmscube
08:25imirkin: it'll need to be a bit fancier than that i'm afraid
08:25imirkin: but not much
08:25imirkin: with gbm it hsould work ok
08:25tagr: imirkin: dEQP probably doesn't work quite yet because I don't have AOSP running on this device
08:25tagr: with CTS you mean the OpenGL CTS or the Android CTS?
08:25karolherbst: you don't need the AOSP for deqp
08:26tagr: the latter is called VTC now, isn't it?
08:26karolherbst: ohh wait, this is for actual android here, right?
08:26imirkin: ES-CTS, which i think is part of VK-GL-CTS
08:26karolherbst: nvm then
08:26imirkin: dEQP definitely has them too
08:26imirkin: more of a pain to get working with gbm though
08:26imirkin: for that you just want to use piglit
08:27tagr: karolherbst: I don't run any kind of Android on Tegra, this is all just regular Linux
08:27karolherbst: tagr: ahh, then you probably don't need the AOSP
08:27tagr: though I do plan to go back some day and attempt reproduce what others have done with the open source graphics stack
08:27karolherbst: you can raun the deqp tests on a regular linux
08:27tagr: oh, nice
08:28karolherbst: imirkin used it to fix OpenGLES issues I think :p
08:28tagr: the first hit on Google directed me to some Android website, so I just assumed
08:28imirkin: trust me - just grab piglit.
08:28imirkin: don't waste your time with the dEQP / CTS stuff. you won't get it to work.
08:28tagr: I should probably get into the habit of making piglit part of my normal test routine, anyway
08:28karolherbst: the deqp things aren't really reliable, I think it would be better with your own crash handler or something
08:28imirkin: with piglit, once you build it, you can run
08:29tagr: okay, I'll take a look
08:29tagr: I must admit I've never actually run piglit before
08:30karolherbst: piglit is quite fast if you don't chicken and disable parallel tests
08:31imirkin: tagr: PIGLIT_SOURCE_DIR=. PIGLIT_PLATFORM=gbm bin/khr_compressed_astc-basic_gles2 -auto
08:31imirkin: and the -miptree_gles2 one
08:32imirkin: tagr: don't run the whole thing, that'll be disaster.
08:32imirkin: just those handful of tests.
08:32tagr: imirkin: okay, thanks!
08:37karolherbst: pmoreau: mhh the shader_info thing is indeed never filled :( I guess we need to do that in clover as well
08:40pmoreau: Well, I’m not 100% sure about that.
08:41karolherbst: in spirv_to_nir a SpvBuiltInWorkgroupSize is exppected
08:41pmoreau: All the info can be found in `struct pipe_grid_info`, so I wonder if it spirv_to_nir should not read that one when processing OpenCL stuff.
08:41karolherbst: or SpvExecutionModeLocalSize
08:41karolherbst: uhm... no?
08:41karolherbst: no deps to gallium
08:43pmoreau: Where is that shader_info coming from?
08:44karolherbst: it is filled
08:44karolherbst: there is nir_gather_info, but it just parses the nir and fills in some things
08:44karolherbst: not everything
08:44karolherbst: the other stuff is set somewhere
08:45karolherbst: the glsl linker is setting the local_size things
08:50imirkin: tagr: if you want to run actual conformance tests, then the way to do it that i know would require running Xorg
08:50imirkin: there might be other ways too
08:56karolherbst: pmoreau: it is passed from gl_linked_shader.Program.info into nir_shader_create, but there is more stuff filled in later
08:56karolherbst: and as I said the local_size is filled in at glsl linking time
08:56karolherbst: I don't see how we could use any of this inside clover
09:04karolherbst: pmoreau: how do I get the pipe_grid_info in kernel::launch?
09:04karolherbst: ohh q->pipe...
09:05pmoreau: It is created in kernel::launch
09:05karolherbst: ahh crap
09:05karolherbst: too late
09:06pmoreau: That’s the only place where you receive that information.
09:06karolherbst: guess I have to pass in grid_size
09:07pmoreau: And block_size, and block_offset
09:07karolherbst: or block size?
09:07karolherbst: what is local_size now...
09:07pmoreau: *grid_offset, not block_offset
09:08pmoreau: And reduced_grid_size is how many blocks you launch, per dimension.
09:08karolherbst: okay, but this isn't important for what I try to solve right now
09:21karolherbst: pmoreau: in one test I get OOR_ADDR :(
09:21karolherbst: it is the barrier test
09:22karolherbst: pmoreau: grid dimension is the grid_size?
09:23pmoreau: Depends what grid dimension is
09:23karolherbst: compute header
09:23pmoreau: Is it the number of blocks per dimension, or the number of threads per dimension
09:23karolherbst: grid dimensions = 1x1x1
09:23karolherbst: block dimensions = 1024x1x1
09:23karolherbst: I think I fail to set the grid dimensions to proper values
09:23karolherbst: should be 16k or so
09:24karolherbst: I think
09:24pmoreau: Okay, so the number of blocks. In which case, it’s reduced_grid_size.
09:27tagr: imirkin: basic_gles2 seems to pass, miptree_gles2 segfaults
09:31karolherbst: skeggsb: ping on the backlight patch?
09:31karolherbst: pmoreau: ohh, you have such a laptop right?
09:31karolherbst: main GPU being nvidia + backlight controls?
09:32karolherbst: pmoreau: would be awesome if you could verify there is a backlight regression and that my patch fixes that
09:41karolherbst: pmoreau: ... that clover thing is a overused of template stuff...
09:43karolherbst: grid ix 16x1x1.. makes sense
09:45karolherbst: pmoreau: so grid dimension has to become 16x1x1?
09:58karolherbst: grid dimensions = 16x1x1 :)
10:10tagr: imirkin: basic_gles2 seems to pass, miptree_gles2 segfaults~.
13:54karolherbst: wow, my mmiotrace patch was applied to 4.14, 4.9, 4.4 and 3.18
13:54karolherbst: I didn't even ask for that
14:12mupuf: karolherbst: :)
15:51karolherbst: imirkin: :( I can't just throw BBs away, because that makes the maxwell sched calculator unhappy
17:41juri_: is there a page documenting what GPUs work with OpenCL on nouveau?
17:41imirkin_: juri_: just pick any empty page :)
17:42imirkin_: [i.e. none]
17:43juri_: um. it's all over the internet as working. there's even coriander, to run cuda code against opencl...
17:43imirkin_: feel free to update the internet.
17:44juri_: what happened? :P
17:44imirkin_: it never worked
17:44imirkin_: still doesn't
17:44imirkin_: people were working on it
17:44imirkin_: people still are.
17:44imirkin_: if all you need is to add some integers in opencl, we got you covered.
17:45juri_: nope. need to do a lot of double precision multiplication.
17:45pmoreau: Might work
17:45pmoreau: I haven’t tried it explicitly, but at least some of the OpenCL tests using doubles pass.
17:46imirkin_: juri_: in case it's not clear, this work is not end-user ready
17:46juri_: I can drop back to single if needbe.
17:46imirkin_: double vs single isn't the issue
17:46imirkin_: various intrinsics are
17:46juri_: i'm not exactly an end user, so that's fine. :)
17:47imirkin_: not sure what the control flow situation is
17:47imirkin_: esp in light of unstructurized code
17:48pmoreau: I haven’t tried that complicated control flow. Just simple if/else, switch and for loops.
17:49juri_: I'll be compiling to cuda from haskell, so using cuda-- is fine. i just need to test suite the stuff out.
17:49imirkin_: pmoreau: can you put a quick wiki page together to provide developer-type people instructions on the sequence of repositories to get, build, etc
17:49imirkin_: juri_: definitely ain't no cuda...
17:49juri_: that would be a great help. :)
17:50juri_: imirkin_: yeah. i'll be using coriander as a shim.
17:50imirkin_: may i ask if you _really_ need opencl?
17:50imirkin_: e.g. what does opencl have that opengl compute shaders don't?
17:50imirkin_: [that you need]
17:51imirkin_: coz opengl compute shaders work fine on all fermi+ hw.
17:51juri_: it's targetable (in theory) by the meta-par and acceleration languages in haskell. I've got some really big 3d renders (think: raytracing--) that i'd like to throw GPU at, in addition to CPU.
17:51imirkin_: (with some rare exceptions)
17:51imirkin_: right, i get that
17:51imirkin_: but what do opengl compute shaders not do that you need opencl for?
17:51pmoreau: imirkin_: Already have: https://github.com/pierremoreau/mesa/wiki/OpenCL-support-for-Nouveau
17:52imirkin_: i know that opencl has more grown up support for various floating point things than glsl's "welp, it's all just undefined" approach
17:52imirkin_: but does that matter in practice?
17:52juri_: functionally? no idea. code-wise? my haskell code only spits out cuda, and coriander only "converts" to opencl. I'm trying to assemble a toolchain, and make it work, befor fixing links in the chain.
17:52imirkin_: oh, i see.
17:53imirkin_: so round hole, square peg
17:53pmoreau: I would never think of writing a BVH builder in GLSL, though that might be possible. Especially with the pointer extension.
17:54juri_: in an ideal world, it'd be nice if my haskell code spat out opencl, to start with.
17:56juri_: heck, the only reason i'm running X11 is as an opencl loader...
17:56juri_: (servers are headless)
18:00imirkin_: juri_: see pmoreau's link above, in case you missed it, for getting his opencl work going
18:01imirkin_: if there are simple things missing, feel free to point them out (or even better, send patches)
18:04juri_: thanks. will do. :)
19:07captainchris: hi everybody
19:07captainchris: I need to reinstall my OS and i want to select nouveau instead NVIDIA
19:08captainchris: Can is use blender and godot engine with nouveau driver ?
19:08captainchris: Can I use ^ sorry
19:08karolherbst: should work I guess
19:09imirkin_: captainchris: i'm not actively aware of any bugs reported against those + nouveau (for recent versions)
19:09imirkin_: captainchris: what GPU do you have?
19:10captainchris: NVIDIA Corporation GT215 [GeForce GT 240]
19:10imirkin_: is it the DDR3 or DDR5 one?
19:10captainchris: DDR3 i think
19:10imirkin_: cool. then you should even be able to have reclocking on it. (manual)
19:12captainchris: I can install it easy ?
19:15imirkin_: well, a few things to be aware of...
19:15imirkin_: (a) nouveau is nowhere near the level of stability of the nvidia blob driver
19:15imirkin_: (b) the nv50 series of gpu's (of which yours is one) has an unidentified issue that causes things to die at random on occasion.
19:15imirkin_: but in general, it should work
19:16imirkin_: i've been using it on various gpu's for the better part of a decade with few issues
19:16imirkin_: otoh other people run into all kinds of issues. depends what you do.
19:16imirkin_: i do very little :)