01:28 j2lapoin: The best 3d card opensource is gtx780?
01:28 HdkR: Or you can go the AMD route and the options improve. :P
01:29 j2lapoin: For amd what could be a good opensource one
01:30 HdkR: Depends on if you are teh sort of person to care about libre firmware blobs or not I guess
01:30 HdkR: s/teh/the
01:30 j2lapoin: Im running parabola, i care
01:31 joepublic: parabola does not support 3d acceleration with amd cards
01:31 HdkR: It's a sad day then
01:31 j2lapoin: Joepublic: you told me to go for nouveau
01:32 joepublic: parabola supports nouveau, yes, radeom/amdgpu, no. And with nouveau you have to have a card that works without nonfree firmware (such as kepler)
01:33 HdkR: So yea, GTX 780ti
01:34 j2lapoin: i must be sure cause the only way i can found one is in the used and there is novreturn.
01:34 HdkR: Sure. they are five year old cards :)
01:35 j2lapoin: It is silly to use a i5 3.10 to run this card?
01:35 HdkR: Should be fine
01:37 j2lapoin: Ok then ill probably need a new case and a new powersupply cause mine is for low profile card only
01:39 j2lapoin: I mean i should be fine after having spend 200$
02:59 j2lapoin: just the whole kit (780ti/pwsupply/case) for 400$ mostly newegg
03:00 j2lapoin: joepublic, i get my kit, i just have few more days before receiving it
16:39 pmoreau: karolherbst: I made some changes to your branch to get things working on my MCP79: there was a tiny bit of plumbing missing for Tesla.
16:50 karolherbst: pmoreau: cool
16:50 karolherbst: did you push it?
16:50 pmoreau: Nope, need to do that
16:50 karolherbst: k
16:50 karolherbst: but otherwise things are working out quite well I suppose?
16:52 pmoreau: I have a remaining issue when trying to run simple examples, memory related as if it was not reading the pointer to write to from the correct location, or getting the wrong data.
16:54 pmoreau: karolherbst: https://hastebin.com/jefujivave.rb
16:54 karolherbst: pmoreau: mhh, with HMM or without it?
16:54 pmoreau: Without HMM
16:55 karolherbst: no idea what GLOBAL_LIMIT_WRITE means
16:55 karolherbst: probably something brutally stupid
16:55 pmoreau: It’s the *memory_accesses/g_arg_w_int32* test from my SPIR-V testing repo, nothing using HMM there.
16:56 karolherbst: mhhh
16:56 karolherbst: why does it read from s[] though?
16:56 karolherbst: I moved over to use a[] instead of direct uniform access, maybe there is something wrong with that?
16:56 karolherbst: pmoreau: mind giving the full debug output?
16:56 pmoreau: Ah right, I tried some modifications there.
16:57 pmoreau: Let me run again with the a[] accesses.
16:57 karolherbst: normally those should get converted to c0[]
16:57 karolherbst: pmoreau: ADDRESS_BITS is 32 with tesla, right?
16:57 pmoreau: It should
16:58 karolherbst: yeah... looks like it
16:58 karolherbst: that " shared: 32768" is odd as well
16:59 karolherbst: but maybe something caused by your modifications
16:59 pmoreau: Yes, 32
17:00 pmoreau: Here is what I get with the a[] accesses: https://hastebin.com/alotewemab.rb
17:00 karolherbst: huh
17:00 pmoreau: Whole dmesg output: https://hastebin.com/qobabicole.coffeescript
17:00 karolherbst: something is terribly wrong
17:01 karolherbst: it shouldn't emit a[]
17:01 karolherbst: pmoreau: NVC0LoweringPass::handleLDST
17:01 pmoreau: Let me push my modifications, but you can find the diff here: https://hastebin.com/ihexobapuq.php
17:02 karolherbst: if (i->src(0).getFile() == FILE_SHADER_INPUT) {
17:02 karolherbst: if (i->src(0).getFile() == FILE_SHADER_INPUT) {
17:02 karolherbst: if (prog->getType() == Program::TYPE_COMPUTE) {
17:02 karolherbst: i->getSrc(0)->reg.file = FILE_MEMORY_CONST;
17:02 karolherbst: i->getSrc(0)->reg.fileIndex = 0;
17:02 karolherbst: }
17:02 pmoreau: `NVC0LoweringPass` definitely not something run on NV50, right? :-)
17:02 karolherbst: nv50 needs this as well
17:02 karolherbst: right
17:02 karolherbst: I don't know how compute shaders are handled on nv50 though
17:02 karolherbst: if those are supposed to work at all
17:03 karolherbst: but yeah, you need that a[] -> c0[] lowering I think
17:04 karolherbst: g[] access on tesla is a bit weirdo as well. I think we can have 32 32 bit windows into VRAM
17:05 karolherbst: uhm
17:05 karolherbst: 16
17:05 karolherbst: g0 - g15
17:06 karolherbst: pmoreau: mind adding support for printing fileIndex for g[] as well for nv50 targets?
17:06 karolherbst: I think we want that as well
17:06 pmoreau: Okay, I’ll do that. Let me push my modifications somewhere first.
17:06 karolherbst: I have access to som G86 GPUs or something, but only after I am back from my PTO :p
17:07 karolherbst: anyway, I expect that with that a[] lowering it should fix your issues
17:07 karolherbst: except the value inside fileIndex is garbage
17:10 karolherbst: pmoreau: no idea how we could be able to expose more than 4GB of VRAM though :/ as being only able to address 32 bits makes it kind of hard to have allocations > 4GB
17:10 karolherbst: but I think CL actually supports that, no?
17:10 karolherbst: we could have address_bits being 64 then, where the high bits are the gX index and one allocation can be 4GB top
17:12 HdkR:allocates 16 4GB regions
17:12 pmoreau: I am not sure that allocations greater than 4 GB are allowed for devices where `CL_ADDRESS_BITS` is 32.
17:14 karolherbst: pmoreau: the idea would be to expose 64
17:15 karolherbst: no idea if the gX index can be indirect though
17:15 karolherbst: HdkR: try that on a Tesla GPU
17:15 karolherbst: how much VRAM had the biggest Tesla? 2GB? :p
17:15 pmoreau: 2 GB sounds like a lot!
17:16 karolherbst: just makes it pointless to have 16 g[] buffers :/
17:16 HdkR: Nobody needs more than 2GB in their compute tasks anyway
17:16 pmoreau: Let’s see what Wikipedia says. I wouldn’t be surprised if it didn’t even reach 1 GB.
17:16 karolherbst: I think there was something wrong with it though
17:16 karolherbst: can't remember
17:18 HdkR: Looks like the GTX 285 had a 2GB variant
17:18 pmoreau: Yeah
17:18 pmoreau: Crazy times!
17:19 HdkR: Rando 2GB GT 330 OEM card as well
17:25 pmoreau: karolherbst: How come writing to global address `0x00000000` gives an error! :-D https://hastebin.com/awemilaqoc.rb
17:31 j2lapoin: i order a 400w power supply, will be ok with gtx780?
17:37 pmoreau: karolherbst: Pushed everything to https://github.com/pierremoreau/mesa/tree/nouveau_nir_spirv_opencl_hmm_v2
19:23 karolherbst: pmoreau: no clue :p