13:25 vita_cell: guys, maybe it worth to mention:
13:25 vita_cell: https://github.com/ValveSoftware/steam-for-linux/issues/5833
13:34 karolherbst: the heck?
13:35 karolherbst: sounds like server joining is broken with either driver?
13:37 karolherbst: vita_cell: so you are also having issues with the nvidia driver? is that correct?
14:06 vita_cell: karolherbst, before they did a big update, I could join servers fine, but rejoining will fail, and break the connection (need steam restart). After the big buGdate, joining complete broken if you use Nouveau driver. No, I don't have problems with Nvidia driver, I am using it, and CSGO server joining problems gone. Really I can not understand that problem
14:07 karolherbst: me neither
14:07 vita_cell: never thought that could be a GPU driver problem
14:07 karolherbst: it shouldn't be
14:07 karolherbst: I am sure it is some cheat detection stuff
14:07 vita_cell: hmmmm maybe
14:08 karolherbst: are you able to join non protected servers?
14:08 vita_cell: never tryed it
14:08 karolherbst: normally there are servers where you are literally allowed to cheat
14:08 vita_cell: but still, I never had problems joining official servers
14:09 vita_cell: official servers are anticheat protected too
14:10 karolherbst: I see
14:10 karolherbst: weird
14:10 karolherbst: maybe some really nasty our of bound read/write? ... mhh
14:10 karolherbst: *out
14:11 karolherbst: I have no idea why the driver should matter here at all
14:11 vita_cell: I just have no idea, just never thought that this problem can be caused by GPU driver
14:12 vita_cell: but, I just did "let's try change the driver, hope this improve something", and now server joining is working
14:13 karolherbst: I am still sure it isn't caused by the GPU driver, but something the game does
14:13 vita_cell: Valve chould not exclude Nouveau users, they assume that everybody uses Radeon, Nvidia, Intel drivers
14:13 karolherbst: no, it may be something more subtle
14:14 orbea:wonders if it would be possible to lie to the game adn say you are using nvidia while using nouveau
14:14 vita_cell: you are right, it is somesithing from game, because almost always I could to join any server I wanted, until they updated the game with new stuff
14:14 karolherbst: I could try out myself here
14:15 vita_cell: my hardware: i5-2500k, 8gb ram ddr3, GTX770 4gb (using reclock for nouveau)
14:16 vita_cell: also refresh button breaks connection (community server list), when you push refresh (after big update) it will fail to fins any server and connection will break
14:16 vita_cell: *when using nouveau driver
14:16 karolherbst: mhhh
14:16 orbea: also, did you try again with nouveau after finding nvidia working? Just in case it fixed itself through other means while you switched drivers...
14:16 karolherbst: could be some memory corruption though
14:17 vita_cell: I just tryed before (today), then just installed Nvidia 390 driver, and everything worked
14:18 vita_cell: and yeah that bugsas I said exists from some years ago, but after their crappy big update, it almost doesn't work with nouveau driver
14:18 vita_cell: *bug
14:45 karolherbst: vita_cell: so you can't join any community server, right?
15:19 vita_cell: karolherbst now, with nouveau driver, it will fail most of the times
15:20 vita_cell: and when it fails, I must to restart steam, for try it again or to just join official server
15:20 vita_cell: workshop map fails always when on nouveau
15:29 vita_cell: and in any GNU/Linux distro, I can not to join servers with addons (while server is loading you see something downloading, it stucks at 0%), but it is a different problem/bug
15:29 vita_cell: *has nothing to do with GPU drivers
15:35 karolherbst: I am sure those are all game bugs, not driver bugs
15:39 vita_cell: Yeah, but now, csgo nouveau support is much worse
15:40 vita_cell: and for sure, they don't give a **** about nouveau users
18:49 airlied: win 24
19:49 karolherbst: vita_cell: mhh.. I am able to create a community map game with nouveau
20:00 vita_cell: karolherbst, hmmm, after the their big update, I can not to enter to any workshop map when I use nouveau driver
20:00 vita_cell: before the big update, I never failed
20:01 vita_cell: *it
20:06 karolherbst: yeah... anyway, this really sounds like an internal game bug. It could be... whatever
21:17 vita_cell: instead of improving game quality code, they waste the time to mess with chat, GUIs...And then...new bugs welcome
21:20 pendingchaos: karolherbst: some initial numbers on an (messy) implementation of that immediates-in-constant-buffers optimization you mentioned that blob does in #dri-devel: https://hastebin.com/ogilaqodel.txt
21:33 RSpliet: pendingchaos: any idea on performance implications?
21:33 pendingchaos: I haven't run any benchmarks
21:34 RSpliet: I noticed the blob isn't scared of re-using consts a million times, so I wouldn't be surprised if perf is hardly affected. Happen to know which GPU this is for btw?
21:34 pendingchaos: seems to get the gpr usage under 33 for a bunch of shaders
21:34 pendingchaos: Maxwell/Pascal
21:37 RSpliet: Yep, GPR under 33 is the dream :-) Best evaluated on Kepler I bet, given the reclocking situation...
21:39 pendingchaos: I'm not 100% sure about re-uploading the immediates upon program binding, but it's probably much simpler to implement
21:39 pendingchaos: it seems especially silly when a program only has a couple of immediates to upload
21:40 RSpliet: Seems redundant to me... immediates aren't patched up right?
21:41 pendingchaos: "patched up"? like modifying the program after it's been compiled?
21:41 pendingchaos: (fix-ups in nouveau)
21:41 RSpliet: That
21:41 pendingchaos: my current implementation doesn't do that
21:42 pendingchaos: a statically allocated space is reserved in the driver constant buffer
21:43 RSpliet: I thought you could bind up to like 8 constbuffers (on kepler)
21:43 RSpliet: Think NVIDIA used one for pointers with OpenCL, and a different one for immediates...? It's a tad hazy, I've only come across it when debugging/optimising OpenCL programs, and found a weird constbuf with a single 64-bit immediate
21:44 imirkin: there was an original idea in codegen to have an "immediates" constbuf
21:44 imirkin: it was never really implemented, and i may have nuked it completely, since we were out of constbuf space
21:45 pendingchaos: up to 16 constant buffers for each stage or something
21:47 imirkin: right
21:47 imirkin: and 14 were necessary for GL
21:47 imirkin: +1 for non-ubo uniforms
21:47 imirkin: +1 for driver constbuf
21:47 imirkin: and we were out :)
21:48 RSpliet: imirkin: is there a potential need to re-upload non-ubo uniforms / driver consts on each shader execution?
21:49 imirkin: you mean each draw?
21:49 imirkin: then yes.
21:49 RSpliet: Ehh... I'll bluff that that's exactly what I meant
21:50 RSpliet: <- terrible at poker. And GL
21:50 pendingchaos: imirkin: why wasn't it put in the driver constbuf? because of the problem of uploading the immediates?
21:52 RSpliet: pendingchaos: Is this what the blob does? I recall these uploads appear in mmt traces...
21:52 pendingchaos: I don't know what the blob does
21:52 pendingchaos: just that it apparently would put immediates in a constant buffer so they could be used without a mov instruction
21:55 imirkin: pendingchaos: should work ok. just not the way it was done.
22:15 karolherbst: pendingchaos: we could stick those in the driver constbuf or something
22:16 karolherbst: RSpliet: 16 on kepler, but only 8 for compute
22:16 imirkin: yeah, shouldn't be any trouble. obviously limited in quantity of immediates, but that was always the case
22:16 karolherbst: on maxwell+ we can just use const buf 17 or 18 for that though
22:17 karolherbst: pendingchaos: maybe mind doing just that for maxwell+ fo now?
22:17 karolherbst: *for
22:17 karolherbst: then we don't have to mess up the driver const buf
22:17 pendingchaos: there's space for 7764 32-bit immediates if the driver constant buffer is used
22:17 pendingchaos: you confirmed that there are 18 constant buffers?
22:17 karolherbst: should be enough I guess
22:18 karolherbst: pendingchaos: not confirmed, but I know that from a source making it highly believeable
22:18 karolherbst: switch emulators seem to say there are 18 in total
22:18 karolherbst: also the ISA has a 5 bit field
22:20 pendingchaos: I think I would prefer putting it in it's own constant buffer
22:20 karolherbst: yeah, we just can't do it on earlier gens
22:20 karolherbst: afaik nvidia reserves 3 cbs for own usages in GL and 2 in vk
22:21 karolherbst: on maxwell+
22:53 pendingchaos: I think putting it in it's own constant buffer for compute might be problematic with it's limit of 8 constant buffers
22:53 pendingchaos: currently 6 UBOs are available free of emulation using global memory
22:54 pendingchaos: putting it in it's own would lower that number to 5
22:55 karolherbst: pendingchaos: ohh, right :/
22:55 karolherbst: pendingchaos: maybe just do it if we have a buffer available?
22:55 karolherbst: then applications with 6 ubos are screwed
22:55 karolherbst: for compute
22:55 karolherbst: and 14 for graphics on kepler
22:56 karolherbst: or maybe we just expose 16 ubos on maxwell and always use the last one if available
22:56 karolherbst: pendingchaos, imirkin: we always know at compile time how many ubos are used, right?
22:57 karolherbst: uhm... 5 ubos on kepler actually
22:57 karolherbst: ... no, 6
22:57 pendingchaos: that information should be available
22:57 pendingchaos: I think also the bindings?
22:58 karolherbst: yeah...
22:58 karolherbst: I mean, we have to know which index to use
22:58 karolherbst: or from where to load the data
22:58 karolherbst: so yeah
22:58 pendingchaos: yeah
22:58 karolherbst: so a compilation would produce one shader and one buffer with immediates
22:58 karolherbst: and if there is no buffer, that means we either don't have immediates to load or there is no free constant buffer
22:59 karolherbst: pendingchaos: maybe even return an index?
22:59 karolherbst: or do we pack ubos?
22:59 karolherbst: like if 0,2,3 are used, 1 is free/unused?
22:59 karolherbst: don't know exactly how that stuff works out in the end
23:01 pendingchaos: I think 1 could be used in that situation?
23:01 pendingchaos: I think it would be easy to do this approach on pre-Maxwell
23:01 karolherbst: yeah.. just wondering if we have to deal with that
23:01 karolherbst: pendingchaos: well, that would work on all gens, that's why I am thinking about it
23:01 karolherbst: and we could just expose 2 more ubos on maxwell+
23:01 karolherbst: for no reason
23:01 karolherbst: but
23:02 karolherbst: that would just mean the compiler has to tell us which ubo slot to use for the immediate buffer