13:25vita_cell: guys, maybe it worth to mention:
13:34karolherbst: the heck?
13:35karolherbst: sounds like server joining is broken with either driver?
13:37karolherbst: vita_cell: so you are also having issues with the nvidia driver? is that correct?
14:06vita_cell: karolherbst, before they did a big update, I could join servers fine, but rejoining will fail, and break the connection (need steam restart). After the big buGdate, joining complete broken if you use Nouveau driver. No, I don't have problems with Nvidia driver, I am using it, and CSGO server joining problems gone. Really I can not understand that problem
14:07karolherbst: me neither
14:07vita_cell: never thought that could be a GPU driver problem
14:07karolherbst: it shouldn't be
14:07karolherbst: I am sure it is some cheat detection stuff
14:07vita_cell: hmmmm maybe
14:08karolherbst: are you able to join non protected servers?
14:08vita_cell: never tryed it
14:08karolherbst: normally there are servers where you are literally allowed to cheat
14:08vita_cell: but still, I never had problems joining official servers
14:09vita_cell: official servers are anticheat protected too
14:10karolherbst: I see
14:10karolherbst: maybe some really nasty our of bound read/write? ... mhh
14:11karolherbst: I have no idea why the driver should matter here at all
14:11vita_cell: I just have no idea, just never thought that this problem can be caused by GPU driver
14:12vita_cell: but, I just did "let's try change the driver, hope this improve something", and now server joining is working
14:13karolherbst: I am still sure it isn't caused by the GPU driver, but something the game does
14:13vita_cell: Valve chould not exclude Nouveau users, they assume that everybody uses Radeon, Nvidia, Intel drivers
14:13karolherbst: no, it may be something more subtle
14:14orbea:wonders if it would be possible to lie to the game adn say you are using nvidia while using nouveau
14:14vita_cell: you are right, it is somesithing from game, because almost always I could to join any server I wanted, until they updated the game with new stuff
14:14karolherbst: I could try out myself here
14:15vita_cell: my hardware: i5-2500k, 8gb ram ddr3, GTX770 4gb (using reclock for nouveau)
14:16vita_cell: also refresh button breaks connection (community server list), when you push refresh (after big update) it will fail to fins any server and connection will break
14:16vita_cell: *when using nouveau driver
14:16orbea: also, did you try again with nouveau after finding nvidia working? Just in case it fixed itself through other means while you switched drivers...
14:16karolherbst: could be some memory corruption though
14:17vita_cell: I just tryed before (today), then just installed Nvidia 390 driver, and everything worked
14:18vita_cell: and yeah that bugsas I said exists from some years ago, but after their crappy big update, it almost doesn't work with nouveau driver
14:45karolherbst: vita_cell: so you can't join any community server, right?
15:19vita_cell: karolherbst now, with nouveau driver, it will fail most of the times
15:20vita_cell: and when it fails, I must to restart steam, for try it again or to just join official server
15:20vita_cell: workshop map fails always when on nouveau
15:29vita_cell: and in any GNU/Linux distro, I can not to join servers with addons (while server is loading you see something downloading, it stucks at 0%), but it is a different problem/bug
15:29vita_cell: *has nothing to do with GPU drivers
15:35karolherbst: I am sure those are all game bugs, not driver bugs
15:39vita_cell: Yeah, but now, csgo nouveau support is much worse
15:40vita_cell: and for sure, they don't give a **** about nouveau users
18:49airlied: win 24
19:49karolherbst: vita_cell: mhh.. I am able to create a community map game with nouveau
20:00vita_cell: karolherbst, hmmm, after the their big update, I can not to enter to any workshop map when I use nouveau driver
20:00vita_cell: before the big update, I never failed
20:06karolherbst: yeah... anyway, this really sounds like an internal game bug. It could be... whatever
21:17vita_cell: instead of improving game quality code, they waste the time to mess with chat, GUIs...And then...new bugs welcome
21:20pendingchaos: karolherbst: some initial numbers on an (messy) implementation of that immediates-in-constant-buffers optimization you mentioned that blob does in #dri-devel: https://hastebin.com/ogilaqodel.txt
21:33RSpliet: pendingchaos: any idea on performance implications?
21:33pendingchaos: I haven't run any benchmarks
21:34RSpliet: I noticed the blob isn't scared of re-using consts a million times, so I wouldn't be surprised if perf is hardly affected. Happen to know which GPU this is for btw?
21:34pendingchaos: seems to get the gpr usage under 33 for a bunch of shaders
21:37RSpliet: Yep, GPR under 33 is the dream :-) Best evaluated on Kepler I bet, given the reclocking situation...
21:39pendingchaos: I'm not 100% sure about re-uploading the immediates upon program binding, but it's probably much simpler to implement
21:39pendingchaos: it seems especially silly when a program only has a couple of immediates to upload
21:40RSpliet: Seems redundant to me... immediates aren't patched up right?
21:41pendingchaos: "patched up"? like modifying the program after it's been compiled?
21:41pendingchaos: (fix-ups in nouveau)
21:41pendingchaos: my current implementation doesn't do that
21:42pendingchaos: a statically allocated space is reserved in the driver constant buffer
21:43RSpliet: I thought you could bind up to like 8 constbuffers (on kepler)
21:43RSpliet: Think NVIDIA used one for pointers with OpenCL, and a different one for immediates...? It's a tad hazy, I've only come across it when debugging/optimising OpenCL programs, and found a weird constbuf with a single 64-bit immediate
21:44imirkin: there was an original idea in codegen to have an "immediates" constbuf
21:44imirkin: it was never really implemented, and i may have nuked it completely, since we were out of constbuf space
21:45pendingchaos: up to 16 constant buffers for each stage or something
21:47imirkin: and 14 were necessary for GL
21:47imirkin: +1 for non-ubo uniforms
21:47imirkin: +1 for driver constbuf
21:47imirkin: and we were out :)
21:48RSpliet: imirkin: is there a potential need to re-upload non-ubo uniforms / driver consts on each shader execution?
21:49imirkin: you mean each draw?
21:49imirkin: then yes.
21:49RSpliet: Ehh... I'll bluff that that's exactly what I meant
21:50RSpliet: <- terrible at poker. And GL
21:50pendingchaos: imirkin: why wasn't it put in the driver constbuf? because of the problem of uploading the immediates?
21:52RSpliet: pendingchaos: Is this what the blob does? I recall these uploads appear in mmt traces...
21:52pendingchaos: I don't know what the blob does
21:52pendingchaos: just that it apparently would put immediates in a constant buffer so they could be used without a mov instruction
21:55imirkin: pendingchaos: should work ok. just not the way it was done.
22:15karolherbst: pendingchaos: we could stick those in the driver constbuf or something
22:16karolherbst: RSpliet: 16 on kepler, but only 8 for compute
22:16imirkin: yeah, shouldn't be any trouble. obviously limited in quantity of immediates, but that was always the case
22:16karolherbst: on maxwell+ we can just use const buf 17 or 18 for that though
22:17karolherbst: pendingchaos: maybe mind doing just that for maxwell+ fo now?
22:17karolherbst: then we don't have to mess up the driver const buf
22:17pendingchaos: there's space for 7764 32-bit immediates if the driver constant buffer is used
22:17pendingchaos: you confirmed that there are 18 constant buffers?
22:17karolherbst: should be enough I guess
22:18karolherbst: pendingchaos: not confirmed, but I know that from a source making it highly believeable
22:18karolherbst: switch emulators seem to say there are 18 in total
22:18karolherbst: also the ISA has a 5 bit field
22:20pendingchaos: I think I would prefer putting it in it's own constant buffer
22:20karolherbst: yeah, we just can't do it on earlier gens
22:20karolherbst: afaik nvidia reserves 3 cbs for own usages in GL and 2 in vk
22:21karolherbst: on maxwell+
22:53pendingchaos: I think putting it in it's own constant buffer for compute might be problematic with it's limit of 8 constant buffers
22:53pendingchaos: currently 6 UBOs are available free of emulation using global memory
22:54pendingchaos: putting it in it's own would lower that number to 5
22:55karolherbst: pendingchaos: ohh, right :/
22:55karolherbst: pendingchaos: maybe just do it if we have a buffer available?
22:55karolherbst: then applications with 6 ubos are screwed
22:55karolherbst: for compute
22:55karolherbst: and 14 for graphics on kepler
22:56karolherbst: or maybe we just expose 16 ubos on maxwell and always use the last one if available
22:56karolherbst: pendingchaos, imirkin: we always know at compile time how many ubos are used, right?
22:57karolherbst: uhm... 5 ubos on kepler actually
22:57karolherbst: ... no, 6
22:57pendingchaos: that information should be available
22:57pendingchaos: I think also the bindings?
22:58karolherbst: I mean, we have to know which index to use
22:58karolherbst: or from where to load the data
22:58karolherbst: so yeah
22:58karolherbst: so a compilation would produce one shader and one buffer with immediates
22:58karolherbst: and if there is no buffer, that means we either don't have immediates to load or there is no free constant buffer
22:59karolherbst: pendingchaos: maybe even return an index?
22:59karolherbst: or do we pack ubos?
22:59karolherbst: like if 0,2,3 are used, 1 is free/unused?
22:59karolherbst: don't know exactly how that stuff works out in the end
23:01pendingchaos: I think 1 could be used in that situation?
23:01pendingchaos: I think it would be easy to do this approach on pre-Maxwell
23:01karolherbst: yeah.. just wondering if we have to deal with that
23:01karolherbst: pendingchaos: well, that would work on all gens, that's why I am thinking about it
23:01karolherbst: and we could just expose 2 more ubos on maxwell+
23:01karolherbst: for no reason
23:02karolherbst: that would just mean the compiler has to tell us which ubo slot to use for the immediate buffer