00:04 Lyude: karolherbst: good point, haven't checked (i'm on PTO today)
00:06 imirkin: karolherbst: G200 isn't the most tested, but i wasn't aware of any obvious misrender
00:06 imirkin: there is one class of misrender on nv50, which has to do with weird texture stuff
00:06 imirkin: like textureLod / texture(bias) type things
00:09 Mangix: reading the logs: what is this one driver thing I keep hearing about?
00:09 Mangix: the kernel has various drivers that work on the same hardware
00:23 karolherbst: imirkin: yeah weird.. I am seeing the same thing on most of my nv50
00:23 karolherbst: even on older mesa releases like 21.0
00:24 karolherbst: not sure if gnome-42 just uses some very new feature or not though
01:29 Wally: Is the numbering system for nvif files(ie. clxxxx) the same as the ogk files nvidia provides?
01:30 Wally: specifically the files in drivers/gpu/drm/nouveau/include/nvif in the kernel
01:31 karolherbst: yeah, it's the same as nvidias
01:31 Wally: karolherbst: what files would be used by turing gpus then>
01:33 karolherbst: uhm.. not sure, it's written inside the files somewhere
01:33 karolherbst: but it's easier to just follow mesas code
01:34 Wally: How would I find the corresponding numbers from mesa then?
01:35 Wally: I dont seem to see anything like that in the mesa tree, just refrences to libdrm
01:35 karolherbst: huh.. I thought we had more stuff in there by now.. well, worst case, just grep for TURING
01:36 Wally: Booting up linux drive
01:36 karolherbst: but most of that stuff is quite useless outside of the kernel. and I doubt nvidia just publishes stuff which is supposed to be used from userspace
01:37 Wally: karolherbst: They did thankfully...just barely
01:37 karolherbst: I suspect it's the stuff we already got
01:37 Wally: ogk/src/common/sdk/nvidia/inc/ctrl/
01:38 Wally: karolherbst: Mainly, except for maybe ampere? If thats not already implemented
01:38 karolherbst: yeah.. I don't think there is much new in there.. let's see
01:39 Wally: would the ctrl0080 stuff work with Turing...I see that mesa only refrences cl0080 in the tree...
01:40 karolherbst: that's software implemented I think
01:40 karolherbst: all those starting with 00 that is
01:40 Wally: so thats not gpu specific at all...
01:40 karolherbst: exactly
01:41 Wally: Why are hardware specific things implemented at all then?
01:41 karolherbst: well.. some of that stuff is implemented in hardware, some in the driver or probably in firmware
01:41 karolherbst: the hw ones you can't change obviously
01:42 Wally: karolherbst: no performance hit?
01:42 karolherbst: well.. you could also implement GL and VK in hardware if you really wanted to :P
01:42 karolherbst: the question is, is it feasible
01:42 karolherbst: and does it make sense
01:43 karolherbst: and I suspect for most of them it's just not useful to have it in hardware
01:43 Wally: so ~0%
01:43 karolherbst: well... no idea
01:43 karolherbst: there could be a performance hit, question is, does it matter
01:43 Wally: so someone has to test it....
01:44 karolherbst: you have to test hw even more
01:44 karolherbst: anyway, just because something is implemented in hw doesn't make it faster
01:44 karolherbst: it could even make the whole GPU slower instead
01:45 Wally: so why is all of this implemented in the kernel?
01:45 karolherbst: well.. why do we have a kernel at all
01:45 Wally: To bloat our system?
01:46 karolherbst: ehhh, no?
01:46 Wally: to be a hardware interface?
01:46 karolherbst: nope
01:47 Wally: To make daniel-something have a job
01:47 karolherbst: some of the hw features allow you to access memory by physcal addreses
01:47 karolherbst: so you don't want userspace to touch any of that
01:47 karolherbst: at least not directly
01:47 Wally: oh
01:48 karolherbst: well... the GPU is DMA capable
02:05 Wally: *sigh*
02:06 Wally: Security bugs, worst bugs
09:50 karolherbst: yeah... something is very broken with nv50
12:09 karolherbst: I slowly think it's some random memory issue on the kenrel side... let's see :D
13:55 karolherbst: maybe I'll figure out this stupid nouveau_bo_move_m2mf bug
14:20 karolherbst: yeah okay.. something with fencing on nv50 is just bonkers...
14:22 karolherbst: then let's check if 4.19 works... but that's soo old
14:57 wuniu: Does the nvidia's open gpu kernel nodules work fine?
14:57 karolherbst: not really
14:57 karolherbst: well.. for some devices it does apparently
14:58 karolherbst: but they also said it's currently only for data center gpus
14:58 karolherbst: and that doesn't mean GPUs inside data centre, but "data center gpus" ;)
15:01 wuniu: fk u nv
15:01 karolherbst: well, it's not like it will always stay that way, it's just the first release
15:02 karolherbst: not sure what they said about _when_ all GPUs turing+ are supported
15:02 karolherbst: wuniu: in theory most of the code for desktop GPUs is out there
15:02 Sarayan: it's very riscv though, so gpus before the transition are SOL
15:02 karolherbst: but some ran into issues with 2060 GPUs e.g.
15:02 wuniu: will they support pre-turing gpus?
15:02 karolherbst: or random other reasons
15:02 karolherbst: wuniu: I highly doubt that
15:02 karolherbst: they pushed a lot of the driver into firmware
15:03 karolherbst: and only turing GPUs have that risc-v core to run it
15:03 karolherbst: so no matter what happens, nouveau needs to be maintained for everything up to volta regardless of what happens with nvidias driver in the future :)
15:14 wuniu: ahh, i got it. thank u.
15:43 karolherbst: ehh
15:43 karolherbst: it smells like more data races actually :(
15:52 karolherbst: let's see what kcsan says
16:52 karolherbst: fun.. it works with kcsan enabled
17:20 Mangix: so this open source driver by nvidia. Does it require Nvidia's OpenGL/Vulkan stuff?
17:22 TimurTabi: It's just a partially open source version of the normal driver.
17:27 Wally: The userspace(ie. gl, vk, etc) is proprietary
17:31 Mangix: TimurTabi: sure. My thinking is along the lines along RADV vs. amdvlk
17:38 airlied: Mangix: you could try writing a userspace, but the kernel module doesn't expose a stable API to userspace, so it would be an ongoing struggle to keep up
17:42 karolherbst: also.. it would probably be easier to simply port mesa over
17:42 karolherbst: mesa already runs on top of nvidias driver on the switch for the homebrew folks
17:42 karolherbst: (it's just all downstream)
17:43 karolherbst: I highly doubt that one would need to do anything besides replacing libdrm with a variant talking to the nvidia driver
17:44 karolherbst: but it's a bit pointless anyway
17:44 karolherbst: time is better spent on fixing nouveau instead
18:06 Wally: Mangix: Ive been writting some stuff for libdrm on that...but it would still be better to fix nouveau
18:06 Wally: (and add pm for tu100)
18:06 karolherbst: yeah.. pm is already being worked on though :)
18:06 Wally: yay!
18:10 karolherbst: ehhh.. this nv50 bug :(
18:12 karolherbst: so apparently it's no data race either :(
18:15 karolherbst: mhh but the bug is something like this: hw accelerated ttm copy fails, falls back to sw which is super slow
18:15 karolherbst: slow as in 30s+ slow
18:17 karolherbst: or maybe we don't wait on the hw long enough
18:29 Wally: karolherbst, do you remember the git repo or project name of the guys who run mesa on the nvidia driver?
18:33 imirkin: some switch homebrew project
18:33 imirkin: blanking on the names...
18:33 Wally: I remember it to, just not the name...
18:36 Wally: switch-mesa lol
18:37 Wally: cant find the repo though...
18:56 karolherbst: but that person kind of stopped working on it after creating a new lib with a completely new API to target nv gpus
18:57 Wally: that API isnt that maintained :(
18:57 karolherbst: no shit
18:57 karolherbst: I already mentioned that it's a terrible idea to do that
18:57 Wally: fincs didnt put liscences in some of his files so I cant use a lot of it
18:57 karolherbst: :D
18:57 karolherbst: oh well
18:58 Wally: karolherbst: *But 3mb with nouveau and 800kb with geck3d!*
18:58 karolherbst: apparently saving 0.1% CPU overhead and 1% binary size is more important than working on something useful
18:58 Wally: yeah
18:58 karolherbst: who doesn't love to use a custom API nobody is able to use anywhere else
18:59 karolherbst: Wally: is anything even using that API?
18:59 Wally: Eh
18:59 Wally: Some apps
18:59 Wally: Most arent using any api afaik
19:00 Wally: if they do they are using a mesa port, be it switch-mesa or another port
19:00 karolherbst: "Why isn't this a Vulkan driver?" oh wow
19:02 karolherbst: oh well...
19:06 karolherbst: I think I am mostly disappointed that they showed 0 interested in cooperating or you know, improving nouveau
19:06 Wally: Why is he using nouveau ioctls on something that doesnt support it!
19:06 Wally: they will return NULL!
19:08 karolherbst: they even have downstream patches for codegen they never bothered to even send to us..
19:08 karolherbst: well
19:08 karolherbst: that's not how open source is supposed to work
19:09 Wally: karolherbst: Pretty sure thats just removing gm107 etc.
19:09 karolherbst: mhh, well some changes are also questional, like disabling bound checks :)
19:10 Wally: eh, saved 3 bytes of program space though :)
19:11 Wally: Wally is going to copy the structure of the port, but not any of the ports contents
19:19 karolherbst: awesome
19:19 karolherbst: we are simply too slow
19:20 Wally: Ill also upstream some of the fixes
19:20 karolherbst: cool
19:20 karolherbst: ehh but my comment was to something else :)
19:20 karolherbst: anyway
19:20 karolherbst: we need every help we can get
19:20 karolherbst: and I promise to be nice :D
19:21 Wally: yay!
19:51 Wally: hentai: How much did you pay for that account!
20:02 airlied: karolherbst: did it ever work? like even with access to the nvodia ioctls, putting a mesa driver on top is a lot of work
20:06 airlied: like you need to get userspace cmd submits working
20:08 Wally: airlied: I dont think it did
20:08 Wally: It looks realllllllly like its using nouveau ioctls
20:09 Wally: its a port of nouveau afaik that just barely has some hardware acceleration support
20:09 Wally: (from libdrms generic ioctls)
20:14 Wally: " Also, deko3d has native support for many Kepler/Maxwell performance-enhancing hardware features...such as Zcull, the tiled cache, compressed render targets (decompressed prior to presentation), several optimizations in the shader compiler, and more."
20:14 karolherbst: airlied: yeah, it works
20:14 karolherbst: that's what people use to port GL games to switch
20:15 karolherbst: Wally: it has its own version of libdrm
20:15 Wally: Ah
20:16 Wally: karolherbst: Are they using the nouveau drm driver or the ogk/proprietary one?
20:17 karolherbst: they can't use nouveau
20:18 Wally: Ah! looking through their libdrm_nouveau they are quite clever...
20:19 karolherbst: we are not sooo far off from the prop driver I think
20:19 Wally: Ill just merge their libdrm_nouveau with my libdrm_nvidia then
20:49 Wally: Are these not specific to nouveau?https://github.com/devkitPro/libdrm_nouveau/blob/master/include/nouveau_drm.h
21:02 hentai: Wally, It is mine
21:03 Wally: hentai: When did you register it
21:04 hentai: 10 months ago
21:04 hentai: Also, I have porn, nickperv and ahegao
21:04 Wally: ah
21:04 hentai: And the true gem of my collection: porn
21:05 Wally: eh
21:05 Wally: eww
21:05 hentai: Also, Stallman
21:05 Wally: パンツ なにろ
21:05 Wally: (I think thats how you spell it)
21:06 Wally: hentai: Answer it or give me your account
21:07 hentai: Wally, Not for sale
21:07 Wally: パンツ なにろ
21:07 Wally: ?
21:10 graphitemaster: This is like the days of short ICQ ids
21:11 Wally: lel
21:14 Wally: graphitemaster: I dont think he has any :(
21:24 hentai: Wally, Well, 9 digit is good for me
21:24 Wally: 9 digit?
21:24 Wally: thats a weird color, never heard of it before
21:25 graphitemaster: also name squatting isn't cool
21:25 Wally: rgb value?
21:25 Wally: graphitemaster: it is
21:26 graphitemaster: also re previous discussion about deko3d (which is just nvn btw) not having any users, that's not true xd
21:26 graphitemaster: i wish nv would just release nvn for pc already i fuckin' hate vulkan
21:26 graphitemaster: i'd rather write a renderer for each gpu than use design by committee mess
21:27 Wally: we would have to implement a gallium backend...
21:27 graphitemaster: writing a nintendo switch port of my engine was so easy because nvn is so nice to use, it's less lines of code than my gl 4.6 backend
21:27 Wally: graphitemaster: Do you have docs regarding nvm, or maybe even a header file?
21:28 graphitemaster: nah, i'm bound by nda unfortunately
21:28 Wally: derp
21:28 graphitemaster: just become a nintendo developer if you want access
21:28 Wally: lel
21:28 Wally: sign an NDA if you want access
21:29 graphitemaster: deko3d is pretty much a rip off of the api
21:29 graphitemaster: s/dk/nvn/ for the most part
21:29 Wally: imho libdrm is better than vulkan
21:29 Wally: graphitemaster: does that break your nda?
21:29 graphitemaster: nope
21:29 Wally: k
21:33 graphitemaster: anyways vulkan is slowly becoming usable, the dynamic state extensions and dynamic rendering is getting it closer to being something I wouldn't mind using
21:33 graphitemaster: the only missing piece for me is simplifying synchronization and online shader compilation that doesn't suck (tm)
21:44 Wally: graphitemaster: doesnt suck = suckless ;)
22:11 Lyude: the project?
22:24 i509VCB: if vulkan works you can always use something like wgpu for a lot less hell.
22:25 karolherbst: graphitemaster: vulkan on the switch or?
22:26 karolherbst: ohh vulkan in general
22:26 karolherbst: graphitemaster: well, why would online compilation suck in vulkan?
22:26 karolherbst: people usually underestimate how expensive just parsing text is
22:27 karolherbst: for CL the OpenCL to whatever step usually takes like 90% of the time
22:27 karolherbst: I have tests which have speeds up of around 2000% by just using CL builtins instead of the header versions
22:27 karolherbst: with a cold cache
22:37 RSpliet: Yes, lexing (dialects of) C is painful
22:38 RSpliet: Think that's mainly because you your lexing has to take into account past definitions to work out whether an ID is a typename or a variable name or... so in practice you can't separate lexing from parsing, there's a feedback loop.
22:38 karolherbst: well... yes, but :D
22:39 RSpliet: Guess you could try and do the "working out what an ID is" strictly in parsing, but... yeah that shifts the problem
22:39 RSpliet: Anyway
22:44 RSpliet: Still surprised that building the initial parse tree is 90% of the time though, in general parsers should be O(n), perhaps some extra non-standard parsing complexity comes from C, but compiler optimisations would easily push O(n^2).
22:44 karolherbst: I am just very disappointed that those devkit pro people just you know, got a full GL stack for free basically, and then instead of improving things, just came up with some niche solution nobody really needs or want to use
22:44 karolherbst: RSpliet: complexity stuff is theory, and theory alone
22:44 RSpliet: I don't fully agree with that
22:44 karolherbst: practically everything below O(n log n) is the same
22:45 karolherbst: or other things are way more important
22:45 karolherbst: you parse a string, that's expensive
22:45 RSpliet: If the argument is that parsing is expensive not because of the complexity, but because of fixed overheads like reading from SSD, then I feel like there should be a lot you can win from optimising code
22:45 karolherbst: nah
22:45 karolherbst: it's because it's a string
22:45 karolherbst: why are arrays faster than lists for insertions?
22:46 karolherbst: "but but.. O(n) vs O(1) how can O(n) be faster?" well, because it's faster in the real world
22:47 karolherbst: ehh wait.. I think strictly insertions are O(n/2) for lists and.. ehh
22:47 karolherbst: somethign crappy for arrays
22:47 RSpliet: I don't know what you mean with lists vs. arrays. Linked-lists?
22:47 karolherbst: yeah
22:47 karolherbst: linked lists are terrible
22:48 karolherbst: the only time they are beneficial is, if you iterate and modify a lot
22:48 karolherbst: well.. modifiying _while_ iterating
22:48 karolherbst: but besides that, they are really bad for every op
22:49 karolherbst: and it doesn't change with bigger n
22:49 karolherbst: copying array elements for random insertion is less expensive than inserting elemtns at random positions within lists
22:49 karolherbst: in the "theory" that's not explainable
22:49 karolherbst: it's just something where the real world beats the theory
22:50 RSpliet: Well, not really
22:50 RSpliet: I feel like linked lists are mainly bad performance-wise because the programmer calls malloc for every element they add
22:50 karolherbst: nope
22:50 karolherbst: that's not the problem
22:50 karolherbst: the difference isn't like a few %
22:51 karolherbst: the difference scales exponential or something
22:51 RSpliet: malloc and free quickly starts dominating the cost
22:51 karolherbst: it's still not the problem
22:51 karolherbst: it's one, but not the biggest one
22:52 karolherbst: the big issue are caches
22:52 RSpliet: but you can easily pre-allocate an array and build a linked-list inside the pre-allocated buffer for example, if you have fixed size constraints to work with. You still get relatively poor cache locality.
22:52 RSpliet: But that's something that prefetchers can solve to some degree
22:52 karolherbst: prefetchers suck
22:52 karolherbst: well.. mostly
22:52 karolherbst: that's why arrays are so good
22:53 RSpliet: they work better on arrays and on code :-P in the state of the art they're bad at recognising linked lists
22:53 karolherbst: yeah
22:53 karolherbst: but you don't need a big brain to make them work good for arrays :P
22:53 RSpliet: my colleague did some research into that, "programmable prefetchers". But that won't benefit joe average :-P
22:53 RSpliet: sorry, former colleague @ uni
22:53 karolherbst: anyway... string parsing is expensive
22:54 karolherbst: no matter how you look at it
22:54 RSpliet: it shouldn't be 90% expensive
22:54 karolherbst: well, it's more even
22:54 karolherbst: I've done some benchmarks in CL
22:54 karolherbst: I had some tests going down from 2 seconds to 0.02s
22:54 karolherbst: just because I cached CL C to spir-v
22:54 RSpliet: "should" ;-)
22:55 karolherbst: yeah.. well
22:55 karolherbst: the problem is CL here really
22:55 karolherbst: anyway.. lexing/parsing.. things you don't want to do
22:55 RSpliet: I mean, at this point it's gut feeling, if it's 40% of the time I'd nod and say "yeah, well..."
22:56 karolherbst: what's actually expensive as well is, if the shader becomes complex enough, that nir opt passes are changing a lot of things around :(
22:57 RSpliet: yeah that's where O^2 passes start biting you :-P
22:57 karolherbst: although I don't think we do have many of them
22:57 karolherbst: but at some point I plan to look into why stuff is expensive
22:57 RSpliet: Yeah be nice to get a bit of profiling data
22:58 karolherbst: I did profile some tests, but the problem is, that it's just not well optimized passes
22:58 karolherbst: or well.. relying on the fact that it gets called again
22:59 karolherbst: we might get away with just reruning stuff on the inserted things or something
22:59 RSpliet: Oh and I agree lexing/parsing is a PITA. I taught the basics of like recursive descend and LR(1) and stuff in uni. Falls or stands with the language definition
22:59 RSpliet: And
22:59 RSpliet: well
22:59 RSpliet: it's C
22:59 karolherbst: yeah.. C is just terrible
23:00 karolherbst: I suspect parsing GLSL is less ofa pita
23:00 RSpliet: From that perspective. Yeah.
23:00 karolherbst: C is terrible form every perspective :D
23:00 RSpliet: Still prefer writing it over like ML
23:00 karolherbst: I think the only good thing is, that it compiles to machine code
23:01 karolherbst: yeah.. sure, there are worse languages
23:01 karolherbst: ehh "formal proof" okay, that's actually makes it wors e :)
23:01 RSpliet: Yeah, with C you need to think like a computer architect to get good bytecode out of it. But I happen to think that way :-P
23:02 RSpliet: curro would give me flak for preferring C over ML
23:02 karolherbst: I think everybody relying on "formal proofs" is ignoring reality :D
23:02 karolherbst: it's such a red flag, I always run away from projects advertizing themselves as "formally prooven"
23:03 karolherbst: ML kind of sounds like this kind of langauges
23:03 karolherbst: "there is a formal proof that a well-typed ML program does not cause runtime type errors" yeah well.. I can say this about any well-typed program from any language :P don't need a proof for that just to feel better
23:06 RSpliet: Mhh, nuance. Type safety is mostly a good thing. I might have my history wrong, but I feel like ML dialects were the academic testbed for developing type safety techniques. I feel like Rust is praised a lot currently because it got type safety right, among other reasons.
23:07 karolherbst: ohh sure
23:07 RSpliet: They also tried to retrofit type safety onto C++ or something like that, with mixed results and endless frustration from C devs :-P
23:07 karolherbst: nothing against getting things right
23:07 karolherbst: it just always feels like that if somebody points out that something is "formally prooven" they are either annoying to talk with or think that's in any way relevant for people out there
23:07 RSpliet: (historians, please educate me!)
23:09 RSpliet: Hahahahaha well, to some degree it is important. But ML is also functional ("recursive") rather than imparative, which I just can't cope with. And with me many devs :')
23:09 karolherbst: I mean.. all of that is probably somewhat important, just most know that's a theoretical concept and shut up about it, and those you don't are very unlikeable people :D
23:09 karolherbst: s/you/who/
23:24 karolherbst: ehh
23:24 karolherbst: ttm is sooo annoying :(
23:33 karolherbst: ehh.. maybe our fencing is broken for real