17:15 butterypancake: *it'd be nice to use their normal kernel, and not the actual normal kernel
17:16 imirkin: yeah, just pointing out options
17:16 imirkin: if you want accel on GM20x+, you need to load firmware from linux-firmware
17:16 imirkin: that firmware is easily decompiled (the ISA is well-known, at least by nouveau folks)
17:16 imirkin: but it can't be changed, since it must be signed
17:17 butterypancake: so there is the nvidia-firmware package (speaking from experience on arch linux) but will that not work with the linux-libre kernel?
17:17 imirkin: not sure what nvidia-firmware might contain
17:18 imirkin: it might contain the video decoding firmware
17:18 butterypancake: video decoding isn't supported on NV126 though...
17:18 imirkin: the nvidia firmware is in linux-firmware
17:18 imirkin: correct
17:18 imirkin: but the package doesn't know that ;)
17:19 butterypancake: so the nvidia-firmware to do video game graphics accerleration is already in the linux kernel?
17:19 imirkin: yeah, the nvidia-firmware package in arch appears to be the bundling of my extract_firmware script, which goes digging for video decoding accel firmware in the blob package
17:20 imirkin: nvidia ships nouveau-friendly firmware as part of linux-firmware
17:20 imirkin: (usually years after release of the GPUs, but that's neither here nor there)
17:20 butterypancake: so I can actually ignore that package, just switch to the normal linux kernel, and enjoy better minecraft?
17:21 imirkin: you'd have to install linux-firmware, but yeah
17:21 imirkin: (or at least the "nvidia" directory of that)
17:21 imirkin: of course you'd still be getting a small fraction of the perf available with your GPU
17:21 imirkin: since the firmware nvidia supplies doesn't enable us to change clocks on those GPUs, and they tend to boot to very low clock speeds
17:22 butterypancake: some youtube video showed my card (GTX 960) being 10x slower on nouveau vs properitary driver
17:22 imirkin: [although clock-for-clock, we also have worse perf than nvidia blob. but when the clocks are 10% of what they could be, we _really_ don't stand a chance]
17:22 imirkin: yeah, 10x sounds right
17:22 imirkin: i'd believe anything between 10 and 20x
17:23 imirkin: turns out memory speed and core clock speed actually affect performance. who knew.
17:23 imirkin: there are actually some somewhat experimental patches that make reclocking work on GM20x's, but you better have a good cooling solution in place, coz we can't adjust fan speeds ourselves.
17:24 butterypancake: so I currently have two GTX960s in the case with the SLI bridge on them. I get that nouveau wouldn't let me use both, but would having both in the case do anything bad?
17:24 imirkin: nope
17:24 imirkin: we don't support the SLI bridge, so you'd just have 2 GPUs in your system
17:24 butterypancake: I could run two minecrafts!
17:26 butterypancake: I do wish I could just buy modern hardware that would work without firmware...
17:26 karolherbst: butterypancake: if you have only FullHD displays you could use one GPU for desktop and the other via prime offloading
17:26 karolherbst: should come with a small perf impact, but normally you shouldn't really notice
17:26 butterypancake: prime offloading?
17:27 karolherbst: butterypancake: he.. in regards to firmware: modern hardware usually moves towards moving needing more firmware rather than less
17:27 karolherbst: turns out, it's simplier to just implement things in microcode than to open source stuff
17:27 karolherbst: butterypancake: yeah.. usually on laptops you can use DRI_PRIME env variable to offload work to a different GPU
17:27 karolherbst: smae works on desktops
17:27 karolherbst: *same
17:28 butterypancake: karolherbst: not always. modern audio interfaces use a standard usb audio interface so they can work with iPads. There are very specific cases when we move to no firmware
17:28 karolherbst: you could use one for the desktop and the other for gaming to make alt+tab more pleasent or whatever
17:28 imirkin: butterypancake: just call up nvidia and buy a few million units, i'm sure they'd make something happen for you
17:28 karolherbst: butterypancake: well.. for hot pluged devices you can't use firmware, but that was true even 20 years ago
17:28 karolherbst: (and onlt printers really required OS drivers)
17:29 karolherbst: *only
17:29 butterypancake: karolherbst: I thought udev was for allowing hotplug firmware stuff...
17:29 karolherbst: but requiring stuff running on the devices actually is more and more common, simply also because it becomes really cheap
17:29 karolherbst: butterypancake: still, you require firmware
17:30 karolherbst: but USB devices usually do not as it's already on the devices
17:30 butterypancake: actually, most audio interfaces that don't claim they work with iPad need firmware even though they are USB
17:30 butterypancake: *protip if you every get into audio :P
17:30 karolherbst: audio is a completly different world of pain anyway :p
17:31 karolherbst: but I doubt it's about firmware
17:31 karolherbst: just the OS not supporting all devices
17:31 karolherbst: you do need drivers, but that's something entirely different to requiring firmware
17:32 butterypancake: oh you're right. Damn. Everything is so complicated :P
17:32 karolherbst: anyway, each device which supports updates has embedded firmware and these days all devices can be updated :p
17:32 butterypancake: wait, what is the difference between firmware and drivers. Now that I think about it I don't actually know
17:33 karolherbst: drivers are modules to the kernel running on the CPU
17:33 karolherbst: firmware is usually a blob running on the device in some form
17:33 butterypancake: so in nouveau, the first thing it does is give the card the firmware? The card doesn't store the firmware persistantly?
17:33 karolherbst: (and there are things in between like the vbios which contains CPU code for BIOS/UEFI being able to initialize the GPU, scripts or simply tables for the driver to read out)
17:34 karolherbst: for nvidia GPU there are two things: vbios which is more of being able to proide device specific information to the OS
17:34 karolherbst: _and_ there are firmwares the driver needs to upload to the GPU and run on several engines to do stuff
17:34 karolherbst: like power management or context switching
17:35 karolherbst: so there are two different kind of firmware: os owned and device owned
17:35 butterypancake: that makes a lot of sense. So now I know how the GPU manages to do the signature check, because it's the one running the binary
17:36 karolherbst: stuff like signature checks are usually done through embedded firmware the OS can't change (for obvious reasons)
17:37 karolherbst: on nv gpu I don't even think you can update those
17:37 butterypancake: but that embedded firmware is on some NAND chip I can peel off the pcb if I really felt like it.
17:37 karolherbst: you can flash a new vbios though, but that's a separate thing
17:37 karolherbst: butterypancake: sure
17:37 imirkin: i wrote a long post on phoronix about this a while back
17:37 butterypancake: imirkin: I'd love to read that!
17:38 imirkin: https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-source-nvidia-linux-nouveau/998310-nouveau-persevered-in-2017-for-open-source-nvidia-but-2018-could-be-much-better?p=998427#post998427
17:38 imirkin: it doesn't address your questions exactly
17:38 imirkin: but it does provide a bunch of info
17:38 imirkin: about the whole situation
17:41 imirkin: bbl, errands
17:43 butterypancake: imirkin: I certianly don't understand everything in that post, but it's a good read. Thanks for sharing!
18:32 karolherbst: imirkin: btw, if there are no CTS regressions, I plan to merge the shader cache for nvc0 next week. right now doing some benchmarks with deqp running like all GLES/GL tests https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4264
18:37 karolherbst: we also have an MR for nv50, but the changes in nv50_program.c are bigger and more annoying to review :/ https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6141
18:37 karolherbst: (contains the nvc0 stuff as well for now)
18:38 imirkin: karolherbst: ok, as long as you're pretty sure it doesn't break the "variant" stuff
18:39 karolherbst: imirkin: you mean the fixup stuff?
18:39 imirkin: yes
18:39 karolherbst: I doubt it as the fixups are also serialized
18:39 karolherbst: but yeah.. let's see how the testing goes
18:39 imirkin: the thing is that testing may not really hit those cases
18:39 karolherbst: the CTS does
18:40 imirkin: ok
18:40 karolherbst: well.. at least it hit it enough so I had to fix it for volta :)
18:40 imirkin: well, you had to IMPLEMENT it in volta
18:40 imirkin: but yeah
18:41 karolherbst: anyway, I wanted to merge it right after the branching so we have a lot of time for testing
18:42 imirkin: ok
18:42 imirkin: karolherbst: i'm thinking of dropping ES3 on G80-G98
18:42 imirkin: because they don't support pause/resume xfb
18:42 imirkin: what do you think?
18:43 imirkin: it's a little bit odd, since ES3 doesn't have GS, and without GS it should actually be achievable
18:44 karolherbst: yeah.. makes sense. Maybe there are those appications upporting ES3 in case 4.x is not available?
18:44 karolherbst: at least it would help with webgl
18:45 imirkin: sorta :)
18:45 karolherbst: maybe
18:45 imirkin: at least firefox bails when ARB_tf2 isn't supported
18:45 karolherbst: ahh
18:45 karolherbst: chromium is more complex in this regard I think?
18:45 imirkin: coz that's what brings the pause/resume support in
18:45 karolherbst: but I guess it wouldn't do WebGL2 on top of GL3.3 either
18:45 imirkin: exactly
18:45 imirkin: it needs GL3 + ARB_tf2 + some other stuff
18:46 imirkin: so anyways, i'll think about how i want to handle it. but as-is, we fail lots of ES3 tests on those GPUs due to lack of pause/resume support
18:46 imirkin: nva0+ have pause/resume support, which is nice
18:46 imirkin: (and we expose ARB_tf2 on those)
18:47 karolherbst: we could make it nva0+ for now
18:47 imirkin: right
18:47 imirkin: so this is my plan: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6245
18:47 karolherbst: mhhh
18:48 karolherbst: yeah.. I am just wondering if too many will complai about thinkgs breaking out of the sudden
18:48 imirkin: one way to find out :)
18:49 karolherbst: is there a way to split it out in CAPS or something?
18:49 imirkin: i can also look into adding the fake support which doesn't work GS
18:49 karolherbst: so.. on nv50 we do support it just not for GS, right?
18:49 imirkin: right, so we could add a PAUSE_RESUME_ES3 or whatever thing
18:49 karolherbst: yeah.. maybe that's a better idea
18:49 imirkin: which would ONLY be valid without GS
18:49 karolherbst: or add an _GS variant
18:49 imirkin: but that still requires implementation in nv50
18:50 imirkin: well, it's not just GS - any intermediate stage which produces an unpredictable number of primitives
18:50 karolherbst: right
18:50 imirkin: but those aren't possible in ES3 (without ES 3.1 + extensions)
18:50 karolherbst: maybe we could turn it into a shader cap?
18:50 imirkin: lol no
18:51 imirkin: it has nothing to do with shaders
18:51 imirkin: anyways, i'll think about it.
18:51 karolherbst: I am wondering if one implementation could expose geometry shaders through an extension by still not supporting the pause/resume stuff with them.. mhh
18:52 imirkin: not in ES3
18:52 imirkin: we could make it a condition of the cap
18:52 imirkin: OES_geometry_shader requires ES 3.1
18:52 karolherbst: no.. I mean.. supporting ES3 but not ES3.1, but still exposing geometry shaders through ES
18:52 imirkin: does not currently exist.
18:53 imirkin: and doubtful it ever will
18:53 karolherbst: ahh, I see
18:53 karolherbst: so we couldn't expose OES_geometry_shader on nv50 anyway
18:53 karolherbst: or could we for nva0+?
18:53 imirkin: not legally, at least
18:53 imirkin: no, we'd need to support ES3.1
18:53 karolherbst: okay
18:53 karolherbst: yeah. I guess then that's fine
18:53 imirkin: now there's an argument that coudl be made that we can actually support ES3.1 on those chips
18:53 imirkin: but we're currently far from that
18:54 karolherbst: right.. but then we need to support pause/resume with GS anyway
18:54 imirkin: right.
18:54 karolherbst: so it would be nva0+ only
18:54 imirkin: i was going to look into how far we are on compute
18:54 imirkin: we might not be that far actually
18:55 karolherbst: yeah.. would be good to get the image stuff tested
19:02 imirkin: i'll be happy if ssbo works :)
19:02 karolherbst: I guess those do work :p
19:02 imirkin: they _should_
19:02 imirkin: but do they? who knows.
19:03 karolherbst: otherwise pmoreau would have complained about busted global memory
19:03 karolherbst: :D
19:03 imirkin: different interfaces, etc
19:03 imirkin: clover vs gl
19:03 karolherbst: but yeah.. maybe something about pushing bounds or something
19:03 karolherbst: yeah..
19:03 imirkin: and there are a lot more depq ssbo tests, i'd guess
19:03 karolherbst: imirkin: btw, I did an initial implementation of the multithreading fix without reworking the driver at all and I was able to get rid of all races on the pushbuffer, but this was more of a test if I finally understood libdrm now and see where the issues are. But that ends up showing that our fencing is busted as well and I need to have a good plan on how to make that part race free as well
19:03 imirkin: atomics etc
19:04 karolherbst: at least I managed to also run all threaded deqp EGL tests without crashing
19:07 karolherbst: but I already forgot on how it was busted.. but I do remember it was in an annoying way
19:08 imirkin: usually is.
19:10 karolherbst: skeggsb mentioned it might be easier if we would use bufctx for everything and never use pushbuf_refn
19:10 karolherbst: or at least.. a similiar concept
19:11 karolherbst: he also mentioned that mesa should have never used pushbuf_refn :)
19:14 imirkin: so i remember one annoying thing was fence_work doing stuff with bo's
19:15 karolherbst: yeah
19:15 karolherbst: something like that
19:15 karolherbst: but I also got into corrupting nouveau_mm
19:15 imirkin: that's nice too :)
19:16 karolherbst: got mm_slab.free to be bigger than mm_slab.count :)
19:16 karolherbst: ehh.. less fails with the cache enabled
19:17 karolherbst: or wait...
19:17 karolherbst: did I count correctly?
19:17 karolherbst: ahh no.. it's identical
19:20 karolherbst: imirkin: anyway, it feels like fixing the fencing/buffer tracking/whatever things first might be a good start and easier to review than a "fix everything at once" thing.. I just don't know much about the details on why pushbuf_refn is bad and bufctx_refn is good and if that actually helps or not...
19:20 karolherbst: but anyway.. would be a start
19:20 imirkin: pushbuf_refn isn't bad.
19:20 imirkin: it just has to be done with some slight care
19:21 imirkin: which we take, i believe
19:21 imirkin: i fixed all those issues a while back
19:21 imirkin: by doing nouveau_pushbuf_space with the appropriate values
19:21 karolherbst: the painful part is that pushbuf_refn can kick the pushbuffer
19:21 karolherbst: which we really don't want
19:21 karolherbst: right...
19:22 karolherbst: I fixed it locally by just kicking at certain points so pusbuf_refn would never end up kicking
19:22 karolherbst: but.. that's still a bit annoying
19:23 karolherbst: if bufctx_refn removes the needs for locks eg as it doesn't touch the pushbuffer I'd kind of prefer to go with that and I guess skeggsb has a good reason if he says that mesa wasn't supposed to use pushbuf_refn anyway
19:23 karolherbst: and I guess it's a mix of that and other bits
19:28 imirkin: push_refn should NEVER kick
19:28 imirkin: we have pushbuf_space's right in front of it that make sure it doesn't
19:28 imirkin: if that doesn't work, we're in some other sort of trouble
19:40 AndrewR: ..anyone saw nouveau 0000:01:00.0: DRM: core notifier timeout followed by nouveau 0000:01:00.0: DRM: base-0: timeout on recent linux git, on nv92 after I try to to chnage away from X to VT .... (image frozen, but I can return back by ctrl-alt-f7)
19:40 imirkin: AndrewR: there were massive changes to the whole display pipeline
19:40 imirkin: in theory they should have been no-op changes
19:40 imirkin: however in practice, many lines of code were changed :)
19:41 imirkin: you should try bisecting it
19:41 AndrewR: imirkin, most likely not today ... but yes, I saw those little commits ... some fun for future ....
19:47 karolherbst: imirkin: but then space can kick
19:48 imirkin: karolherbst: yes, but it happens before the refn
19:48 imirkin: the refn can never kick
19:48 karolherbst: yeah.. but that doesn't help
19:48 imirkin: (the way we have it)
19:49 karolherbst: the idea was that refing bos shouldn't kick, if it's a pushbuf_space before the pushbuf_refn or the pushbuf_refn itself doesn't really matter
19:50 karolherbst: I'd like to get away from pusbhbuf_space and all implicit submissions and enforce that we have to kick from mesa, as this would make it all more sane (and removes the need to even care about implicit pushes)
19:50 karolherbst: if bufctx_refn gives us that, then I'd say it's a win
19:50 karolherbst: (and makes it much easier to spot bugs)
19:51 karolherbst: I'd even want any locking to be runtime verifiable as well and assert on the locking code, etc... so we can spot bugs before it ends up at users machines
19:51 imirkin: it really matters because it gets the bo moved into the place where it should be.
19:51 karolherbst: but that's a different topic
19:51 imirkin: bufctx is a giant pain to use
19:51 imirkin: and is more useful for long-term things
19:51 karolherbst: ahh
19:51 imirkin: the refn stuff is for one-time usages.
19:51 karolherbst: yeah.. makes sense
19:51 karolherbst: but the implicit pushing is still annpoying
19:51 imirkin: that's why we have the pushbuf_space
19:51 imirkin: we have tons of PUSH_SPACE all over the place
19:52 karolherbst: what's so annoying about bufctx? maybe we could make it easier to use or rework bits to make it less annoying or soemthing
19:52 imirkin: so stuff kicking in the middle is always a possibility
19:52 karolherbst: yeah.. I'd like to remove all of those :p
19:52 karolherbst: (if possible)
19:52 karolherbst: I know that for fences there is the need to do that eg..
19:52 imirkin: well, as part of removing them, you can use whatever solution for the pushbuf_refn's too
19:52 imirkin: it's the same thing.
19:52 imirkin: another non-obvious thing, potentially
19:52 imirkin: is that BEGIN_* actually includes a PUSH_SPACE in it
19:52 imirkin: HOWEVER
19:53 imirkin: if you have some #define set, which we do in the query files, then it doesn't
19:53 karolherbst: yeah, I saw that
19:54 karolherbst: what I'd like to ahve is a PUSH_ACQ thing or so, which "allows" accesses to pushbuffers + records all implicit pushes or soemthing and a PUSH_DONE to mark the point where we are done with it, so we know which pushbuffer belongs to the same "thing" be it a draw, or grid launch or whatever
19:54 karolherbst: and at this point a PUSH_SPACE doesn't really make any difference anymore
19:55 imirkin: well, some sequences of commands MUST be in the same pushbuf
19:55 karolherbst: as we have one context/thread writing a buffer and everybody else either waits or writes their own
19:55 karolherbst: right
19:55 karolherbst: and I think for that it's fine to use SPACE
19:55 karolherbst: but it's more of a "this has to belong together"
19:56 imirkin: well, all the BEGIN_* stuff *has* to be together
19:56 karolherbst: in my WIP branch I have one global pushbuffer with global locking, but it also asserts if something doesn't claim ownership, so I am sure nothing adds stuff while another context does _validate
19:56 imirkin: you can't have a command sequence spread over multiple buffers
19:56 imirkin: like if you have BEGIN_NVC0(foo, 10)
19:56 imirkin: then all 11 dwords MUST be in the same pushbuf
19:56 imirkin: or else boom.
19:57 karolherbst: with _validate I meant nvc0_state_validate
19:57 karolherbst: imirkin: right, and that's fine
19:57 karolherbst: that SPACE before _REFN just sounds more of a workaround than a technical requiernment
19:57 imirkin: and sometimes even "unrelated" things must be together, but that's very rare
19:57 imirkin: let me find examples, hold on
19:58 imirkin: nve4_p2mf_push_linear
19:58 imirkin: so like
19:58 imirkin: that whole thing inside the while loop must be in the same thing
19:58 imirkin: or else fail
19:58 imirkin: i think
19:58 karolherbst: ahh
19:58 karolherbst: right
19:58 karolherbst: that makes sense
19:59 imirkin: if it's not that one, then there's another one like it
19:59 karolherbst: I wouldn't be surprised if that's a general requiernment of p2mf/m2mf
20:02 karolherbst: imirkin: in regards to bufctx.. maybe we should add a _ONETIMEUSE bin and just always clear it?
20:02 imirkin: yeah, we have a TMP one i think
20:02 imirkin: we use it in some places where push_refn would be a pain
20:02 karolherbst: there is no nvc0 one at least
20:03 imirkin: let me check
20:03 karolherbst: maybe it's SCREEN?
20:03 imirkin: NVC0_BIND_3D_VTX_TMP
20:03 karolherbst: ohh
20:03 karolherbst: missed that one
20:04 karolherbst: imirkin: but I was more thinking about a TMP one we could just reset inside kick_notify or so...
20:04 imirkin: bins are cheap.
20:05 karolherbst: right
20:05 karolherbst: but that was more of an idea to make one time buffers easy to use with bufctx as well
20:07 karolherbst: at least now I've got an idea what I could try out ... let's see how that goes
20:10 imirkin: the whole draw flow is pretty complicated
20:10 imirkin: takes a while to get your head wrapped around it
20:10 karolherbst: yeah.. I noticed
20:10 imirkin: like the kick notify callback gets set to something weird while a draw is happening
20:11 karolherbst: nvc0_draw_vbo_kick_notify :)
20:11 imirkin: yeah
20:11 imirkin: anyways, requires some care.
20:11 karolherbst: I am wondering if we could get rid of it somehow
20:15 karolherbst: imirkin: btw assert(mtx_trylock(&lock) == thrd_busy); is a fun way of abusing locking funtions to verify correctness :)
20:15 imirkin: heh
20:23 karolherbst: mhh. a 10% speed improvement in deqp with the caching
20:24 airlied: yeah caching is great for deqp
20:24 karolherbst: sadly when having no initial cache it doesn't help at all :/
20:26 airlied: it should have some effect
20:26 airlied: since a lot of gles tests use the same vertex or fragment shaders repeatedly
20:27 karolherbst: airlied: nothing significant
20:27 karolherbst: but yeah.. with intel I think I saw bigger differences...
20:27 airlied: you should check the cache has hits :-P
20:27 karolherbst: so maybe something odd going on
20:27 karolherbst: dunno
20:27 karolherbst: airlied: yeah.. maybe
20:28 airlied: the cache does only wr ite to disk in a background thread I think that is scheduled low pri
20:28 karolherbst: ahhh
20:28 karolherbst: maybe that's why
20:28 airlied: but I assume it gets some cpu time
20:28 airlied: but mybe not if you smash it wht parallel runner
20:28 karolherbst: airlied: do you know if iris also has a in memory cache like radeonsi?
20:29 airlied: nope don't think so, and I think marek was considering dump the radeonsi one
20:29 karolherbst: ahh
20:29 airlied: and just keeping the live and disk ones
20:30 karolherbst: yeah.. I guess applications usually don't do dump things :p
20:30 karolherbst: *dumb
20:49 karolherbst: ohhhh
20:51 karolherbst: I thought the blitter wasn't cached :) but it actually is
20:57 imirkin: it used to be super-cached
20:57 imirkin: ben recently redid some stuff, not sure if it's still as-cached as it was
21:19 karolherbst: ufff.. something is weird
21:19 karolherbst: with nir the cache grows up to 1.3G
21:19 karolherbst: with TGSI only 600M
21:20 karolherbst: seems like something is off
21:21 karolherbst: airlied: btw, iris does have an in memory cache
21:22 karolherbst: ohhh....
21:22 karolherbst: nir shaders have names
21:22 karolherbst: *ugh*
21:24 karolherbst: also I guess variable names and shit
21:26 karolherbst: ahh last argument to nir_serialize needs to be true
21:26 karolherbst: not false
21:31 karolherbst: airlied: btw, do you know if we ever identified a bug as being a cache collision?
21:31 karolherbst: I am not sure if I trust sha1 enough on that
21:38 imirkin: skeggsb: did you ever hear back on the gpio thing from nvidia btw?
21:43 karolherbst: huh.. that's a new
21:43 karolherbst: nouveau messes up, wayland compositor crashes?
21:43 karolherbst: (running on intel_
21:43 karolherbst: )
21:58 airlied: karolherbst: it would be pretty amazin to get a sha1 cache collision on structured data that makes sense
22:06 imirkin: airlied: could probably be engineered though
22:06 imirkin: even within the structure of that data
22:08 HdkR: xxhash ftw?
22:08 HdkR: 13 days ago it just released the new faster version as well :)
22:11 airlied: imirkin: the engineering sha1 collisions were all pretty nuts in terms of data being any ways similiar
22:12 airlied: not sure what you'd achieve engineering a mesa shader cache collision, other than being correct about something on the internet, rather than anything that matters
22:13 imirkin: airlied: yeah, more like demonstration that it's possible
22:13 imirkin: HdkR: i think some parts of mesa already use xxhash
22:14 HdkR: aye
22:14 HdkR: Probably worth updating it to the new XXH3 hash even :P
22:23 imirkin: karolherbst: do you remember where pmoreau's nv50 tree is?
22:26 imirkin: aha, found it i think
22:27 karolherbst: airlied: well... we already have that for sha1 in general :p
22:27 karolherbst: sha1 is quite busted
22:28 karolherbst: although I am sure it's not as bad as md5 :D
22:28 karolherbst: but it's more about random issues
22:29 karolherbst: like one in a million user having some crash because a shader collision
22:29 karolherbst: *cache
22:54 imirkin: pmoreau: if you want to make a patchset with just the resource stuff for nv50, i can review it (i.e. properly handling the binding stuff)
22:54 imirkin: pmoreau: i'd encourage you to try to copy what nvc0 does as much as possible. i think most of the same concepts carry over, including having shared resources and having to rebind stuff
22:55 Lyude: skeggsb: btw, if you're around (otherwise I might figure it out while you're gone) how exactly do the GET/PUT methods for evo/nvd work? Is it something like PUT == start running display methods at this address, and GET == which method the display controller is currently executing?
22:56 Lyude: wondering since it looks like we don't rewind the push buffer correctly when we run out of space
22:58 airlied: karolherbst: there's a big dfifferent between, it's practical to engineer a collision and random collisions of structured data exist
23:05 skeggsb: imirkin: no, not yet unfortunately
23:05 karolherbst: airlied: right... it's just that guarding against those collisions is just super cheap
23:05 skeggsb: Lyude: it'll process commands until GET==PUT, and yeah, there's something fishy there that i haven't tracked down yet either
23:06 skeggsb: not yet sure if it's a dumb mistake on my part, or the weirdness we encountered last time we tried to do this
23:08 karolherbst: people also claim that ccache doesn't hit collisions, but it actually does quite often... true, it uses md4 which is even more busted, but bugs exist and I don't want to annoy users by those random issues nobody has any clue why it happened
23:09 karolherbst: the annoying thing about those bugs are that "trying again" doesn't retrigger the issue so most don't even report those
23:10 airlied: karolherbst: wierd I hadn't heard of real world ccache issues, I've heard of bugs in ccache and times were 0 byte files get considered as valid
23:11 karolherbst: airlied: we even got 3 I remember in #dri-devel :p
23:11 karolherbst: one was even reproducible
23:11 karolherbst: git checkout to a different commit triggered it
23:11 karolherbst: was a bug in ccache and not related to md4 being weak though
23:12 airlied: yes as I said ccache bug
23:12 airlied: not hash collision
23:12 karolherbst: right, but mesa also has bugs
23:12 airlied: and they won't be in the sha1 colloiding
23:12 airlied: so using a different hash won't fix them
23:13 Lyude: skeggsb: gotcha-I'm bored so I might as well try taking a look
23:13 airlied: like a hash that isn't cryptographically secure doesn't mean it's not a good hash
23:13 karolherbst: ohh the ccache bug I remember there was a hash collision
23:13 karolherbst: just ccache failed to create the hash properly
23:13 airlied: https://lists.samba.org/archive/ccache/2004q4/000149.html
23:14 airlied: is a good analysis
23:14 airlied: but if ccache ends up summing a 0 byte file because the input values choked, then it doesn't matter what the hash is
23:14 karolherbst: the annoying thing is not the hash being secure or prone to attacks, but random user bugs
23:15 karolherbst: we could mess up creating the hash (even though I kind of doubt it as we create it from the serialized shader)
23:15 karolherbst: airlied: it wasn't like that
23:15 karolherbst: also, if you have a million users, "random" become less random
23:15 karolherbst: it's just that it only hits a few users
23:16 karolherbst: the problem is just, it's super annoying to blame ccache on things as 1. there are people claiming "it's not a ccache bug" and just not even think about it
23:16 airlied: karolherbst: you'll see more problems from bits from the sun
23:16 airlied: and bad RAM
23:16 karolherbst: airlied: with sha1 maybe
23:17 karolherbst: the point is, it's quite cheap to check
23:17 karolherbst: and I'd rather have a "cache collision" in the log than a "dunno what happened there"
23:17 karolherbst: even _if_ the fs gets corrupted and messing with the cache
23:17 karolherbst: we at least would know
23:18 karolherbst: also md4 is just busted
23:18 karolherbst: and the 2004 analysis just plain wrong :p
23:18 karolherbst: or rather is based on wrong assumptions
23:19 karolherbst: if you assume your application is bug free, then yes, you can just say "the hash is safe"
23:19 karolherbst: but 1. it's not and 2. data corruption exists
23:20 airlied: you must get strangely structured data corruption
23:20 karolherbst: maybe
23:20 airlied: you got some wierd magneto powers?
23:20 karolherbst: but I am not talking about a single machine
23:21 karolherbst: I am talking about millions of machine (dunno how many users mesa actually has, but maybe it's not that far of)
23:21 airlied: you know valve already rolled out a big system that runs this stuff across lots of steam users
23:21 karolherbst: and if you got millions, that things can become quite likely
23:21 karolherbst: *those
23:21 airlied: that's just bad stats
23:22 karolherbst: yeah... maybe it won't ever happen
23:22 karolherbst: but that's why I was wondering if it actually happened
23:22 karolherbst: but maybe nobody ever checked so "we don't know
23:22 karolherbst: "
23:22 airlied: if there was a decent sha1 corruption that causes a collision, there'd be papers
23:22 airlied: researchers would be all over it just to get a headline
23:23 karolherbst: sha1 is already broken though
23:23 airlied: again you say that like it matters in this use case
23:23 airlied: broken crptographically yes
23:24 karolherbst: maybe it's just me thinking that collision detection is just something which needs to be done and is actually not required.. dunno
23:24 karolherbst: but again.. checking is also fairly cheap
23:25 airlied: I expect it's just you and you it's not worth the effort
23:25 karolherbst: it's no effort
23:25 karolherbst: that's the point
23:25 airlied: though uf you really believe it's a big problem, then you should bring it up for the mesa cache
23:25 airlied: not just nouveau
23:26 karolherbst: the issue is, I have no idea if it's an issue, hence I am wondering if it actually happend at some point
23:26 karolherbst: if it would be something expensive to add, sure, then I'd see we could just ignore it
23:26 airlied: ask valve maybe, they've ran a lot of shaders through the cache
23:26 karolherbst: but it's actually not expensive
23:27 airlied: but I expect the argument is it's not a real world problem
23:27 karolherbst: yeah, probably
23:27 karolherbst: it's more of a "random bug happened" and I want to be sure it's not a cache collisions happening for whatever reason
23:28 karolherbst: maybe a system even crashed and thing just got stored in weird ways.. who knows
23:28 karolherbst: I am fairly sure that won't happen on my system, but if you got millions user those probabilities shift quite a lot
23:29 karolherbst:maybe should do the math and see where it gets me
23:51 karolherbst: here are some data for nvc0: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4264#note_593246
23:52 karolherbst: wondering how much it would help "shadow warrier", that game took like 5 minutes to load without any caches
23:58 karolherbst: anyway, we can improve that later :)