01:54 airlied: zmike: the reason llvmpipe didn't expose 4.6 is because it never passed conformance
10:14 tarceri: karolherbst: using valgrind can be useful for finding pref issues for small things for example a single shader compilation. Dumping the output in KCacheGrind you can visually click though call paths, view the code when mesa is built in debug mode etc
10:24 karolherbst: tarceri: call paths as in it literally records which path was taken through the code?
10:38 tarceri: karolherbst: yes, its the output from the valgrind callgrind tool. You get the number of calls of each function, which other functions called that function and how many times etc. You can quickly order the call list by largest percentage of time spent on the code inside a function. The actual timings might not be as accurate as some other tools but for many instances that doesn't really matter.
10:40 karolherbst: tarceri: well but perf can do that as well?
10:41 karolherbst: like my assumption here is that even cachegrind adds a significant overhead to change performance enough and I'm curious what benefits it has over e.g. perf, which doesn't really have any perf impact
10:43 karolherbst: maybe I should just try it out the next time I'm doing perf profiling and see how it goes
10:48 tarceri: karolherbst: The benefits I find is having accurate call path information and numbers. perf is just sampling so you don't always get a full picture of what is happening, but that might not matter depending what you want
10:50 karolherbst: right... so far the info I got with perf is good enough and I can still see the code and percentages inline how often a path was hit. Though I guess correct path information gives you a better high level overview
11:07 tarceri: If you are just trying to lower the biggest hot paths perf is probably fine for most uses, but a few years ago when valve was really focused on getting shader compile times to as low as they can be I found the callgrind output really useful for tightening up every little bit in the compiler pipeline.
11:09 tarceri: But I wouldn't want to be running anything too long running as its sloooooooow
11:21 pq: The problem and the benefit with stochastic sampling is that it's stochastic.
11:22 pq: btw. do you know of any problems in 'perf' code location recording on e.g. ARM A55?
11:23 pq: like, assigning samples to not exactly the right code location
11:25 karolherbst: tarceri: yeah.. my biggest concern is as it impacts perf characteristics, that it might see something as a hot path that without it wouldn't be, but it sounds like it's reliable enough in that regards?
11:27 pq: karolherbst, cachegrind is really good for stuff that is CPU-bound in the first place. Executing CPU code much MUCH slower than usual does not distort the results.
11:33 tarceri: karolherbst: I haven't really seen anything too bad in that regards no.
12:36 ammen99: karolherbst: I used valgrind because I have used it in the past (successfully), but now I looked at perf and especially the hotspot gui, I believe it has the information I need.
12:36 ammen99: in any case indeed, memsetting is not the slowest part :)
12:37 ammen99: it seems the bottlenecks are vkQueueSubmit2KHR (10% of the time spent there, if I understand the gui correctly)
12:37 zmike: airlied: happily, it doesn't look like any of the remaining CTS fails are from 4.6 features, so this hasn't changed anything
14:03 zmike: mareko: is there any practical reason we couldn't add PIPE_BIND_SAMPLER_VIEW to renderbuffer_alloc_storage ?
14:04 zmike: the lack of it effectively guarantees hitting a "missing bind flag" issue when calling u_blitter
16:36 mareko: zmike: no
16:50 cmarcelo: where can I find logs that margebot refers to in: "This branch couldn't be merged: Failed to push rebased changes. Check bot logs for more details." --- I've rebased the branch.
16:51 valentine: yeah, marge is broken atm for all Mesa MRs
16:51 valentine: not sure yet what the issue is
16:51 cmarcelo: ah, ok. tks!
17:06 zmike: mareko: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40444
17:25 mareko: zmike: I guess the vulkan validation layer failed
17:29 mareko: zmike: there is no practical reason for PIPE_BIND_SAMPLER_VIEW to be optional
17:40 zmike: mareko: actually it's for llvmpipe to shut up the error spam there
17:40 zmike: I'm wondering if SAMPLER_VIEW bind should just become implicit
19:31 airlied: zmike: but at one point we didn't advertise the GL level without conformance or printing a warning
19:32 airlied: zmike: 4.5 is conformant to the CTS that was released at the time, just newer tests added started to fail and nobody fixed them
19:32 zmike: I'm not sure what you mean
19:32 zmike: I just ran 4.6 this morning and it's clean
19:32 airlied: oh okay, maybe it all got magically fixed, want to submit a conformance report :-)
19:32 zmike: I do not
19:32 zmike: but I will fix more bugs
19:56 airlied: I've got a vague feeling down in one of the cts-runner corners something falls over
19:56 airlied: but I don't have a spare machine right now to leave running it
19:58 zmike: I can try it out on my multithread runner tomorrow