00:41 rhyskidd: mwk: thanks for the envytools review
01:06 Llmiseyhaa: So looks like the my GT730 likes to sit at ~46C idle.
01:08 rhyskidd: Llmiseyhaa: sounds reasonable
01:13 Llmiseyhaa: Yeah, that's what I thought.
09:03 paulk-gagarine: hey there, I see that you have updated the tasks lists for nouveau -- I'd like to suggest working on the GK20A firmwares so that the whole graphics pipeline can run with free software
09:04 paulk-gagarine: (which I care about a lot, as I'm very interested in running the Tegra K1, either on the Jetson or Chromebooks, with free software)
09:15 karolherbst: paulk-gagarine: yeah, you can do that. I am not entirely sure what is wrong with our kepler firmware on the tk1
09:37 paulk-gagarine: karolherbst, yeah I'd really like to figure it out, but never had the time to
09:37 paulk-gagarine: karolherbst, should I submit the idea anywhere in particular?
10:37 karolherbst: paulk-gagarine: well it could be a more general task, because we also use close firmware for videl acceleration on most boards
10:37 karolherbst: I keep that in mind
12:12 NanoSector: so, last revision is building now
12:12 NanoSector: it's given me way more good builds than bad ones though so i'm not convinced this is reliable
12:19 NanoSector: karolherbst, 0be75179df5e20306528800fc7c6a504b12b97db is the first bad commit
12:19 NanoSector: https://github.com/torvalds/linux/commit/0be75179df5e20306528800fc7c6a504b12b97db
12:28 NanoSector:compiles that and tests
12:32 NanoSector: yeah but that commit is fine as well :\
12:34 NanoSector:tries a bisect on drivers/gpu/drm/nouveau between 4.11 stable and 4.12 stable
12:57 karolherbst: NanoSector: :/
12:58 karolherbst: it might be always the case you did a mistake somewhere
12:58 karolherbst: most likely your last bad is good as well
12:59 NanoSector: i don't think i made a mistake
13:00 NanoSector: but i do builds in 10 minutes now (make localmodconfig helps a lot)
13:00 NanoSector: although
13:00 imirkin: paulk-gagarine: it's quite likely that the nouveau firmware Just Works (tm) on the TK1. just on one's ever tried.
13:00 NanoSector: i did select latest master as bad, which includes 4.13 and everything
13:03 paulk-gagarine: imirkin, IIRC tagr and gnurou were saying that it doesn't really work well
13:03 imirkin: i don't think they ever properly tried it
13:03 imirkin: i think there's a wrinkle in that it needs some of the new GK208 logic but in the v3 isa. or vice-versa.
13:14 paulk-gagarine: ohh
13:14 paulk-gagarine: if it's just that, it looks rather doable
13:23 imirkin: for all i know it won't work
13:23 imirkin: but iirc the initial hurdle was a moderately simple one
14:07 NanoSector: karolherbst, this next bisect points to 2579b8b0ece53248b815042f8662a4531acf120d
14:08 NanoSector: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?id=2579b8b0ece53248b815042f8662a4531acf120d
14:08 NanoSector: which sounds illogical
14:09 NanoSector: or does it
14:09 NanoSector: hmm
14:10 NanoSector: well i'll be damned, now it works on kernel 4.13
14:10 NanoSector:flips a table
14:17 NanoSector: so running dolphin-emu does crash it on 4.13, but glxgears suddenly starts to work
14:17 NanoSector: managed to capture a dmesg: https://hastebin.com/efiyafobim.vbs
14:50 NanoSector: oh the dmesg got cut off, welp
14:51 NanoSector: here's the full thing: https://ptpb.pw/CzBH
14:52 NanoSector: so gk104_fifo_runlist_commit is in the stack traces of that dmesg
14:54 NanoSector:undoes the patch and tests
14:59 NanoSector: yeah it still hangs then. so unless anyone has other ideas i'm not sure i can be of more use here, sorry
15:23 karolherbst: :/
17:02 karolherbst: NanoSector: yeah, sometimes some effects aren't really connected to the issue so the wrong thing is actually bisected :/
17:04 NanoSector: :\
17:04 NanoSector: well, we tried
17:05 karolherbst: is GT200 third gen Tesla as well?
17:05 imirkin_: how do you count gens?
17:06 karolherbst: no clue
17:06 imirkin_: it's kinda in between G96, G98, and MCP77
17:06 imirkin_: but also has fp64 :)
17:06 karolherbst: I want to write down which GPUs have PMUs where we can upload our code to
17:07 karolherbst: in a way a random interested GSoC student would understand it
17:07 imirkin_: ooof....
17:07 karolherbst: ;)
17:07 imirkin_: i honestly don't remember
17:07 imirkin_: does G98 have such a PMU?
17:07 karolherbst: well GT215 is the first
17:07 imirkin_: if not, then there's no chance
17:07 karolherbst: imirkin_: NVIDIA GT21x, MCP77-MCP7A, MCP89, GFxxx, GKxxx or GM1xx chip ;)
17:08 imirkin_: wait, MCP77/79 have it?
17:08 karolherbst: I am sure my nvac has it
17:08 imirkin_: huh
17:08 karolherbst: or not? let me check
17:09 karolherbst: maybe only mcp89
17:09 imirkin_: mcp89 definitely does
17:09 karolherbst: okay, nouveau doesn't enable the PMU for MCP77-7a
17:10 karolherbst: well, then I would write "NVIDIA GT21x, MCP89, GFxxx, GKxxx or GM1xx chip" in the project idea
17:10 karolherbst: should be good enough
17:10 karolherbst: mhh
17:10 imirkin_: https://github.com/skeggsb/nouveau/tree/master/drm/nouveau/nvkm/subdev/pmu/fuc
17:10 karolherbst: tegra doesn't have i... argh
17:11 imirkin_: of course not ;)
17:11 karolherbst: oh wel
17:11 karolherbst: l
17:24 karolherbst: imirkin_: do you think it would be a feasable task for a student to implement access memory from system memory in draw calls for GF-GM GPUs in a GSoC project?
17:24 imirkin_: ?
17:25 imirkin_: that should already work
17:25 karolherbst: really?
17:25 imirkin_: perhaps i misunderstand the question
17:25 karolherbst: I thought for draw calls everything needs to be moved to vram?
17:25 imirkin_: i mean ... the way the code is written yeah, but just flip the vram_domain to gart, and you'll have it in gart.
17:26 skeggsb: it's... complicated... ;)
17:26 imirkin_: [this is done on K1]
17:26 karolherbst: skeggsb: well, yeah, but I though on fermi and newer this should be quite possble
17:26 karolherbst: imirkin_: okay well yeah, but the task of the student would be to implement a way to do that on demand
17:26 skeggsb: in theory, you can pass both VRAM/GART, and you'll get wherever you manage to get.. but, the way the ttm code works means if you do that, you'll lose large pages
17:27 karolherbst: and maybe even implement that "preferably in vram, but sysmem would be fine as well, but slower" thing
17:27 skeggsb: which means no compression, etc
17:27 imirkin_: skeggsb: i thought with fermi you could get large pages? or you mean hw supports it but not ttm?
17:27 skeggsb: the way it's implemented, mapping stuff into the VM isn't allowed to fail
17:27 imirkin_: [i mean, large pages in sysmem]
17:27 skeggsb: which means you can't migrate between small<->large pages for a buffer
17:28 skeggsb: (because you have to allocate PTs, which can fail)
17:28 skeggsb:is working on it
17:28 karolherbst: well yeah true, but imirkin_ was under the impression it should be easier on fermi?
17:29 imirkin_: skeggsb: is there anything you're not working on?
17:29 imirkin_: world peace?
17:29 skeggsb: karolherbst: it's *possible* on fermi, just not the way our ttm implementation works
17:29 karolherbst: skeggsb: anyhow, the question is rather if it would be a feasable task to do for a student or which part could be done by one
17:29 karolherbst: I see
17:29 imirkin_: skeggsb: wait, but does fermi not support large pages in sysmem?
17:30 skeggsb: you can't map 4K pages into 64/128K GPU pages :P
17:30 imirkin_: that's the CPU side of it
17:30 imirkin_: but from the GPU side it works OK, right?
17:30 skeggsb: if the CPU has a 64K page size, in theory, possibly
17:31 skeggsb: dunno if the GPU supports it or not
17:31 imirkin_: all you need is 64K or contiguous hw memory
17:31 skeggsb: i heard a hint that it doesn't somewhere
17:31 imirkin_: ah. so then that's fail, since you'd have to "resolve" the image to remove compression/etc
17:31 skeggsb: yes well, getting non-PAGE_SIZE blocks of memory gets really hard the longer the system is up... not really a feasible solution even if it does work
17:32 imirkin_: =/
17:32 imirkin_: so then wait... what does fermi do that G84 doesn't as far as this page migration stuff is concerned?
17:32 skeggsb: dual page tables
17:32 skeggsb: (4K and 64K page tables that cover the same virtual address range)
17:32 skeggsb: err.. 128K
17:32 skeggsb: (or 64K..)
17:32 skeggsb: it can be both, 128K by default
17:32 imirkin_: right
17:33 imirkin_: since configurable page sizes is definitely what all hw needs
17:33 skeggsb: pascal has 2MiB pages too!
17:33 skeggsb: ;)
17:33 imirkin_: and if possible, those page sizes shouldn't match what other hw vendors have
17:33 imirkin_: because then interop would be too simple
17:34 skeggsb: aaanyway, i've been having *loads* of fun in the horror of those codepaths lately, so yes, working on it actually
17:34 imirkin_: (x86_64 large pages are 1MB right?)
17:34 skeggsb: not a clue
17:34 imirkin_: or 4MB. but almsot definitely not 2MB
17:34 imirkin_: [and there are like 4GB huge pages]
17:34 skeggsb: i don't believe current non-SOC HW can use system memory large pages anyway, but i'm not sure where i heard that, or if it's even true
17:35 karolherbst: imirkin_: 4k, 2M, 1G
17:35 imirkin_: well it definitely doesn't sound like a good idea anyways... like you said, fragmentation becomes too big an issue.
17:35 imirkin_: oh. so then 2MB is a match. neat.
17:35 imirkin_: perhaps they *did* finally compare notes.
17:36 karolherbst: well yeah, what I am actually interested is, if you think we could make a project for a student out of this or if we should just deal with it ourselves
17:40 skeggsb: i don't think the DRM side of it is a big enough project, it basically amounts to lazily-mapping buffers into GPU address-space directly at submission time
17:41 skeggsb: as opposed to in the ttm move_notify() hook, which can't fail
17:41 skeggsb: making it sane on the userspace side, i can't comment on well enough atm...
17:42 karolherbst: skeggsb: okay, so it's most likely not much of a big deal, but would require a lot of knowledge about proper memory placement, so that we can implement good well enough scoring to not waste too much performance
17:42 skeggsb: in theory, userspace should send the kernel a list of "here are all the buffers this submission needs, and the places that are OK for it to be in for it" and the kernel deals with it
17:42 karolherbst: yeah, right
17:43 skeggsb: but, then you end up with hilarious situations like mesa doing system memory -> system memory blits with the cpu...
17:43 karolherbst: I already discuessed with imirkin_ that we could have for every buffer a "places where I can be", and "places where I _want_ to be"
17:44 skeggsb: *or* you don't have the kernel randomly place stuff, and somehow tell userspace to migrate shit itself, so it knows where its buffers are at any given moment
17:44 karolherbst: mhhh
17:44 skeggsb: (the kernel will still evict, but will put it back wherever userspace thought it was last)
17:44 NanoSector: karolherbst, i don't even think the bug i have is reliably reproducible, a while back I could open gnome-control-center no problem and now it doesn't work, on the very same kernel
17:44 karolherbst: skeggsb: maybe userspace could give a score for how bad it would be to be put into sys mem
17:45 karolherbst: and 1 means I don't care at all, and 100 means: super totally super bad
17:45 skeggsb: it can't really know that up front, i mean, how well do all our heuristics like that work out? ;)
17:45 karolherbst: but I doubt that's feasable at all
17:45 karolherbst: ;)
17:45 karolherbst: does it really matter if you run out of vram?
17:46 karolherbst: a bad heuristic is still better than messing up due to having no vram
17:46 skeggsb: that *should* already work, the kernel will kick buffers out
17:47 karolherbst: within one draw call?
17:47 skeggsb: yes, it'll go through the list of buffers, and hand out vram-only first, then gart-only, then whatever's left over goes to things marked as either
17:48 skeggsb: if you send a submission that has too many vram-only objects to fit at once, that's a userspace bug
17:48 NanoSector: speaking of gart, is it normal to have a gart of 1TB?
17:48 NanoSector: [ 2.562291] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
17:48 skeggsb: it's the size of the GPU address space
17:49 skeggsb: the message is a bit misleading, it comes from being AGP-only at one point
17:49 NanoSector: ah
17:50 karolherbst: skeggsb: I see
17:51 karolherbst: skeggsb: so it would make sense to adjust userspace to somehow say: here it matters or not, to not hurt perf too bad and don't mark a buffer as vram only, because of perf concerns
17:57 karolherbst: skeggsb: what do you think about reporting vram usage to userspace?
17:57 karolherbst: never looked into that at all and have no clue how much work that would be for nouveau
18:18 karolherbst: I've drafted some project ideas I came up with for GSoC/EVoC: https://gist.githubusercontent.com/karolherbst/f80890aad3983cd37a502888229d0978/raw/d66a66a99c993146be19fb4f4a1b505d01d2295f/gistfile1.txt
18:18 karolherbst: mupuf: ^
18:18 mupuf: cool!
18:20 karolherbst: I also added the three old ones for completness
18:21 karolherbst: Please tell me if some of those are good enough to be added to the Xorg wiki, so that I can simply add those there
18:21 imirkin_: skeggsb: my guess is we should look at what amdgpu does wrt vram/sysram placement, and copy that
18:23 karolherbst: and by the way: KHR-GL45.arrays_of_arrays_gl.SubroutineFunctionCalls1 shows a super annoying compiler bug regarding for loops
18:30 karolherbst: imirkin_: by any chance, do you know what is wrong with shaders@glsl-fs-lots-of-tex?
18:30 imirkin_: yeah. remove the test.
18:31 imirkin_: i thought it was gone finally
18:31 karolherbst: well, it passes on intel
18:31 karolherbst: that's why I am asking
18:31 imirkin_: both jekstrand and i sent patches to kill it
18:32 karolherbst: well, but it looks like we actually have a bug regarding that inside codegen or so
18:32 imirkin_: er no. i guess jekstrand sent a patch to adjust it. perhaps it was merged.
18:32 imirkin_: no.
18:32 imirkin_: the issue is a rounding one.
18:32 karolherbst: mhhh
18:32 karolherbst: 0.400000 vs 0.400000?
18:32 karolherbst: .-...
18:33 karolherbst: 0.400000 vs 0.266667
18:33 imirkin_: inside the texture evaluator
18:33 imirkin_: yep.
18:33 karolherbst: ohh, I see
18:33 imirkin_: the error vs what's expected gets magnified due to how the test is written.
18:33 karolherbst: mhh, I see
18:34 imirkin_: https://patchwork.freedesktop.org/patch/17718/
18:35 karolherbst: imirkin_: precise for the win maybe?
18:36 imirkin_: nope.
18:36 imirkin_: read the commit message.
18:36 karolherbst: ohh wait, there are simply adds
18:36 karolherbst: ohhhh
18:36 karolherbst: meh
18:36 karolherbst: k
18:37 imirkin_: jason changed it around a bit
18:37 imirkin_: but i don't think it addressed the source of the failure on nvidia
18:38 karolherbst: any knowledge about "spec@!opengl 1.0@gl-1.0-drawpixels-color-index" ?
18:38 imirkin_: yep
18:38 imirkin_: let that one go.
18:38 karolherbst: k
18:38 imirkin_: it's a st/mesa deficiency
18:39 imirkin_: apparently old-school GL supported indexed-color textures
18:39 karolherbst: we fail those amd_performance_monitor because of "Not enough free MP counter slots !"
18:39 imirkin_: this does not work well in combination with glDrawPixels()
18:39 karolherbst: ahh, I see
18:40 imirkin_: brian was nice enough to write the test, but not enough to fix the fail :)
18:40 karolherbst: the perf counter one?
18:40 imirkin_: the indexed color + drawpixels
18:40 karolherbst: k
18:42 karolherbst: "spec@arb_draw_indirect@arb_draw_indirect-draw-arrays-prim-restart" also kind of sounds unimportant
18:42 imirkin_: GL spec changed
18:43 karolherbst: k
18:43 imirkin_: 1-line fix in nouveau
18:43 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c#n1013
18:43 imirkin_: change that to a "0" :)
18:43 karolherbst: fun
18:44 imirkin_: feel free to send that in if it works...
18:44 imirkin_: i've been neglecting that bit of it
18:44 imirkin_: coz i felt it was lame to change GL specs
18:44 karolherbst: :D
18:44 karolherbst: okay, so 0 would be the correct thing in the end?
18:44 imirkin_: for GL 4.5, iirc, yes
18:44 imirkin_: GL 4.4 is the old way
18:44 karolherbst: I see :/
18:45 imirkin_: like i said, "kinda lame"
18:45 karolherbst: "spec@arb_vertex_type_2_10_10_10_rev@draw-vertices-2101010" got my interest now
18:46 imirkin_: everyone fails that one.
18:47 karolherbst: nice...
18:47 imirkin_: pretty sure blob does too
18:48 skeggsb: so, er, what happens to clients that were written against 4.4 in that case?!
18:48 karolherbst: tough luck I guess
18:48 imirkin_: skeggsb: i thought i mentioned "kinda lame"
18:48 imirkin_: skeggsb: in practice, nobody cares about primitive restart with draw-arrays
18:48 karolherbst: but shouldn't we do it the old way on 4.4 contexts?
18:48 imirkin_: so ... it's just for conformance, and some hw didn't support it, so they just said "fuck it"
18:49 skeggsb: heh
18:49 imirkin_: although why nvidia made it configurable is a little beyond me
18:50 karolherbst: oh wow, like 70% of our piglit fails is due to multisampling
18:50 imirkin_: but it works out well now.
18:50 imirkin_: karolherbst: those ext_multisample_framebuffer tests?
18:50 karolherbst: not only those
18:50 imirkin_: good times...
18:50 karolherbst: spec.!opengl1.1 as well
18:50 imirkin_: i picked away at them over time, used to be worse =]
18:50 imirkin_: yeah, but those are the resolve issue
18:51 karolherbst: arb_texture_rg as well
18:51 imirkin_: ?
18:51 karolherbst: multisample-formats X gl_arb_texture_rg-int
18:51 imirkin_: hm
18:51 imirkin_: i don't remember off the top of my head
18:51 karolherbst: ext_framebuffer_multisample_blit_scaled in addition to ext_framebuffer_multisample
18:51 karolherbst: ext_texture_integer
18:52 karolherbst: for fun reason the latter has tests like "multisample-formats X gl_ext_texture_integer"
18:52 karolherbst: so ext_texture_integer == arb_texture_rg?
18:52 karolherbst: maybe
18:52 imirkin_: certainly rg-int ;)
18:53 karolherbst: as far as I can tell, every of those fails sound like resolv issues on glReadPixels
18:54 karolherbst: maybe not those multisample-format fails
18:54 karolherbst: https://mini.karolherbst.de/nouveau/piglit/nve6-cts/nve6-cts-0393cd92c5e121d5f93ac227dcc7638070f77443/spec@ext_texture_integer@multisample-formats%202%20gl_ext_texture_integer.html
18:55 mupuf: karolherbst: I updated the frameretracer and the dynamic reclocking one: https://pastebin.com/HNf4vVFJ
18:55 mupuf: you will never learn that experimenting in the PMU is a complete waste of time :p
18:56 karolherbst: mupuf: thanks
18:56 karolherbst: nope
18:56 karolherbst: I won't :p
18:57 mupuf: C is already a pain for signal processing, so asking people to do it in a foreign ASM is just suicidal
18:57 karolherbst: ohhh right, we can't do it on fermi either
18:57 karolherbst: mupuf: hey.... I was only thinking about adding the counter readout in asm, which I basically already wrote
18:57 karolherbst: ;)
18:57 mupuf: did I write the code for it already?
18:57 karolherbst: and those can be read out in userspace
18:57 mupuf: oh yeah, you made another one
18:58 mupuf: yes!
18:58 karolherbst: yeah
18:58 mupuf: this is something we need to do!
18:58 karolherbst: yeah
18:58 karolherbst: I already sent patches
18:58 mupuf: I don't care where the code comes from
18:58 karolherbst: it is even configureable from the host
18:58 mupuf: but we need to expose that to the userspace to enable the reclock experiments
18:58 karolherbst: mupuf: https://github.com/karolherbst/nouveau/commits/pmu_counters_v2
18:59 mupuf: shower time first
18:59 mupuf: then I will have a look at the nv4x fan issue that has been reported to me
18:59 mupuf: and then back again to the one on nvc1
18:59 mupuf: after a year without touching this, I have forgotten some aspects of it...
18:59 mupuf: and I will have to re-do experiments
19:00 mupuf: at least, I have a good model, so a couple of sample points will be enough
19:01 mupuf: I had forgotten so much that I did not even recognize my code in nvbios...
19:04 karolherbst: :O
19:08 karolherbst: imirkin_: we also fail some atomic tests
19:11 karolherbst: and that subtest argument is totaly broken
19:16 karolherbst: nice! one of those multisample fails shows a difference in apitrace, nice nice
19:34 karolherbst: imirkin_: do you know a fast hack to clear allocated buffers with 0s?
19:34 imirkin_: ->clear_buffer ?
19:35 karolherbst: well, I was more talking about adjusting just one place
19:35 karolherbst: like in the kernel or something
19:35 karolherbst: or did you mean inside libdrm?
19:36 imirkin_: i guess i'm not sure what the question is
19:37 karolherbst: well, I have the issue, that in qapitrace I see random content in surfaces and it would be cool if everything would be 0 by default
19:38 karolherbst: but I already kind of know what is goind on
19:38 karolherbst: anyway
19:39 imirkin_: ah.
19:39 imirkin_: should be easy to do in mesa
19:39 karolherbst: intel: https://i.imgur.com/UvJ6LCN.png nouveau: https://i.imgur.com/2cjE8FJ.png
19:40 imirkin_: yeah
19:40 imirkin_: i remember those ;)
19:40 imirkin_: good luck!
19:40 karolherbst: :D
19:40 imirkin_: iirc i spent some time randomly banging on the keyboard
19:40 imirkin_: but if i fixed some, others then broke
19:41 imirkin_: i think i may have a better understanding of the pipeline now though, and perhaps could give it another shot
19:41 karolherbst: I think we just have multisample disabled on the blit
19:41 karolherbst: and that's why it'smessed up
19:41 imirkin_: we fake it though
19:41 imirkin_: but the texture descriptor's MS thing isn't handled properly?
19:41 eliteqnvgb: p
19:42 karolherbst: I already had the code for that and threw it away...
19:49 karolherbst: imirkin_: is there a way to read the GL_DRAW_FRAMEBUFFER in qapitrace?
19:50 imirkin_: if it's not in the framebuffers view, then no
19:50 karolherbst: :/
19:50 karolherbst: okay
19:50 karolherbst: there are several glDrawArrays(mode = GL_TRIANGLES, first = 0, count = 3) calls drawing into that framebuffer
19:50 karolherbst: and then it's blit and the result is those pngs
19:52 karolherbst: would be cool to know if the blit goes wrong or if something is odd with the draws already
19:52 karolherbst: but
19:52 karolherbst: the blit is for 512 to 256
19:52 karolherbst: and those buffers are just 256x256
19:53 imirkin_: we do the blit as if the surfaces were not multi-sampled
19:53 karolherbst: okay
19:53 karolherbst: can the blit stretch the image?
19:54 imirkin_: there's a blit_scaled for msaa
19:54 imirkin_: [extension]
19:54 karolherbst: ahh
19:54 karolherbst: but I could just check how the src and dest is setup for the blit
19:56 karolherbst: uhhhhhhhh, buh buh buh
19:59 karolherbst: :/
20:00 karolherbst: imirkin_: you will totally love that
20:01 karolherbst: okay, now it is pixel perfect in apitrace
20:01 karolherbst: but we still fail the test, because glReadPixels still messes up
20:02 karolherbst: imirkin_: https://gist.github.com/karolherbst/546d62a8316aa9c3cdcf3d933685109b
20:02 imirkin_: lol
20:02 karolherbst: ;)
20:02 imirkin_: yeah, that one passes. now try the other ones.
20:04 karolherbst: imirkin_: the msaa 8 test of this also looks fine
20:05 karolherbst: but... the painful part is, the piglit test fails, becuase glReadPixels :( super annoying
20:07 karolherbst: it used the 3d blitter for this test
20:12 karolherbst: imirkin_: ohhh wait, I think I know what is going on
20:14 karolherbst: or not
20:15 imirkin_: yeah, so part of the trick is that the 2d blitter is sometimes used, other times the 3d blitter is used
20:15 imirkin_: loads of fun.
20:16 imirkin_: 3d blitter does texture lookups
20:16 imirkin_: on MSAA textures, but attempts to do it in a non-MSAA way
20:16 imirkin_: it's a bit of a hack ;)
20:16 imirkin_: should probably be adapted to use TXF instead of TEX
20:16 imirkin_: (or does it use TXF already?)
20:17 imirkin_: although of course that sucks for scaling
20:17 imirkin_: since you don't exactly want to implement filtering with TXF...
20:17 karolherbst: it seems to simply use tex 2D
20:19 imirkin_: you could break down and just use the u_blitter junk
20:19 imirkin_: which handles all these various things
20:19 imirkin_: i've resisted
20:19 karolherbst: mhh
20:19 karolherbst: well
20:20 karolherbst: we still need to fix glReadPixels first anyhow, which seems to be actually more annoying then to fix those blits
20:20 karolherbst: *than
20:20 karolherbst: sadly
20:24 karolherbst: imirkin_: also, what is so bad about the u_blitter stuff?
20:25 imirkin_: it's wasteful
20:26 karolherbst: okay well sure, but if it would work better than what we have now?
20:27 imirkin_: the path to hell is paved with good intentions
20:27 karolherbst: ahh
20:27 imirkin_: i'm a fan of the "leave it broken until it's properly fixed" approach rather than the "do a stupid thing to just fix it now because something better is coming Real Soon" approach
20:28 karolherbst: imirkin_: k, I see
20:28 imirkin_: since putting in crutches removes any impetus to properly resolve the issue
20:28 karolherbst: understandable
20:31 airlied: imirkin_: wasteful how?
20:31 imirkin_: airlied: doesn't use the 2D blitter
20:31 imirkin_: i suppose we could use it on the "3d blit" path only
20:32 karolherbst: mhh, then we end up always using the 3d blit, because it works and the 2d one doesn't
20:32 imirkin_: at which point ... well ... we cheat, whereas it can't cheat (since it's generic)
20:32 imirkin_: we make use of some things which are a bit nvidia-specific
20:32 imirkin_: karolherbst: actually 2d blit is the one that tends to work :)
20:33 karolherbst: ahh I see
20:33 imirkin_: anyways, u_blitter has a proper resolve... which is probably good
20:33 imirkin_: whereas we just do a scaled blit and hope for the best, which tends to work out nicely
20:33 imirkin_: except in some cases, apparently ;)
20:34 airlied: do you need shader based resolve?
20:34 karolherbst: airlied: glReadPixels doesn't return the right values
20:34 imirkin_: maybe?
20:34 imirkin_: karolherbst: that's entirely unrelated.
20:34 karolherbst: right, true
20:34 airlied: imirkin_: so 2d blitter can do resolve?
20:34 imirkin_: airlied: it can do scaled blits. and the samples are laid out in a grid.
20:35 imirkin_: so ... not perfect :)
20:35 imirkin_: for a high-quality resolve, we definitely need a shader.
20:35 airlied: does it so integer properly?
20:35 imirkin_: almost certainly not
20:36 airlied: since integer resoollve is pick first sample
20:36 imirkin_: eyah
20:36 imirkin_: i don't remember if 2d blit does integer at all tbh
20:37 airlied: radv uses a compute shader if the hw resolve doesnt work
