12:41 linkmauve: I… might have bought my first Nvidia GPU since the Switch, a 1070, alongside a full computer.
12:41 linkmauve: I guess I’ll exchange it for an AMD one.
12:52 imirkin: linkmauve: like you slipped and fell and poof - you had bought an nvidia gpu?
12:52 linkmauve: Yes. :(
12:52 linkmauve: Someone offered me an i7-8700k for 50€, and then added a 1070 for an additional 50€. :(
12:53 imirkin: that's quite the slippery slope
12:53 cosurgi:linkes nouveau drivers. Almost 99.(9)% stable, except for 'echo off > /sys/devices/system/cpu/smt/control' (don't do that)
12:53 imirkin: i can see how one might fall
12:53 cosurgi: *likes :)
12:53 linkmauve: I think it might be worth it if I can exchange it for any AMD GPU to someone, instead of having to pay euros for one.
12:54 cosurgi: ... and steam games - only those with low graphics requirements are working ;)
12:56 imirkin: linkmauve: definitely
13:11 imirkin: linkmauve: if, in the meanwhile, you happen to plug it in and use nouveau, feel free to report any issues
13:12 linkmauve: imirkin, it wouldn’t be any useful for 3D stuff right?
13:12 imirkin: depends on one's definition of 'useful'
13:12 imirkin: if that defintion is 'cuda', then no
13:13 imirkin: or 'vulkan'
13:13 linkmauve: Mine would be OpenGL.
13:14 imirkin: should work...
13:14 imirkin: we pass all GL 4.5 CTS tests individually, but still can't make it through a full run for some reason
13:14 linkmauve: But then it’s locked down to non-usable frequencies right?
13:14 imirkin: usable
13:15 imirkin: but the lowest ones, yes
13:15 linkmauve: Unlike the Switch on which there is no firmware issue.
13:15 imirkin: correct.
13:15 linkmauve: And we can use the full performances.
13:15 orbea: is it a matter of becoming unstable over time or do certain tests react badly in combination with others?
13:15 imirkin: as full as you can get by just increasing clocks, yes
13:15 linkmauve: I’d bet the Switch would give me more performances than this 1070.
13:15 imirkin: i wouldn't bet that
13:15 imirkin: orbea: that's a great question. next question?
13:16 orbea: heh
13:17 linkmauve: Oh, interesting.
13:19 linkmauve: I’ll test how it compares to my UHD620 then. :)
13:19 imirkin: should beat it, i'd think
13:20 imirkin: even at the piddly low freq
13:25 linkmauve: Interesting too!
13:47 cosurgi: linkmauve: I used it for OpenGL, nothing fancy. My software draws less than 1e4 triangles. And it works.
13:47 cosurgi: *use it :)
13:47 cosurgi: not textures or anything. Just some simple lighting.
13:48 linkmauve: Most of the software I plan on using, games, uses textures and stuff.
13:48 linkmauve: No idea how many triangles, or how complex shaders.
13:48 cosurgi: nah. I tried games. Forget it ;)
13:48 cosurgi: only 'this war of mine' and 'neo scavenger' work.
13:49 cosurgi: 'cyties skylines' - unplayable framerate at smallest window size 800x600
13:49 cosurgi: 04:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
13:49 cosurgi: that's what I have.
13:50 linkmauve: :(
13:50 linkmauve: PyTouhou gets me about 2000 fps on my UHD 620, hopefully it’ll reach 60 fps at least.
13:50 linkmauve: Although, given the CPU, it might reach that with llvmpipe as well.
13:51 cosurgi: But my OpenGL code which draws just a cuple of triangles works great ;) List of videos which I recorded using nouveau: https://yade-dem.org/doc/tutorial-more-examples-fast.html
14:01 imirkin: cosurgi: you also have 3x 4k monitors. that's not the most usual setup.
14:02 imirkin: might not be able to push that many pixels of a 3d game with nouveau :)
14:06 cosurgi: heh, yeah ;)
14:07 imirkin: not to mention rotation
14:07 imirkin: which introduces extra copies
14:08 linkmauve: Oh, you do a copy in the display controller in that case?
14:08 imirkin: rotation isn't supported natively
14:09 imirkin: normally you have one big happy X framebuffer
14:09 linkmauve: Interesting.
14:09 imirkin: and that's just sitting on the GPU
14:09 linkmauve: Ah, on Xorg.
14:09 imirkin: and scanout happens from various parts of it
14:09 imirkin: with rotation, that no longer works, so you have to copy stuff out
14:09 imirkin: (unless the GPU supports rotation natively, but it doesn't)
14:10 linkmauve: That won’t be an issue with Weston right?
14:10 imirkin: yes and no?
14:10 imirkin: with weston, you always copy :)
14:10 imirkin: so rotation isn't special
14:11 linkmauve: Uh, why would you?
14:11 linkmauve: You can’t scanout from a dmabuf buffer?
14:11 imirkin: sure
14:11 imirkin: (some)
14:11 imirkin: but how does it get into that buffer?
14:12 imirkin: you don't have some persistent permanent "root" buffer like you do with X
14:13 linkmauve: Ah, you mean in the compositing case, I meant in the case where you can scanout a dmabuf to an overlay or primary plane.
14:13 imirkin: right, so like never
14:13 linkmauve: You can’t do that?
14:14 linkmauve: s/You/Nouveau/
14:14 imirkin: heh
14:14 imirkin: well think about it
14:14 imirkin: you have 2x rotated screens
14:14 imirkin: a virtual 2400x1920 display or whatever
14:14 imirkin: you have a full-screen GL application
14:14 imirkin: which is creating a 2400x1920 image
14:14 imirkin: in a single dma-buf thing
14:14 imirkin: what's the next step?
14:14 linkmauve: Could you run https://github.com/ascent12/drm_info on some recent-ish Nouveau GPU?
14:15 linkmauve: Why virtual?
14:15 imirkin: i have an old kernel, so it won't have some stuff
14:15 linkmauve: Ok.
14:15 imirkin: well, i just mean logical
14:15 imirkin: the logical screen size is 2400x1920 (2x 1200x1920)
14:15 linkmauve: Ah, in that case you could configure the two primary planes to crop said dmabuf right?
14:15 imirkin: and the GL application si across both screens
14:15 imirkin: but that requires native rotation support
14:15 imirkin: and i'm not sure even that would work
14:16 linkmauve: On i915 I have │ ├───"rotation": bitmask {rotate-0, rotate-90, rotate-180, rotate-270} = (rotate-0 | rotate-90)
14:16 imirkin: i915 has native rotation support
14:16 imirkin: nouveau does not.
14:16 linkmauve: Ok.
14:16 imirkin: (well, nvidia gpu's)
14:18 imirkin: emersion has some nvidia listings btw
14:20 linkmauve: https://drmdb.emersion.fr fancy!
14:20 imirkin: quite
14:20 linkmauve: https://drmdb.emersion.fr/devices/0dbcf7336b50 for instance.
14:21 imirkin: looks right
14:21 imirkin: and it has the 16F formats - nice
14:21 linkmauve: Oh, your primary planes don’t do YUYV, only the overlay planes?
14:21 imirkin: correct
14:21 imirkin: and the overlay planes aren't entirely the most useful things in the world
14:21 imirkin: since they only allow horizontal scaling
14:22 imirkin: (starting with G80)
14:22 linkmauve: Oh. :(
14:22 imirkin: welcome to desktop :)
14:23 linkmauve: i915’s planes are super useful.
14:23 imirkin: i915's embedded :p
14:23 imirkin: when they make a dgpu, let's talk
14:23 linkmauve: imirkin, but on a non-hidpi monitor, your overlay planes would still allow you to avoid a composition altogether.
14:23 imirkin: there's only one overlay plane
14:23 imirkin: and there's no alpha
14:23 linkmauve: Same on i915.
14:24 imirkin: there are colorkeys, but those aren't exposed nicely by drm
14:24 imirkin: volta+ has fancier things
14:24 imirkin: like blending
14:24 linkmauve: Firefox for instance is declared with full opacity, so it can get promoted to an overlay plane here.
14:25 imirkin: yeah, i mean that _should_ work fine
14:25 imirkin: although to be perfectly honest, overlays aren't the most tested of things
14:25 imirkin: since nothing actually uses them
14:25 linkmauve: Which marketing name is Volta?
14:25 imirkin: not a lot of volta's out there
14:25 linkmauve: Well, most Wayland compositors should be using them nowadays.
14:25 imirkin: turing is more likely
14:25 linkmauve: Ok.
14:25 imirkin: e.g. RTX series
14:25 imirkin: as well as the GeForce 16xx series
14:26 imirkin: i forget what the volta code was, but it wasn't really a consumer gpu
14:26 linkmauve: Ok.
14:29 imirkin: also, atomic is supported actually
14:29 imirkin: just not exposed by default
14:29 linkmauve: Any reason for it not to be?
14:29 imirkin: fear of the gathering darkness
14:29 linkmauve: Heh. :)
14:30 imirkin: maybe if you call skeggsb chicken enough times, he'll turn it on, just to prove you wrong
14:30 linkmauve: :D
14:30 linkmauve: I guess I’ll enable it on my Switch first.
14:31 imirkin: swithc doesn't use nouveau drm
14:31 imirkin: for two reasons
14:31 imirkin: #1: it uses tegra drm
14:31 linkmauve: Uh, I (think I) do.
14:31 linkmauve: Oh.
14:31 imirkin: #2: if you're using the blobs, you're using whatever the blobs give you
14:31 fincs: We use a fake libdrm_nouveau on Switch :)
14:31 linkmauve: I’m using whatever was shipped by ArchLinux’s kernel.
14:31 imirkin: ah
14:31 imirkin: well the hardware is tegradrm
14:32 imirkin: separate display controller from what's on desktop gpu's
14:32 linkmauve: Ok.
14:32 imirkin: which is more embedded friendly
14:32 imirkin: supports rotation iirc ;)
14:32 linkmauve: \o/
14:32 imirkin: emersion's page doesn't seem to show tegra =/
14:32 imirkin: submit something? :)
14:33 linkmauve: Ok!
14:34 linkmauve: I see it also doesn’t have any Freedreno sample.
14:34 imirkin: i expect he'd be happy to take whatever people send him
14:34 imirkin: and he has a number of other embedded results in there
14:41 emersion: yeah, feel free to submit new devices :)
14:44 linkmauve: As soon as I get ssh back…
14:44 linkmauve: The wifi on these devices is so terrible, as is the lack of Ethernet ports…
15:02 HdkR: imirkin: Volta was Titan V and V100 depending on "Geforce", Tesla, or Quadro lineup
15:03 HdkR: Titan V never had Geforce, but it still lived outside of all three series
15:03 HdkR: "Titan" class
15:07 imirkin: right. i couldn't remember Titan V. I remembered V100, but that's not exactly available at best buy :)
15:07 imirkin: with nvlink and all that
15:07 imirkin: (titan v probably didn't make much of an appearance at best buy either, but at least it could have)
15:08 HdkR: Oh yea, gotta love that overpriced NVLink connector for the Quadro lineup
15:08 imirkin: V100 was the on-motherboard-only thing, no?
15:09 HdkR: Quadro GV100 and Tesla V100 were available
15:09 imirkin: oh ok
15:09 imirkin: what was the power nvlink thing?
15:09 imirkin: am i making things up?
15:10 HdkR: You're probably thinking of the NVSwitch in the DGX?
15:10 imirkin: maybe.
15:10 HdkR: Allows you to link 16 devices through NVLink
15:10 HdkR: Full throughput through the switch
15:18 imirkin: i remember something specifically with POWER
15:21 HdkR: Oh
15:21 HdkR: Right, Power connecting directly to the GPUs with NVLink
15:23 HdkR: I don't recall if that ever continued past the P100 it shipped with
15:24 imirkin: sounded a bit crazy
15:25 HdkR: Looks like Power9 shipped with some NVLink 2.0 things, so it must have
15:26 HdkR: Sadly CXL as an interconnect doesn't quite have the bandwidth to stand up to NVLink
16:37 fincs: Is it know what this register does? https://github.com/mesa3d/mesa/blob/master/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h#L191-L194
16:38 HdkR: It's clearly described, UNK02EC :P
16:39 fincs: Yeah, that's a really good indication as to what it does
16:39 fincs: I've observed bit4 gets set if both conservative raster is enabled and dilate is different from 0.0f
16:40 karolherbst: fincs: it's not used anywhere, right?
16:40 fincs: It is, see above
16:40 karolherbst: ohh...
16:40 imirkin: no clue about any conservative raster stuff
16:40 imirkin: there are some piglit tests though
16:40 imirkin: not sure how exhaustive they are
16:41 fincs: It's really "easy" to implement
16:41 HdkR: Piglit might be a bit conservative with its testing there
16:41 imirkin: but to check whether it does something?
16:41 imirkin: HdkR: nah, it's a lot more raster
16:41 HdkR: :D
16:41 fincs: The only weirdo thing is that you need to write to 0x418800 bit23..24 for the dilate
17:15 imirkin: fincs: yeah, with the firmware ... nouveau does do that, iirc
17:17 pendingchaos: I'm not sure if the conservative raster piglit tests ever got merged
17:18 pendingchaos: IIRC they rendered slightly different on the nvidia driver (even when conservative raster was disabled) which caused them to fail
17:19 imirkin: ah =/
17:19 imirkin: did you figure out why? were the sample positions differnet or something?
17:19 imirkin: perhaps raster is flipped somehow (they use that screen control thing)
17:22 pendingchaos: no, I don't think I did
17:23 imirkin: hm ok. well now i'm on a gp108, i guess i could glance at it. but i don't run the blob too much...
17:23 imirkin: still have to do something useful with passthrough gs ... that kinda stalled
18:05 AndrewR: imirkin, I tried to trace my self-compiled blender 2.79b and I got this file (xz compressed) ... https://yadi.sk/d/HzE2QDqxlRH3YA (~5 mb)
18:39 imirkin: AndrewR: cool. This is G92 right?
18:41 AndrewR: imirkin, yes
18:41 imirkin: and what's the issue?
18:42 imirkin: seems to retrace fine on the G84
18:42 AndrewR: imirkin, some ares in preferences and then some renderers and file chooser window show a lot offlickering triangles, wrong colors, missed text ..hard to use!
18:43 imirkin: do you see them when retracing the trace?
18:43 AndrewR: imirkin, yes ...
18:43 imirkin: can i see a screenshot of the issue?
18:43 imirkin: of the glretrace of the file you sent
18:43 imirkin: oh whoa
18:43 imirkin: wait
18:44 imirkin: with mesa head, i see it
18:44 imirkin: with mesa 20.0.2 i don't
18:44 imirkin: what version are you using?
18:47 AndrewR: imirkin, head ..
18:47 AndrewR: git-f13049f48a
18:58 imirkin: AndrewR: ah yeah. can you see if it happens for you with mesa 20.0.x?
18:58 imirkin: i think it's some recent that broke it. i'm gonna go with either me enabling the multi draw elements with start indices thing
18:58 AndrewR: imirkin, guess for this I need to chek out and recompile this branch (need some time for this)
18:58 imirkin: or mareko for messing with vbo code :)
18:58 imirkin: hold on
18:58 imirkin: let me quickly test one thing
19:00 imirkin: screwed up on nvc0 too btw
19:01 imirkin: nope, not coz of PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES
19:01 imirkin: a bisect will be needed
19:01 imirkin: if you're not up to it, i can do it later
19:01 AndrewR: imirkin, I'll try, but not sure who will do it faster :}
19:06 AndrewR: imirkin, set to 20.0-branchpoint, recompiling ....
19:11 imirkin: the fail doesn't seem nv50-specific, which is nice
19:12 AndrewR: imirkin, MESA_DEBUG=flush 'fixes' it (at cost of performance, I guess?)
19:15 imirkin: yeah. probably some of marek's changes
19:16 imirkin: not to pick on him too much -- he's coming up with real optimizations - but it's hard to account for everything sometimes
19:29 AndrewR: imirkin, there is no way to get shared library in meson's directory (for picking up by libgl via LIBGL_DRIVERS_PATH ? )
19:30 imirkin: i dunno, maybe
19:30 imirkin: look for it?
19:31 imirkin: i just install + LD_LIBRARY_PATH
19:31 imirkin: to a temp dir
19:35 AndrewR: imirkin, installed it globally ... 20.0.0-devel (git-58fd26c433) seems to be good, bisecting
19:35 imirkin: cool
20:04 imirkin: AndrewR: let me knwo what progress you make whenever you're done
20:04 imirkin: that way i don't have to replicate your effort, should you not have finished
20:05 AndrewR: imirkin, ok ...
20:51 AndrewR: imirkin, # bad: [56e15bf31c0a88d220d5907a533d59ca6341d96a] iris: Use ISL_AUX_USAGE_STC_CCS for stencil CCS and # bad: [ebfa899089b89c5765914dd9775dcc90bc391b7f] gitlab-ci: Skip dEQP-GLES3.functional.shaders.derivate.*
21:13 imirkin: AndrewR: and good?
21:13 imirkin: or it's all bad so far? :)
21:55 AndrewR: imirkin, good: [58fd26c4332aa3f3ffd34d99d996ba62476d5acd] turnip: Fix vkCmdCopyQueryPoolResults with available flag
21:56 AndrewR: # bad: [7e2b4bf256610cc016202893d7b4b4ef60b25b53] radeonsi: don't wait for shader compilation to finish when destroying a context
22:02 imirkin: AndrewR: fwiw it doesn't repro using llvmpipe
22:12 AndrewR: imirkin, # good: [451cf228d53ba8f51beb3dcf04370e126fb7ccb6] svga: Fix banded DMA upload
22:25 AndrewR: imirkin, # bad: [cd7241c4f8082dbd07f0bcd268741c527512c66b] vbo: pass only either uint32_t or uint64_t into ATTR_UNION
22:38 AndrewR: imirkin, # good: [7283c33b981f975361e3bfa62a339c88f2642cbb] Vulkan overlay: use the corresponding image index for each swapchain
22:46 AndrewR: imirkin, # good: [32d3435a78675ff5ebf933d45b9b99fdc4dc7d82] r600/sfn: Add GDS instructions
22:48 imirkin: getting close, i think?
22:59 AndrewR: imirkin, # good: [27dada7ce90315d47184c51879a3f67e99f2bab2] mesa: remove FLUSH_CURRENT calls that have no effect - yeah, just few more steps ...
23:07 imirkin: compiles should also be getting faster, hopefully
23:17 AndrewR: imirkin, # bad: [2b22e33c10f98f2f58101881818f55b4c4b73606] vbo: remove immediate mode code that doesn't do anything and simplify stuff - not sure, at this point even more corruption appear ..it seems
23:18 imirkin: could be a later thing fixes something =/
23:18 imirkin: we can work that out later
23:18 imirkin: for now let's just say corruption = bad
23:24 AndrewR: imirkin, # good: [10cf7a5113446c85dd39bbb12544dd4ac30a0200] vbo: create the immediate mode buffer only in vbo_exec_vtx_map
23:27 AndrewR: imirkin, # bad: [3e0d612f5e22fee19aff0e40814db24d63f63103] vbo: don't unmap persistent buffer mappings for glBegin/End
23:30 AndrewR: imirkin, 03ded3d6ce37d3be12776bcc5dcd3c4d91f33248 is the first bad commit - vbo: skip FlushMappedBufferRange for glBegin/End by using a persistent mapping
23:31 imirkin: sigh
23:32 imirkin: let's see if disabling this fixes everyhting
23:32 imirkin: can you try doing the revert on top of master?
23:37 AndrewR: imirkin, sorry, no clear automatic revert on this one ...
23:41 imirkin: ok
23:41 imirkin: i'll play with it
23:49 AndrewR: imirkin, sorry, I think I need some sleep ...
23:49 imirkin: yeah, you've already done a bunch of the work here
23:49 imirkin: unfortunately my memory of coherent bo mappings + vbo is ... that there are problems
23:50 imirkin: equally unfortunately is that these issues don't lend themselves to debugging
23:50 imirkin: since they're basically heisenbugs
23:50 imirkin: if you try to look at the state, poof, they're gone