12:41linkmauve: I… might have bought my first Nvidia GPU since the Switch, a 1070, alongside a full computer.
12:41linkmauve: I guess I’ll exchange it for an AMD one.
12:52imirkin: linkmauve: like you slipped and fell and poof - you had bought an nvidia gpu?
12:52linkmauve: Yes. :(
12:52linkmauve: Someone offered me an i7-8700k for 50€, and then added a 1070 for an additional 50€. :(
12:53imirkin: that's quite the slippery slope
12:53cosurgi:linkes nouveau drivers. Almost 99.(9)% stable, except for 'echo off > /sys/devices/system/cpu/smt/control' (don't do that)
12:53imirkin: i can see how one might fall
12:53cosurgi: *likes :)
12:53linkmauve: I think it might be worth it if I can exchange it for any AMD GPU to someone, instead of having to pay euros for one.
12:54cosurgi: ... and steam games - only those with low graphics requirements are working ;)
12:56imirkin: linkmauve: definitely
13:11imirkin: linkmauve: if, in the meanwhile, you happen to plug it in and use nouveau, feel free to report any issues
13:12linkmauve: imirkin, it wouldn’t be any useful for 3D stuff right?
13:12imirkin: depends on one's definition of 'useful'
13:12imirkin: if that defintion is 'cuda', then no
13:13imirkin: or 'vulkan'
13:13linkmauve: Mine would be OpenGL.
13:14imirkin: should work...
13:14imirkin: we pass all GL 4.5 CTS tests individually, but still can't make it through a full run for some reason
13:14linkmauve: But then it’s locked down to non-usable frequencies right?
13:14imirkin: usable
13:15imirkin: but the lowest ones, yes
13:15linkmauve: Unlike the Switch on which there is no firmware issue.
13:15imirkin: correct.
13:15linkmauve: And we can use the full performances.
13:15orbea: is it a matter of becoming unstable over time or do certain tests react badly in combination with others?
13:15imirkin: as full as you can get by just increasing clocks, yes
13:15linkmauve: I’d bet the Switch would give me more performances than this 1070.
13:15imirkin: i wouldn't bet that
13:15imirkin: orbea: that's a great question. next question?
13:16orbea: heh
13:17linkmauve: Oh, interesting.
13:19linkmauve: I’ll test how it compares to my UHD620 then. :)
13:19imirkin: should beat it, i'd think
13:20imirkin: even at the piddly low freq
13:25linkmauve: Interesting too!
13:47cosurgi: linkmauve: I used it for OpenGL, nothing fancy. My software draws less than 1e4 triangles. And it works.
13:47cosurgi: *use it :)
13:47cosurgi: not textures or anything. Just some simple lighting.
13:48linkmauve: Most of the software I plan on using, games, uses textures and stuff.
13:48linkmauve: No idea how many triangles, or how complex shaders.
13:48cosurgi: nah. I tried games. Forget it ;)
13:48cosurgi: only 'this war of mine' and 'neo scavenger' work.
13:49cosurgi: 'cyties skylines' - unplayable framerate at smallest window size 800x600
13:49cosurgi: 04:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
13:49cosurgi: that's what I have.
13:50linkmauve: :(
13:50linkmauve: PyTouhou gets me about 2000 fps on my UHD 620, hopefully it’ll reach 60 fps at least.
13:50linkmauve: Although, given the CPU, it might reach that with llvmpipe as well.
13:51cosurgi: But my OpenGL code which draws just a cuple of triangles works great ;) List of videos which I recorded using nouveau: https://yade-dem.org/doc/tutorial-more-examples-fast.html
14:01imirkin: cosurgi: you also have 3x 4k monitors. that's not the most usual setup.
14:02imirkin: might not be able to push that many pixels of a 3d game with nouveau :)
14:06cosurgi: heh, yeah ;)
14:07imirkin: not to mention rotation
14:07imirkin: which introduces extra copies
14:08linkmauve: Oh, you do a copy in the display controller in that case?
14:08imirkin: rotation isn't supported natively
14:09imirkin: normally you have one big happy X framebuffer
14:09linkmauve: Interesting.
14:09imirkin: and that's just sitting on the GPU
14:09linkmauve: Ah, on Xorg.
14:09imirkin: and scanout happens from various parts of it
14:09imirkin: with rotation, that no longer works, so you have to copy stuff out
14:09imirkin: (unless the GPU supports rotation natively, but it doesn't)
14:10linkmauve: That won’t be an issue with Weston right?
14:10imirkin: yes and no?
14:10imirkin: with weston, you always copy :)
14:10imirkin: so rotation isn't special
14:11linkmauve: Uh, why would you?
14:11linkmauve: You can’t scanout from a dmabuf buffer?
14:11imirkin: sure
14:11imirkin: (some)
14:11imirkin: but how does it get into that buffer?
14:12imirkin: you don't have some persistent permanent "root" buffer like you do with X
14:13linkmauve: Ah, you mean in the compositing case, I meant in the case where you can scanout a dmabuf to an overlay or primary plane.
14:13imirkin: right, so like never
14:13linkmauve: You can’t do that?
14:14linkmauve: s/You/Nouveau/
14:14imirkin: heh
14:14imirkin: well think about it
14:14imirkin: you have 2x rotated screens
14:14imirkin: a virtual 2400x1920 display or whatever
14:14imirkin: you have a full-screen GL application
14:14imirkin: which is creating a 2400x1920 image
14:14imirkin: in a single dma-buf thing
14:14imirkin: what's the next step?
14:14linkmauve: Could you run https://github.com/ascent12/drm_info on some recent-ish Nouveau GPU?
14:15linkmauve: Why virtual?
14:15imirkin: i have an old kernel, so it won't have some stuff
14:15linkmauve: Ok.
14:15imirkin: well, i just mean logical
14:15imirkin: the logical screen size is 2400x1920 (2x 1200x1920)
14:15linkmauve: Ah, in that case you could configure the two primary planes to crop said dmabuf right?
14:15imirkin: and the GL application si across both screens
14:15imirkin: but that requires native rotation support
14:15imirkin: and i'm not sure even that would work
14:16linkmauve: On i915 I have │ ├───"rotation": bitmask {rotate-0, rotate-90, rotate-180, rotate-270} = (rotate-0 | rotate-90)
14:16imirkin: i915 has native rotation support
14:16imirkin: nouveau does not.
14:16linkmauve: Ok.
14:16imirkin: (well, nvidia gpu's)
14:18imirkin: emersion has some nvidia listings btw
14:20linkmauve: https://drmdb.emersion.fr fancy!
14:20imirkin: quite
14:20linkmauve: https://drmdb.emersion.fr/devices/0dbcf7336b50 for instance.
14:21imirkin: looks right
14:21imirkin: and it has the 16F formats - nice
14:21linkmauve: Oh, your primary planes don’t do YUYV, only the overlay planes?
14:21imirkin: correct
14:21imirkin: and the overlay planes aren't entirely the most useful things in the world
14:21imirkin: since they only allow horizontal scaling
14:22imirkin: (starting with G80)
14:22linkmauve: Oh. :(
14:22imirkin: welcome to desktop :)
14:23linkmauve: i915’s planes are super useful.
14:23imirkin: i915's embedded :p
14:23imirkin: when they make a dgpu, let's talk
14:23linkmauve: imirkin, but on a non-hidpi monitor, your overlay planes would still allow you to avoid a composition altogether.
14:23imirkin: there's only one overlay plane
14:23imirkin: and there's no alpha
14:23linkmauve: Same on i915.
14:24imirkin: there are colorkeys, but those aren't exposed nicely by drm
14:24imirkin: volta+ has fancier things
14:24imirkin: like blending
14:24linkmauve: Firefox for instance is declared with full opacity, so it can get promoted to an overlay plane here.
14:25imirkin: yeah, i mean that _should_ work fine
14:25imirkin: although to be perfectly honest, overlays aren't the most tested of things
14:25imirkin: since nothing actually uses them
14:25linkmauve: Which marketing name is Volta?
14:25imirkin: not a lot of volta's out there
14:25linkmauve: Well, most Wayland compositors should be using them nowadays.
14:25imirkin: turing is more likely
14:25linkmauve: Ok.
14:25imirkin: e.g. RTX series
14:25imirkin: as well as the GeForce 16xx series
14:26imirkin: i forget what the volta code was, but it wasn't really a consumer gpu
14:26linkmauve: Ok.
14:29imirkin: also, atomic is supported actually
14:29imirkin: just not exposed by default
14:29linkmauve: Any reason for it not to be?
14:29imirkin: fear of the gathering darkness
14:29linkmauve: Heh. :)
14:30imirkin: maybe if you call skeggsb chicken enough times, he'll turn it on, just to prove you wrong
14:30linkmauve: :D
14:30linkmauve: I guess I’ll enable it on my Switch first.
14:31imirkin: swithc doesn't use nouveau drm
14:31imirkin: for two reasons
14:31imirkin: #1: it uses tegra drm
14:31linkmauve: Uh, I (think I) do.
14:31linkmauve: Oh.
14:31imirkin: #2: if you're using the blobs, you're using whatever the blobs give you
14:31fincs: We use a fake libdrm_nouveau on Switch :)
14:31linkmauve: I’m using whatever was shipped by ArchLinux’s kernel.
14:31imirkin: ah
14:31imirkin: well the hardware is tegradrm
14:32imirkin: separate display controller from what's on desktop gpu's
14:32linkmauve: Ok.
14:32imirkin: which is more embedded friendly
14:32imirkin: supports rotation iirc ;)
14:32linkmauve: \o/
14:32imirkin: emersion's page doesn't seem to show tegra =/
14:32imirkin: submit something? :)
14:33linkmauve: Ok!
14:34linkmauve: I see it also doesn’t have any Freedreno sample.
14:34imirkin: i expect he'd be happy to take whatever people send him
14:34imirkin: and he has a number of other embedded results in there
14:41emersion: yeah, feel free to submit new devices :)
14:44linkmauve: As soon as I get ssh back…
14:44linkmauve: The wifi on these devices is so terrible, as is the lack of Ethernet ports…
15:02HdkR: imirkin: Volta was Titan V and V100 depending on "Geforce", Tesla, or Quadro lineup
15:03HdkR: Titan V never had Geforce, but it still lived outside of all three series
15:03HdkR: "Titan" class
15:07imirkin: right. i couldn't remember Titan V. I remembered V100, but that's not exactly available at best buy :)
15:07imirkin: with nvlink and all that
15:07imirkin: (titan v probably didn't make much of an appearance at best buy either, but at least it could have)
15:08HdkR: Oh yea, gotta love that overpriced NVLink connector for the Quadro lineup
15:08imirkin: V100 was the on-motherboard-only thing, no?
15:09HdkR: Quadro GV100 and Tesla V100 were available
15:09imirkin: oh ok
15:09imirkin: what was the power nvlink thing?
15:09imirkin: am i making things up?
15:10HdkR: You're probably thinking of the NVSwitch in the DGX?
15:10imirkin: maybe.
15:10HdkR: Allows you to link 16 devices through NVLink
15:10HdkR: Full throughput through the switch
15:18imirkin: i remember something specifically with POWER
15:21HdkR: Oh
15:21HdkR: Right, Power connecting directly to the GPUs with NVLink
15:23HdkR: I don't recall if that ever continued past the P100 it shipped with
15:24imirkin: sounded a bit crazy
15:25HdkR: Looks like Power9 shipped with some NVLink 2.0 things, so it must have
15:26HdkR: Sadly CXL as an interconnect doesn't quite have the bandwidth to stand up to NVLink
16:37fincs: Is it know what this register does? https://github.com/mesa3d/mesa/blob/master/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h#L191-L194
16:38HdkR: It's clearly described, UNK02EC :P
16:39fincs: Yeah, that's a really good indication as to what it does
16:39fincs: I've observed bit4 gets set if both conservative raster is enabled and dilate is different from 0.0f
16:40karolherbst: fincs: it's not used anywhere, right?
16:40fincs: It is, see above
16:40karolherbst: ohh...
16:40imirkin: no clue about any conservative raster stuff
16:40imirkin: there are some piglit tests though
16:40imirkin: not sure how exhaustive they are
16:41fincs: It's really "easy" to implement
16:41HdkR: Piglit might be a bit conservative with its testing there
16:41imirkin: but to check whether it does something?
16:41imirkin: HdkR: nah, it's a lot more raster
16:41HdkR: :D
16:41fincs: The only weirdo thing is that you need to write to 0x418800 bit23..24 for the dilate
17:15imirkin: fincs: yeah, with the firmware ... nouveau does do that, iirc
17:17pendingchaos: I'm not sure if the conservative raster piglit tests ever got merged
17:18pendingchaos: IIRC they rendered slightly different on the nvidia driver (even when conservative raster was disabled) which caused them to fail
17:19imirkin: ah =/
17:19imirkin: did you figure out why? were the sample positions differnet or something?
17:19imirkin: perhaps raster is flipped somehow (they use that screen control thing)
17:22pendingchaos: no, I don't think I did
17:23imirkin: hm ok. well now i'm on a gp108, i guess i could glance at it. but i don't run the blob too much...
17:23imirkin: still have to do something useful with passthrough gs ... that kinda stalled
18:05AndrewR: imirkin, I tried to trace my self-compiled blender 2.79b and I got this file (xz compressed) ... https://yadi.sk/d/HzE2QDqxlRH3YA (~5 mb)
18:39imirkin: AndrewR: cool. This is G92 right?
18:41AndrewR: imirkin, yes
18:41imirkin: and what's the issue?
18:42imirkin: seems to retrace fine on the G84
18:42AndrewR: imirkin, some ares in preferences and then some renderers and file chooser window show a lot offlickering triangles, wrong colors, missed text ..hard to use!
18:43imirkin: do you see them when retracing the trace?
18:43AndrewR: imirkin, yes ...
18:43imirkin: can i see a screenshot of the issue?
18:43imirkin: of the glretrace of the file you sent
18:43imirkin: oh whoa
18:43imirkin: wait
18:44imirkin: with mesa head, i see it
18:44imirkin: with mesa 20.0.2 i don't
18:44imirkin: what version are you using?
18:47AndrewR: imirkin, head ..
18:47AndrewR: git-f13049f48a
18:58imirkin: AndrewR: ah yeah. can you see if it happens for you with mesa 20.0.x?
18:58imirkin: i think it's some recent that broke it. i'm gonna go with either me enabling the multi draw elements with start indices thing
18:58AndrewR: imirkin, guess for this I need to chek out and recompile this branch (need some time for this)
18:58imirkin: or mareko for messing with vbo code :)
18:58imirkin: hold on
18:58imirkin: let me quickly test one thing
19:00imirkin: screwed up on nvc0 too btw
19:01imirkin: nope, not coz of PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES
19:01imirkin: a bisect will be needed
19:01imirkin: if you're not up to it, i can do it later
19:01AndrewR: imirkin, I'll try, but not sure who will do it faster :}
19:06AndrewR: imirkin, set to 20.0-branchpoint, recompiling ....
19:11imirkin: the fail doesn't seem nv50-specific, which is nice
19:12AndrewR: imirkin, MESA_DEBUG=flush 'fixes' it (at cost of performance, I guess?)
19:15imirkin: yeah. probably some of marek's changes
19:16imirkin: not to pick on him too much -- he's coming up with real optimizations - but it's hard to account for everything sometimes
19:29AndrewR: imirkin, there is no way to get shared library in meson's directory (for picking up by libgl via LIBGL_DRIVERS_PATH ? )
19:30imirkin: i dunno, maybe
19:30imirkin: look for it?
19:31imirkin: i just install + LD_LIBRARY_PATH
19:31imirkin: to a temp dir
19:35AndrewR: imirkin, installed it globally ... 20.0.0-devel (git-58fd26c433) seems to be good, bisecting
19:35imirkin: cool
20:04imirkin: AndrewR: let me knwo what progress you make whenever you're done
20:04imirkin: that way i don't have to replicate your effort, should you not have finished
20:05AndrewR: imirkin, ok ...
20:51AndrewR: imirkin, # bad: [56e15bf31c0a88d220d5907a533d59ca6341d96a] iris: Use ISL_AUX_USAGE_STC_CCS for stencil CCS and # bad: [ebfa899089b89c5765914dd9775dcc90bc391b7f] gitlab-ci: Skip dEQP-GLES3.functional.shaders.derivate.*
21:13imirkin: AndrewR: and good?
21:13imirkin: or it's all bad so far? :)
21:55AndrewR: imirkin, good: [58fd26c4332aa3f3ffd34d99d996ba62476d5acd] turnip: Fix vkCmdCopyQueryPoolResults with available flag
21:56AndrewR: # bad: [7e2b4bf256610cc016202893d7b4b4ef60b25b53] radeonsi: don't wait for shader compilation to finish when destroying a context
22:02imirkin: AndrewR: fwiw it doesn't repro using llvmpipe
22:12AndrewR: imirkin, # good: [451cf228d53ba8f51beb3dcf04370e126fb7ccb6] svga: Fix banded DMA upload
22:25AndrewR: imirkin, # bad: [cd7241c4f8082dbd07f0bcd268741c527512c66b] vbo: pass only either uint32_t or uint64_t into ATTR_UNION
22:38AndrewR: imirkin, # good: [7283c33b981f975361e3bfa62a339c88f2642cbb] Vulkan overlay: use the corresponding image index for each swapchain
22:46AndrewR: imirkin, # good: [32d3435a78675ff5ebf933d45b9b99fdc4dc7d82] r600/sfn: Add GDS instructions
22:48imirkin: getting close, i think?
22:59AndrewR: imirkin, # good: [27dada7ce90315d47184c51879a3f67e99f2bab2] mesa: remove FLUSH_CURRENT calls that have no effect - yeah, just few more steps ...
23:07imirkin: compiles should also be getting faster, hopefully
23:17AndrewR: imirkin, # bad: [2b22e33c10f98f2f58101881818f55b4c4b73606] vbo: remove immediate mode code that doesn't do anything and simplify stuff - not sure, at this point even more corruption appear ..it seems
23:18imirkin: could be a later thing fixes something =/
23:18imirkin: we can work that out later
23:18imirkin: for now let's just say corruption = bad
23:24AndrewR: imirkin, # good: [10cf7a5113446c85dd39bbb12544dd4ac30a0200] vbo: create the immediate mode buffer only in vbo_exec_vtx_map
23:27AndrewR: imirkin, # bad: [3e0d612f5e22fee19aff0e40814db24d63f63103] vbo: don't unmap persistent buffer mappings for glBegin/End
23:30AndrewR: imirkin, 03ded3d6ce37d3be12776bcc5dcd3c4d91f33248 is the first bad commit - vbo: skip FlushMappedBufferRange for glBegin/End by using a persistent mapping
23:31imirkin: sigh
23:32imirkin: let's see if disabling this fixes everyhting
23:32imirkin: can you try doing the revert on top of master?
23:37AndrewR: imirkin, sorry, no clear automatic revert on this one ...
23:41imirkin: ok
23:41imirkin: i'll play with it
23:49AndrewR: imirkin, sorry, I think I need some sleep ...
23:49imirkin: yeah, you've already done a bunch of the work here
23:49imirkin: unfortunately my memory of coherent bo mappings + vbo is ... that there are problems
23:50imirkin: equally unfortunately is that these issues don't lend themselves to debugging
23:50imirkin: since they're basically heisenbugs
23:50imirkin: if you try to look at the state, poof, they're gone