00:35raket: DrNefario: I sometimes use ULMB on my monitor, my trick is to start linux with the nvidia driver and reboot and it still works with nouveau... :)
00:36DrNefario: wait, is ULMB part of the issue?
00:43raket: No it's not.. i'm just saying ULMB works with nouveau after a reboot after enabling it on the monitor when the nvidia driver is loaded ...
01:36DrNefario: ULMB seems to be forced off no matter what due to the refresh rate being set too low
01:36raket: DrNefario: < DrNefario> wait, is ULMB part of the issue? raket: No it's not.. i'm just saying ULMB works with nouveau after a reboot after enabling it on the monitor when the nvidia driver is loaded ...
01:37DrNefario: my monitor is probably cursed tbh
01:43DrNefario: well I think I'll give up on it for now. if someone starts working on the bug, I'll be happy to provide more debug info in the issue. https://gitlab.freedesktop.org/drm/nouveau/-/issues/106
09:24karolherbst: imirkin: are we running out of VRAM here? https://gitlab.freedesktop.org/drm/nouveau/-/issues/110
13:04imirkin_: karolherbst: running out of vram is usually a submit failure
13:04imirkin_: like pushbuf_validate: -12
13:04imirkin_: or something like that
13:04karolherbst: hence me wondering a little on what's happening here
13:04imirkin_: VRAM_LIMIT is we hit some arbitrary limit on something
13:04imirkin_: i wonder if we're configuring the dma object wrong somehow? i dunno
13:05karolherbst: channel -1 is also a bit weird
13:05imirkin_: MCP89 was always a bit weird
13:05imirkin_: iirc i always suspected that it had some sort of fifo-related problem
13:06imirkin_: like ... more so than the rest of the nv50 series
13:06imirkin_: which also has a separate fifo problem
13:19imirkin_: iirc VRAM_LIMIT is "you tried to go out of bounds on the vram dma object"
13:20imirkin_: which is complete crap, since we set the bounds to be 0xffffffff i think
13:23imirkin_: it's saying that it hits this while trying to read the pushbuf
13:23imirkin_: at least that's how i interpret that error message
13:23imirkin_: not that on MCP89 the vram is stolen, but it's "real vram" as far as the chip is concerned
13:24imirkin_: we had some issues on MCP77/79 that RSpliet fixed iirc relating to the stolen vram setup
13:24karolherbst: mhh yeah
13:24imirkin_: it's wholly possible we screw something up on MCP89 as well
13:25karolherbst: I am just surprised we only see it like.. now
13:25karolherbst: and not like the past 10 years
13:26karolherbst: but I guess firefox just becomes very big and is doing a lot of stuff :)
13:31imirkin_: also people are getting kicked off the ddx
13:31imirkin_: more GL for all!
13:31imirkin_: also people are more likely to use nouveau since blob support esp for older GPUs is going to be more lacking
13:32karolherbst: I mean.. nothing against using GL, we just don't have enough people to fix all the bugs :(
13:33imirkin_: nothing against GL, it just doesn't work
13:33imirkin_: my recommendation would actually be that distros stop shipping nouveau_dri.so
13:33karolherbst: well we can't have vulkan on those older GPUs
13:33imirkin_: (by default)
13:34karolherbst: ever used llvmpipe on 4k?
13:34imirkin_: not sure how that's my problem.
13:34karolherbst: yeah well, it's the distros problem getting "laggy desktop" bugs
13:35imirkin_: nouveau_dri.so is not ready for production use. the fact that llvmpipe isn't a great solution to everyone deciding to sue GL isn't my problem.
13:35imirkin_: that doesn't make nouveau_dri.so any more ready for production use.
13:35karolherbst: yeah well.. convince people to not use GL :)
13:35imirkin_: not my problem.
13:36imirkin_: (and i've tried, even people here aren't interested. i'm sure people will be even less interested in the wider community)
13:36karolherbst: yeah, because the solution is not being grumpy and tell everybody how wrong they are but to get people interested in that project
13:37imirkin_: let me try ...
13:37imirkin_: "hey, come work on this project where the manufacturer provides no documentation, and there's no hope of ever getting more than 10% of perf on newer-than-10yr old hw"
13:37imirkin_: the crowds are converging already!
13:39karolherbst: yeah. I am aware of that problem
13:39karolherbst: and all I can say is that "we are working on it"
13:39imirkin_: but that's a lie
13:39karolherbst: it's not
13:40imirkin_: prove me wrong.
13:40karolherbst: I wish I would be allow to talk more about it, but I can't
13:40imirkin_: i've been hearing that line for 5+ years
13:40karolherbst: yeah, I am aware
13:41imirkin_: when it happens, great, maybe people will be interested, fix things up, etc
13:41karolherbst: it's as frustrating for us as for everybody else
13:41imirkin_: in the meanwhile, nouveau_dri.so is not ready for production use
13:41karolherbst: yeah.. I guess so
13:41imirkin_: and should not be shipped as part of any distro by default
13:42imirkin_: the solution to llvmpipe being slow on 4k is "don't use llvmpipe on 4k"
13:42karolherbst: but RH uses it in production and at least RH funds some of the work, which I think is fair
13:42karolherbst: and they are free to ship it
13:42imirkin_: everyone is free to ship it
13:42karolherbst: but they also pay the price for it
13:42imirkin_: i didn't say it was illegal or they weren't able to
13:42imirkin_: maybe. i feel like i pay the price for it.
13:42imirkin_: in tons of wasted time
13:43karolherbst: right, that's common in community driven projects
13:43karolherbst: and it's furstrating that like cose to all distribution demand bug fixes
13:43karolherbst: but rarely fix it themselves
13:43karolherbst: and hopefully that changes somewhat in the future as well
13:44karolherbst: I also heard about some group of people (trying to be as vague as possible) who would be interested in contribtuing, but let's see how that turns out
13:44orbea: tbf there aren't many package maintainers up to the task of fixing nouveau bugs
13:44imirkin_: so don't ship it.
13:44imirkin_: or don't install it by default
13:45karolherbst: if fedora or RHEL ships it, that's fine, I get poked about bugs the first, and I get paid for working on it anyway :p
13:45karolherbst: or well skeggsb_ or Lyude or somebody
13:45karolherbst: but there is at least a path from a support perspective
13:45karolherbst: for other distributons? yeah well.. open a bug upstream and hope they have time to fix it
13:46karolherbst: I mean.. distributions can ship it, it just annoys us if people come back every other week and ping on bugs because theirs is the most important one anyway
13:46karolherbst: I think there is mainly a lack of what users can expect they deman more than we can deliver
13:46imirkin_: i'll write up some wiki pages.
13:47karolherbst: and it's fine if it still gets shipped, it should just be clear that this is "best effort" and "your bug might never get fixed"
13:48karolherbst: uhh wiki
13:48karolherbst: this one needs cleaning up a lot anyway
13:48imirkin_: i'm going to delete 99% of it
13:48imirkin_: the info was stale on it when i did a cleanup of it back in 2014 or so.
13:48imirkin_: by now it's beyond ancient
13:49imirkin_: and is not usefully serving the community
13:50karolherbst: probably for the best
13:51omegatron: if I wanted to contribute in the "near" future, by hunting and fixing bugs in nouveau (because of my own personal interest in performant hardware), I assume I would actually need the respective hardware in the machine to test the software? so, I would be limited for now to my available cards ( gt 240, gtx 650 ti, gtx 760 and gtx 1060 ) or otherwise would have to buy new (or used) cards .. !?
13:51omegatron: or are those cards already so old, noone cares for them?
13:51karolherbst: fixing bugs is already good enough
13:52imirkin_: omegatron: as long as you care about them, who cares what other people care?
13:52omegatron: my only "fear" in this matter would be, to fry my hardware .. that wouldn't be good (because no replacement parts available on the market)
13:53imirkin_: omegatron: that's always a danger. however i'm not aware of any nouveau developer ever frying hardware through software they had written
13:53omegatron: good to know
13:53imirkin_: i believe mupuf fried a gpu or two in his oven, but ... i feel like one can anticipate such risks going into that activity
13:54orbea: why would you stick a GPU into an oven? (just curious)
13:54imirkin_: orbea: i think he was reflow-soldering them
13:54imirkin_: this is the generation of GPUs that had bad solder joints
13:54imirkin_: so they were dying already anyways
13:54karolherbst: he also treid a hair dryer once
13:55karolherbst: but apparently they are not hot enough either
13:55imirkin_: the hair drier was to figure out the temperature sensors i think
13:55imirkin_: so they were plugged in and working while he did that
13:55imirkin_: and didn't fry them doing that...
13:55imirkin_: omegatron: you could be the first though ;)
13:56karolherbst: but yeah...
13:56karolherbst: personally I'd be fine with a "you ship it, you support it" statement
13:56karolherbst: we can't support it anyway
13:56karolherbst: sonds sad, but it's reality
13:56imirkin_: karolherbst: well, i'll write something, if you hate it you can update it or whatever
13:56imirkin_: this isn't the new york times in terms of distribution, so even if there's info you hate for a few days, i doubt too many eyes will be on it ;)
13:57karolherbst: I'd probably just rephrase it to make it sound nicier :p
13:57imirkin_: nicer than me?! inconceivable!
13:57karolherbst: imirkin_: uhm.. I know the stats on the wiki in terms of reader
13:57imirkin_: "mr nice" is my middle name
13:57karolherbst: there are a lot
13:57karolherbst: you'd be surprised
13:57imirkin_: yes, Mr Googlebot seems to like those pages a lot
13:58karolherbst: ehh no
13:58karolherbst: I meant stats from google :)
13:58karolherbst: apparently we have around 5k clicks a month
13:58imirkin_: not surprising
13:58karolherbst: from google
13:58imirkin_: should put up some ads for AMD :)
13:59karolherbst: top query 3rd place "nvidia architecture names" :D
13:59imirkin_: that's all me
13:59karolherbst: maybe :D
14:00karolherbst: but google konws unique users
14:00imirkin_: not anymore, but it def used to be ;)
14:00imirkin_: alright. time to get some work done.
14:00karolherbst: that's what I am thinking about for 5 hours already
14:01imirkin_: i just minimize the hexchat window. (was about to do it, but saw your comment.)
14:15mupuf: orbea, imirkin_: indeed, one board got its GPU drop from the PCB :D
14:17mupuf: But the most important thing is, I actively TRIED to break a GPU and failed :o I was blasting a hair drier on a GPU with its fan disabled, and ran benchmarks. The result was that the HW was limiting the power effectively using what I called the FSRM (a crude downclocking)
14:17mupuf: on maxwell 2+, one cannot modify the registers controlling this... so I would not fear frying a GPU
14:18mupuf: well, that is to say if you trust that the board has been competently designed and the power delivery is not gonna die on you
14:19omegatron: sounds promising
15:21RSpliet: mupuf: meanwhile -> https://www.pcgamer.com/amazon-new-world-killing-rtx-3090-gpus/
15:22mupuf: RSpliet: yep :D That's what I had in mind :D
15:22RSpliet: But yes, in a normal setting it's incredibly hard to fry a GPU. I've had a GeforceFX running for 1.5 years before I found out that the fan fell off.
15:24RSpliet: And that didn't involve nouveau ;-)
15:25karolherbst: I mean I am not surprised that an application is able to fry GPUs
15:26karolherbst: I bet that running furmark on that GPU would fry it as well
15:26RSpliet: I am, the GPU should have plenty of security measures to clock back or shut down before it dies
15:26RSpliet: They've had such measures since forever
15:27RSpliet: And they're crude, requiring zero software involvement
15:28karolherbst: the FSRM can only do so much
15:28RSpliet: Sure, and as voltages got lower it is able to do less too
15:28karolherbst: the thing is.. with furmark the driver actually has to half the clocks anyhow :D
15:29RSpliet: But there's also just a hard shutdown switch on a temperature trip point AFAIK
15:29karolherbst: but yeah...
15:29karolherbst: there should
15:29karolherbst: that's configured by the VBIOS
15:29karolherbst: or rather devinit
15:29karolherbst: the GPU by itself doesn't come with defaults
15:30karolherbst: and who knows what the PMU is doing nowadays
15:30RSpliet: Yeah, and I wonder whether these days, with their 2m² chips, they may need multiple sensors for different regions of the chip
15:48karolherbst: in pushbuf_kref we ref a bo with refcnt 0 :)
15:48karolherbst: imirkin_: we might want to assert on that?
15:53imirkin_: karolherbst: it can happen "naturally"
15:53karolherbst: it shouldn't :D
15:53imirkin_: two threads
15:53imirkin_: one that is ref'ing
15:53imirkin_: one that is unref'ing
15:53karolherbst: yeah.. not the case here
15:53karolherbst: I am not talking about bo_ref
15:53imirkin_: oh. pushbuf_ref ... yeah, that's not great
15:54karolherbst: need to figure out why that happens though
15:55karolherbst: annoying thing is, it gets unrefed by bo_ref a couple of times until the process just crashes for odd reasons :)
15:55karolherbst: so bo_del gets called a lot
15:55karolherbst: btw, it's the text bo after resizing.. so I guess my MT fixes still ahve an issue there somewhere
15:57imirkin_: so after resizing
15:57imirkin_: i explicit ref the text bo
15:57imirkin_: but it could be that i don't hold a reference to it
15:57imirkin_: and it gets deleted anyways?
15:57imirkin_: i thought the pushbuf would hold a reference...
15:57karolherbst: no, I think this part is fine
15:58karolherbst: it's probably some weird interaction with other contexts, _but_ ... I don't know how it comes to it :D
15:59karolherbst: let me check something...
16:03karolherbst: why is gdbserver so CPU hunbry.. *sigh*
16:13karolherbst: okay.. so it seems like to only happen if the application indeed uses multiple contexts
16:13karolherbst: so my mt patches aren't that broken