01:02imirkin: unlord: https://patchwork.freedesktop.org/patch/423259/
01:02imirkin: should speed up xvideo for you a bit
01:04unlord: wow cool, can I add this to xf86-video-nouveau ?
01:08imirkin: yeah, should apply on top of 1.0.17 definitely
01:08imirkin: maybe 1.0.16? but dunno
01:08imirkin: i haven't tested it with weird sizes to make sure my edge conditions are correct
01:09imirkin: but it does seem to work with at least one video
01:15damo22: nice work
01:15unlord: do your buffers allow over read and overwrite?
01:16imirkin: unlord: no
01:17imirkin: so gotta be slightly careful :)
01:17unlord: aren't you doing 4 luma at a time?
01:17imirkin: my very first patch for xf86-video-nouveau was to fix that kind of bug :)
01:17imirkin: but l = w >> 3
01:17imirkin: and the remainder is taken care of at the end
01:17unlord: so you are just not converting the edge
01:17imirkin: in non-sse fashion
01:17unlord: ahh ok
01:18imirkin: not optimal for images of width 3, but ... i don't care :)
01:19imirkin: this was my first patch in xf86-video-nouveau: https://cgit.freedesktop.org/nouveau/xf86-video-nouveau/commit/?id=2fa3397e348161a3394e2b456f065921272a056a
01:20unlord: then you were the perfect person to fix this :)
01:20imirkin: i think i'm the only nouveau person who still cares about xf86-video-nouveau
01:20imirkin: everyone else is on the GL train
01:21imirkin: whereas i'm on the stability train
01:21imirkin: it was a super-annoying issue where dragging mplayer windows across (WindowMaker) workspaces would sometimes crash X
01:22imirkin: had to cross page boundaries s.t. the next page was not allocated, or something
01:22imirkin: and so you'd get a SEGV
01:22damo22: worst kind of bug
01:23damo22: i guess it could be worse, if it crashes in the kernel
01:25imirkin: damo22: feel like doing a few more tests for me?
01:25imirkin: (with the nva0, deqp)
01:25damo22: sure but i cant right now because im at work
01:25imirkin: k, i'll write up some instructions
01:26imirkin: so you can have a look at it when you're able
02:42imirkin: damo22: grab my nv50_compute branch and check out commit 683dfb25350b0a6117e6e30a73a3436fcd97df52, and run the 4 tests same as before. let me know if that passes. if not, i will have another thing to try.
02:44damo22: i'll probably get to this in a couple of hours
02:45imirkin: ok. i might be asleep by then.
02:45imirkin: but no worries
02:45damo22: im always in the wrong timezone
02:45imirkin: nothing wrong about your tz...
02:46imirkin: and there's no huge rush on this either
02:46damo22: i find that all the people i collab with are asleep when i am having breakthroughs
02:46imirkin: what tz are you in btw? hawaii?
02:47imirkin: ah yeah, you're in the future :)
02:47imirkin: i'm working with a NZ company for something, and my 6pm is their 9am
02:47imirkin: not extremely convenient :)
02:48imirkin: (but they're good, and we don't really need to be around at the same time except rare occasions)
02:48damo22: blackmagic design?
02:48imirkin: no, a pentest company
02:48imirkin: aptly named :)
02:49damo22: i got hurd to boot with no disk driver in kernel
02:49imirkin: i thought hurd was moderately feature-complete
02:49imirkin: as far as like ... the very basics go
02:49damo22: it is except for drivers
02:50imirkin: compared to, say, an OS that i'd write :)
02:50damo22: 70% of debian is ported to hurd, but it only runs in qemu and a bunch of old machines
02:50unlord: oh good, I'm not the only one here who casually writes their own OS when they are bored
02:51imirkin: unlord: i think i have one of yours (at least had)
02:51damo22: i plan to change that
02:51imirkin: from like ... a long time ago
02:51unlord: imirkin: huh
02:51unlord: it would be a long long time ago
02:51imirkin: late 90's
02:51imirkin: we used to be on the same irc chan
02:52imirkin: whose name i no longer remember :)
02:52unlord: we're still there :)
02:52unlord: same nick right?
02:52imirkin: no, i had a diff one
02:52imirkin: don't want to link it publicly though
02:53unlord: I feel like we've had this conversation before
02:53damo22: i think having a kernel as a message bus is a great idea
02:53damo22: everything in hurd is an RPC
02:53unlord: like when I was tring to use nouveau to write a DOS VBE TSR
02:53unlord: which I still need to do sometime
02:54damo22: but it still doesnt have ACPI support
02:55damo22: so i have plenty of work to do
02:57damo22: im not quite sure how to initialise the interrupt controller in userspace though, if its not going to be in kernel, it still needs some kind of kernel irq handling
04:31damo22: imirkin: 4/4 passed, but my machine hung shortly after im not sure why, i think i forgot to modprobe nouveau
04:31imirkin: then you wouldn't get 4/4 passed :)
04:31damo22: error setting MTRR (base = 0x00000000d1000000, size = 0x00e00000, type = 1) Invalid argument (22)
04:31damo22: from X
04:32imirkin: no idea what that is
04:32imirkin: maybe nouveau loaded *after*
04:32imirkin: that'd be extra-bad
04:32imirkin: i.e. X is using e.g. vesa
04:32imirkin: and then nouveau loads
04:32imirkin: big sadness.
04:32imirkin: damo22: confirm you checked out 683dfb25350b0a611 ?
04:33damo22: yep, let me boot my machine, its headless so i dont know why its not turning on
04:34damo22: $ lsmod|grep nouveau
04:35damo22: | * 683dfb25350 (HEAD) nv50/ir: emit store unlock without predication
04:35imirkin: ok cool
04:36imirkin: and you built and everything? sorry, i just want to be super-sure
04:36damo22: i ran ninja from a previous built one
04:36imirkin: and you did the install (to a side dir)?
04:36damo22: then ninja install
04:36imirkin: cool yea
04:36imirkin: that's useful info
04:36damo22: let me run the test again
04:37damo22: after modprobing and then running X
04:37imirkin: so i guess just "st unlock" doesn't respect the predicates' authoritah
04:37damo22: Passed: 4/4 (100.0%)
04:38damo22: target implementation = 'X11 EGL/GLX'
04:38damo22: X operation 152:0 failed: GLXBadFBConfig
04:38damo22: not sure what that is
04:38imirkin: meh wtvr
04:38damo22: i have no monitor attached
08:58karolherbst: imirkin: https://github.com/NVIDIA/open-gpu-doc/commit/97c510e793804d116b6bc110e1b3473565fc1440 mhhhh
08:58karolherbst: wondering if those apply partly to desktop chips as well
14:32imirkin: karolherbst: yeah, i saw that a while back
14:32imirkin: i assume they do for vp6+
14:32imirkin: they also published a driver to drive the parts on tegra
14:32karolherbst: imirkin: ohh, btw, did you ever connect a serial console to a motherboard?
14:32imirkin: uhh ... yes
14:32imirkin: i'm not 13 :p
14:33karolherbst: okay.. right :D
14:33karolherbst: thing is, I can't figure out the correct settings...
14:33karolherbst: something dumb is happening
14:33imirkin: what is it connected to?
14:33karolherbst: I have a USB serial console thingy, one of those 4 pins ones
14:33karolherbst: and I connected it to the JCOM port
14:33imirkin: you need to flip rx/tx
14:33karolherbst: already done that
14:33karolherbst: I get _something_
14:33karolherbst: but that something is garbage
14:34imirkin: the voltages are highly non-standard
14:34karolherbst: maybe I need one of those 10 pin consoles :/
14:34imirkin: and many USB things only work with like 1.8 or 3.3V
14:34karolherbst: the USB ones I have has a 5V pin
14:34karolherbst: but yeah.. maybe
14:34imirkin: in recent times, i've had no trouble with an ancient ftdi usb/serial thing i have and various ARM boards
14:35imirkin: but in general, if you're connecting two computers, rather than computer -> device, then you need a null-modem cable
14:35imirkin: which basically just flips rx and tx
14:35karolherbst: yeah well.. I can connect whatever pin I want to whatever
14:35karolherbst: and I am farily sure I got the pins correct
14:35imirkin: are the "parameters" the same?
14:35karolherbst: as I am receiving stuff whene xpected and such
14:36imirkin: i.e. is it e.g. 9600 8N1 on both sides?
14:36karolherbst: yeah well.. no clue
14:36karolherbst: I don't know what parameters the console on the mb is configured with
14:36imirkin: if you don't know what's on the remote side
14:36imirkin: then just try stuff
14:36karolherbst: I tried
14:36imirkin: there are only so many options
14:36imirkin: 8N1 is going to be by far the most common thing
14:36karolherbst: done that, still only garbage
14:36imirkin: rates will be 9600 or 115200
14:37karolherbst: I already tried all of that :p
14:37karolherbst: soo, what the kernel tells me is this: 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
14:38karolherbst: but no clue how much I can trust that
14:38imirkin: it's just the max
14:38imirkin: other uart's have lower max's
14:38karolherbst: sure, but I tried like all of the vailable baud rates by now
14:39karolherbst: one question though, if I do something like.. "echo 123456789012345678901234567890 > /dev/ttyS0 " should I expect the same on the receiving end or do I have to ... handle that in a magic way?
14:39imirkin: so ...
14:39imirkin: you can just do that
14:39imirkin: if you're happy with the current serial settings
14:40imirkin: there's a tool called kermit which can assist with some serial things
15:38fling: After few kernel updates llvmpipe in .xinitrc does not crash/reboot my box anymore
15:38fling: it was the only thing capable to actually reboot it right away after crashing
15:38fling: also awesome performance with llvmpipe is better
16:53karolherbst: I am reaceiving more data than I am sending
16:53karolherbst: that I am sure of
16:56karolherbst: 3030 3030 3030 000a repeateably: 9f9f 9f9f e59f 00eb f6f6 f6f6 eb56 f600 f6f6 56f6 00eb f6f6 f6f6 eb56 f600 f6f6 56f6 00eb f6f6 f6f6 eb56 f600 f6f6 56f6 ... :/
16:57imirkin: that means your rx speed is higher than your tx speed
17:16linkmauve: karolherbst, oh, does that mean video encoding and decoding are now easier to implement for the Switch?
17:16karolherbst: imirkin: I guess so
17:40kherbst: imirkin: mhhh.... something is super fishy
17:42kherbst: 57600 is only getting half the data...
17:42imirkin: so 28800 it is then?
17:42imirkin:remembers having a 14.4 modem
17:43kherbst: no.. 115200 is correct, the data is just corrupted
17:43imirkin: there are other things
17:43imirkin: first off
17:43imirkin: try 7n1
17:43imirkin: and play with some of those other settings
17:43imirkin: perhaps it wants xon/xoff
17:43kherbst: that arrives when I sent 6 times this: 0000000 3030 3030 3030 3030 3030 3030 3030 0a30
17:43imirkin: and you get to have more pins
17:43kherbst: yeah.. probably
17:44kherbst: that's what I am thinking as well
17:44kherbst: just need a proper serial console :p
17:44imirkin: but 8n1 is like ... the 99.999% case
17:44imirkin: maybe it's the other parity? hm
17:44kherbst: 7e1 gives me "3d00 7676 7676 7676 7676 7676 7676 7676"
17:45imirkin: yeah, i've never actually had to "RE" it like that
17:45imirkin: so i dunno what the different indicators are
17:45imirkin: and i've only dealt with fairly simple and standard consoles
17:46kherbst: 5n1 "0000010 1d1e 1616 1616 1616 1616 1616 1616 1616"
17:46imirkin: didn't know that was a thing
17:46imirkin: only know about 8 and 7
17:47kherbst: yeah.. no idea
17:47imirkin: but again, vast, vast, vast majority is 8n1
17:48imirkin: in the modern age
17:48kherbst: I think it's just something dumb
17:50kherbst: question is.. how do I find those 9 pin serial consoles... I only find those 4-6pin ones...
17:50kherbst: although 6 is probably enough
17:51imirkin: The Google knows
17:51imirkin: kherbst: http://www.usconverters.com/index.php?main_page=page&id=61&chapter=0
17:51kherbst: yeah, but no..
17:51kherbst: I mean, it's a normal motherboard with those piny pins
17:52imirkin: should be indicated which pin is pin 1
17:52kherbst: normal JCOM interface
17:52imirkin: on the motherboard itself
17:52imirkin: take a photo of it, should be a little thingie next to the 1 pin
17:52kherbst: I already have the spec of the mb
17:53imirkin: i splurged the $2 to get a cable that goes out to a serial port for that
17:53imirkin: er, to a DB9 thing
17:53imirkin: big inventment =]
17:53imirkin: but it actually didn't work for the thing i wanted
17:53imirkin: i think i didn't have the null modem cable
17:53imirkin: or i did and shouldn't have
17:53imirkin: i forget
17:53kherbst: like https://m.media-amazon.com/images/I/41ycj8dD8nL._AC_SS450_.jpg and https://images-na.ssl-images-amazon.com/images/I/419hgZfIRmL._AC_SX450_.jpg
17:53kherbst: is what I would need
17:54imirkin: yeah, that's basically what i have
17:54imirkin: except that it also has a slot thingie on it
17:54imirkin: so i can attach it to an empty slot on the back of the case
17:54kherbst: yeah... I don't have any space for that :D
17:54kherbst: not if I put 3 GPUs in it
17:54imirkin: yeah, i had to take it out too
17:54imirkin: this was from the earlier, more carefree days
17:55kherbst: worst part, the serial console is directly beneath the lowest pcie slot
17:55imirkin: this is on your comp?
17:55imirkin: why don't you just set the parameters?
17:55imirkin: or what's the "other side"?
17:55kherbst: I tried...
17:55imirkin: make sure a getty is running
17:55kherbst: I did
17:55imirkin: and try lower baud rate
17:55kherbst: it's all garbage :p
17:55kherbst: console=ttyS0,115200n8 even
17:56kherbst: I may be a noob if it comes to serial consoles, but not _that_ kind of noob :p
17:56kherbst: there is probably some magic flag I can set
17:56imirkin: n = odd
18:05kherbst: n should be none ;) but the shortcut n means odd in minicon
20:07aldcor: thanks guy. Very supportive ;)
20:52imirkin: kherbst: i don't suppose you have blob, cts, and valgrind-mmt set up all in one place?
20:53kherbst: uhm....... I might have, but I have only my turing system accesible
20:53kherbst: but my nvidia setup is a mess right now
20:53nmschulte: Fermi chips/cards support Vulkan?
20:53kherbst: by accident on specific driver version
20:53imirkin: nmschulte: define 'support'
20:54imirkin: karolherbst: turing is fine
20:54imirkin: i just need some sampler bits
20:54imirkin: unless turing has new TSC format?
20:54karolherbst: imirkin: I think it would be better if I look into it tomorrow, because... with wayland I can't unload nouveau on the fly :)
20:54karolherbst: uhm.. don't think so?
20:54imirkin: karolherbst: no worries
20:54imirkin: karolherbst: if you could trace each of KHR-GL45.texture_filter_minmax_tests.*, that'd be super
20:55imirkin: (there are just a handful of tests)
20:55karolherbst: are they failing?
20:55nmschulte: support: is functional -- I'm probably missing something though. I see NVIDIA "No longer enumerate Fermi based GPUs in vkEnumeratePhysicalDevices" in Feb 2016 w/ closed-source driver.
20:55imirkin: karolherbst: unimplemented. doing it now, but dunno which bits they are. going to guess some, but my guesses may fail.
20:55karolherbst: ohh, right
20:55imirkin: nmschulte: afaik nvidia never published a fermi vk driver. nouveau has no vk support for any chips.
20:56karolherbst: imirkin: they actually did I think?
20:56karolherbst: or was that only on windows?
20:56imirkin: i thought they only did kepler+ for vk
20:56karolherbst: they removed fermi vk support at some point
20:56karolherbst: but they had it enabled
20:56karolherbst: it was just buggy
20:56nmschulte: imirkin: 👍, wish I could buy you a drink
20:57imirkin: nmschulte: for telling you there is nothing that works for your hw?
20:57nmschulte: for sharing all your knowledge, yeah
20:58nmschulte: I found a GF116/GTX 550 Ti that knocks the socks off my G98/8400 GS rev.2
20:59nmschulte: Saw musings about Vulkan, figured I'd ask those that know.
20:59nmschulte: Didn't know nouveau doesn't do Vulkan.
20:59nmschulte: no offense, but I stick w/ AMD/Radeon for things I care about support with
21:00karolherbst: imirkin: mhhh.. might be false, but fermi does have dx12 support at least...
21:00karolherbst: but I was under the impression that something important is missing for vulkan on fermi and there were driver builds with support enabled by accident
21:00imirkin: karolherbst: how did they get dx12??
21:00imirkin: i thought dx12 was all bindless
21:00karolherbst: :) no clue
21:01karolherbst: 12 FL to be precise
21:01karolherbst: it's even on wiki
21:01karolherbst: ohh wait
21:01imirkin: nmschulte: as you should. don't give nvidia your money.
21:01karolherbst: it's 12 FL 11_1
21:01nmschulte: https://developer.nvidia.com/vulkan-driver search Fermi -- hints that they goofed and had it on when they wanted it off all along.
21:01imirkin: nmschulte: note that the G98 might be able to reclock, while the GF116 won't. so with nouveau, perf might not pan out the way you hope
21:01imirkin: although G98 is pretty crap in the first place.
21:02imirkin: so maybe the GF116 will be faster even at low clocks
21:02imirkin: karolherbst: oh right. you can have a dx12 driver without dx12 support
21:02nmschulte: right now G98 is useless; can't 8192 macbs restriction is killer, and even for smaller res videos, there is basically just as much CPU load on decode as w/o hwaccel.
21:03imirkin: nmschulte: uhm ... for h264?
21:03imirkin: there should be very little cpu load
21:03imirkin: (when it does work)
21:03imirkin: it takes the stream pretty much directly
21:03imirkin: make sure you're using vdpau output
21:04imirkin: and not xv or GL or anything else
21:04nmschulte: I'll test things again; I was using full-stack diagnostic: poking the card via zoneminder... maybe zoneminder is doing bad things causing huge mem-copies or such
21:04imirkin: i dunno what that is
21:04nmschulte: I'm not outputting to a screen, outputting to files on disk.
21:04imirkin: ah ok
21:04imirkin: what data?
21:04imirkin: YUV or RGB?
21:04nmschulte: zm might be converting to RGB space, unsure
21:05nmschulte: but I will test directly via ffmpeg to be certain
21:05imirkin: do you know if it's dumping "video" or "output"?
21:05imirkin: anyways, i don't think we ever considered the case of dumping it to cpu-space
21:05imirkin: there might be something dumb there
21:06nmschulte: ultimately, zoneminder is reading an rtsp "live" video feed, and decoding that on the GPU (via ffmpeg; I can control the hwaccel knobs; using vaapi in my test so far), and then writing the frames to disk (among other processings on the frames).
21:06imirkin: ah ok. va-api is a bit different
21:06imirkin: iirc it includes some very nasty fallbacks
21:07imirkin: anyways, feel free to play with the GF116
21:07imirkin: it should work just as well, if not better
21:07imirkin: for video decode. but its clocks will be stuck at whatever it boots to
21:07imirkin: probably a middle-of-the-road freq for those? dunno
21:07imirkin: (you can check in /sys/kernel/debug/dri/0/pstate)
21:07nmschulte: I'll happily try vdpau. I noticed different CPU loads via vaapi hwaccel than no hwaccel, so I assumed it was "working." I think I even told ffmpeg to "trash the output" (/dev/null; no-op), but perhaps that's not enough to avoid the CPU-ram-copy situation-hiccups.
21:08nmschulte: I'll get the GF116 tomorrow
21:09nmschulte: crappy about the inability to re-clock. does that mean if I boot windows w/ nvidia driver (or boot w/ nvidia driver, and then rmmod), cold-reboot w/ nouveau/GNU/Linux, it'll retain "fast clocks"?
21:10imirkin: if you boot with nvidia
21:10imirkin: and then force it to a higher perf
21:10imirkin: and then load nouveau
21:10imirkin: i think it should woork
21:10imirkin: coz we won't reset the board
21:11ericonr: the nvidia driver tends to be kinda bad about releasing the GPU, iirc
21:11ericonr: could be wrong
21:12imirkin: oh yeah, for a while it kills display and we don't know how to restore it
21:12imirkin: i dunno if that works now
21:12nmschulte: don't care; headless here.
21:50imirkin: karolherbst: nevermind. first guess, and it seems to work
21:50imirkin: it's the bits next to seamless cubemap
22:17nmschulte: holy cow wow -- -hwaccel_output_format vaapi gives me very low CPU load vs e.g. yuv420p; hwaccel vaapi is _definitely_ faster to decode in HW, but I guess getting the bits back out is real real slow.
22:17imirkin: nmschulte: so ... one thing
22:18imirkin: is that the engine outputs interlaced output
22:18imirkin: i.e. we get 2x images of odd and even fields
22:18imirkin: er, that was confusingly phrased.
22:18imirkin: when decoding a video, we always get the odd and even fields in separate images
22:18imirkin: whether it's a progressive or interlaced video - doesn't matter
22:19nmschulte: and so twice the mem-copy load? I could pass to the vaapi_deinterlace filter, and hwdownload filter, and only get the one image?
22:19imirkin: va-api might have some very misguided logic to try to reconsistitute those transparently instead of failing some API calls
22:19imirkin: nah, each image is half the size obviously
22:20imirkin: basically some AMD guys are writing all the va-api stuff, and their goal is "make application work" rather than "do things intelligently"
22:20nmschulte: the interesting part about zoneminder is that it does image analysis on these frames, but it must not be using GPU surfaces to do that. I bet the codepaths in zoneminder could be optimized quite a bit in this regard.
22:20imirkin: so like some applications bail when certain API calls fail (which are supposed to fail)
22:20imirkin: so the solution is to add a bunch of fallbacks
22:21nmschulte: to make the user "happy"
22:21imirkin: but the application doesn't know some API call is doing a bad thing
22:21imirkin: like ... it's right to call that API
22:21imirkin: you just have to be able to handle failures
22:21imirkin: and get the data some other way
22:21imirkin: anyways, there may be room for optimization of the data flow
22:22imirkin: is what i'm saying
22:22imirkin: not necessarily in zoneminder
22:22imirkin: but in the va-api backend
22:22imirkin: i was pretty happy when i saw an image on the screen
22:22imirkin: so ... not a lot more optimization went into it :)
22:22imirkin: (and va-api backend didn't exist at the time)
23:32nmschulte: is there an "console" with nouveau to see if I'm placing any sort of load on my GPU?
23:32nmschulte: (nouveau_top ...)?
23:32nmschulte: thanks to NVIDIA, again?
23:33imirkin: i mean, you could make that argument
23:33imirkin: but it'd be a weak one
23:33imirkin: more like ... what would it show?
23:34imirkin: anyways ... we don't expose anything. we could... from PCOUNTER i guess
23:34imirkin: pcie usage, etc
23:34nmschulte: idk, simple dma access counts or something would suffice. _anything_ that shows "theres numbers flying" or "there's numbers crunching". heck, even temperature sensors might work well
23:35nmschulte: "does the video jitter" and "does the big angry fan, or the small angry fan, turn on" are the only means a user as to know "it's working!" lol
23:40nmschulte: oh no, one bad ffmpeg invocation and I get spammed: https://desmas.net/nouveau-timeout.png. does this require a (cold) reset now?