02:38 imirkin: pmoreau: you have logic to lower all 64-bit integer ops to 32-bit based things right?
02:38 imirkin: pmoreau: if so, you might want to send that series out
02:38 imirkin: and/or make it available for testing alongside with ARB_gpu_shader_int64
09:16 pmoreau: imirkin: Well, not all 64-bit integer ops, only the MUL/MAD ones, of which I have a new version of the patch. And some stuff for CVT, but I haven’t went back to that one, to really test all the different paths.
09:16 pmoreau: imirkin: And a patch for folding U64/S64 constants.
09:17 pmoreau: I’ll go through them again this week-end and (re-)submit them.
09:17 pmoreau: imirkin: Have fun with the GM107! :-)
09:23 GaivsIvlivs: pmoreau what's going on with the GM107?
09:23 pmoreau: GaivsIvlivs: "imirkin | [01:43:41] alright. this weekend is the weekend of making xf86-video-nouveau work on GM107"
09:24 GaivsIvlivs: I had it working on my system, until something with this latest round of updates made GNOME quit and need a login again
09:25 pmoreau: But using the modesetting DDX instead of the nouveau one, right?
09:26 GaivsIvlivs: nope just nouveau
09:30 pmoreau: You should be getting a "Unknown chipset NV117", because it only supports up to NV10X.
09:31 pmoreau: Oh wait, maybe you have a version prior to its removal
09:31 GaivsIvlivs: It reads as GM107
09:32 pmoreau: Yes, but it is also known as NV117, as that is the ID read from "register" 0 on the hardware: https://nouveau.freedesktop.org/wiki/CodeNames/#NV110
09:33 pmoreau: Anyway, support for it was removed in https://cgit.freedesktop.org/nouveau/xf86-video-nouveau/commit/src?id=3e2e0faa2ee1cce9c1bb5c7ad80d0592460f3edc
09:33 pmoreau: As it wasn’t great, and it looks like Ilia is going to try to fix that.
09:34 GaivsIvlivs: Excellent, I eagerly await further support for my card
11:11 karolherbst: I am updating my branches to 4.8 now
11:12 karolherbst: or at least only that reclocking branch
11:29 karolherbst: I think CodeEmitterGK110::emitFMAD misses a NEG in the short imm case
11:30 karolherbst: or let me think
11:34 karolherbst: no clue, don't want to touch that thing
12:25 karolherbst: mhh
12:25 karolherbst: it seems like nv50 is a little broken
12:26 karolherbst: shaders/warzone2100/1.shader_test crashes
12:26 karolherbst: https://gist.github.com/karolherbst/d82bca142588589027c0301f206bc6c3
12:26 karolherbst: ran as 0xac
12:27 karolherbst: imirkin: do you mind checking if this is a regression? it would take a week on mine :/
12:29 karolherbst: quite a lot shaders fail
12:30 hakzsam: karolherbst: I would prefer to wait for the features freeze before pushing new things (should be today I guess)
12:30 hakzsam: but I will try to have a look at your stuff over the weekend :)
12:30 karolherbst: I don't have commit access anyway :p
12:30 karolherbst: and second
12:30 karolherbst: it isn't a regression of my stuff
12:30 karolherbst: master fails a lot
12:31 hakzsam: I didn't say that your stuff will introduce regression
12:31 karolherbst: ahh, k
12:31 karolherbst: I just wanted to run my patches through the tesla stuff and noticed this
12:31 hakzsam: nv50 might have been broken since mesa 12
12:31 karolherbst: yeah, might be
12:31 hakzsam: I will run a bunch of piglit after the features freeze anyways :)
12:32 karolherbst: okay
12:33 karolherbst: maybe ac859d68f474694f9cb1de007997c936d735a48c is fine
12:33 karolherbst: that would be good
12:33 karolherbst: but I highly doubt it
12:33 hakzsam: doubtful yes
12:34 karolherbst: it is messy to apply a patch while bisecting :/
12:34 karolherbst: but I could test at least releases
12:36 karolherbst: uhhh, that patch only touches nvc0
12:37 karolherbst: where is the nv50 version of that?
12:37 hakzsam: I didn't implement it for nv50
12:37 karolherbst: mhhhhhhh
12:37 hakzsam: I actually only care about nvc0 :)
12:37 karolherbst: then I wonder why I could use it for 0xac
12:38 hakzsam: well, time to look at F1 2015 again
12:38 karolherbst: soo maybe it crashes cause that thing isn't checked
12:38 hakzsam: that game is a real pain
12:38 karolherbst: seems like it
12:38 karolherbst: a lot of fancy features are just plain painful?
12:39 hakzsam: yeah, and the trace is a monster trace, 1.5M GL calls
12:39 hakzsam: 175K lines of glsl
12:39 hakzsam: fun :)
12:39 karolherbst: well, happens
12:39 karolherbst: I saw worse
12:39 karolherbst: and I am not even joking :D
12:39 hakzsam: you can always see worse, but it's crazy, trust me :)
12:40 karolherbst: I guess the fancy feature part messes it up
12:43 karolherbst: over 1M calls per frame is crazy anyway...
12:44 hakzsam: yup
12:45 karolherbst: no idea how bad the port of civ 5 is, but a 2010 game doing over 500k gl calls is also pretty intense...
12:46 karolherbst: but the os x version was out on release date, so I guess they didn't change much for linux
12:46 karolherbst: or we just hit all the bad paths
12:46 tobijk: karolherbst: no use anyway, its unplayable in late games :D
12:46 karolherbst: tobijk: exactly the issue
12:46 karolherbst: it burns your cpu like nothing and the gpu is bored
12:46 tobijk: <- 3m round change times, and that is not graphics related
12:47 karolherbst: tobijk: every little building int eh cities are rendered on their own ;)
12:47 tobijk: "yay"
12:48 karolherbst: would be nice to find way to reduce the CPU overhead
12:48 karolherbst: I have a trace of that game with many calls
12:51 tobijk: not sure how you'd like make consume less cpu, i guess our path (besides compiling shaders) are not that bad
12:51 karolherbst: well
12:51 karolherbst: you can always improve things
12:52 tobijk: right, but i doubt that worth for now, there other "low hanging fruits"
12:53 karolherbst: mhh
12:53 karolherbst: yeah, but also for reducing cpu overhead
13:32 karolherbst: uhhh
13:32 karolherbst: it seems like chances are high we get the cts stuff for free
13:33 karolherbst: allthough we shouldn't get our hopes up until we really get it
13:33 karolherbst: https://www.khronos.org/members/ip-framework might be interesting
13:44 tobijk: karolherbst: would be nice though :)
13:44 karolherbst: yeah
15:44 iterati: hi. I see that the kepler reclock v5 branch from https://github.com/karolherbst/nouveau removed. Which branch supports kepler reclocking on 4.8 ?
16:11 pmoreau: iterati: There is a v6 branch, but I don't think it has been rebased on 4.8 yet.
16:23 iterati: pmoreau: thanks. Anything else with working reclocking for gtx 650 on 4.8 ?
16:28 karolherbst: pmoreau: I actually did it today
16:30 karolherbst: uhhh
16:30 karolherbst: "multi-res-shading"
16:30 karolherbst: this sounds... interesting
16:34 pmoreau: iterati: See Karol's message above
16:45 karolherbst: hakzsam, imirkin: if no new issue comes up, I guess my cse patch can be pushed after the branching? I don't really feel like resending the same patch again ;)
16:54 imirkin: karolherbst: what was the issue on nv50? want to check it out before i unplugit
16:54 karolherbst: I am not quite sure, it could be that I messed up. shader-db/shaders/warzone2100/1.shader_test crashed here
16:54 karolherbst: but then hakzsam told me I can't fake the chipset for nv50
16:54 karolherbst: so maybe it is just me
16:54 imirkin: yeah, you can't
16:55 imirkin: i mean you can - it just won't work properly
16:55 karolherbst: I noticed
16:57 imirkin: seems fine - http://hastebin.com/raw/nufetugevo
16:58 karolherbst: okay, then it was just me
16:58 karolherbst: sadly my tesla machine is crapy slow :/
17:08 karolherbst: mad(a, a, imm0) => mul(a, 2*imm0) ?
17:08 karolherbst: or did I miss anything?
17:08 karolherbst: ohh wait, in my shader I did
17:08 imirkin: a*a + imm0
17:08 imirkin: not quite the same
17:08 karolherbst: ohh right
17:08 karolherbst: wrong order
17:09 karolherbst: silly me
17:09 imirkin: math is hard :p
17:09 karolherbst: it sure is
17:09 karolherbst: especially if you have like 3600 equations :D
17:10 karolherbst: mhh but still, I had something like that: mad(a, a, mul(b, imm0))
17:11 karolherbst: *have
17:11 karolherbst: mhh, looks like nothing indeed
17:12 imirkin: the only interesting thing with fmul is that it can have a post-multiplier (or divider)
17:12 karolherbst: right
17:12 karolherbst: so a*b / 2?
17:12 imirkin: right
17:12 imirkin: or * 2
17:12 imirkin: up to 8 i think
17:13 imirkin: i.e. 2, 4, 8
17:13 imirkin: we try to pick up those optimizations
17:13 imirkin: but they're not always possible without algebraic manipulation, so we'd miss them
17:13 imirkin: we also miss out on cases like a * 2 * b * 2
17:14 imirkin: whereby it really ought to become a * b * 4
17:14 karolherbst: mhh true
17:14 karolherbst: but somehow I never saw this
17:15 imirkin: and we never do any tree rebalancing
17:15 karolherbst: mhh, maybe now I have something
17:15 imirkin: i.e. the difference between (a + b) + (c + d) vs (a + (b + (c + d)))
17:15 imirkin: the former case can be dual-dispatched, for example
17:15 imirkin: but it uses more registers. nothing is perfect in this world ;)
17:16 karolherbst: mad(mul(a, 0.5), b, c) => add(mul_2(a, b) , c)
17:16 karolherbst: *mul_/2
17:17 karolherbst: the question is now: what would be the benefit of this...
17:18 karolherbst: or would be a add + mul div 2 cheaper than a mad+mul?
17:18 imirkin: definitely
17:18 karolherbst: well assuming the post divider is pretty much for free
17:18 imirkin: i think that's a reasonable assumption
17:21 karolherbst: but this could lead to more instructions whenever the previous mul can't be eliminated
17:21 hakzsam: karolherbst: yes, I will take care of that after the branching
17:21 imirkin: karolherbst: life's a bitch
17:22 karolherbst: hakzsam: awesome :) thanks
17:25 karolherbst: imirkin: any other fancy thinks we don't use yet?
17:26 imirkin: zcull ;)
17:27 karolherbst: I know :D
17:27 imirkin: i know you know
17:28 karolherbst: but I meant more like from the ISA
17:29 karolherbst: I still have that one LiveOnlyTex thing, but it is broken and I don't see where, have to start from scratch there
17:36 karolherbst: mhh
17:36 karolherbst: https://gist.github.com/karolherbst/bd719fa6ef928f831caaa205375e0f50
17:36 karolherbst: pretty special case though
17:39 karolherbst: imirkin: any idea how to do that in non ugly?
17:40 imirkin: so that's the algebraic manipulation stuff i mentioned we were missing
17:40 imirkin: you can look at what intel does... i think their thing is also relatively primitive
17:40 imirkin: this stuff can be tricky.
17:41 karolherbst: mhh right
17:44 karolherbst: that actually happens quite often in this shader here...
17:45 imirkin: the problem is generally known as "rebalancing" i believe
17:45 imirkin: although it's obviously more subtle than plain tree rebalancing
17:45 karolherbst: yeah, I can figure why
17:58 karolherbst: imirkin: I have an idea for an experiment: pass which marks sources with abs or neg+abs if the signess is already known and see what benefits we could get from this
17:59 imirkin: not sure i understand
17:59 imirkin: (or rather, i'm sure i don't understand...)
17:59 karolherbst: k, like this
17:59 imirkin: oh, like a value range propagation pass?
17:59 karolherbst: if you have max %r1 %r2 1.0, you know that %r1 is positive
17:59 imirkin: yes
17:59 imirkin: value range propagation (aka vrp) is quite useful
18:00 karolherbst: then you could turn a max %r3 $r1 %r2 into max %r3 abs $r1 %r2
18:00 karolherbst: yeah
18:00 karolherbst: just as an experiment though
18:00 imirkin: not just for abs stuff
18:00 imirkin: for cmp things as well
18:00 karolherbst: I already worked on somethign like that, but just in stupid
18:00 karolherbst: right
18:00 karolherbst: it was just an example
18:01 karolherbst: it would be really usefull if we could just do if (src1 >= src2) in our passes , but I have no good idea how to do it right...
18:02 karolherbst: especially because src1 >= src2 and src2 >= src1 could both be false, because it simply doesn't know yet
18:03 imirkin: well
18:03 imirkin: i've seen this a lot
18:03 karolherbst: imirkin: does vrp work like that you mark every instruction result with a min/max value (or maybe even a list of value ranges) and do checks against those?
18:04 imirkin: yes
18:04 karolherbst: k
18:04 imirkin: so the sequence i've seen
18:04 imirkin: is like
18:04 imirkin: set $r0 $r1 < $r2
18:04 imirkin: or whatever
18:04 imirkin: er hm, we should handle this already
18:04 imirkin: but basically under some circumstances
18:04 imirkin: it'll do the compare again
18:04 imirkin: against 0
18:04 imirkin: which is really the value itself all over again
18:05 karolherbst: yeah I think I saw situations like this too
18:05 karolherbst: especially sequences like those: https://gist.github.com/karolherbst/b1cb4522c8697cff6ef8ab63308ed88d :/
18:05 karolherbst: so many immediates
18:06 imirkin: :)
18:06 karolherbst: 90% of the pixmark shader looks like this
18:06 karolherbst: and then a ton of cos/sins ...
18:07 karolherbst: I doubt it has anything significant besides mul,add,sin,cos,sqrt,max,min
18:08 karolherbst: a few loops, but meh
18:08 karolherbst: mhh, I think I would want to implement something like this, but it may get messy really fast
20:16 docmax: is it possible to switch from nvidia to nouveau without reboot?
20:17 karolherbst: docmax: depends
20:17 docmax: on what?
20:18 karolherbst: it is possible without many issues if you can turn off the nvidia card
20:18 karolherbst: otherwise you depend on a bit of luck
20:18 docmax: i do a rmmod nvidia which works
20:18 karolherbst: sure, but the gpu stays on
20:18 karolherbst: is it a normal desktop gpu?
20:18 docmax: nvidia gtx 970
20:18 imirkin: depending on your gpu, nouveau may not be able to properly reinitialize the display unit after nvidia has had a go at it
20:19 karolherbst: imirkin: I assume that a force POST may mess things up too much, too?
20:19 docmax: when doing modprobe nouveau the screen goes black
20:19 karolherbst: right
20:19 karolherbst: if you can't turn off your gpu, you have to be lucky
20:19 docmax: how do i know gpu is off?
20:20 karolherbst: you can't on a desktop
20:20 karolherbst: well, usually you can't
20:20 karolherbst: don't think there is any desktop gpu which would support it
20:20 karolherbst: maybe
20:20 karolherbst: you can unplug the gpu and plug it in again
20:20 karolherbst: PCIe _should_ support it
20:20 imirkin: man, i just update X + freetype ... it's gonna take time to get used to the new fonts
20:20 tobijk: lol :D
20:20 docmax: if modprobe nouveau goes black doesnt it mean gpu was off?
20:20 karolherbst: docmax: I mean off in like no power
20:21 imirkin: docmax: screen != gpu
20:21 imirkin: last i checked, which admittedly was a while ago, nvidia did something which caused nouveau to no longer be able to update the displayed image.
20:22 imirkin: perhaps the failure mode is different now. either way, it's not a "regularly" supported action
20:22 karolherbst: allthough moast MB don't implement the pcie hotplugging thing right, so you may even damage your gpu
20:22 imirkin: it can work in some, esp laptop, setups
20:22 karolherbst: well, usually it should work with all laptop setups where the gpu is turned off
20:22 tobijk: mh spirv is in need of proper var names: var->var->data.mode = nir_mode;
20:23 imirkin: what's wrong with all vars being named var? :)
20:23 tobijk: if know all code by heart, nothing ;-)
20:24 karolherbst: allthough I don't belive that a variable contains a variable anyway
20:24 karolherbst: mhhh
20:24 karolherbst: it could though
20:24 karolherbst: ohh no, it can'T
20:24 karolherbst: :D
20:24 karolherbst: silly me
20:24 tobijk: spirv/vtn_variables.c:1236:30
20:25 karolherbst: nir_variable *var;
20:25 karolherbst: :D
20:25 karolherbst: there you go
20:26 imirkin: unfortunately you kinda have to know the history of the code to really make sense of it
20:26 karolherbst: right
20:26 karolherbst: but I also don't get why the spirv thing should have _any_ references to nir...
20:26 karolherbst: even if it might make sense for some
20:26 karolherbst: technically it is wrong
21:13 imirkin: oh, well the anv code to process spirv is 100% tied to nir
21:13 mooch3: karolherbst, yeaaaaah, on my machine, if you unplug a gpu and plug it back in, the thing turns off
21:13 imirkin: its function is to convert from spirv to nir, so ... that kinda makes sense :)
21:15 mooch3: i know this because my radeon has to sit unscrewed in the socket due to the fact that it's not tall enough to meet the screw bracket
21:18 karolherbst: mooch3: no shit...
21:18 karolherbst: :D
21:19 mooch: no i mean
21:19 mooch: the whole MACHINE
21:19 karolherbst: ahhh .D
21:19 karolherbst: :D
21:19 mooch: the entire machine turns off
21:19 karolherbst: I see
21:19 karolherbst: yeah, silly mbs
21:19 mwk: that sounds brutal
21:19 karolherbst: you usually need a server mb for that
21:19 karolherbst: mwk: lazy bios devs
21:19 karolherbst: 95% of all uefis are garbage
21:20 mooch: mwk: you said you had nv1 emulation code right? if so, can i see it? i'm thinking about implementing nv1 into the 86box emulator
21:20 karolherbst: and then the pcie controllers are shit too
21:20 mwk: mooch: hwtest/nv01_pgraph.cc
21:20 mwk: that's pretty much all I have
21:20 mooch: oh that
21:20 mooch: welp
21:21 karolherbst: mooch: sorry for that then, I just thought you were being sarcastic or something like that :p
21:21 mooch: i'd definitely need more than that
21:21 mwk: I also know a few things about PFB/PFIFO/PDAC/PRM/PDMA
21:21 mwk: but nothing that resembles an emulator
21:21 mooch: tbh we're going to need a lot of RE work done on NV1 to emulate it
21:21 mwk: tbh I think we're closer than for nv3
21:22 mwk: the display engine is dead simple to emulate, for one...
21:22 mooch: eh, at least nv3 is svga
21:22 mwk: exactly
21:22 mooch: so you can emulate the vesa modes pretty easy
21:22 mwk: svga == not dead simple :)
21:23 mooch: yeah, but there's already code and docs for it
21:23 mooch: i have the vesa modes mostly working on nv3 and nv4
21:36 mjg59: mooch: I admire your dedication
21:41 mooch: mjg59, dedication? pffft. this is just me being stupidly stubborn
21:42 mwk: it's not that different...
21:42 mwk: eh
21:43 mwk: should I try to RE nv1's cliprects, or sleep...
21:44 imirkin: mwk: do you have any idea what the deal is with VP_A vs VP_B on nvc0 (and maybe nv50 had that bs too?)
21:44 mwk: uh, no
21:44 mwk: NFI about that thing
21:44 mwk: and Tesla had nothing like it
21:45 imirkin: k
21:48 mjg59: mwk: Was it you trying to get text mode switching working on NV1?
21:49 mwk: yep
21:49 mjg59: I think I'd tried to blank that out
21:50 mjg59: How far had you got?
21:50 mwk: I succeeded, sort of
21:50 mjg59: The idea of a working kms driver for nv1 is still just utterly hilarious
21:50 mwk: I think my specimen has a defective DAC that screws up pallette in 4bpp mode
21:50 mjg59: Ah!
21:51 mjg59: That would explain the weirdness you were seeing
21:51 mjg59: You weren't able to test it under DOS?
21:51 mwk: no
21:51 mwk: my specimen also has a defective BIOS that hangs the machine when POSTed
21:51 mjg59: Spectacular
21:51 mwk: I think it belonged to some other hacker before, who flashed it and fucked it up
21:52 mwk: I mean, the code is just... broken
21:52 mwk: in the place where the entry point should be, there's a near return instruction
21:52 mwk: in a context that is far-called
21:53 mwk: and the following bytes look like someone just patched a jump to a ret
21:53 mwk: gods only know what else happened to that card
21:54 mwk: so... I managed to POST the card by putting it as secondary and writing a libpciaccess program to POST it
21:54 mwk: based on a part of the BIOS that looked like the init script
21:55 mwk: which ALSO has been patched and cannot be parsed by the BIOS' own init script parser
21:55 mjg59: The only NV1 on Ebay is 500 Euros
21:55 mjg59: WTF
21:55 mwk: it's a collector's item
21:55 mjg59: And it's even marked as not working
21:55 mwk: that's better than I expected, last time I checked there were no NV1s on ebay, period
21:56 mwk: got a link?
21:56 mwk: anyhow
21:56 mjg59: http://www.ebay.com/itm/RARE-Diamond-Edge-3D-3240-Multimedia-1995-Video-Audio-Card-NVIDIA-NV1-SCHEDA-PCI-/201688756797?hash=item2ef596323d:g:F7MAAOSwLF1X~faL
21:56 mwk: I managed to use all parts of the card already, I think
21:57 mwk: the only problems I ran into are the fucked BIOS and broken 4bpp
21:57 mwk: well, I haven't tried MIDI synth yet, or audio capture
21:57 mjg59: Sounds like you should just merge it
21:57 mjg59: Support for the Saturn joypad port would be wonderful
21:57 mwk: that's quite easy, actually
21:58 mwk: I don't have an actual saturn pad, but the whole thing is just bitbanging through some DAC registers, and I verified that stuff with a multimeter
21:58 mooch: mjg59, i think my feats of emulation are even more impressive given the fact that i don't even have a riva card
21:58 mooch: wow mwk
21:59 mwk: mjg59: merge what?
21:59 mooch: so i could just look at your findings, and a saturn emulator, and get the correct results???
21:59 mwk: I didn't write a driver...
21:59 mwk: mooch: ... for saturn gamepad support on NV1, yes
22:00 mwk: but that's not particularly interesting...
22:01 mjg59: mwk: Boo
22:01 mwk: also, my NV1 is an early revision
22:01 mooch: joy
22:01 mjg59: mwk: All I want is to be able to get an arbitrary resolution kmscon on nv1. And an nv1. And a machine with PCI slots.
22:01 mwk: from what I've seen in the windows driver, newer DACs have some proper saturn support
22:02 mwk: as in, they actually support talking some weirdo protocols
22:02 mwk: but mine has only the bitbang thing
22:02 mwk: mjg59: by "arbitrary resolution" you mean "as big as you can fit in 2MB", right? :)
22:03 mwk: sorry, this nv1 is taken, till death do us part
22:03 imirkin: mjg59: don't most desktop mobos still have a PCI slot?
22:05 imirkin: hakzsam: you've been looking at blob-generated maxwell code... have you noticed if they do 128-bit const loads or not?
22:06 mwk: mjg59: also, there's lots of other fun to be had with an NV1
22:06 mwk: an ALSA driver, for one :)
22:06 mwk: that'd be one fun driver
22:06 mwk: ALSA + KMS + input device
22:08 mwk: too bad there's no chance of supporting DRI..
22:09 mjg59: mwk: That's quite a lot in 2bpp
22:09 mjg59: imirkin: Ha I uh the only desktop I own is a Mac Pro
22:10 mwk: mjg59: it would, if the thing supported 2bpp
22:10 mjg59: Which has zero slots
22:10 mjg59: mwk: Is it all one PCI device?
22:10 mwk: you can have 4bpp, 8bpp, 16bpp, or 32bpp
22:10 mwk: yes, single function
22:10 mjg59: Well, even 4bpp is pretty high res
22:10 mwk: and you can't have hw accel with 4bpp
22:11 mwk: hmm
22:11 mwk: I wonder how high can the pixel clock go...
22:11 mjg59: Yeah that was going to be my next question
22:11 mwk: well
22:11 mwk: I can set the PLL quite high
22:12 mwk: the question is what will happen...
22:12 mjg59: Does it have any similarities with nv3, or would it be logical as an entirely separate driver?
22:12 mwk: oh fuck no
22:12 mwk: the 2d engine is similiar
22:12 mwk: other than that, everything is different
22:14 mwk: the audio and graph parts are tied in many funny ways, btw
22:14 mwk: same DMA engine
22:14 mjg59: That doesn't surprise me
22:14 mjg59: I think people used to think of DMA engines as magical expensive things
22:14 mwk: and you need to allocate structs in VRAM to describe your sounds
22:14 mjg59: Ha that's wonderful
22:15 mwk: for that matter, the DOS driver puts MIDI fonts in VRAM and plays them from there
22:17 mjg59: mwk: omfg
22:17 mwk: mjg59: actually, there are three DMA engines on that thing
22:17 mwk: one for graphics, one for audio, and one for DOS craziness
22:17 mjg59: mwk: This is like the kind of custom chip you found on systems in the 80s
22:17 mjg59: …or, I guess, console hardware
22:17 mwk: but they have a little problem
22:18 mjg59: Which kind of makes sense
22:18 mwk: they share a "reset" button
22:18 mwk: so if you fuck up graphics DMA, you have to blow up audio too
22:19 mwk: I haven't figured out what exactly the 3rd DMA engine is for, yet
22:19 mwk: AFAICT it's not connected to anywhere in the card
22:20 mwk: and it's only supposed to be used by the DOS driver's sound blaster emulation to read/write memory over 1MB mark
22:20 mwk: because, you know, protected mode is expensive
22:20 mwk: it's better to use a PCI device for that purpose
22:20 mjg59: Is the DMA engine limited to 24 bit addresses?
22:20 mwk: no
22:20 mwk: it's fully 33-bit
22:20 mjg59: Better than some hardware, then
22:20 mwk: and I wish that was a typo.
22:21 mwk: I mean, I guess the high bit is clipped away at some point before the PCI bus
22:21 mwk: but the DMA engine clearly counts in 33-bit registers
22:22 mjg59: Yeah ok it's 9 whole bits better than some hardware
22:25 mwk: mjg59: it's even better, it supports demand paging in the DMA engine
22:26 pmoreau: Damned, completely forgot to add the v2 to the mail subject… --"
22:31 mwk: mjg59: but the most hilarious part of that GPU has to be the double-buffer mode
22:31 mjg59: mwk: That's wonderful
22:31 mwk: to turn double buffering on or off, you have to flip a bit in memory controller, which makes contents of the whole memory basically invalid
22:31 mwk: ... including the structs describing audio playback
22:32 mjg59: mwk: Hahaha
22:32 mjg59: mwk: Please write all of this up somewhere
23:13 pmoreau: imirkin_: Thanks for the review! I’ll take care of those tomorrow and send out at least a patch for adding 64-bit integer constant folding (for almost all of them). CVT might need some extra time.