11:01pendingchaos: karolherbst: it seems shladd and shl have half the throughput of xmad
11:01pendingchaos: so, in isolation, I think your approach with a few more specializations would probably be the same or better than mine
11:02pendingchaos: though mine seems to interact better with other optimizations
11:02pendingchaos: so I'm not sure which would be better
11:02karolherbst: pendingchaos: yeah, I expected as much
11:02karolherbst: pendingchaos: well, in the end speed > stats
11:02karolherbst: sure, less instructions are generally better, but if you replace two fast with a super slow...
11:03karolherbst: pendingchaos: but that also explains why nvidia generally uses one shl/shladd instead of 2 or 3 xmads
11:05karolherbst: pendingchaos: I mean, we can always to that mul -> 1x shl/shladd thing
11:05karolherbst: with that I think we should be pretty good already, no?
11:06karolherbst: maybe even optimize a mul a*5 to a a << 2 + a
11:06karolherbst: nvidia doesn't do it, which keeps me wondering if there is something up with that
11:06karolherbst: maybe shladd without the add is in some way faster?
11:06karolherbst: or maybe not?
11:06karolherbst: or maybe they simply didn't bother
11:06karolherbst: for small immediates we still have that 2 xmad variant
11:07karolherbst: but it doesn't work with negative ones
11:07karolherbst: so we could use shladd for a * -4
11:08karolherbst: but this is kind of hitting our possibilities within codegen
11:08karolherbst: we don't try to eliminate neg/abs instructions by using shladd or something instead
11:13pendingchaos: so perhaps: do what the 4th patch already does but only if it creates 1 shladd/shl, otherwise try to create two xmads
11:15pendingchaos:disappears for a short while
11:24imirkin: wtf?? shl is slower than xmad?
11:27RSpliet: xmad == integer mult-add?
11:27RSpliet: ... I keep forgetting which is which
11:29imirkin: xmad == integer mul-add with limiations
11:29imirkin: of the various 16-bit variety iirc
11:31pendingchaos: imirkin: I think it can be if too many warps are using it
11:37karolherbst: imirkin: it seems so
11:37karolherbst: imirkin: it matches with nvidias decision making
11:38karolherbst: nvidia only opts muls into shifts if it would produce exactly one shift
11:38karolherbst: and generally 2 xmads are prefered over one shift
11:39karolherbst: mhhh, let me check something
11:39RSpliet: I suspect they simply added fewer barrel shifters than xmads to their design... stream threads through. Does it have a long issue delay too?
11:40karolherbst: pendingchaos: nvidia has opts for when the input of the mul is small, it only uses one xmad
11:40karolherbst: like you AND the input with 0xff before
11:41karolherbst: but nvidia prefers one shift over one xmad
11:42karolherbst: but if I multiply with 9, it uses xmad again
11:42karolherbst: maybe shladd is slower than shl?
11:42RSpliet: that is to be expected, add would be a second pipeline stage
11:52RSpliet: On GM107+, is the ratio of SFUs vs. FPU/ALU by any chance 1/8?
11:53karolherbst: no idea?
11:57RSpliet: Hmmmmm no, seems tto be 1/4
11:57RSpliet: Gosh darnit, I want to play with nouveau compilers :'(
11:58karolherbst: you have to follow what is important in life
11:58RSpliet: Yeah, last weekend I followed an adorable dog around the streets for a good 10 mins...
12:12pendingchaos: this table has information on the throughput of a bunch of instructions: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions__throughput-native-arithmetic-instructions
12:12pendingchaos: (showing that the throughput of sqrt,sin,... is 1/4 that of fadd,fmul,...)
12:19karolherbst: yay, I get the vulkan CTS passed the init phase
12:19pendingchaos: karolherbst: nvidia could just accidentally be creating suboptimal code
12:19pendingchaos: anyway, does my suggestion sound good?
12:19karolherbst: pendingchaos: yeah, that's the "non caring" part I talk about
12:19mooch3: karolherbst, pls show me ur code
12:20karolherbst: pendingchaos: well, I would be a bit careful here, but doing the 3 xmad -> 2/1 xmad opts is a good idea in every case
12:20karolherbst: pendingchaos: doing the mul -> 1 shl/shladd is also good
12:20karolherbst: everything else we would have to verify it benefits
12:20karolherbst: pendingchaos: the one xmad thing only works if you know the input is inside a certain value range though
12:21karolherbst: might be not worth the trouble
12:21pendingchaos: perhaps it might be best to do it with value range analysis?
12:21karolherbst: like if both operands are && 0xffff
12:21pendingchaos: (not now of course)
12:21karolherbst: pendingchaos: yeah
12:21karolherbst: we need to deal a different way with that
12:21karolherbst: there is also min/max and so on
12:22karolherbst: anyway, this is more of a GSoC project kind of thing
12:23karolherbst: mooch3: https://github.com/karolherbst/mesa/commits/nouveau_vulkan
12:24HdkR: Happy day
12:24karolherbst: now I have to get it running with piglit
12:25karolherbst: as the main issues will be crashes
12:25karolherbst: and "dEQP-VK.api.smoke.create_sampler" doesn't sound like a nice test to start a vulkan driver with
12:25HdkR: 1k LOC is quite nice
12:25karolherbst: well, mostly copy and paste
12:26karolherbst: with some adjustemnets
12:26karolherbst: next step would be to wire up the winsys stuff
12:26karolherbst: and wsi
12:26karolherbst: mhh, well winsys first
12:26RSpliet: pendingchaos: If you want to make sure your optimisations actually benefit us, you might want to double-check the values in TargetGM107::getLatency()
12:27RSpliet: I suspect they need a lot more detail to reflect the differences in successive architectures, but without adjusting them we make the GPU stall a lot
12:27RSpliet: (because instructions are sort-of-statically scheduled)
12:38pendingchaos: karolherbst: in the first patch, I'll be adding a check in insnCanLoad so XMADs don't have immediates outside of [0,65535]
12:38pendingchaos: does it still have your R-b or is the change too significant?
12:56karolherbst: pendingchaos: no, that's fine
15:47karolherbst: I need a nice name for the nouveau vk lib and nouveau_vk looks a bit too boring
15:53nyef: karolherbst: ValKyrie ?
15:56karolherbst: until I find something better, I will go with "nouv"
15:56karolherbst: I simply have a name conficlt
16:00nyef: Oh, wait... This is the vulkan thing?
16:01nyef: You could use "de-novo"?
16:03nyef: Yeah, that fits. Vulcan is the Roman god corresponding to Hephaestus, so the use of Latin is appropriate.
16:07karolherbst: pendingchaos: I don't know if we should lower mul to shl+add on GPUs without shladd support
16:08karolherbst: maybe that would be better, maybe not, let me check
16:08pendingchaos: karolherbst: to fit in with radv and anv, you could maybe use something like "nvk" or "nvvk"
16:09karolherbst: I kind of like nvk mor than nouv
16:09pendingchaos: MUL seemed to already be turned into SHL, so I assumed MUL was expensive
16:09pendingchaos: I don't know much about the nv50 target
16:10karolherbst: yeah, MUL -> SHL is good
16:10karolherbst: but maybe MUL is faster than SHL+ADD?
16:10karolherbst: I really don't know that
16:11karolherbst: the only case where I saw nvidia doing mul ->shladd was with negative immediates
16:11karolherbst: and only power of two ones
16:11karolherbst: * -5 falls back to the 3 XMADs
16:11karolherbst: but I assume nvidia didn't care enough here
16:11nyef: Has anyone tried using a "superoptimizer" for the nvidia ISA and then analyzed the results to try and determine appropriate optimization rules?
16:12karolherbst: pendingchaos: anyway, I would leave those opts a bit more seperated overall
16:12karolherbst: 1 patch for MUL -> SHL
16:12karolherbst: 1 patch for MUL -> SHLADD
16:12karolherbst: 1 patch for MUL -> 2 XMADs
16:14karolherbst: it would be nice to find also applications affected in terms of performance by those patches
16:14karolherbst: just that we have numbers to verify it
16:14karolherbst: more or less
16:14karolherbst: I mean, we kind of know that the trivial cases might be fine
16:14karolherbst: but maybe there is a surprise
16:16karolherbst: pendingchaos: patch 2 is also reviewed by me
16:16karolherbst: pendingchaos: you forgot the immediate = true change though
16:17pendingchaos: seems so
16:17pendingchaos: it should be "immediate = false"
16:17pendingchaos: I think I'll try running a few Feral games with the patches
16:17karolherbst: mhh, I think dolphin would be better
16:17karolherbst: those feral games still do mainly fp stuff
16:18karolherbst: so I expect the impact to be smaller
16:18karolherbst: but maybe it is significant
16:18pendingchaos: they seem to affect some of the Feral shaders a decent bit
16:18pendingchaos: Dolphin is strongly affected by them
16:18karolherbst: sure, but those games usuaully have quite a lot of shaders
16:19karolherbst: pendingchaos: also what do you mean with it should be immediate = false?
16:19pendingchaos: I meant it should be "immediate = true"
16:19karolherbst: ahh, k
16:20karolherbst: anyway, next step is to add the libdrm code... this will be fun
16:20karolherbst: for me that is
17:05pendingchaos: karolherbst: in "it would be nice to find also applications affected in terms of performance by those patches"
17:06pendingchaos: "those patches" = the 4th patch or the xmad patches?
17:06karolherbst: more the 4th patch
17:06karolherbst: we already know that xmad improves performance :)
17:06pendingchaos: I don't think the 4th patch affects Dolphin much
17:07pendingchaos: I haven't tested it, though shader-db seems to show very little change for it's shaders
17:42karolherbst: I see
19:44Lyude: nice, so it looks like I can actually trigger that disp failure without needing it to happen really on, which means I can get mmiotraces :D
19:52karolherbst: soo, winsys connected. I am able to read out the pci device id through the kernel module, nice :)
19:52Lyude: ...how does one manually set the chipset variant being used in demmio?
19:52Lyude: it's been a while since I've hneeded to do this
19:53karolherbst: Lyude: -a I think
19:53karolherbst: or -c or.... -d?
19:53Lyude: demmio: invalid option -- 'a'
19:53karolherbst: I think those are all the variants we have
19:53Lyude: oh, it doesn't even look like demmio has an option for that
19:54Lyude: *adds that*
19:56karolherbst: didn't we had some fancy method to name GPUs in nouveau?
19:56karolherbst: besides that NVXX stuff?
19:56Lyude: you mean like gm107?
19:58karolherbst: actually it is inside the kernel
19:58karolherbst: or maybe not
19:58karolherbst: I thought we had it
19:58karolherbst: intel has this fancy "deviceName = Intel(R) HD Graphics 630 (Kaby Lake GT2)" thing
19:58karolherbst: I want that as well :D
19:59Lyude: add it!
19:59karolherbst: too much work I think
19:59karolherbst: as we aren't always right as well
19:59karolherbst: maybe the string is inside the vbios somewhere?
19:59karolherbst: allthough I doubt that
19:59Lyude: probably not
19:59karolherbst: oh well, chipset should be enough for now
19:59Lyude: iirc intel autogenerates theirs
20:02karolherbst: yeah, but they don't have a as crappy naming as nvidia does
20:02karolherbst: Lyude: sometimes even same vendor _and_ device id have different marketing names
20:03karolherbst: maybe we could just ask hwdb though
20:05karolherbst: mooch: what lspci is using
20:07mooch: ah lol
20:07Lyude: wait, is the vendor ID not always nvidia?
20:07Lyude: erm, *nvidia's VID
20:08mooch: well, not on pre-NV4 anyway lol
20:08mooch: but nouveau doesn't even support that
20:08Lyude: i was going to say yeah
20:08Lyude: i won't feel that bad if I break <nv4 :)
20:08karolherbst: Lyude: yeah, I guess so
20:09karolherbst: isn't there a safe sprintf?
20:09karolherbst: ahh, true
20:10karolherbst: mhh "deviceName = NV311"
20:10karolherbst: ohh no, that is correct
20:10karolherbst: I have to parse it as hex, silly me
20:10Lyude: karolherbst: you aren't adding this to demmio as we speak are you?
20:11Lyude: or is this for the pci name stuff in nouveau
20:11karolherbst: no, I am simply hacking on some vulkan driver
20:11karolherbst: even without the interface changes we might need
20:11karolherbst: there is still quite a lot of boilerplate stuff we have to do anyway
20:12karolherbst: Lyude: https://github.com/karolherbst/mesa/commit/dbb9e3c365ca398eebbe0e4eba6e7a81152d3685
20:13airlied: karolherbst: btw you probably don't need a winsys
20:13karolherbst: airlied: probably we do actually
20:14karolherbst: but I don't really want to talk about the details in public (yet)
20:14airlied: though there may be code sharing and useful abstractions in that area :-P
20:16karolherbst: it seems there is quite a lot of code sharing possible between the vulkan drivers as well, but maybe that's just for the most basic things and it stops now
20:17airlied: karolherbst: I think you've gotten to the limit already :-)
20:17karolherbst: yeah... maybe
20:17airlied: we've been pretty diligant and merging anv/radv stuff as we go
20:17karolherbst: sharing that dispatching stuff is already super helpful
20:46Lyude: skeggsb: hm, nvif_mem_init_map() in nv50_dmac_create() is where we map the bo that we allocated for the pushbuffer that's used for the evo channel, correct?
20:46Lyude: if that is correct, should that be NVIF_MEM_UNCACHED and not NVIF_MEM_COHERENT? (assuming coherent == cache coherent)
20:54Lyude: whoops, nope, that's definitely not it
21:02Lyude: karolherbst: I keep thinking, so: I figured out that we can actually trigger the disp fail bug even when we're not loading the module early, and that we can actually tell what boots are going to fail to init the GPU because this always seems to happen during those boots (but only when nouveau isn't loaded, I don't think I see this at all when nouveau is loaded on boot)
21:03Lyude: that seems to be for the smbus but, is there a chance that's actually coming from the nvidia GPU?
22:33mooch: hey, Lyude, how do you even get a job at redhat, anyway?
22:33mooch: like, for things like virtualization imean
22:33mooch: *i mean
22:33mooch: because last i checked, they literally had NO jobs in that field in my entire country (i live in the us)
22:37Lyude: mooch: poked enough people on fdo and did gsoc with a red hatter
22:39mooch: ah lol
22:39mooch: well, too bad qemu's code is a nightmare for me to work with :c
22:39Lyude: mooch: tbh, it is kind of hard to get reqs for my team
22:39mooch: i've only ever worked on emulators anyway
22:39mooch: i did make an iphone emulator tho :/
22:40Lyude: i mean
22:40Lyude: we have a couple of people who work for us who started off just doing emulators
22:41Lyude: hans de goede for instance started with working on GBA emulators I think
22:41karolherbst: mooch: I am sure you wouldn't reallt be able to get a job when you want to work on emulators only
22:41mooch: oh? why not
22:41karolherbst: even qemu is mainly used as a kvm hypervisor to do virtual machines, not to emulate ancient hardware
22:42mooch: emulators are close enough to virtualizers :p
22:42mooch: Lyude, oh? da heck do they work on now?
22:42karolherbst: I mean, there needs to be some business value for the stuff you would do. Well in most cases
22:42Lyude: mooch: "yes"
22:42Lyude: (they work on so much stuff at this point I've lost track, but they're working on the hw enablement team here atm)
22:43mooch: karolherbst, i mean, emulators do sell a bunch on the play store lol
22:43Lyude: kind of like me! :)
22:43mooch: Lyude, ah nice!
22:43karolherbst: mooch: well...
22:43karolherbst: that is questionable business anyway
22:43mooch: how so? emulation is completely legal in the us
22:43karolherbst: and I am sure a court can rule those illegal anyway
22:43mooch: this has already been decided in courts
22:43mooch: it's legal
22:43karolherbst: selling them?
22:43mooch: sony v bleem
22:44Lyude: karolherbst: yeah emu is legal
22:44mooch: even having comparison screenshots in your advertising is legal
22:44karolherbst: I know that emus are legal
22:44Lyude: even in Hell Land
22:44karolherbst: I mean, selling them if you include copyrighted material
22:44mooch: bleem was a commercial emulator
22:44karolherbst: which you usually have to do
22:44mooch: karolherbst, nope
22:44mooch: you don't
22:44mooch: none of the play store ones do
22:44karolherbst: I see
22:44karolherbst: well anyway, there isn't much money there
22:44mooch: really? supergnes sold 1 million copies
22:45mooch: and that's a snes emulator
22:45mooch: at $4 a pop, that's 4 million dollars
22:45karolherbst: that is at most 100k in revenue
22:45karolherbst: at most
22:45mooch: uh how? lol
22:45mooch: they sell it for $4 a pop
22:45karolherbst: I meant profit
22:46mooch: oh sorry
22:46karolherbst: anyway, that isn't something which will get you money for a long time
22:46mooch: why not?
22:46mooch: besides, emulation is fairly similar to virtualization
22:47mooch: you still have to emulate the peripherals :p
22:47karolherbst: and you could propably work on that _as_well_
22:47karolherbst: but I hardly think it would be worth a full time job
22:47mooch: virtualization isn't worth a full time job? lol
22:47mooch: how is it not?
22:47karolherbst: emulating hardware for virtualization
22:48karolherbst: because, you don't want to emulate hardware in the first place
22:48karolherbst: so you only emulate everything which doesn't matter really and for everything else you try to be tricky
22:48karolherbst: emulating a GPU is not great ;)
22:49karolherbst: so what you do instead is, to do something like virgl
22:49karolherbst: but that's not emulating hardware anymore
22:49karolherbst: working on virgl would be something maybe
22:50HdkR: Emulating is good for learning about architecture but it being the sole reason to get a jorb is difficult :)
22:50mooch: i mean, i got an interview with raytheon just based on my emulation work :/
22:50mooch: my mom tells me that's an honor
22:50karolherbst: that doesn't mean you would do emulating stuff full time ;)
22:50karolherbst: I wouldn't even go to such companies
22:51karolherbst: because this is military stuff
22:51mooch: yeah, so?
22:51karolherbst: yeah, that's stupid
22:51karolherbst: I would feel terrible
22:51mooch: this was a non-military division
22:51karolherbst: same thing
22:51mooch: hey now, don't forget, i'm borderline psychopathic in some ways
22:51mooch: so i really don't give a shit
22:52karolherbst: I also was invited to watch some military stuff on some airport or something, never followed up on that, so I don't know if it was fake or not
22:52karolherbst: looked real though
22:52mooch: i mean, i don't want to go into combat, mind you, because i'm literally incapable of using a gun safely (tourette's. SEVERE tourette's)
22:52mooch: but still
22:52karolherbst: well, you work for a company whos purpose is to create stuff to kill people
22:53karolherbst: if people are fine with working for such companies, good for them. Doesn't change the fact that they are implicitly assholes
22:53mooch: okay, sure, but still
22:53mooch: i don't exactly back down from calling myself an asshole either :3
22:54karolherbst: as long as you are aware, that's fine
22:54HdkR: mooch: They play Smash brothers religiously there, have fun? :P
22:54mooch: HdkR, at raytheon? lol
22:54karolherbst: the world could be a more terrible place without assholes, who knows
22:54Lyude: Can a core timeout error from nouveau come from something else other then the hardware dying? like, a thread being blocked at just the right place?
22:54mooch: karolherbst, i mean, i'm not exactly MORALLY an asshole, i'm just an asshole to those i deem "stupid"
22:54mooch: ...which is at least 60 million americans
22:55mooch: ...and 50 million russians
22:55Lyude: famous last words
22:55mooch: and various other people
22:55mooch: usually people who can't figure out basic shit about computers
22:55karolherbst: 50% of all people are less intelligent than half of all people :p
22:55mooch: i get that joke
22:55Lyude: but really though, anyone know the answer to the notifier timeout question?
22:56nyef: "On average, half of the population is below average. This also applies to their ability to figure out which half they're in."
22:56mooch: okay, fair enough
22:56karolherbst: nyef: :p
22:56mooch: but i know that iq-wise at least, i'm above average :p
22:57mooch: in most other aspects, i'm below average tho lol
22:57mooch: especially in the "ability to function" category
22:57Lyude: iq is a lie, intellegience is an oversimplification, etc.
22:57mooch: yeah, but still :p
22:57karolherbst: kind of gets boring
22:58mooch: if you can write an emulator, you've got to at least be SORTA intelligent, right?
22:58karolherbst: most people are in some sort intelligent
22:58karolherbst: doesn't have to be math or science stuff
22:58mooch: ehhhh i wouldn't say that
22:58karolherbst: or thinking stuff
22:58mooch: i mean, trump voters are pretty fucking idiotic lol
22:58karolherbst: being able to get along with people is also a form of intelligent
22:58Lyude: that is true at least :)
22:58karolherbst: or having good control over your body
22:59mooch: so is everyone who still supports them
22:59karolherbst: being emotional stable
22:59mooch: i mean, i'm not even completely sane, so lol
22:59mooch: then again, with my past, who would be?
23:00karolherbst: some people
23:00karolherbst: there would be always somebody, who would
23:00mooch: protip: telling the school or even your parents about how often you get bullied is NEVER going to stop it
23:00karolherbst: well, except for the obvious thinks
23:00karolherbst: like flying
23:01mooch: karolherbst, i mean, i had an attempt on my life when i was like 10 or 11, by another kid so :/
23:01mooch: at least i'm not in a mental hospital right now like she is!
23:01mooch: at least, last i checked
23:01mooch: that btw, is uh, a good chunk of the reason why i'm not sane
23:02karolherbst: yeah... stuff like that is always difficult to talk about
23:02mooch: not for me but eh
23:02mooch: i told my parents RIGHT AFTER it happened
23:02mooch: yet they didn't call the cops for some reason...
23:02mooch: merely chewed the fuck out of the kid that did that to me
23:03mooch: tho i think my mom might've threatened her or something? i dunno my memory's vague as fuck
23:03Lyude: ugh, not another one. aaaaaaaaaaaaaaaaa. so it looks like core notifiers do only come from the evo channel, which means I'm now seeing another bug with this gm107 trying to bring up two displays
23:03mooch: Lyude, jeez :/
23:03karolherbst: Lyude: :D
23:03Lyude: specifically with mst, but it looks like something that is very likely fixed in mainline kernels
23:03Lyude: mooch: yeah i've been fixing nouveau for like
23:04Lyude: nearing a month now
23:04karolherbst: Lyude: there is always more work :p
23:04mooch: tbh, i don't have NEARLY the knowledge to work on mesa at all
23:04karolherbst: mooch: just start
23:04mooch: or the kernel or anything in the ecosystem really
23:04mooch: i've tried, but i don't know WHERE to start
23:04Lyude: neither did i
23:04karolherbst: mooch: fix bugs
23:04mooch: yeah, which bugs?
23:04karolherbst: bugs which annoy you
23:04mooch: i only have a gm107, mind you
23:04mooch: none of them really annoy me
23:05Lyude: i'm not even sure this was a full year ago; but I came into here asking how I could start contributing to mesa with nouveau :)
23:05nyef: mooch: Find bugs. Preferably ones that affect you. Try to fix them.
23:05mooch: i've never had any bugs on nouveau or linux :/
23:05karolherbst: mooch: I started to fix memory reclocking on my gk106, this was fun
23:05mooch: at least, not recently
23:05mooch: also, i'm REALLY bad at reverse engineering
23:05mooch: as in, my skills are non-existent in that field
23:06nyef: I've been here on and off (mostly off) for years. I think the first time was an issue with an NV17 laptop, when it was reasonably new.
23:06nyef: Later with a connector mapping issue with a PowerPC mac.
23:06karolherbst: mooch: I never learned reverese engineering anywhere
23:06karolherbst: on paper I am not even qualified for that job :D
23:06mooch: i can't even read asm, and i've tried for years
23:07karolherbst: I couldn't either before starting to work on nouveau :p
23:07mooch: i'm not qualified on paper for programming at all, even though i've been doing it for 9 years
23:07mooch: karolherbst, except i've tried various types of asm for years
23:07mooch: can't read any of them
23:07karolherbst: the main issue most people have is that they _think_ they can't and never will be able to do something
23:07karolherbst: this is the main issue for most learning issues
23:07mooch: the problem is that i can't fucking figure out the structure of the asm program
23:08Lyude: who cares, i never properly learned asm technically.
23:08karolherbst: asm has no structure
23:08Lyude: i understand it pretty dang well at this point, seeing as i've written one assembler :P
23:08mooch: well, i kinda need to figure out the structure of the program i'm disassembling tho
23:08Lyude: but it's not a thing you actually need to understand that well to work in this community
23:08mooch: ya know, to figure out dafuq it's doing
23:08karolherbst: mooch: sure, but the asm won't tell you
23:08karolherbst: that is the result of figuring out what the asm does
23:09Lyude: unless you already have a working disassembler, and even then it still might not tell you
23:09mooch: yeah, but i can't figure that out
23:09mooch: because i don't know the structure of the program
23:09Lyude: (...wat asm is this even?)
23:09mooch: i can't figure that out
23:09karolherbst: extracting jumps makes kind of sense
23:09mooch: Lyude, 6502, arm, mips, x86...
23:09mooch: basically any of them
23:09karolherbst: but relativ jumps are messy
23:09mooch: can't read em
23:09karolherbst: is only a problem if you want to read them
23:10karolherbst: sometimes it is also a valid way to just ignore something and never look back :p
23:10mooch: yeah, but i need to read them to, say, get my emulation working
23:10mooch: this is also why my 3c501 emulation in 86box never worked
23:10karolherbst: I think there are some fancy llvm dissasemblers
23:10nyef: mooch: https://stackingthebricks.com/master-new-skills/ might be of interest to you.
23:10mooch: i couldn't fucking read the goto soup that is the diagnostics program for it
23:10HdkR: IDA/Hopper will show control flow as well which is what I have a hard time tracking in ASM
23:11mooch: nyef, i AM wildly passionate about this tho
23:11mooch: that's the thing, and it's fucking frustrating
23:11mooch: HdkR, this was a 16-bit x86 program
23:11mooch: so hex rays can't decompile it
23:11nyef: mooch: The same skills apply. Try reading the article anyway.
23:11karolherbst: nyef: I kind of got to a level, where lazyness is my main issue, which is kind of neat, as this is easier fixable than convincing yourself you aren't able to do something...
23:12HdkR: Time to write a architecture backend for the disassembler :P
23:12mooch: also, this program has a TON of functions, and the compiler that made it couldn't even inline functions
23:12nyef: karolherbst: Amen to that.
23:12mooch: HdkR, uh, you mean the decompiler?
23:12mooch: because i've tried reading asm in ida pro, and i still can't
23:12mooch: seriously, this shit is goto soup
23:12HdkR: mooch: Yea. Writing a plugin for a new architecture that ida/hopper/etc doesn't support is fairly straightforward :)
23:13karolherbst: mooch: any relative jumps in there?
23:13mooch: HdkR, ida's disassembler supports 16-bit x86 just fine
23:13mooch: karolherbst, yeah, but those are easy with ida pro
23:13karolherbst: I think there is some llvm based asm -> C for supported arch things
23:13karolherbst: which might produce more understandable code
23:13mooch: the only problem is the fact that even when decompiled, the code still looks like shit
23:14mooch: decompiled with radare2, mind you
23:14HdkR: mooch: Binary.ninja supports IR lifting for not dealing with ASM directly :)
23:14mooch: ...which can't decompile entire programs at once
23:14karolherbst: mooch: well, chances are that the original code was already bad enough
23:14mooch: well, it was in c, and compiled by lattice c 2.1
23:14mooch: for the 8088 i think
23:15karolherbst: at some point trial&error is the faster route to success
23:15mooch: HdkR, that doesn't support 16-bit x86
23:15mooch: karolherbst, well, i tried that too, but then the diagnostic program was doing nonsensical bullshit
23:15karolherbst: that's good
23:15mooch: and giving nonsensical results
23:15karolherbst: because then you know what not to do ;)
23:15mooch: da heck do you mean?
23:16HdkR: mooch: That's why you can write an architecture plugin. I find it really helps with learning the ASM :)
23:16mooch: this was AFTER fixing shit
23:16karolherbst: if you find enough stuff not do to, then there isn't much left
23:16mooch: HdkR, for binary ninja?
23:16mooch: karolherbst, well, i couldn't figure out WHY it was doing this shit is the problem
23:17karolherbst: sometimes there is no reason. This is kind of a bigger issue with especially older software
23:17karolherbst: where software and hardware were workaroudning issues inside software/hardware
23:17karolherbst: and it doesn't make sense of you look at them as seperate things
23:17karolherbst: or two different kind of software
23:17karolherbst: or maybe even inside the same thing to workaround bugs somewhere else
23:18nyef: https://www.tinaja.com/ebooks/tearing_rework.pdf is a bit Apple ][-specific, but it's an interesting approach.
23:18mooch: well, it didn't even make sense according to the hardware docs i had tho
23:18karolherbst: I alerady saw enough shitty software to just assume that sometimes things are just illogical and won't make sense like ever
23:18mooch: or the linux driver, for that matter
23:18karolherbst: hw docs can be buggy as well ;)
23:18mooch: the old one
23:18nyef: Also, some diagnostic software does really stupid-seeming things purely to test the error response of the hardware.
23:19nyef: Or to make sure that some completely bizarre edge case works properly.
23:19mooch: well, it was failing the test tho
23:19nyef: On real hardware?
23:19mooch: no, in my emulation
23:19mooch: i don't have the real hardware
23:20nyef: Not having real hardware to compare against makes everything harder.
23:20mooch: but this diagnostics program seems to have been updated for YEARS after the card came out
23:20mooch: and it's official
23:20mooch: so i assume it runs on real hardware
23:20karolherbst: mooch: the point was, you not having the hardware
23:20mooch: okay, true
23:21karolherbst: as you can't reverse engineer it
23:21mooch: well, having the hardware wouldn't exactly help much either, as i can't decap shit either (no money)
23:21karolherbst: anyway, if there is some linear progression through the tests you can kind of use regressions as an indicator if you get closer to what the hardware does or not ;)
23:21karolherbst: mooch: but you could use software to speak with the hardware
23:22mooch: okay, true
23:22nyef: mooch: Decapping not necessary. Even having an I/O trace of a /passing/ diagnostic run would help.
23:22karolherbst: I mean, in the end the goal of the emulator is to get as close as possible to the real hardware
23:22karolherbst: _not_ to do something sane or logical
23:22mooch: how would i get an i/o trace of a passing diagnostic on an 8-bit isa bus?
23:22mooch: karolherbst, i know, but still
23:23mooch: i have to understand how the hardware works to make that happen
23:23nyef: Stick the damned thing in a PC/AT or better, run the diagnostic in Ring 3.
23:23karolherbst: to some degree, yes
23:24nyef: With IOPL at 0, any I/O access will trap, and you can emulate the access against the real hardware, and log it at the same time.
23:24nyef: And you can make this happen, even running a real-mode DOS program on a 286 in protected-mode.
23:25nyef: A 386? Even better! Now you have vm86 mode.
23:46mooch: ehhh, i can't write asm either tho
23:47mooch: also, i can't afford an old computer like that anyway