09:29 hermier: imirkin: can you confirm me that it is this patch that you want me to test: https://lists.freedesktop.org/archives/nouveau/2015-March/020421.html
16:26 imirkin: hermier: confirmed
16:49 hermier: imirkin: ok I modified the patch so it works with head, and will give it a try this night ;)
16:51 hermier: because the code was moved, and functions renamed
17:02 booti386: Hello, after switching to the proprietary firmware (for NVE6), it seems to no longer crashes
17:04 imirkin: yea
17:04 booti386: However, it still crashes when I try to set the pstate to the highest level (0f: core 1032 MHz memory 5400 MHz)
17:04 karolherbst: booti386: for that you need 4.9
17:05 booti386: Oh.
17:05 booti386: Well :D
17:05 karolherbst: :p
17:05 karolherbst: well hopefully, no idea if the nouveau drm stuff got merged already
17:05 imirkin: booti386: according to skeggsb, your hw doesn't exist.
17:05 karolherbst: imirkin: what is special about his hw?
17:06 imirkin: it works with blob ctxsw fw, and not nouveau one
17:06 karolherbst: I see
17:06 booti386: Oh. This is embarassing :( OpenGL renderer string: Gallium 0.4 on NVE6
17:06 booti386: Ahaha
17:07 booti386: Oh, and one more question, why the gpu (nearly?) never recovers without a reboot?
17:13 karolherbst: booti386: because nouveau is bad in trying
17:14 booti386: karolherbst: What do you mean?
17:15 karolherbst: nouveau is just bad in recovering from faults in general
17:20 booti386: Why? Simply reset the hardware and then initialize it again is not enough? (I guess no, but well)
17:20 karolherbst: booti386: what about opengl states?
17:20 booti386: Kill everything! :)
17:20 karolherbst: sure, and then you could also reboot
17:20 karolherbst: cause your entire desktop would vanish anyway
17:21 booti386: Sad
17:21 Tom^: meh recovering from errors is pointless, fix the errors instead and there aint anything to recover from!
17:22 karolherbst: Tom^: you can try :p
17:23 Tom^: =D
17:25 karolherbst: booti386: but yeah, if we manage to recover, we indeed kill every process using the gpu I guess, but this has to be improved and we are currently working on it somewhat
17:25 imirkin: the biggest issue right now is we don't properly reset the GPU
17:25 imirkin: everything else is downstream of that
17:27 booti386: Oh, cool, at least if I could switch to a TTY, it's enough to me :D
17:27 booti386: Oh, ok...
17:28 karolherbst: and sometimes the gpu even falls of the bus
17:28 booti386: And it can be a difficult thing to trace in the blob, I guess
17:28 imirkin: with KHR_robustness you can do things like report lost contexts in GL, etc
17:28 imirkin: and things like the ddx don't really care, and should be able to bring things back up pretty easily
17:29 karolherbst: booti386: well, usually it is nouveaus fault... so we can't really get nvidia in doing the same
17:29 imirkin: well, when the gpu falls off the bus, not a ton you can do. but that's rare.
17:29 imirkin: much more common is that some (important) engine gets wedged
17:29 imirkin: and our basic attempts at unwedging have no effect
17:29 imirkin: ideally we'd just shut nouveau down and start it back up, including running the vbios
17:29 imirkin: but there's no logic for that right now
17:30 booti386: Ok, I see...
17:30 imirkin: i believe in like 75% of cases, that should be enough to bring things back up (number is 87% made up)
17:32 imirkin: i bet it's a month's work for a few people to drastically improve the situation
17:38 mupuf: imirkin: if we do this, we lose all the context
17:38 mupuf: and the X server needs to be restarted
17:38 mupuf: but yeah, we need to have a way to reliably reset anything
17:43 imirkin: mupuf: not necessarily
17:44 imirkin: if it's properly reported to the clients
17:44 imirkin: they can mitigate the issues one way or another
17:44 imirkin: mupuf: and restarting X server gracefully seems like a much better outcome than forcing me to force-reboot and force a raid rebuild
17:45 mupuf: imirkin: yeah, this gl extension ... but no app supports it
17:45 mupuf: arguably, what matters the most is the compositor
17:45 mupuf: that's about it
17:45 mupuf:thinks that we should fix it for wayland and forget about X
17:45 karolherbst: mupuf: and like toolkits using gl
17:46 booti386: I'm all fine if I get only a single TTY back, you know :)
17:46 mupuf: right
17:46 mupuf: this is true
17:46 mupuf: yeah, we should at least succeed at this
17:47 karolherbst: mupuf: allthough users will be equally pissed
17:47 booti386: Not me :D
17:48 karolherbst: doubt it, cause it doesn't matter if you have to reboot or get your tty if your 5 hour writing stuff is just gone
17:48 karolherbst: or any other work
17:48 karolherbst: or unsaved game or whatever
17:49 booti386: Err... Regular saves? :/
17:49 karolherbst: well, not all games support autosaves
17:49 imirkin: mupuf: not sure what wayland or X have to do with it. the kernel has to report the reset to userspace.
17:50 imirkin: then userspace can deal with it however it wants, that's a problem for later :)
17:50 mupuf: imirkin: hehe
17:50 mupuf: we will need this reporting actually
17:50 imirkin: but without step 1, there are no later steps
17:50 mupuf: we discussed about it with Ben and Samuel at XDC
17:50 mupuf: it was to handle properly the -EBUSY command submission issue
17:51 booti386: Or at least let the computer reboot gracefully (because it often ends up hanging the whole system, with only Alt+Sys+B working to reboot)
17:53 booti386: (To be sure not to damage the filesystems)
17:55 imirkin: right, so there's the "submit too fast" thing as well, but that's separate.
17:58 mupuf: imirkin: not entirely, because the userspace needs to distinguish between: The kernel killed the context and the hw is unhappy
17:59 imirkin: why
17:59 imirkin: (why would the kernel kill the context?)
18:00 imirkin: oh, you mean submitting too fast needs a different code -
18:00 imirkin: yes
18:00 imirkin: but you can do more than just fill in an errno
18:00 imirkin: you could fill in a whole return structure
18:01 mupuf: yep, or you just use the mechanism to report back to the userspace that the context has been killed
18:01 mupuf: it is necessary for robustness anyway
18:01 mupuf: this way, one stone, two birds
18:04 imirkin: right.
18:04 imirkin: anyways. when i said "that's separate", i just meant it's a separate issue.
18:05 imirkin: making resets possible is a LOT more than just reporting the error to userspace
18:06 mupuf: yep
18:06 mupuf: agreed ;)
18:09 booti386: Would it help to do a mmiotrace while crashing a GPU?
18:09 imirkin: booti386: i think just re-running the vbios should be enough to get things going
18:09 imirkin: however i think that would end up disconnecting nouveau's idea of the state from the card's
18:09 imirkin: so it's a little more subtle
18:10 booti386: But the internal state of nouveau needs to be rebuilt?
18:10 booti386: Yes, ok
18:12 imirkin: and a bunch of things on the card as well
18:12 imirkin: like ... you know, mmu setup
18:12 imirkin: little things like that
18:13 imirkin: and it'd be nice to let userspace continue as if nothing had gone wrong, which means recreating a lot of the [hw] contexts, etc.
18:15 booti386: Yes, of course...
18:17 imirkin: and i have to imagine this will be quite tricky with ben's upcoming DP-MST work
18:18 imirkin: since those "connectors" are a lot more ephemeral
18:21 booti386: Oh, they are sorts of "USB hubs" for display
18:25 imirkin: yep.
19:07 mupuf: imirkin: simple, just use the fini and init functions again
19:07 mupuf: that will take care of everything, wouldn't it?
19:08 mupuf: but seriously, we just need to handle pgraph, the rest just never crashes
19:09 karolherbst: mupuf: what stays on the gpu while the machine get's suspended?
19:09 karolherbst: or is suspend + nouveua == bad?
19:10 mupuf: nothing
19:10 karolherbst: ohh, so in theory everything is already there
19:10 mupuf: nouveau swaps out everything from the vram
19:10 karolherbst: ohh right
19:10 orbea: suspend usually works here, black screen once in a while after resuming
19:10 karolherbst: ohh wasn't there this issue that nouveau fails to suspend if there is no place in sysram to suspend?
19:11 orbea: can get past that by killing xorg, but i have to reboot if I want anything with GL to not freeze xorg
19:11 imirkin: mupuf: yes ... very simple ...
19:11 imirkin: mupuf: sounds like you have a patch ready? :p
19:11 mupuf: ;)
19:12 mupuf: just saying that the MMU and such is not a real issue, because of the way nouveau is architectured. The thing is that no-one cared enough to add this capability
19:12 mupuf:has not seen a hard hang with nouveau in a while
19:13 karolherbst: true
19:13 mupuf: what else but a page fault can generate that?
19:13 karolherbst: I was lazy today, so I was playing civ 5 the entire day
19:13 karolherbst: not one crash so far
19:13 mupuf: and even then, the pagefault would just kill the process
19:13 orbea: at first it seemed xorg issue since xorg was segfaulting, but a xorg patch fixed that until it came back with a new segfault. Another patch fixed that and now I just get the freeze with no segfault...
19:14 karolherbst: uhh, nearly 9 h cpu time :O
19:14 karolherbst: mupuf: I have no idea, but we had one report here in IRC
19:15 karolherbst: mupuf: where basically vram is bigger than swap and things go south on suspend if ram is too full or something like that
19:16 mupuf: karolherbst: right
19:50 booti386: Oh, so theorically a crash in nouveau could be solved by hibernate/resume? (if it does not lock itself waiting for events or similar)
19:53 tobijk: karolherbst: stop playing civ5 and make its shaders faster ;-) (beside epic round load times :D)
19:54 imirkin: it probably would lock itself up, if some engine is hung
19:55 booti386: :/
19:55 karolherbst: tobijk: I want to write some documentation on how to RE the vbios though
19:56 karolherbst: tobijk: feel free to look into my patches and grab whatever interest you and just continue to work on those. I still plan to finish these things though
19:56 tobijk: docu is always nice :)
19:56 tobijk: karolherbst: yeah i was on making little adjustments, but imirkin redirected me
19:56 karolherbst: there is one thing though, which is really good
19:57 tobijk: ?
19:57 karolherbst: three things actually
19:57 karolherbst: 1. better pow lowering
19:57 karolherbst: 2. selp
19:57 karolherbst: 3. make CSE smarter
19:58 tobijk: 4. LoadPropagation is missing mov's
19:58 tobijk: to complete the list :)
19:58 karolherbst: tobijk: CSE is silly in cases like mul(neg(a), b) == mul(a, neg(b))
19:58 karolherbst: it won't detect that
19:58 karolherbst: tobijk: https://github.com/karolherbst/mesa/commit/73fb44f494714f5c0ae9fa3cfcba5f9672c5bd54
19:58 karolherbst: no idea if we can merge that into the CSE pass and be such smart there
19:59 karolherbst: or if it is indeed fit for a new pass
19:59 karolherbst: ohh I was even smart enough to detech mul(a, b) == mul(neg(a), neg(b))
20:00 karolherbst: *detect
20:00 tobijk: hehe
20:00 tobijk: not sure if that needs a new pass though
20:00 karolherbst: exactly
20:00 karolherbst: but it helps
20:00 hermier: stupid idea, but isn't it better to transform mul(neg(a), b) to neg(mul(a,b)) .
20:01 hermier: ?
20:01 imirkin: karolherbst: it'd be easier if something normalized things so that the neg always went on the first arg
20:01 karolherbst: tobijk: those passes improve pixmark_piano by around 10%: https://github.com/karolherbst/mesa/commits/pixmark_piano
20:01 imirkin: hermier: neg is a modifier
20:01 karolherbst: imirkin: yeah well, maybe
20:01 karolherbst: will be painful to get it right
20:01 hermier: imirkin: that means ?
20:02 karolherbst: because passes also need to be not silly about it
20:02 imirkin: hermier: it's not like neg is a separate op. it's a modifier on the mul instruction.
20:02 karolherbst: especially if you do mod ^ Modifier(NEG)
20:02 karolherbst: cause you can't anymore
20:02 imirkin: hermier: whereas there's no way to negate the result of a mul
20:02 hermier: imirkin: ok, so it has to do with assembly
20:02 karolherbst: hermier: ISA ;)
20:02 imirkin: it has to do with which operations that are possible and which aren't
20:08 tobijk: karolherbst: nice, 10%, any idea for "real" workloads
20:08 karolherbst: it is real
20:08 karolherbst: it just don't need any memory stuff
20:08 karolherbst: so it is pure compute performance
20:10 booti386: Wow
20:10 tobijk: karolherbst: i mean a non-benchmark workload
20:12 tobijk: anyway 10% is impressive as is for piano :)
20:13 karolherbst: yeah, but there are some experiments in it and stuff
20:13 karolherbst: so the "normal" things are like 2-3%
20:13 karolherbst: especially that pow lowering helps
21:49 karolherbst: wip for vbios reing: https://gist.github.com/karolherbst/4341e3c33b85640eaaa56ff69a094713
21:54 mupuf: karolherbst: nice!
21:54 karolherbst: the line breaking is killing me though
21:54 mupuf: but don't forget to say one thing: this does not work on GM2xx
21:54 karolherbst: uhh right
21:54 karolherbst: we can't mention that often enough
21:55 mupuf: hehe
21:55 karolherbst: :D
21:55 mupuf: and you forgot also a very important thing: A flow chart to explain when to do what
21:55 mupuf: and where one should look for
21:55 karolherbst: uhh right
21:56 karolherbst: but currently I am in text mode, fancy graphical stuff have to wait
21:56 mupuf: for instance: parameters that should be tunable between cards: vbios!
21:56 karolherbst: mh?
21:56 mupuf: parameters depending on the HW, check out what the strap says!
21:56 mupuf: and be prepared for indexed vbios tables
21:56 karolherbst: never ever looked into that
21:57 mupuf: we should add some examples, for the sake of completeness
21:57 mupuf: but yeah
21:57 mupuf: we need to provide some workflows
21:57 mupuf: userspace tracing, kernelspace tracing
21:57 mupuf: what is in the kernel space, what is in the userspace
21:58 mupuf: this is IMO more important than documenting the flags of nvafakebios :)
21:58 karolherbst: yeah
21:58 karolherbst: I was thinking to write stuff for kernelspace tracing after I am done with the vbios
21:58 mupuf: sounds great
21:59 karolherbst: will also write usefull things for optimus systems :)
21:59 mupuf: hehe
21:59 mupuf: but yeah, writing down the general workflows and what each tool allows people to do is definitely something we should have done a while back
22:00 karolherbst: yep
22:00 karolherbst: I will do exactly this :p
22:00 mupuf: way to go!
22:00 karolherbst: something according to this scheme: task, tools, workflows
22:01 mupuf: but don't go too deep in the hand-holding
22:01 karolherbst: so first describing the task, then document usefull tools for that, last describe the workflows
22:01 karolherbst: I know ;)
22:01 mupuf: better go for breadth first
22:01 karolherbst: yeah
22:01 karolherbst: no idea if I will write any workflow things for now anyway
22:01 karolherbst: documenting the tools is more important
22:04 mupuf: really/
22:04 mupuf: for me, the most important is: Say how the blob is structured
22:04 mupuf: what part does what
22:04 mupuf: and where to look for
22:04 mupuf: then come the tools
22:05 mupuf: say, I am interesting in checking something related to tesselation
22:05 mupuf: what tool should I use?
22:05 karolherbst: ahh well, I am already too far with those docs where I don't care about the broad view anymore
22:05 mupuf: how to use is sort of documented already
22:05 mupuf: worst case, the source code is available
22:05 karolherbst: well ...
22:05 karolherbst: we also have the source code of the kernel module, doesn't mean anybody understands what's going on
22:06 karolherbst: and nobody really wants to read the source to find this out
22:06 karolherbst: so most will just say: meh, then I won't help
22:06 mupuf: sure, but, for high level stuff, it is not a good idea
22:06 karolherbst: true
22:06 mupuf: and your documentation is meant for REing the blob, right?
22:06 karolherbst: yeah
22:06 mupuf: so, the source code is not there
22:06 karolherbst: true
22:07 mupuf: better start with our knowledge and then build to the tools
22:07 mupuf: I am not talking about writing a book
22:07 karolherbst: but I start where the (new) dev already knows where to look at
22:07 karolherbst: sort of
22:07 mupuf: which is ... absolutely false
22:07 mupuf: why would this dev know already where to go?
22:07 karolherbst: I meant if you know it is something in the vbios
22:08 karolherbst: then how do you RE it?
22:08 mupuf: sure, but that is the second level
22:08 karolherbst: so, currently it isn't written anyway
22:08 karolherbst: I know it is the second level
22:08 mupuf: first level is to know where to look ;)
22:08 karolherbst: but the first level can be discussed in IRC already
22:08 mupuf: hence what I said about breadth first :)
22:08 karolherbst: I just think the second level is actually a bit more important than the first one
22:09 mupuf: don't go out of your way to document reverse engineerign tools
22:09 mupuf: if one cannot check out the source code anyway (or run -h), then they are doomed
22:09 karolherbst: well, for most tools, we can't do -h
22:09 mupuf:sees way more value in documenting what does what
22:09 karolherbst: also, how should anybody know which are the right tools anyway?
22:09 mupuf: and what tools to use
22:09 karolherbst: right
22:10 mupuf: well: Kernel-space == initial set-up, power management
22:10 karolherbst: that's what I try to do, just I think that the broader view isn't as important as the actual tasks
22:10 mupuf: AKA, non-client-dependant code
22:10 karolherbst: if nobody writes stuff for the broader view, I may write that in the end anyway
22:10 mupuf: Tool to use: mmiotrace + demmio
22:11 karolherbst: sure, that would be part of the "Nvidia kernel module" section ;)
22:11 mupuf: User-space == Per-client work, command submission (link to pfifo's documentation)
22:12 mupuf: OpenGL, OpenCL, etc... is implemented there
22:12 mupuf: Tool: valgrind-mmt + demmt
22:13 mupuf: Vbios: Contains the manufacturer-provided specifications. Look there for information related to power management, connectors, clocks, memory, init scripts, gpios and external devices
22:15 mupuf: WARNING: Some data may depend on the value of the strap register, which allow manufacturers to share the same vbios with multiple revisions of a GPU. Information such as the memory information may be indexed based on this value, check out the uses of the strap_peek in nvbios for more information
22:15 mupuf: Tool to use: nvbios
22:15 mupuf: INFO: Nouveau developers have access to a repository of user bioses, ask for it if you feel like you need it.
22:16 karolherbst: If you want, you can also write stuff ;)
22:16 mupuf: Microcodes: NVIDIA is known to use a lot of different microcodes. You may use nvdis to de-assemble them
22:16 mupuf: well, I am done for what I know about the first level
22:16 mupuf: just copy paste that in your document ;)
22:17 karolherbst: k
22:17 karolherbst: yeah
22:17 karolherbst: done already
22:17 mupuf: as I said, not a book
22:17 mupuf: just a general overview
22:17 karolherbst: sure
22:17 mupuf: I am sure I forgot many things
22:17 mupuf: especially examples
22:17 karolherbst: well, it will get better over time :p
22:18 mupuf: Final advice: In case of doubt, just ask yourself the question of how would you implement it if you were an engineer of NVIDIA.
22:19 karolherbst: uhhh
22:28 mupuf: karolherbst: I suggest you create wiki pages for this :)
22:28 mupuf: that will make it easier to share
22:28 mupuf: or better, put the documentation as much as possible in the actual tool ;)
22:30 karolherbst: mupuf: well, we did try to discuss what wiki software we want to use, but well
22:31 karolherbst: and no, that isn't documentation which should be in the tools. Documenting parameters, yes, that should be there, but documenting the tasks? no
22:31 karolherbst: we could add doc files to envytools though
22:31 karolherbst: and have everything in there, but well
22:33 mupuf: agreed, hence why I said, as much as possible ;)
22:33 karolherbst: I see
22:37 karolherbst: uhh it is late again, lucky me I don't have to work tomorrow :p
22:41 karolherbst: imirkin: the reason for the one patch for MAD is, that the MAD constantfolding thing won't fold all sources in
22:42 karolherbst: so you end up with a non optimal optimized MAD instruction
22:43 imirkin: karolherbst: what's the *precise* issue
22:43 karolherbst: mad doesn't get opted into a single mov
22:43 imirkin: he was modifying the MAD d, x, 0, y -> MOV d, y situation
22:44 imirkin: which seems like a pretty reasonable opt to make
22:44 karolherbst: no, he wasn't
22:44 karolherbst: well
22:44 imirkin: yeah. he was.
22:44 karolherbst: he was.. but there was an issue
22:44 imirkin: which lies elsewhere.
22:44 imirkin: in LoadPropagation i guess?
22:44 karolherbst: ohh I remember
22:45 karolherbst: the thing ends up with chained muls
22:45 karolherbst: *movs
22:45 imirkin: right, but why
22:45 imirkin: LoadPropagation should take care of it
22:45 karolherbst: mov $r1 $r0; mov $r2 $r1;
22:45 imirkin: but doesn't for some reason?
22:45 karolherbst: nope, it is the fault of that pass
22:45 imirkin: nope
22:45 karolherbst: it is
22:45 imirkin: ConstantFolding has nothing to do with load propagation
22:46 imirkin: if you have mov a, imm; mov b, c; mov d, b; mov e, d; that should be propagated all the way through
22:46 imirkin: that modification was only dealing with a minor and rare sub-case of that situation
22:46 imirkin: so clearly there's something going wrong, but we need to determine what
22:46 imirkin: rather than try to paper over it
22:47 karolherbst: well the thing is something like this
22:49 karolherbst: https://gist.github.com/karolherbst/c969071995bb2640ba2423c5383f987d
22:50 karolherbst: and that thing wasn't really handled later
22:50 imirkin: right... so ... why not
22:50 imirkin: fix THAT problem
22:50 imirkin: rather than paper over it in mad const folding
22:50 karolherbst: there is a reason, and I know it was a painful one... let me check
22:50 imirkin: normally LoadPropagation does it
22:53 karolherbst: imirkin: http://hastebin.com/uqurorihat.pl
22:54 karolherbst: line 173 and 594
22:54 karolherbst: that is directly after constantfolding (the lower part)
22:54 karolherbst: ohh
22:55 karolherbst: maybe that would be silly, but does loadpropagation handle it well, if the mov have different dTypes?
22:55 imirkin: dunno, maybe not
22:55 imirkin: in any case - the problem is in LoadPropagation
22:56 karolherbst: mhh
22:56 karolherbst: I also don't see why that doesn't work
22:57 tobijk: mh i was just going to look at it
22:57 karolherbst: imirkin: ....
22:57 karolherbst: insnCanLoad fails
22:58 karolherbst: if (reg.data.s32 > 0x7ffff || reg.data.s32 < -0x80000) false
22:58 karolherbst: maybe this is why?
22:58 karolherbst: mhh
22:58 karolherbst: that would be silly though
22:58 tobijk: karolherbst: http://hastebin.com/letazayatu.pas
22:58 karolherbst: tobijk: well, yeah, you could check why that mov isn't opted way in loadpropagation
22:58 tobijk: for more overview
22:59 tobijk: karolherbst: yeah that is what i wanted to do, but it seems you are doing it already :)
22:59 imirkin: karolherbst: for the mul, sure - is there nothing to remove the intermediate move?
23:00 karolherbst: imirkin: well, load propagation should handle that... let me check
23:00 imirkin: maybe it doesn't....
23:00 imirkin: i forget exactly how that's supposed to work
23:00 tobijk: it doesnt work for add as well
23:00 tobijk: and who knows for what it does :D
23:00 karolherbst: I am sure it fails due to the u32 != f32 thing
23:00 karolherbst: maybe
23:03 karolherbst: imirkin: https://gist.github.com/karolherbst/741afc624d21f8efd9c4361c9a445cf1
23:03 tobijk: karolherbst: actually he saw that example already :)
23:03 karolherbst: not this way
23:04 imirkin: ok, so you fixed it somehow? added a special condition for OP_MOV somewhere?
23:04 karolherbst: no, I just printed the result pre and post loadpropagation
23:05 imirkin: so... loadprop *does* take care of the issue?
23:05 tobijk: karolherbst: which backend?
23:05 tobijk: meaning which version of insnCanLoad
23:05 karolherbst: mhhh
23:05 karolherbst: imirkin: memoryopt does crazy things
23:05 karolherbst: https://gist.github.com/karolherbst/741afc624d21f8efd9c4361c9a445cf1#file-cpostmemory
23:06 karolherbst: but huh...
23:06 karolherbst: tobijk: sure that thing helps to show the issue?
23:06 karolherbst: ...
23:06 imirkin: right, so that makes sense
23:06 imirkin: it's a constraint move
23:06 imirkin: so that the RA can deal with various idiocy
23:07 karolherbst: ohh wait, I see it now
23:07 imirkin: in the RA logic, you should modify the InsertSomethingPass (forget the name)
23:07 imirkin: to special-case if the source is an immed
23:07 imirkin: and THAT will fix your issue
23:07 imirkin: er... maybe
23:08 karolherbst: odd
23:08 karolherbst: loadpropagation should just fold the immed in of the add, no?
23:08 tobijk: it should :)
23:09 karolherbst: gdb then...
23:11 mupuf: hakzsam, karolherbst, imirkin: please do not use reator, it is doing some performance testing
23:11 mupuf: will give you a link to know when it is over
23:11 imirkin: hmmm... right. there should be FADD32I that can be used
23:11 imirkin: and even if there weren't, 1.0 fits nicely under the 20-bit limit
23:11 tobijk: yep
23:12 imirkin: mupuf: haven't used reator in quite some time, but thanks for the heads up
23:12 karolherbst: I gdb that thing now, maybe I will find the issue
23:12 mupuf: imirkin: it is now testing the rendering and performance with some benchmarks of samuel's WIP patch
23:13 karolherbst: imirkin: uhhh....
23:14 imirkin: mupuf: which one?
23:14 karolherbst: imirkin: well
23:15 karolherbst: imirkin: should load propagate propagate this? mov u32 %r39 0x3f800000; mov f32 %r40 %r39 ?
23:15 karolherbst: cause it doesn't
23:15 karolherbst: and then it propagates the "mov f32 %r40 %r39 " for the add, so it changes the last source from r40 to r39
23:15 imirkin: probably. the type on a mov is pretty irrelevant.
23:16 karolherbst: imirkin: k
23:17 karolherbst: !targ->insnCanLoad(i, s, ld) is true
23:17 karolherbst: for the source of the mov
23:17 tobijk: so that is fine
23:18 karolherbst: nope
23:18 karolherbst: ! false == true ;)
23:18 karolherbst: imirkin: "if (!(opInfo[i->op].srcFiles[s] & (1 << (int)sf)))" triggers
23:18 imirkin: what's i->op?
23:18 karolherbst: i: mov f32 %r40 %r39
23:19 karolherbst: ld: mov u32 %r39 0x3f800000
23:20 karolherbst: sf is nv50_ir::FILE_IMMEDIATE
23:20 imirkin: right, makes sense
23:21 karolherbst: so?
23:22 tobijk: does line 360 trigger?
23:22 tobijk: 359/360
23:23 karolherbst: tobijk: doesn't matter
23:24 karolherbst: imirkin: k, so how would you fix this issue?
23:24 imirkin: not offhand
23:24 imirkin: i'd have to RTFS
23:26 karolherbst: is there a good reason why insnCanLoad returns false?
23:27 imirkin: no, but probably lots of bad ones
23:27 karolherbst: k
23:27 karolherbst: so insnCanLoad is the issue
23:27 karolherbst: but I don't see why it should return false for MOVs and immediates anyway
23:28 karolherbst: I would expect that to return true in this case
23:28 imirkin: me too... i guess. there could be some subtle reason, but i can't think of one
23:29 karolherbst: uhhh
23:29 karolherbst: opProperties _initProps
23:29 karolherbst: no case for OP_MOV
23:33 karolherbst: yay, profit
23:33 karolherbst: imirkin: "+ { OP_MOV, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1 }," into _initProps?
23:33 karolherbst: or would there be anything wrong with this
23:37 tobijk: karolherbst: looks fine
23:37 karolherbst: mhh, something is fishy though
23:38 karolherbst: crash, nice
23:38 karolherbst: uhh, without my change
23:38 karolherbst: ...
23:39 karolherbst: "ERROR: no viable spill candidates left" crap
23:39 karolherbst: have to sleep anyway
23:39 tobijk: heh, which shader?
23:40 karolherbst: shadow_warrior/8964.shader_test
23:43 imirkin: karolherbst: probably folds something it shouldn't
23:44 tobijk: but it is without his change?!
23:46 tobijk: imirkin: yeah something is odd with karols change, we are hurting regs and do need more insts :/