01:03 karolherbst: inglor: check the trello
01:03 karolherbst: or bugzilla
01:04 karolherbst: inglor: there is a lot to RE actually, and to look at the code doesn't really help in general, except you want to fix a bug, but then you usually have to RE something else anyway
07:33 kloofy: i dunno how good of an idea this could be, i can't read the code entirely yet..but, it seems like tiling has mmio regs, per buffer on radeon for instance not sure about nouveau
07:34 kloofy: and those can be queried from shader, having the ringbuffer poll certain regs..and those parameters could encode masks too somehow
07:39 kloofy: it should respond that buffer starting from 123, has tiling max or some slice of this and that, if mem controller can't be queried with partial PC address
07:43 kloofy: but for all of that this code is again under agd5f expertease
07:47 kloofy: point being with him, it's probably the best to discuss what is the cheapest option to get more then pc counter bits out from the address, and i think it is still possible to push the code for radeon and others too
07:59 kloofy: but again this all is in the interest of those who use those chips, i do not yet use them so it's that i tried to offer some solutions theories how to do stuff
08:19 kloofy: i head off now again head bangs terribly, i have overtried and sacrificed my health, if we could manage to have comprehensive discussions with the experts, than there are yet many options to push for highly performant paths in code
10:00 kloofy: i mean the nouveau way i have inspected the most, and i know how to make the patch there in details also, for intel it's also quite easy but i remember the docs vaguely, it's doable
10:01 kloofy: but for radeon i'd need to have some discussion, as the relevant docs are mostly missing, but devs know those bits
10:04 kloofy: i am sure that if everything is optimized and precompiled for specific hw, things will work definitely 10times faster for most cards
10:06 kloofy: there is nothing wrong what vendors devs+hackers have done, since the codegen from high level was needed anyways
10:10 kloofy: well intel is marvelous contributor to open source, those docs seem to be very good, i think i would deal with all their stuff if i was not banned there for ages allready
10:13 kloofy: it's safe to hack on the drivers there cause of the acquisation of altera, but intel intellectual property governed rom at same time should work on those devices with minor modifications without my help too
10:14 kloofy: my views is that i want to have a choise of many drivers playgrounded on fpga's if one driver fails the other one works etc.
10:15 kloofy: but i think safest would be to pick only intel and appropriate arria 10 devices to run and distribute such code roms
10:23 kloofy: currently running haswell gpu, and it is very well running, it's the only one i still have with nv50, cause i gave my kepler card to my sister, could get another one for compensation
10:23 kloofy: but...i allready have haswell and this one has good isa too
10:27 kloofy: nv50 yeah i have too, and it's almost clear how to schedule with this card too, it does not have cctll but still has push and pull methods of the cache
11:43 Yoshimo: what is left to do before nouveau exposes OpenGL 4.3 on a maxwellv2 card?
13:21 mupuf: karolherbst: nice tool!
13:24 karolherbst: the table gives me headaches though :/ there are some conditionals in there, which are a bit annoying to track down:https://gist.github.com/karolherbst/7532bf9ebb0298fe3ca69b24802585bc
13:28 karolherbst: I guess I have to crawl some data from every vbios and see which fields are more important and check what those do first
13:34 karolherbst: there should be also a field which tells the driver how many checks to do for the unk11 == 0x30 case, or maybe a after x MHz, drop to lowest
13:35 karolherbst: and maybe also some flags for the "recover" strategy, like what the driver should do between the downclock_t and one of the others
18:28 karolherbst: any ideas what we might be able to do to reduce power consumption on light desktop workloads? Except for that "simple" engine clock gating