00:05 mogorva: commenting out those lines doesn't change the picture in Civ5
00:52 mlankhorst: are the tegra video encoding/decoding drivers open?
00:52 mlankhorst: I want to toy some with a re-encoding from a webcam that supports h.264
00:52 imirkin: hakzsam_: for the \% i mean... why the \? i.e. why not just "%"?
00:54 hakzsam_: I thought it won't display correctly with the '%' only
00:54 imirkin: hakzsam_: are you *sure* you can't use fences? please explain why in a future submission... it seems like you're just copying it almost entirely.
00:54 hakzsam_: btw, those descriptions are not currently exposed
00:54 imirkin: well, if you were to pass those as format strings to printf you'd have to do %%, not \%
00:54 imirkin: but there's no reason to pass them in as format strings...
00:55 hakzsam_: I'm not sure but I'll think about it :)
01:01 hakzsam_: btw, some other calls to nouveau_bo_map are not checked...
01:06 mlankhorst: oh looks like it has gstreamer support at least
01:27 imirkin: mogorva: if you get a chance, try out the new fix i put up for bug #90887 -- you don't have to try all the various games it helps, one or two should be enough
01:28 mogorva: kk
01:28 imirkin: actually wait... i think it's wrong
01:28 imirkin: urgh
01:29 mogorva: nouveau has just crashed on me again, almost 3 hours of log is missing from dmesg probably due to improper shutdown :(
01:29 imirkin: i'm tired, i'll fix it later
01:29 mogorva: kk
02:56 mlankhorst: anyone here experienced with streaming videos?
03:05 RSpliet: watching them, sure!... what kind of experience are you looking for? ;-)
03:12 mlankhorst: looking to make a video stream accessible over the internet
03:17 mlankhorst: I don't have the bw at home, so I want to stream to a remote server, but shutdown the stream if nobody's watching
03:26 Guest61914: i have a question, does linux kernel, 3.14.43 have a good support by nvidia cards by nouveau?
04:08 RSpliet: mlankhorst: isn't that a fairly basic Icecast set-up?
04:09 RSpliet: see the "relays-on-demand" option
05:05 mlankhorst: RSpliet: probably then :)
05:06 mlankhorst: no h264?
05:14 shakesoda: imirkin: checked the trace @47769 and @47786, both are fine. frame 34 is the first one with bad shading.
05:21 shakesoda: imirkin: @565510 looks like the first place where things are screwy
07:14 Stag: Hi guys, I am using elementary os freya x86-64...I was using nvidia prop. drivers for last couple days, but today it stopped detecting my monitor , and now I am with a horrible resolution
07:14 Stag: I tried going back to nouveau using the GUI settings in elementary
07:14 Stag: but it didn't help
07:16 Stag: is there a way to reconfigure nouveau ?
11:11 imirkin: shakesoda: ok, from the looks of it, shade model flat is a *lot* more broken on nv50 than on nvc0. on nvc0 it's broken for funny *really* weird cases, while on nv50 it appears to just be plain broken.
11:13 imirkin: i think i see a way to fix it up better than on nvc0 though. we'll see... famous last words.
11:24 imirkin: ugh. we really need more people working on nouveau =/ suggestions for recruitment welcome.
11:39 pmoreau: imirkin: I should have my whole month of August to work on Nouveau (and find a job/PhD position).
11:42 imirkin: whichever comes first ;)
11:43 pmoreau: :)
11:43 imirkin: pmoreau: gonna try to get your mac to work?
11:43 pmoreau: Of course :)
11:44 pmoreau: I could try to work on something else on the side, too
11:46 imirkin: hehe ok
11:46 pmoreau: I was thinking of power/clock gating on Teslam ot OpenCL/Cuda
11:46 pmoreau: s/ot/or
11:47 pmoreau: But maybe some OpenGL fixes would be better?
11:47 imirkin: uhh... yeah. opencl ain't gonna happen in a single month's time
11:48 imirkin: i dunno... your call. reclocking on tesla's would certainly be welcome... rspliet made good progress on it, probably just needs some finishing touches
11:51 pmoreau: I will be working on Nouveau before and after August, just that August should be 100% on Nouveau, and the rest more like 10%-25%.
11:52 imirkin: ok, well opencl would certainly be very welcome
11:52 imirkin: however it's a substantial effort without a very clear way forward
11:52 pmoreau: Why not reclocking. I wanted to continue Roy's work, but he started working on NVA0 support, and he seems to be continuing for G96 and below
11:54 pmoreau: Right, but I plan to start using OpenCL and/or Cuda a lot on personal projects.
11:54 imirkin: ok
11:54 imirkin: well happy to provide advice and whatnot
11:55 imirkin: i started on it a year ago
11:55 imirkin: but got stuck
11:55 pmoreau: Samuel was telling me that some work has been done already (by curro iirc)
11:55 imirkin: and disinterested
11:55 imirkin: curro also made a separate and unrelated effort in a different way
11:55 imirkin: probably a couple of years before that
11:55 pmoreau: Ok
11:56 pmoreau: Should I pick up the work or start from scratch?
11:56 imirkin: your call. imho curro's approach, while it made sense back then, no longer makes as much sense now
11:57 imirkin: my approach also has shortcomings
11:57 imirkin: curro wanted to make a TGSI backend in llvm
11:57 imirkin: apparently he ran into trouble, although he claims that with more time he could have worked it all out -- there were no fundamental issues
11:58 imirkin: my approach was to take SPIR 1.0 (which is approximately llvm ir) and transform it directly into nv50 ir
11:58 imirkin: my theory being that the opencl state tracker will have to start accepting SPIR eventually anyways, and this will plug nicely into that
11:59 imirkin: of course now there's SPIR-du-jour which is totally different
11:59 pmoreau: ;)
11:59 imirkin: i don't know if llvm can do opencl -> spir-v yet
11:59 imirkin: but if it can't now, it'll be able to soon
11:59 pmoreau: Right
11:59 imirkin: so an approach would be to make a SPIR-V -> nv50 ir importer
11:59 imirkin: the nice thing about that is that vulkan will eventually use SPIR-V as well
11:59 airlied: just set llvm to generate PTX and translate that to nv50 ir :)
12:00 imirkin: [or rather, it uses SPIR-V, and eventually we'll want to support vulkan somehow]
12:00 airlied: but yeah spir-v might be a better plan
12:00 imirkin: yes, something like PTX -> nv50 ir might work too, but i think you underestimate the complexity of PTX
12:00 pmoreau: But, was spir-v spec released?
12:00 imirkin: the spec is out
12:00 pmoreau: Ok
12:00 imirkin: has been for a while
12:00 airlied: I think there are still discussions about how to get spir-v into llvm
12:00 imirkin: or at least a preliminary one, i don't think it's been fully finalized
12:00 airlied: what form it should take
12:01 pmoreau: oh right, we're only waiting for vulkan spec
12:01 imirkin: right, and vulkan's of no concern for this effort
12:01 imirkin: so basically all the efforts are of the form OpenCL C -> X and X -> nv50 ir
12:01 imirkin: and the hope is that OpenCL C -> X is handled by not-you
12:01 airlied: there was debate on backend vs converters or something
12:01 pmoreau: and I guess, even for Nouveau in general: first get OpenGL 4.5?
12:02 airlied: convert AMDGPU to nv50 Ir :-P
12:02 imirkin: airlied: it's not THE worst thought :)
12:02 imirkin: certainly r600 would be the worst thought... but gcn is moderately similar
12:03 imirkin: the annoyance with SPIR and SPIR-V is that they're both SSA forms
12:03 imirkin: while nv50 ir has to be non-ssa on input
12:03 imirkin: so you have to go out of ssa first... of course since there's an optimizing compiler on the other end, you don't have to worry as much about having moves-galore as a result
12:04 pmoreau: I'll begin by reading some stuff about it first: even though I heard about it here and there, roughly know what it is about, it still remains somewhat vague.
12:04 imirkin: what is the "it" that you mean?
12:04 pmoreau: SPIR-V, NV50 IR at least
12:04 imirkin: have you ever worked with compilers?
12:05 pmoreau: Does compiling programs count? :D
12:05 imirkin: heheh
12:05 pmoreau: I did a bit
12:05 imirkin: as long as the programs you're compiling compile other programs :p
12:05 imirkin: ok, well SSA is all the new rage in compilers
12:05 pmoreau: I had a course on compilers, and did implement a really really basic compiler during labs
12:06 imirkin: Something Single Assignment
12:06 imirkin: so basically each thing is only ever assigned to once
12:06 imirkin: this causes problems for code which goes like x = 1; if (foo) x = 2;
12:06 pmoreau: x = (foo) ? 2 : 1
12:07 imirkin: so it resolves that with x_a = 1; if (foo) x_b = 2; x = phi(x_a, x_b);
12:07 pmoreau: what is phi?
12:07 imirkin: magic :)
12:07 pmoreau: Same function as in the bug report?
12:07 imirkin: it's a combining operator which lets you have only a single assignment per thing
12:07 imirkin: even when the program wants to assign to the same thing multiple times and you can't undo it
12:08 imirkin: sure, my example was simple enough to convert it into a sel
12:08 imirkin: but i could come up with more complex examples where you can't do that :p
12:08 pmoreau: ;)
12:08 imirkin: nouveau also has the concept of "psi" nodes
12:08 imirkin: which are effectively join points for conditional execution
12:08 imirkin: er, predicated
12:09 imirkin: phi nodes merge basic block outputs into a single value
12:09 imirkin: psi nodes work within basic blocks
12:09 pmoreau: Hum...
12:09 imirkin: in nouveau, this basically just means that the RA coalesces the diff values
12:09 imirkin: and ensures that they all get assigned to the same register
12:10 pmoreau: so psi would be to eliminate some rvalues?
12:10 imirkin: no, it's again a pseudo-op like phi
12:10 imirkin: but instead of having separate bb's you could do like
12:10 imirkin: (foo) x_a = 1;
12:10 imirkin: (!foo) x_b = 2;
12:10 imirkin: x = psi(x_a, x_b)
12:11 imirkin: (in nv50 ir parlance, psi == union. but if you read literature, it'll talk about psi.)
12:11 pmoreau: Ok
12:12 imirkin: but basically SSA form enables opts to be much simpler
12:12 imirkin: since you don't have to worry about multiple things ever assigning a single value
12:12 imirkin: and the phi/psi ops act as helpers to the RA to tell it which values have to go into the same registers across bb's
12:13 pmoreau: But I guess it's more complicated in the first place as you might have to move code around due to possible dependencies
12:13 imirkin: connor had some blog post with a bunch of links to compiler literature if you're interested. you don't really have to though -- the compiler is written and (usually) works
12:13 imirkin: well, the code as written is supposed to work. so if you don't move ops around, it's fine :)
12:14 imirkin: and nouveau doesn't schedule instructions, even though it should.
12:14 imirkin: i suspect it could be a 10-20% gain in perf
12:14 pmoreau: That's something!
12:17 pmoreau: I'll have a look at the patches enabling SSA for GLSL in Mesa
12:18 imirkin: glsl ir doesn't do ssa
12:18 imirkin: nir does though
12:19 pmoreau: http://lists.freedesktop.org/archives/mesa-dev/2014-January/052245.html
12:19 pmoreau: Maybe it wasn't merged, but it existed
12:20 imirkin: Wed Jan 22 09:16:47 PST 2014
12:20 imirkin: yeah, it wasn't merged :)
12:22 imirkin: calim: ping
12:23 RSpliet: pmoreau: if you want, you can look into getting all the gating registers "charged"
12:23 RSpliet: the registers formerly known as JOE
12:24 imirkin: if one of you nv50-owning people wants a relatively simple task
12:24 imirkin: i have one right now...
12:25 RSpliet: pmoreau: or look into PCIe bus speed switching for NVA3/5/8 :-)
12:25 RSpliet: imirkin: I'm a tad busy, but fire away anyway, perhaps I can help when I can't sleep
12:25 pmoreau: So many things to do! :)
12:25 imirkin: RSpliet: btw, if i wanted to try to get GDDR5 reclocking to work on my nva3, where would i start? [after the initial mmiotrace]
12:26 imirkin: as for the nv50 thing, there are reports that c0[$a0] doesn't appear to work in some instructions (notable fadd)
12:26 imirkin: i need to figure out which instructions, if any [other than mov] it works on
12:27 imirkin: or if it's some entirely different issue causing the problems in the first place
12:27 imirkin: [i.e. that it appears to work on fadd]
12:27 pmoreau: imirkin: how should we figure that out?
12:27 imirkin: see bug https://bugs.freedesktop.org/show_bug.cgi?id=91056
12:27 imirkin: write a shader_test that does indirect accesses on a uniform array and try various ops and make sure that the generated shader code has the things we expect
12:33 pmoreau: I could give it a try
12:33 imirkin: [and make sure that it either passes or fails...]
12:58 RSpliet: imirkin: for GDDR5 I think the first (and potentially easiest) step would be figuring out how to generate all the different MR values
12:59 pmoreau: Meh... I should first fix my WiFi issues on my dev partition... And maybe clean it all together and allocate more space to it.
12:59 imirkin: RSpliet: and how might one go about performing that step?
12:59 RSpliet: imirkin: find GDDR5 datasheets documenting the various bits, see which ones are changed by the blob and which aren't, find matching values in the VBIOS
13:00 imirkin: fantastic.
13:00 imirkin: well there's already gddr5 reclocking support in general for kepler
13:00 imirkin: presumably i can reuse that (obviously not for finding in vbios)
13:00 RSpliet: that'd be of some help yes :-)
13:01 imirkin: i'll also pull down some of the nva3 gddr5 vbioses... i think we have at least 3 or 4 of them
13:01 RSpliet: and every value required for GDDR and DDR2 have already been found in the VBIOS, just need GDDR5 specific encoding(DLL, ODT, WL, CAS...)
13:02 imirkin: i'm suspecting that we need to do a full reclock in order for those cards to render properly
13:02 imirkin: that's why i'm asking
13:02 RSpliet: okay :-) I gave mine to Ben, so I can't assist in the search much
13:02 RSpliet: a lot of the work is VBIOS fuzzing
13:03 imirkin: yea np
13:03 RSpliet: (unless skeggsb_ sends it back... after I moved to Cambs ;-))
13:03 RSpliet: *skeggsb
13:04 imirkin: and i figure out what values to look for by looking at the mmiotrace and look at what it writes in the various clocking regs?
13:06 imirkin: RSpliet: http://hastebin.com/apimuxoqah.coffee -- those values seem reasonable right?
13:07 RSpliet: oh heh, most of it is close enough :-)
13:07 imirkin: what do i search for to find those scripts?
13:09 imirkin: ah yeah, SEQ
13:10 RSpliet: yeah, sorry, couldn't recall what strings I used
13:10 RSpliet: was already looking ;-)
13:10 RSpliet: my advice is to verify and correct one cluster at a time
13:11 imirkin: well, none of the values are matching
13:11 imirkin: at least not what nvbios is printing
13:11 RSpliet: that's impossible, at least the 100220 register should be good, it's mapping is trivial
13:12 imirkin: fine.
13:12 imirkin: ONE value is matching :p
13:12 imirkin: http://hastebin.com/oritayayad.sm
13:13 RSpliet: ok, don't expect the VBIOS tool to be 100% correct... there's definitely some tricky bits with tCWL
13:14 RSpliet: that depends on the memory type, which we can't properly detect in envytools because it technically relies on a register value not trivially found in the vbios
13:15 imirkin: did you try to repro the nvidia script 1:1?
13:17 RSpliet: mostly
13:18 imirkin: hrmph, the demmio seq op names don't match the docs :(
13:23 imirkin: so the script i see does a
13:23 imirkin: MOV TS OUT[0xa]
13:23 imirkin: while your code has
13:23 imirkin: ram_wait(fuc, 0x002504, 0x10, 0x10, 20000);
13:23 imirkin: those are different things right?
13:23 imirkin: [otherwise they match up so far]
13:25 RSpliet: they are... the "MOV TS OUT" has been omitted in our scripts
13:26 RSpliet: in fact, not implemented in out flavour of sequencer-scripts; we have only 5 opcodes or so
13:27 imirkin: ok. so presumably that timestamp stuff wasn't deemed important?
13:27 RSpliet: NVIDIA uses this to time the script execution
13:27 imirkin: ah ok
13:27 RSpliet: we do that in a different way :-)
13:28 RSpliet: in the start and end hooks
13:28 imirkin: so... my script doesn't have that ram_wait at all then
13:28 imirkin: i guess it's prudent to wait on that fifo bit though
13:31 RSpliet: I think waiting for 2504 is something they did in pre-NVA3 as what is documented in envytools as FB PAUSE/resume
13:31 imirkin: ah ok
13:32 RSpliet: can't tell you the exact details of that, I recall experimenting with that back then
13:32 imirkin: ramcfg_10_DLLoff -- where do i see that in nvbios output?
13:34 RSpliet: I'm not sure if you do; ramcfg.c is better documentation
13:35 RSpliet: in the kernel tree
13:35 imirkin: k
13:42 RSpliet: oh, and note that I carried some of the writes into the script that the blob does outside
13:43 imirkin: yeah, but you commented on that
13:43 imirkin: so it's all good
13:45 imirkin: hrm, it writes the dll off thing to 0x1002f0 instead of 1002c4
13:47 RSpliet: 00100x8x?
13:48 imirkin: ?
13:48 imirkin: the value written is 0x00100014
13:48 imirkin: but to register 1002f0 instead of mr[1]
13:48 RSpliet: that implies a disabled DLL yes
13:48 RSpliet: so, in principle it doesn't matter to which of the MR registers you do your write
13:48 imirkin: the previous value is 0x00100030
13:49 imirkin: [of 1002f0]
13:49 RSpliet: bits 20:23 select the MR to write (in this case 1, MR1), low bits hold the value
13:50 imirkin: oh, interesting
13:50 RSpliet: they use a specific system for shadowing reasons
13:50 RSpliet: not because the memory ontroller requires it
13:50 imirkin: ok, so i can continue to just use the gddr3_dll_off thing?
13:51 imirkin: i'll refrain from asking how writing the same value on there disables dll...
13:51 RSpliet: well, if the value really changed from 14 to 30, it means "disable data termination, raise adr/cmd termination from ZQ/2 to something reserved for future use"
13:52 imirkin: well it's also for GDDR5
13:52 imirkin: which is the future ;)
13:52 RSpliet: oh the other way around actually :-)
14:44 pmoreau: imirkin: Ah ah ah, you changed the Trello background. :-) You didn't like the default blue?
14:45 imirkin: wanted to try the green
14:45 imirkin: not sure i like it
14:45 imirkin: i wanted a lighter color but... that wasn't an option
14:45 pmoreau: :/
14:45 imirkin: i think that trello board isn't working too great anymore... i need to think about how to organize it better
14:46 imirkin: open to ideas
14:46 pmoreau: It feels like the bugzilla: over-filled! :D
14:47 imirkin: maybe i should make high and low priority items
14:47 pmoreau: As it seems we have a Nouveau organisation, we could have multiple boards, if it helps.
14:47 pmoreau: Oh right, we only have difficulty levels
14:51 pmoreau: Not sure if adding the voting power-up would help prioritise some tasks, either from users or devs.
14:51 imirkin: those tasks aren't clear to users
14:52 imirkin: i know which ones are important and which aren't
14:52 imirkin: but the list grows
14:52 imirkin: the reality right now is that i'm the only person working on 3d stuff anyways...
14:52 imirkin: which is partly why it's in a bit of a disrepair
14:53 imirkin: i think i should break out features and optimizations
14:53 pmoreau: And concentrate on fixes?
14:53 imirkin: no
14:54 imirkin: not necessarily :p
14:54 imirkin: some of those fixes are issues that just aren't very important
14:55 pmoreau: You are the one who knows :-)
15:18 imirkin: ok, i've updated it a bit
15:59 imirkin: skeggsb: this seems wrong: "data = nv_ro32(bios, bit_M.offset + 0x05);" (in M0205.c). seems like that should be ro16, no?
22:38 imirkin: skeggsb: ok, so i'm seeing this on a trace replay on my nv44: http://hastebin.com/ajumaduwir.css
22:38 imirkin: skeggsb: the interesting bit is that this happens right after a nouveau_pushbuf_validate
22:39 imirkin: i.e. those commands happen after a pushbuf_validate happen
22:39 imirkin: coincidence?
22:42 skeggsb: hrm, the *usual* way that'd happen is if there was no object bound to that subchannel.. but, it should be in this case, and that error can happen for other reasons too
22:44 imirkin: well, it's the 3d channel, things *ought* to be bound
22:45 imirkin: the card was very unhappy too... but putting it through a unbind/bind made it all better
22:45 imirkin: i.e. echo 0000:04:00.0 > /sys/bus/pci/drivers/nouveau/unbind