00:05mogorva: commenting out those lines doesn't change the picture in Civ5
00:52mlankhorst: are the tegra video encoding/decoding drivers open?
00:52mlankhorst: I want to toy some with a re-encoding from a webcam that supports h.264
00:52imirkin: hakzsam_: for the \% i mean... why the \? i.e. why not just "%"?
00:54hakzsam_: I thought it won't display correctly with the '%' only
00:54imirkin: hakzsam_: are you *sure* you can't use fences? please explain why in a future submission... it seems like you're just copying it almost entirely.
00:54hakzsam_: btw, those descriptions are not currently exposed
00:54imirkin: well, if you were to pass those as format strings to printf you'd have to do %%, not \%
00:54imirkin: but there's no reason to pass them in as format strings...
00:55hakzsam_: I'm not sure but I'll think about it :)
01:01hakzsam_: btw, some other calls to nouveau_bo_map are not checked...
01:06mlankhorst: oh looks like it has gstreamer support at least
01:27imirkin: mogorva: if you get a chance, try out the new fix i put up for bug #90887 -- you don't have to try all the various games it helps, one or two should be enough
01:28imirkin: actually wait... i think it's wrong
01:29mogorva: nouveau has just crashed on me again, almost 3 hours of log is missing from dmesg probably due to improper shutdown :(
01:29imirkin: i'm tired, i'll fix it later
02:56mlankhorst: anyone here experienced with streaming videos?
03:05RSpliet: watching them, sure!... what kind of experience are you looking for? ;-)
03:12mlankhorst: looking to make a video stream accessible over the internet
03:17mlankhorst: I don't have the bw at home, so I want to stream to a remote server, but shutdown the stream if nobody's watching
03:26Guest61914: i have a question, does linux kernel, 3.14.43 have a good support by nvidia cards by nouveau?
04:08RSpliet: mlankhorst: isn't that a fairly basic Icecast set-up?
04:09RSpliet: see the "relays-on-demand" option
05:05mlankhorst: RSpliet: probably then :)
05:06mlankhorst: no h264?
05:14shakesoda: imirkin: checked the trace @47769 and @47786, both are fine. frame 34 is the first one with bad shading.
05:21shakesoda: imirkin: @565510 looks like the first place where things are screwy
07:14Stag: Hi guys, I am using elementary os freya x86-64...I was using nvidia prop. drivers for last couple days, but today it stopped detecting my monitor , and now I am with a horrible resolution
07:14Stag: I tried going back to nouveau using the GUI settings in elementary
07:14Stag: but it didn't help
07:16Stag: is there a way to reconfigure nouveau ?
11:11imirkin: shakesoda: ok, from the looks of it, shade model flat is a *lot* more broken on nv50 than on nvc0. on nvc0 it's broken for funny *really* weird cases, while on nv50 it appears to just be plain broken.
11:13imirkin: i think i see a way to fix it up better than on nvc0 though. we'll see... famous last words.
11:24imirkin: ugh. we really need more people working on nouveau =/ suggestions for recruitment welcome.
11:39pmoreau: imirkin: I should have my whole month of August to work on Nouveau (and find a job/PhD position).
11:42imirkin: whichever comes first ;)
11:43imirkin: pmoreau: gonna try to get your mac to work?
11:43pmoreau: Of course :)
11:44pmoreau: I could try to work on something else on the side, too
11:46imirkin: hehe ok
11:46pmoreau: I was thinking of power/clock gating on Teslam ot OpenCL/Cuda
11:47pmoreau: But maybe some OpenGL fixes would be better?
11:47imirkin: uhh... yeah. opencl ain't gonna happen in a single month's time
11:48imirkin: i dunno... your call. reclocking on tesla's would certainly be welcome... rspliet made good progress on it, probably just needs some finishing touches
11:51pmoreau: I will be working on Nouveau before and after August, just that August should be 100% on Nouveau, and the rest more like 10%-25%.
11:52imirkin: ok, well opencl would certainly be very welcome
11:52imirkin: however it's a substantial effort without a very clear way forward
11:52pmoreau: Why not reclocking. I wanted to continue Roy's work, but he started working on NVA0 support, and he seems to be continuing for G96 and below
11:54pmoreau: Right, but I plan to start using OpenCL and/or Cuda a lot on personal projects.
11:54imirkin: well happy to provide advice and whatnot
11:55imirkin: i started on it a year ago
11:55imirkin: but got stuck
11:55pmoreau: Samuel was telling me that some work has been done already (by curro iirc)
11:55imirkin: and disinterested
11:55imirkin: curro also made a separate and unrelated effort in a different way
11:55imirkin: probably a couple of years before that
11:56pmoreau: Should I pick up the work or start from scratch?
11:56imirkin: your call. imho curro's approach, while it made sense back then, no longer makes as much sense now
11:57imirkin: my approach also has shortcomings
11:57imirkin: curro wanted to make a TGSI backend in llvm
11:57imirkin: apparently he ran into trouble, although he claims that with more time he could have worked it all out -- there were no fundamental issues
11:58imirkin: my approach was to take SPIR 1.0 (which is approximately llvm ir) and transform it directly into nv50 ir
11:58imirkin: my theory being that the opencl state tracker will have to start accepting SPIR eventually anyways, and this will plug nicely into that
11:59imirkin: of course now there's SPIR-du-jour which is totally different
11:59imirkin: i don't know if llvm can do opencl -> spir-v yet
11:59imirkin: but if it can't now, it'll be able to soon
11:59imirkin: so an approach would be to make a SPIR-V -> nv50 ir importer
11:59imirkin: the nice thing about that is that vulkan will eventually use SPIR-V as well
11:59airlied: just set llvm to generate PTX and translate that to nv50 ir :)
12:00imirkin: [or rather, it uses SPIR-V, and eventually we'll want to support vulkan somehow]
12:00airlied: but yeah spir-v might be a better plan
12:00imirkin: yes, something like PTX -> nv50 ir might work too, but i think you underestimate the complexity of PTX
12:00pmoreau: But, was spir-v spec released?
12:00imirkin: the spec is out
12:00imirkin: has been for a while
12:00airlied: I think there are still discussions about how to get spir-v into llvm
12:00imirkin: or at least a preliminary one, i don't think it's been fully finalized
12:00airlied: what form it should take
12:01pmoreau: oh right, we're only waiting for vulkan spec
12:01imirkin: right, and vulkan's of no concern for this effort
12:01imirkin: so basically all the efforts are of the form OpenCL C -> X and X -> nv50 ir
12:01imirkin: and the hope is that OpenCL C -> X is handled by not-you
12:01airlied: there was debate on backend vs converters or something
12:01pmoreau: and I guess, even for Nouveau in general: first get OpenGL 4.5?
12:02airlied: convert AMDGPU to nv50 Ir :-P
12:02imirkin: airlied: it's not THE worst thought :)
12:02imirkin: certainly r600 would be the worst thought... but gcn is moderately similar
12:03imirkin: the annoyance with SPIR and SPIR-V is that they're both SSA forms
12:03imirkin: while nv50 ir has to be non-ssa on input
12:03imirkin: so you have to go out of ssa first... of course since there's an optimizing compiler on the other end, you don't have to worry as much about having moves-galore as a result
12:04pmoreau: I'll begin by reading some stuff about it first: even though I heard about it here and there, roughly know what it is about, it still remains somewhat vague.
12:04imirkin: what is the "it" that you mean?
12:04pmoreau: SPIR-V, NV50 IR at least
12:04imirkin: have you ever worked with compilers?
12:05pmoreau: Does compiling programs count? :D
12:05pmoreau: I did a bit
12:05imirkin: as long as the programs you're compiling compile other programs :p
12:05imirkin: ok, well SSA is all the new rage in compilers
12:05pmoreau: I had a course on compilers, and did implement a really really basic compiler during labs
12:06imirkin: Something Single Assignment
12:06imirkin: so basically each thing is only ever assigned to once
12:06imirkin: this causes problems for code which goes like x = 1; if (foo) x = 2;
12:06pmoreau: x = (foo) ? 2 : 1
12:07imirkin: so it resolves that with x_a = 1; if (foo) x_b = 2; x = phi(x_a, x_b);
12:07pmoreau: what is phi?
12:07imirkin: magic :)
12:07pmoreau: Same function as in the bug report?
12:07imirkin: it's a combining operator which lets you have only a single assignment per thing
12:07imirkin: even when the program wants to assign to the same thing multiple times and you can't undo it
12:08imirkin: sure, my example was simple enough to convert it into a sel
12:08imirkin: but i could come up with more complex examples where you can't do that :p
12:08imirkin: nouveau also has the concept of "psi" nodes
12:08imirkin: which are effectively join points for conditional execution
12:08imirkin: er, predicated
12:09imirkin: phi nodes merge basic block outputs into a single value
12:09imirkin: psi nodes work within basic blocks
12:09imirkin: in nouveau, this basically just means that the RA coalesces the diff values
12:09imirkin: and ensures that they all get assigned to the same register
12:10pmoreau: so psi would be to eliminate some rvalues?
12:10imirkin: no, it's again a pseudo-op like phi
12:10imirkin: but instead of having separate bb's you could do like
12:10imirkin: (foo) x_a = 1;
12:10imirkin: (!foo) x_b = 2;
12:10imirkin: x = psi(x_a, x_b)
12:11imirkin: (in nv50 ir parlance, psi == union. but if you read literature, it'll talk about psi.)
12:12imirkin: but basically SSA form enables opts to be much simpler
12:12imirkin: since you don't have to worry about multiple things ever assigning a single value
12:12imirkin: and the phi/psi ops act as helpers to the RA to tell it which values have to go into the same registers across bb's
12:13pmoreau: But I guess it's more complicated in the first place as you might have to move code around due to possible dependencies
12:13imirkin: connor had some blog post with a bunch of links to compiler literature if you're interested. you don't really have to though -- the compiler is written and (usually) works
12:13imirkin: well, the code as written is supposed to work. so if you don't move ops around, it's fine :)
12:14imirkin: and nouveau doesn't schedule instructions, even though it should.
12:14imirkin: i suspect it could be a 10-20% gain in perf
12:14pmoreau: That's something!
12:17pmoreau: I'll have a look at the patches enabling SSA for GLSL in Mesa
12:18imirkin: glsl ir doesn't do ssa
12:18imirkin: nir does though
12:19pmoreau: Maybe it wasn't merged, but it existed
12:20imirkin: Wed Jan 22 09:16:47 PST 2014
12:20imirkin: yeah, it wasn't merged :)
12:22imirkin: calim: ping
12:23RSpliet: pmoreau: if you want, you can look into getting all the gating registers "charged"
12:23RSpliet: the registers formerly known as JOE
12:24imirkin: if one of you nv50-owning people wants a relatively simple task
12:24imirkin: i have one right now...
12:25RSpliet: pmoreau: or look into PCIe bus speed switching for NVA3/5/8 :-)
12:25RSpliet: imirkin: I'm a tad busy, but fire away anyway, perhaps I can help when I can't sleep
12:25pmoreau: So many things to do! :)
12:25imirkin: RSpliet: btw, if i wanted to try to get GDDR5 reclocking to work on my nva3, where would i start? [after the initial mmiotrace]
12:26imirkin: as for the nv50 thing, there are reports that c0[$a0] doesn't appear to work in some instructions (notable fadd)
12:26imirkin: i need to figure out which instructions, if any [other than mov] it works on
12:27imirkin: or if it's some entirely different issue causing the problems in the first place
12:27imirkin: [i.e. that it appears to work on fadd]
12:27pmoreau: imirkin: how should we figure that out?
12:27imirkin: see bug https://bugs.freedesktop.org/show_bug.cgi?id=91056
12:27imirkin: write a shader_test that does indirect accesses on a uniform array and try various ops and make sure that the generated shader code has the things we expect
12:33pmoreau: I could give it a try
12:33imirkin: [and make sure that it either passes or fails...]
12:58RSpliet: imirkin: for GDDR5 I think the first (and potentially easiest) step would be figuring out how to generate all the different MR values
12:59pmoreau: Meh... I should first fix my WiFi issues on my dev partition... And maybe clean it all together and allocate more space to it.
12:59imirkin: RSpliet: and how might one go about performing that step?
12:59RSpliet: imirkin: find GDDR5 datasheets documenting the various bits, see which ones are changed by the blob and which aren't, find matching values in the VBIOS
13:00imirkin: well there's already gddr5 reclocking support in general for kepler
13:00imirkin: presumably i can reuse that (obviously not for finding in vbios)
13:00RSpliet: that'd be of some help yes :-)
13:01imirkin: i'll also pull down some of the nva3 gddr5 vbioses... i think we have at least 3 or 4 of them
13:01RSpliet: and every value required for GDDR and DDR2 have already been found in the VBIOS, just need GDDR5 specific encoding(DLL, ODT, WL, CAS...)
13:02imirkin: i'm suspecting that we need to do a full reclock in order for those cards to render properly
13:02imirkin: that's why i'm asking
13:02RSpliet: okay :-) I gave mine to Ben, so I can't assist in the search much
13:02RSpliet: a lot of the work is VBIOS fuzzing
13:03imirkin: yea np
13:03RSpliet: (unless skeggsb_ sends it back... after I moved to Cambs ;-))
13:04imirkin: and i figure out what values to look for by looking at the mmiotrace and look at what it writes in the various clocking regs?
13:06imirkin: RSpliet: http://hastebin.com/apimuxoqah.coffee -- those values seem reasonable right?
13:07RSpliet: oh heh, most of it is close enough :-)
13:07imirkin: what do i search for to find those scripts?
13:09imirkin: ah yeah, SEQ
13:10RSpliet: yeah, sorry, couldn't recall what strings I used
13:10RSpliet: was already looking ;-)
13:10RSpliet: my advice is to verify and correct one cluster at a time
13:11imirkin: well, none of the values are matching
13:11imirkin: at least not what nvbios is printing
13:11RSpliet: that's impossible, at least the 100220 register should be good, it's mapping is trivial
13:12imirkin: ONE value is matching :p
13:13RSpliet: ok, don't expect the VBIOS tool to be 100% correct... there's definitely some tricky bits with tCWL
13:14RSpliet: that depends on the memory type, which we can't properly detect in envytools because it technically relies on a register value not trivially found in the vbios
13:15imirkin: did you try to repro the nvidia script 1:1?
13:18imirkin: hrmph, the demmio seq op names don't match the docs :(
13:23imirkin: so the script i see does a
13:23imirkin: MOV TS OUT[0xa]
13:23imirkin: while your code has
13:23imirkin: ram_wait(fuc, 0x002504, 0x10, 0x10, 20000);
13:23imirkin: those are different things right?
13:23imirkin: [otherwise they match up so far]
13:25RSpliet: they are... the "MOV TS OUT" has been omitted in our scripts
13:26RSpliet: in fact, not implemented in out flavour of sequencer-scripts; we have only 5 opcodes or so
13:27imirkin: ok. so presumably that timestamp stuff wasn't deemed important?
13:27RSpliet: NVIDIA uses this to time the script execution
13:27imirkin: ah ok
13:27RSpliet: we do that in a different way :-)
13:28RSpliet: in the start and end hooks
13:28imirkin: so... my script doesn't have that ram_wait at all then
13:28imirkin: i guess it's prudent to wait on that fifo bit though
13:31RSpliet: I think waiting for 2504 is something they did in pre-NVA3 as what is documented in envytools as FB PAUSE/resume
13:31imirkin: ah ok
13:32RSpliet: can't tell you the exact details of that, I recall experimenting with that back then
13:32imirkin: ramcfg_10_DLLoff -- where do i see that in nvbios output?
13:34RSpliet: I'm not sure if you do; ramcfg.c is better documentation
13:35RSpliet: in the kernel tree
13:42RSpliet: oh, and note that I carried some of the writes into the script that the blob does outside
13:43imirkin: yeah, but you commented on that
13:43imirkin: so it's all good
13:45imirkin: hrm, it writes the dll off thing to 0x1002f0 instead of 1002c4
13:48imirkin: the value written is 0x00100014
13:48imirkin: but to register 1002f0 instead of mr
13:48RSpliet: that implies a disabled DLL yes
13:48RSpliet: so, in principle it doesn't matter to which of the MR registers you do your write
13:48imirkin: the previous value is 0x00100030
13:49imirkin: [of 1002f0]
13:49RSpliet: bits 20:23 select the MR to write (in this case 1, MR1), low bits hold the value
13:50imirkin: oh, interesting
13:50RSpliet: they use a specific system for shadowing reasons
13:50RSpliet: not because the memory ontroller requires it
13:50imirkin: ok, so i can continue to just use the gddr3_dll_off thing?
13:51imirkin: i'll refrain from asking how writing the same value on there disables dll...
13:51RSpliet: well, if the value really changed from 14 to 30, it means "disable data termination, raise adr/cmd termination from ZQ/2 to something reserved for future use"
13:52imirkin: well it's also for GDDR5
13:52imirkin: which is the future ;)
13:52RSpliet: oh the other way around actually :-)
14:44pmoreau: imirkin: Ah ah ah, you changed the Trello background. :-) You didn't like the default blue?
14:45imirkin: wanted to try the green
14:45imirkin: not sure i like it
14:45imirkin: i wanted a lighter color but... that wasn't an option
14:45imirkin: i think that trello board isn't working too great anymore... i need to think about how to organize it better
14:46imirkin: open to ideas
14:46pmoreau: It feels like the bugzilla: over-filled! :D
14:47imirkin: maybe i should make high and low priority items
14:47pmoreau: As it seems we have a Nouveau organisation, we could have multiple boards, if it helps.
14:47pmoreau: Oh right, we only have difficulty levels
14:51pmoreau: Not sure if adding the voting power-up would help prioritise some tasks, either from users or devs.
14:51imirkin: those tasks aren't clear to users
14:52imirkin: i know which ones are important and which aren't
14:52imirkin: but the list grows
14:52imirkin: the reality right now is that i'm the only person working on 3d stuff anyways...
14:52imirkin: which is partly why it's in a bit of a disrepair
14:53imirkin: i think i should break out features and optimizations
14:53pmoreau: And concentrate on fixes?
14:54imirkin: not necessarily :p
14:54imirkin: some of those fixes are issues that just aren't very important
14:55pmoreau: You are the one who knows :-)
15:18imirkin: ok, i've updated it a bit
15:59imirkin: skeggsb: this seems wrong: "data = nv_ro32(bios, bit_M.offset + 0x05);" (in M0205.c). seems like that should be ro16, no?
22:38imirkin: skeggsb: ok, so i'm seeing this on a trace replay on my nv44: http://hastebin.com/ajumaduwir.css
22:38imirkin: skeggsb: the interesting bit is that this happens right after a nouveau_pushbuf_validate
22:39imirkin: i.e. those commands happen after a pushbuf_validate happen
22:42skeggsb: hrm, the *usual* way that'd happen is if there was no object bound to that subchannel.. but, it should be in this case, and that error can happen for other reasons too
22:44imirkin: well, it's the 3d channel, things *ought* to be bound
22:45imirkin: the card was very unhappy too... but putting it through a unbind/bind made it all better
22:45imirkin: i.e. echo 0000:04:00.0 > /sys/bus/pci/drivers/nouveau/unbind