00:00airlied: imirkin: yeah does nvidia use the macro language to do it?
00:00airlied: I suppose you could put a atomic counter inc into every compute shader :-P
00:00imirkin: airlied: i haven't figured out how nvidia does it
00:01imirkin: airlied: yeah, real performant :p
00:18karolherbst: 4 more fails with 4.5
00:20karolherbst: ohh right, KHR-GL45.conditional_render_inverted.functional
00:22karolherbst: mslusarz: huh, they moved the VEX tree into valgrind, like finally?
00:28karolherbst: imirkin: a little atomic add on global memory shouldn't hurt that much :p
00:28karolherbst: especially if you look at how heavy compute shaders usually are
00:28imirkin: in every local invocation, with the whole group fighting for that loc?
00:28imirkin: i'd rather not, just for the sake of stupid counters
00:28karolherbst: I don't know if you really have to increase it by the amount of threads though
00:29karolherbst: I think if you invoke a compute shader running at 1k threads, you still only increase the counter by one
00:29karolherbst: but I don't really know the spec that much
00:32imirkin: doing it the macro way seems fairly feasible
00:32imirkin: you increment 1 for each "run" of the compute shader
00:32imirkin: i.e. per lane, not per group
00:33karolherbst: I kind of guess nvidia does the same?
00:34mwk: shader invocation stats?
00:34mwk: shouldn't there be a hw counter for that?
00:34karolherbst: it isn't implemented for us :)
00:34karolherbst: but the others are
00:34mwk: I mean, G80 has pipeline stats counter for a few things
00:35mwk: shader invocations being one of them
00:35imirkin: mwk: should, but isn't, afaik
00:35imirkin: for compute
00:35karolherbst: anyway, I check if valgrind works now with my stack
00:37karolherbst: yo... fun :( "Valgrind: debuginfo reader: Possibly corrupted debuginfo file."
00:38karolherbst: nice, it works
00:41karolherbst: uhm, except the compute shader doesn't show up?
00:42karolherbst: demmt isn't able to parse the Pascal compute stuff?
00:43imirkin: compute stuff is sometimes on a diff ring
00:43imirkin: it doesn't always get decoded super-properly
00:43karolherbst: mhh, any trick how I can check if the stuff is there?
00:46imirkin: i have a local patch which removes caching of the fifo ring
00:47imirkin: or more accurately - had
01:04karolherbst: imirkin: inside demmt or mmt?
01:12karolherbst: imirkin: you mean that cache_entry stuff?
01:13imirkin: i forget
01:13karolherbst: well, it is the only thing named cache and it is inside pushbuf.c
01:13imirkin: i don't think it's called cache
01:13imirkin: but it caches the fifo ib something-or-other
01:19karolherbst: I guess I will try to fix "KHR-GL32.packed_depth_stencil.blit.depth32f_stencil8" tomorrow
01:20karolherbst: it is the only fail from a GL32 run, so we would have everything fixed there at least
01:26imirkin: is this kepler or maxwell btw?
07:25karolherbst: imirkin: pascal actually
08:10karolherbst: imirkin: the painful part about Kepler1 is, that a lot of stuff are failing due to spilling bugs
08:11karolherbst: so except we fix that, I don't see a near future of kepler1 being able to pass the CTS :(
08:12karolherbst: we have to submit a CTS result for every gen anyway, so I will start with maxwell/pascal (when we are doine with all the fails) and go to kepler2 then
08:19karolherbst: imirkin: and we still need to figure out what to do with BGRA4 :(
11:41karolherbst: mupuf: do you think you will find some time to look into https://bugs.freedesktop.org/show_bug.cgi?id=107016 ?
11:42karolherbst: it is basically a vbios where the voltage table says PWM mode, but there is no PWM and no VID entries
11:43mupuf: oh dear, likely won't have time for that... I need something to get the passion going and that does not look like a good candidate
11:44karolherbst: but mhh
11:44karolherbst: there is an interesting i2c device I never saw
11:44mupuf: you think it could be an external PWM controler? That sounds odd
11:44karolherbst: mupuf: "GPIO 11: line 11 tag 0x04 [VID_0] OUT DEF 0 gpio: normal"
11:45karolherbst: does this mean it is an output only GPIO for the VID?
11:45mupuf: well, that's what VIDs have been used for, yes
11:45karolherbst: mupuf: https://eu.mouser.com/new/Infineon-Technologies/irchil/ :D
11:45karolherbst: mupuf: uhm, we still set those, no?
11:46karolherbst: so it should be IN OUT or whatever?
11:46mupuf: it should be OUT only, why should it be IN?
11:46karolherbst: because we read it and write it?
11:46mupuf: you read the register value, not the voltage connected to the pin
11:47karolherbst: uhm, with VID?
11:47karolherbst: not talking about PWM
11:47karolherbst: I mean those VID GPIO pin groups we get
11:47mupuf: yes, this has been done like this before PWM
11:48mupuf: we would have VID_X, with X going up to 8 IIRC
11:48karolherbst: I just thought those would be both, because we read and write to the GPIO
11:49karolherbst: that GPU indeed seems to have a i2c based voltage PWM
11:50karolherbst: mupuf: any way to verify that from an mmiotrace?
11:51karolherbst: oh crap
11:52mupuf: karolherbst: the PWM control here is not to output PWM, it is to set the voltage
11:52mupuf: and it does not take PWM as an input (to replace having many lines as an input), but rather using i2c
11:53karolherbst: this falls out of the voltage table btw:"Mode PWM, acceptable range [712500, 1150000] µV, frequency 1 kHz, base voltage 306250 µV (unk = 16), range 6250 µV"
11:54karolherbst: there are just 3 VID GPIO pins though, which is a bit odd
11:56mupuf: in the datasheet, I only see I2C as an input
11:56karolherbst: I don't know if it is indeed this one though
11:56karolherbst: anyway, there are a lot of I2C sutff going on around PCLOCK compared to other kepler traces
11:57mupuf: yeah, I would assume this is it
11:57mupuf: previous controlers during the tesla time could sometimes be on the i2c bus
11:57mupuf: but that's the first time I see one that mandated i2c
11:58karolherbst: mupuf: https://gist.githubusercontent.com/karolherbst/6409cdfe34430146266145e725e853e4/raw/733276be124ee8e9c37c1343bfae067b41cae04c/gistfile1.txt
11:59karolherbst: sadly that i2c doesnt really show up there :(
12:00karolherbst: docs are helping
12:00karolherbst: mupuf: 0x43 = CHiL CHL8203/8212/8213/8214
12:01mupuf: getting there then :)
12:01karolherbst: one doc for all of them :)
12:01mupuf: but we should definitely not allow reclocking if we cannot set voltage
12:01karolherbst: yeah, we don't :)
12:01karolherbst: that's the bug
12:01karolherbst: but we did before I think
12:03karolherbst: mupuf: "Up to 3 VID select lines for dynamic voltage transitions" seems like that fits :)
12:06karolherbst: anyway, this will be fun without having such hardware
12:07karolherbst: pmoreau: you have such a GPU :D
12:07karolherbst: your nve7
12:07karolherbst: with two of those :O
12:07karolherbst: and RSpliet as well
12:07karolherbst: ohh wait, that's a CHIL_SMBUS0
12:08karolherbst: which is a CHiL CHL8112A/B, CHL8225/8228
12:08karolherbst: there are also CHiL CHL8266, CHL8316
12:08karolherbst: mupuf: your nvd9 also has a "CHiL CHL8266, CHL8316"
12:09karolherbst: *"CHL8112A/B, CHL8225/8228"
12:10RSpliet: Oh I suspect that NVC3 is not among the living any more
12:10RSpliet: Not to mention... Fermi and reclocking ;-)
12:10karolherbst: doesn't matter
12:10karolherbst: it is just for controlling the voltage
12:10karolherbst: I am sure we can read the output via the VIDs or something
12:11karolherbst: anyway, just having a GPU with such a device would be helpful already
12:11RSpliet: If that machine still existed yes
12:11RSpliet: Yeah, I suspect pmoreaus NVE7s might be more useful. Perhaps RH can just get you one though? :-P
12:14karolherbst: RSpliet: well
12:14karolherbst: RSpliet: for that I would really need to know which GPU to buy
12:14karolherbst: like exact model
12:14RSpliet: Oh I know the pain...
12:15karolherbst: but yeah, maybe I could simply get one
12:15karolherbst: it doesn't seem to be _that_ rare
12:16karolherbst: but then we are biosed by having more vbios with issues than GPUs where things just work
12:16RSpliet: With a bit of luck pmoreau can hint you a brand/type to look for. Although I suspect it's not easy finding mid-range Keplers on ebay
12:17RSpliet: "My" (old housemate's) NVC3 IIRC is a laptop GPU, not much use for you to dig up what that was
14:07Armada: imirkin, are push buffers that reference the same render target supposed to be split up? I have a command buffer that contains a buffer clear and a draw call and it looks like those commands are executed at the same time and competing for the rendertarget
14:08Armada: Splitting up the push buffer fixes it for me
14:12Armada: Hmmm, maybe I'm just not swapping the buffers correctly
17:13karolherbst: imirkin: I've extraced that depth stencil test: https://github.com/karolherbst/piglit/commit/3f9e82195786b720c7cd5727f084e63758d2b417
17:14mslusarz: karolherbst: can you upload pascal compute trace somewhere?
17:14karolherbst: mslusarz: yeah
17:20karolherbst: oh seriously, why are all google results providing "file uploaders" just crap?
17:21karolherbst: what has the world become :D
17:22karolherbst: mslusarz: curl https://gist.githubusercontent.com/karolherbst/256d8a472ac29e67e8de35ac9a7f128f/raw/779b26553251f3aa87022d100b750eaf5fe8c93f/gistfile1.txt | base64 -d | xz -d
17:28karolherbst: mslusarz: I don't know exactly where the compute shader should be, but I was running the KHR-GL44.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations CTS test
17:51mslusarz: can you do 2 more traces for the same command with --mmt-ioctl-create-fuzzer=2 and --mmt-ioctl-create-fuzzer=1?
17:51mslusarz: oh, he left
18:14karolherbst: mslusarz: I am back :)
18:14mslusarz: can you do 2 more traces for the same command with --mmt-ioctl-create-fuzzer=2 and --mmt-ioctl-create-fuzzer=1?
18:15karolherbst: mslusarz: with 1 the test fails, is that to be expected?
18:16mslusarz: IIRC it's possible
18:16karolherbst: well it already fails at context creation
18:16karolherbst: so I don't know how useful that trace would be to you
18:17mslusarz: it might still be useful
18:17karolherbst: here the fuzzer=2 output (base64 decoded xz): https://gist.githubusercontent.com/karolherbst/a03d211383ad6c96f31c48a795790093/raw/4ae74ff761e1ce94e6d57785e3b187171c3e05ee/gistfile1.txt
18:17karolherbst: fuzzer 1: https://gist.githubusercontent.com/karolherbst/4785ebf15c0da63b1ffd12a2937946f3/raw/266b84d2feae5c607e3ae45795af6c9db439787d/gistfile1.txt
18:27mslusarz: karolherbst: try this patch for mmt: http://pastebin.pl/view/b111384e
18:27mslusarz: ... without --mmt-ioctl-create-fuzzer
18:32karolherbst: mslusarz: doesn't seem to help with the compute shader
18:32mslusarz: can I see new trace?
18:33karolherbst: mslusarz: https://gist.githubusercontent.com/karolherbst/f5459c5a651dea2736853edf91433312/raw/fe6747c5646aa9911036da69cd02b5ddecba52ca/gistfile1.txt
18:33mslusarz: probably something needs to be updated on the demmt side
18:33karolherbst: yeah, might be
18:57mslusarz: I can't find what is wrong with this trace
18:58mslusarz: I pushed the mmt change with 2 additional object types I found in your fuzzer=2 trace
18:59mslusarz: (but those 2 types shouldn't matter for decoding your trace)
19:12mslusarz: there is a lot of unknown methods in GP104_COMPUTE, between 0x31c and 0x41c
19:19mslusarz: methods 0x3a0/0x3a4 and 0x3b0/0x3b4 look like addresses of previously used buffers
19:20karolherbst: I don't know how much we actually use from this inside mesa
19:20mslusarz: anyway, I'll shut up now, that's not my area of expertise
19:20karolherbst: but my guess is we just reuse the maxwell stuff
19:21karolherbst: ohh, actually we use the GP100 and GP104 compute class
19:22karolherbst: mslusarz: anyway, thanks for the mmt patches
22:01karolherbst: imirkin: any ideas for that multisample renderbuffer bug?
22:07karolherbst: imirkin: I think the problem is when we read from a non multisampled framebuffer and draw into a multisampled one
22:07karolherbst: uhm, blit I mean
23:33karolherbst: uhhh, looking at a mmt trace and... uhm yeah... nice
23:36karolherbst: imirkin: does this looks like a blitter to you we might have to come up with for the stencil bug? https://gist.githubusercontent.com/karolherbst/ced2012ca68665f12486af34af5ffaed/raw/a73ae199b2f0e96eb6aaca67f6620998ff5df268/gistfile1.txt