01:51 KungFuJesus: can I get any sort of documentation on this struct? https://github.com/grate-driver/mesa/blob/master/src/gallium/drivers/nouveau/nv30/nv30_transfer.h
03:05 imirkin: i wouldn't necessarily use grate-driver as your core repo...
03:05 imirkin: https://cgit.freedesktop.org/mesa/mesa
03:05 imirkin: that's the canonical repo
03:05 imirkin: but ... what kind of info are you looking for? it's just a struct used internally inside of the driver
03:42 KungFuJesus: for one, knowing what this cpp variable is that is being switched on
04:18 imirkin: cpp = characters per pixel
04:18 imirkin: i.e. bytes
06:08 KungFuJesus: ah, so bpp
06:08 KungFuJesus: suppose characters resolves the bits/bytes ambiguity
06:09 imirkin: it's a pretty standard way to describe it
06:09 imirkin: cpp and bpp are used interchangeably
06:10 imirkin: but yeah, bpp can mean bits too
06:11 KungFuJesus: sifm?
06:11 KungFuJesus: and texswz? I'm guess texture swizzle, some sort of reorder?
06:13 imirkin: sifm: https://github.com/envytools/envytools/blob/master/rnndb/graph/nv1_ifm.xml#L27
06:14 imirkin: where do you see texswz specifically?
06:15 imirkin: it probably refers to the NV4_SURFACE_SWZ object
06:15 imirkin: (or rather, the NV40_SURFACE_SWZ object)
06:16 imirkin: (you can look in nv1_ctxobj.xml for definitions of various objects)
06:16 KungFuJesus: in this function: nv30_transfer_rect_blit
06:16 imirkin: oh
06:16 KungFuJesus: a surprising amount of swizzling done in software for something I'd have expected to have dedicated hardware/shaders for
06:17 imirkin: that's this: https://github.com/envytools/envytools/blob/master/rnndb/graph/nv30-40_3d.xml#L561
06:17 imirkin: the "blit" method basically runs a 3d shader
06:18 imirkin: which takes the source as a texture, and destination as a render target
06:19 KungFuJesus: that xml a command set mapped to the registers on the card?
06:20 imirkin: the command are fed via a fifo to ... the fifo engine
06:20 imirkin: the fifo engine then dispatches the commands to the proper "engine"
06:21 imirkin: the fifo has 8 (?) subchannels, and you bind a class to each subchannel, and the commands you send are associated with a subchannel
06:21 imirkin: the class indicates to the fifo engine which underlying engine to send the commands to
06:24 KungFuJesus: good lord there are a lot of special purpose registers
06:26 imirkin: not registers
06:26 imirkin: commands
06:26 imirkin: think of them as function calls
06:27 KungFuJesus: oh I see, so more register writes than specific registers when they say "reg32" in the XML
06:27 imirkin: but yeah ... the API is basically a 16-bit address + 32-bit function argument
06:27 imirkin: each 16-bit address is a diff function
06:28 imirkin: and does whatever with the 32-bit argument
06:28 imirkin: most of it is storing up state
06:28 imirkin: for a later "go" command
06:29 KungFuJesus: so TEX_FORMAT looks somewhat interesting/relevant
06:33 KungFuJesus: nv40_tex_format might be a good place to look
06:39 KungFuJesus: ok, so in nv30_transfer_rect, this where you talked about the transfer methods, I'm guessing
06:40 KungFuJesus: heh, never seen a static struct quite declared that way before
06:41 KungFuJesus: cpu is obviously basically crappy programmed io, and I'm guessing these are in order of priority
06:42 KungFuJesus: what m2mf, sifm, and blit? Obviously those are all probably some form of DMA
06:45 KungFuJesus: guess I gotta figure out which of these work properly and which are creating issues, first
06:46 KungFuJesus: the fact that many of these textures are scaled and filtered means that some of these methods are probably not possible from the getgo
06:46 KungFuJesus: the pure cpu based one won't work if the textures are being scaling
06:46 KungFuJesus: scaled*
06:48 KungFuJesus: I've relied on Gentoo to do the building for me to this point, which specific DSO am I trying to rebuild and test?
06:56 imirkin: cpu is writting to the mmap'd buffer
06:57 imirkin: m2mf and sifm are different command classes
06:57 imirkin: blit is using the 3d engine to perform a copy by texturing the source onto the destination
06:57 imirkin: for each method there's a predicate which checks to make sure that the copy is of a type that the method can handle
06:57 imirkin: the so is nouveau_dri.so
06:57 imirkin: part of the 'mesa' package
07:00 KungFuJesus: eapply_user might just be easier for me to test
07:00 imirkin: not really
07:00 imirkin: you want a fast compile + test cycle
07:00 KungFuJesus: rather than learn the specifics of this meson bootstrap
07:00 imirkin: what i normally do is install mesa to a side directory
07:00 imirkin: and then run the application with LD_LIBRARY_PATH=/path/to/dir foo-program
07:00 KungFuJesus: ah yeah good point, waiting for rebuilds will be slow
07:00 imirkin: use autoconf
07:00 imirkin: it's easy and it works
07:01 KungFuJesus: heh, I avoid autotools with a passion, but I guess I don't have to worry much about the m4 macros and such
07:06 KungFuJesus: hmm, configure is complaining about missing libdrm with some radeon crap
07:10 imirkin: ./configure --with-gallium-drivers=nouveau
07:10 imirkin: ./configure --with-gallium-drivers=nouveau --with-dri-drivers=
07:10 imirkin: that should do the trick.
07:10 KungFuJesus: ah, wasn't doing the last bit
07:13 KungFuJesus: textures need to be a modulus size of 64 for blit to work, eh?
07:13 KungFuJesus: even modulus of 64 that is
07:18 KungFuJesus: one thing I don't think I tried - what's the magic environment variable to force mesa to use llvmpipe?
07:19 imirkin: LIBGL_ALWAYS_SOFTWARE=1
07:20 imirkin: iirc it should work mostly ok on power be
07:21 KungFuJesus: yeah, it's correct when I do that. No surprises
07:21 KungFuJesus: horrendously slow, though
07:27 KungFuJesus: nothing fruitful yet, just CPU and blit aren't enough to get any textures other than the console that pulls down. Just disabling m2mf gets us back to the problem
07:27 KungFuJesus: I guess we can say sifm is probably the issue
07:31 KungFuJesus: in transfer_rect_sifm, what is the "destination" in this context?
07:31 KungFuJesus: texture memory?
07:32 KungFuJesus: code seems to allow there to be different color formats for the src and destination
07:34 KungFuJesus: also why does it seems there are two identical commands sent to the card a couple of times in this function?
07:43 imirkin: how do you mean?
07:43 imirkin: the pitch vs non-pitch thing?
07:44 imirkin: the source and dest are textures, yes
07:44 imirkin: that's the whole point of all these functions, to copy a rectangle from one texture to another
07:44 imirkin: aka transfer :)
07:45 KungFuJesus: Heh, gotta grep a lot to unroll all these macros
07:45 KungFuJesus: what does it mean if the destination pitch is 0?
07:46 KungFuJesus: pitch is usually a row size, isn't it? Usually that's what we mean by it in image processing
07:46 KungFuJesus: (all so far have had a destination pitch of zero)
07:47 imirkin: textures can be laid out in one of two ways
07:47 imirkin: linear, and swizzled
07:47 imirkin: linear means that they're laid out the way you think -- pixel after pixel
07:47 imirkin: and each row is "pitch" away from each other row
07:48 imirkin: (the pitch must be a multiple of ... something, i think)
07:48 imirkin: in the swizzled layout, the pixels are laid out via ... magic. it doesn't matter how, but it's not linear
07:48 imirkin: this is done for POT textures
07:48 KungFuJesus: hmm, these cards must not handle non power of 2 sized textures
07:49 KungFuJesus: seeing as the command that pushes the format and texture size in uses log base 2 on the width and height and OR's it in to the same word
07:49 imirkin: nv4x definitely does
07:49 imirkin: that path must be for the non-pitch case
07:49 KungFuJesus: yes it is
07:50 imirkin: in that case, it's always POT
07:50 imirkin: (POT = power of two, btw)
07:50 KungFuJesus: yeah, figured
07:50 imirkin: NPOT = non-power of two :)
07:51 KungFuJesus: ok so pitch==0 is swizzled?
07:51 imirkin: yes
07:52 KungFuJesus: that is the troublesome code path
07:52 imirkin: wait, so is it definitely SIFM?
07:53 KungFuJesus: well, haven't eliminated the linear path is bug free either, but the current branch being taken is swizzle
07:53 KungFuJesus: yes
07:53 imirkin: heh
07:53 imirkin: well, i dunno if you can make that assertion
07:53 imirkin: just that there's a swizzled texture that gets messed up
07:53 imirkin: check out the code in nv30_miptree.c
07:53 imirkin: in nv30_miptree_create
07:53 imirkin: see that giant if
07:54 imirkin: which gets hit for, among other things, NPOT sizes
07:54 imirkin: force it to always get it -- just add a || true in there
07:55 KungFuJesus: bug still persists
07:56 imirkin: as i expected :)
07:56 imirkin: but you'll note that the non-pitch path no longer gets hit
07:58 KungFuJesus: nope, I'm still hitting it
08:01 KungFuJesus: actually I can't tell anymore, my fprintfs aren't showing up anymore for some reason
08:07 KungFuJesus: oh wait, it didn't get to that bit unless I went into the game
08:17 KungFuJesus: What's somewhat weird is that the color formats chosen here implicitly endian independent. It's as if there's a command missing that says "don't swap the bytes"
08:18 KungFuJesus: what is dst->bo? Is that byte order?
08:18 imirkin: buffer object
08:19 imirkin: and yes, there's some amount of implicit swapping
08:19 imirkin: some of it is good
08:19 imirkin: some of it is bad :)
08:21 KungFuJesus: for the Y8 formats it's somewhat obvious that you'd want it off here
08:21 imirkin: that never gets used
08:21 imirkin: i think in practice, only the RGBA8 formats get used for your situation
08:24 KungFuJesus: I meant for the SIFM method
08:28 imirkin: i think there's an extra method that tells it to not byteswap even though it would normally (due to the overall setting of "big endian platform")
08:28 imirkin: but i'm not sure where it is
08:35 KungFuJesus: as in you think it exists somewhere in the command set and hasn't been discovered, or you've found it once before?
08:42 KungFuJesus: you'd think it'd be a flag in SRCCOPY or something
08:43 imirkin: i'm 99.9% sure it exists
08:43 imirkin: i've seen references to it somewhere
08:48 KungFuJesus: grepping that repo you sent me, I found "BIG_ENDIAN" mentioned a few times in memory/nv10_pfb.xml, but it seems like that's just enumeration stuff
08:49 KungFuJesus: ``` <bitfield pos="31" name="ENDIAN" variants="NV1A-">
08:49 KungFuJesus: <value value="0" name="LITTLE" />
08:49 KungFuJesus: <value value="1" name="BIG" />
08:49 KungFuJesus: </bitfield>
08:49 KungFuJesus: ignore the ```, sorry used to slack now
08:49 KungFuJesus: that's in DMA_FETCH
08:50 KungFuJesus: fifo/nv4_pfifo.xml
08:53 imirkin: yeah, we set that to BIG
08:53 imirkin: KungFuJesus: https://github.com/skeggsb/nouveau/blob/master/drm/nouveau/nvkm/engine/fifo/dmanv04.c#L212
08:57 imirkin: that's the thing that flips the endianness of the commands in the pushbuf
08:57 imirkin: (that's all the PUSH_DATA stuff)
08:57 imirkin: so that the cpu can write it out natively
08:58 imirkin: and then the fifo engine does a byteswap before interpreting the values
08:59 KungFuJesus: does it apply to all DMA, including textures?
08:59 KungFuJesus: what's a good thing for me to try here just to see if undoing implicit endianness will fix the issue?
09:00 KungFuJesus: man there's so little decent documentation on this
09:05 imirkin: this is where i get a little hazy
09:05 imirkin: i'm pretty sure MMIO accesses are byte-swapped
09:05 imirkin: but ... what about like ... cpu accesses? probably not
09:06 imirkin: i wonder if nv30_transfer_rect_cpu gets hit a lot
09:06 imirkin: it will never do a byteswap
09:06 KungFuJesus: seemed to hardly at all
09:06 imirkin: although it's copying form a bo to another bo
09:06 imirkin: whereas we have to ensure that the cpu data -> etc gets treated properly
09:06 imirkin: it's a giant mess =/
09:07 imirkin: one of my bits of advice is to load up a trace in qapitrace
09:07 imirkin: and then look at what the bound textures look like in there
09:09 KungFuJesus: I mean I'm 100% certain the in memory textures will be linear RGBA
09:10 imirkin: yeah
09:11 imirkin: but if you pull it up in qapitrace
09:11 imirkin: perhaps that will make it easier to track down wtf is going on
09:12 KungFuJesus: actually I don't even see the textures in qapitrace
09:12 KungFuJesus: just the framebuffer
09:12 KungFuJesus: there's one texture, and it seems transparent?
09:12 imirkin: try clicking on "opaque"
09:13 KungFuJesus: it's one texture, a 32x32, and it seems to be just a black square
09:15 KungFuJesus: for a given frame, I see several glBindTexture calls, though
09:16 KungFuJesus: ah, maybe I'm not doing lookup state in the proper spot in the frame
09:17 KungFuJesus: tried it later in the frame, 50 textures down
09:17 KungFuJesus: I think X11 is hung
09:19 imirkin: =/
09:24 KungFuJesus: scp'd the trace over to x86 machine, I see the textures
09:24 KungFuJesus: not sure if that says anything
09:25 KungFuJesus: damn, display was very hung on that thing, had to reboot
09:29 KungFuJesus: hmm, well, I gotta get to sleep, let me know if you have any suggestions to try
10:50 cosurgi: pkill -SIGSTOP chromium ; ## solves lots of problems, while I don't use it.
10:55 cosurgi: I added this to the script that locks xserver. And -SIGCONT upon unlocking. That will make life easier to other xservers.
23:34 imirkin: ok, so ... i THINK that at least most of the issues with the rotation are actually core Xorg/EXA issues
23:35 imirkin: so i plan on trying to do a release tonight of xf86-video-nouveau unless i hear any objections