01:51KungFuJesus: can I get any sort of documentation on this struct? https://github.com/grate-driver/mesa/blob/master/src/gallium/drivers/nouveau/nv30/nv30_transfer.h
03:05imirkin: i wouldn't necessarily use grate-driver as your core repo...
03:05imirkin: https://cgit.freedesktop.org/mesa/mesa
03:05imirkin: that's the canonical repo
03:05imirkin: but ... what kind of info are you looking for? it's just a struct used internally inside of the driver
03:42KungFuJesus: for one, knowing what this cpp variable is that is being switched on
04:18imirkin: cpp = characters per pixel
04:18imirkin: i.e. bytes
06:08KungFuJesus: ah, so bpp
06:08KungFuJesus: suppose characters resolves the bits/bytes ambiguity
06:09imirkin: it's a pretty standard way to describe it
06:09imirkin: cpp and bpp are used interchangeably
06:10imirkin: but yeah, bpp can mean bits too
06:11KungFuJesus: sifm?
06:11KungFuJesus: and texswz? I'm guess texture swizzle, some sort of reorder?
06:13imirkin: sifm: https://github.com/envytools/envytools/blob/master/rnndb/graph/nv1_ifm.xml#L27
06:14imirkin: where do you see texswz specifically?
06:15imirkin: it probably refers to the NV4_SURFACE_SWZ object
06:15imirkin: (or rather, the NV40_SURFACE_SWZ object)
06:16imirkin: (you can look in nv1_ctxobj.xml for definitions of various objects)
06:16KungFuJesus: in this function: nv30_transfer_rect_blit
06:16imirkin: oh
06:16KungFuJesus: a surprising amount of swizzling done in software for something I'd have expected to have dedicated hardware/shaders for
06:17imirkin: that's this: https://github.com/envytools/envytools/blob/master/rnndb/graph/nv30-40_3d.xml#L561
06:17imirkin: the "blit" method basically runs a 3d shader
06:18imirkin: which takes the source as a texture, and destination as a render target
06:19KungFuJesus: that xml a command set mapped to the registers on the card?
06:20imirkin: the command are fed via a fifo to ... the fifo engine
06:20imirkin: the fifo engine then dispatches the commands to the proper "engine"
06:21imirkin: the fifo has 8 (?) subchannels, and you bind a class to each subchannel, and the commands you send are associated with a subchannel
06:21imirkin: the class indicates to the fifo engine which underlying engine to send the commands to
06:24KungFuJesus: good lord there are a lot of special purpose registers
06:26imirkin: not registers
06:26imirkin: commands
06:26imirkin: think of them as function calls
06:27KungFuJesus: oh I see, so more register writes than specific registers when they say "reg32" in the XML
06:27imirkin: but yeah ... the API is basically a 16-bit address + 32-bit function argument
06:27imirkin: each 16-bit address is a diff function
06:28imirkin: and does whatever with the 32-bit argument
06:28imirkin: most of it is storing up state
06:28imirkin: for a later "go" command
06:29KungFuJesus: so TEX_FORMAT looks somewhat interesting/relevant
06:33KungFuJesus: nv40_tex_format might be a good place to look
06:39KungFuJesus: ok, so in nv30_transfer_rect, this where you talked about the transfer methods, I'm guessing
06:40KungFuJesus: heh, never seen a static struct quite declared that way before
06:41KungFuJesus: cpu is obviously basically crappy programmed io, and I'm guessing these are in order of priority
06:42KungFuJesus: what m2mf, sifm, and blit? Obviously those are all probably some form of DMA
06:45KungFuJesus: guess I gotta figure out which of these work properly and which are creating issues, first
06:46KungFuJesus: the fact that many of these textures are scaled and filtered means that some of these methods are probably not possible from the getgo
06:46KungFuJesus: the pure cpu based one won't work if the textures are being scaling
06:46KungFuJesus: scaled*
06:48KungFuJesus: I've relied on Gentoo to do the building for me to this point, which specific DSO am I trying to rebuild and test?
06:56imirkin: cpu is writting to the mmap'd buffer
06:57imirkin: m2mf and sifm are different command classes
06:57imirkin: blit is using the 3d engine to perform a copy by texturing the source onto the destination
06:57imirkin: for each method there's a predicate which checks to make sure that the copy is of a type that the method can handle
06:57imirkin: the so is nouveau_dri.so
06:57imirkin: part of the 'mesa' package
07:00KungFuJesus: eapply_user might just be easier for me to test
07:00imirkin: not really
07:00imirkin: you want a fast compile + test cycle
07:00KungFuJesus: rather than learn the specifics of this meson bootstrap
07:00imirkin: what i normally do is install mesa to a side directory
07:00imirkin: and then run the application with LD_LIBRARY_PATH=/path/to/dir foo-program
07:00KungFuJesus: ah yeah good point, waiting for rebuilds will be slow
07:00imirkin: use autoconf
07:00imirkin: it's easy and it works
07:01KungFuJesus: heh, I avoid autotools with a passion, but I guess I don't have to worry much about the m4 macros and such
07:06KungFuJesus: hmm, configure is complaining about missing libdrm with some radeon crap
07:10imirkin: ./configure --with-gallium-drivers=nouveau
07:10imirkin: ./configure --with-gallium-drivers=nouveau --with-dri-drivers=
07:10imirkin: that should do the trick.
07:10KungFuJesus: ah, wasn't doing the last bit
07:13KungFuJesus: textures need to be a modulus size of 64 for blit to work, eh?
07:13KungFuJesus: even modulus of 64 that is
07:18KungFuJesus: one thing I don't think I tried - what's the magic environment variable to force mesa to use llvmpipe?
07:19imirkin: LIBGL_ALWAYS_SOFTWARE=1
07:20imirkin: iirc it should work mostly ok on power be
07:21KungFuJesus: yeah, it's correct when I do that. No surprises
07:21KungFuJesus: horrendously slow, though
07:27KungFuJesus: nothing fruitful yet, just CPU and blit aren't enough to get any textures other than the console that pulls down. Just disabling m2mf gets us back to the problem
07:27KungFuJesus: I guess we can say sifm is probably the issue
07:31KungFuJesus: in transfer_rect_sifm, what is the "destination" in this context?
07:31KungFuJesus: texture memory?
07:32KungFuJesus: code seems to allow there to be different color formats for the src and destination
07:34KungFuJesus: also why does it seems there are two identical commands sent to the card a couple of times in this function?
07:43imirkin: how do you mean?
07:43imirkin: the pitch vs non-pitch thing?
07:44imirkin: the source and dest are textures, yes
07:44imirkin: that's the whole point of all these functions, to copy a rectangle from one texture to another
07:44imirkin: aka transfer :)
07:45KungFuJesus: Heh, gotta grep a lot to unroll all these macros
07:45KungFuJesus: what does it mean if the destination pitch is 0?
07:46KungFuJesus: pitch is usually a row size, isn't it? Usually that's what we mean by it in image processing
07:46KungFuJesus: (all so far have had a destination pitch of zero)
07:47imirkin: textures can be laid out in one of two ways
07:47imirkin: linear, and swizzled
07:47imirkin: linear means that they're laid out the way you think -- pixel after pixel
07:47imirkin: and each row is "pitch" away from each other row
07:48imirkin: (the pitch must be a multiple of ... something, i think)
07:48imirkin: in the swizzled layout, the pixels are laid out via ... magic. it doesn't matter how, but it's not linear
07:48imirkin: this is done for POT textures
07:48KungFuJesus: hmm, these cards must not handle non power of 2 sized textures
07:49KungFuJesus: seeing as the command that pushes the format and texture size in uses log base 2 on the width and height and OR's it in to the same word
07:49imirkin: nv4x definitely does
07:49imirkin: that path must be for the non-pitch case
07:49KungFuJesus: yes it is
07:50imirkin: in that case, it's always POT
07:50imirkin: (POT = power of two, btw)
07:50KungFuJesus: yeah, figured
07:50imirkin: NPOT = non-power of two :)
07:51KungFuJesus: ok so pitch==0 is swizzled?
07:51imirkin: yes
07:52KungFuJesus: that is the troublesome code path
07:52imirkin: wait, so is it definitely SIFM?
07:53KungFuJesus: well, haven't eliminated the linear path is bug free either, but the current branch being taken is swizzle
07:53KungFuJesus: yes
07:53imirkin: heh
07:53imirkin: well, i dunno if you can make that assertion
07:53imirkin: just that there's a swizzled texture that gets messed up
07:53imirkin: check out the code in nv30_miptree.c
07:53imirkin: in nv30_miptree_create
07:53imirkin: see that giant if
07:54imirkin: which gets hit for, among other things, NPOT sizes
07:54imirkin: force it to always get it -- just add a || true in there
07:55KungFuJesus: bug still persists
07:56imirkin: as i expected :)
07:56imirkin: but you'll note that the non-pitch path no longer gets hit
07:58KungFuJesus: nope, I'm still hitting it
08:01KungFuJesus: actually I can't tell anymore, my fprintfs aren't showing up anymore for some reason
08:07KungFuJesus: oh wait, it didn't get to that bit unless I went into the game
08:17KungFuJesus: What's somewhat weird is that the color formats chosen here implicitly endian independent. It's as if there's a command missing that says "don't swap the bytes"
08:18KungFuJesus: what is dst->bo? Is that byte order?
08:18imirkin: buffer object
08:19imirkin: and yes, there's some amount of implicit swapping
08:19imirkin: some of it is good
08:19imirkin: some of it is bad :)
08:21KungFuJesus: for the Y8 formats it's somewhat obvious that you'd want it off here
08:21imirkin: that never gets used
08:21imirkin: i think in practice, only the RGBA8 formats get used for your situation
08:24KungFuJesus: I meant for the SIFM method
08:28imirkin: i think there's an extra method that tells it to not byteswap even though it would normally (due to the overall setting of "big endian platform")
08:28imirkin: but i'm not sure where it is
08:35KungFuJesus: as in you think it exists somewhere in the command set and hasn't been discovered, or you've found it once before?
08:42KungFuJesus: you'd think it'd be a flag in SRCCOPY or something
08:43imirkin: i'm 99.9% sure it exists
08:43imirkin: i've seen references to it somewhere
08:48KungFuJesus: grepping that repo you sent me, I found "BIG_ENDIAN" mentioned a few times in memory/nv10_pfb.xml, but it seems like that's just enumeration stuff
08:49KungFuJesus: ``` <bitfield pos="31" name="ENDIAN" variants="NV1A-">
08:49KungFuJesus: <value value="0" name="LITTLE" />
08:49KungFuJesus: <value value="1" name="BIG" />
08:49KungFuJesus: </bitfield>
08:49KungFuJesus: ignore the ```, sorry used to slack now
08:49KungFuJesus: that's in DMA_FETCH
08:50KungFuJesus: fifo/nv4_pfifo.xml
08:53imirkin: yeah, we set that to BIG
08:53imirkin: KungFuJesus: https://github.com/skeggsb/nouveau/blob/master/drm/nouveau/nvkm/engine/fifo/dmanv04.c#L212
08:57imirkin: that's the thing that flips the endianness of the commands in the pushbuf
08:57imirkin: (that's all the PUSH_DATA stuff)
08:57imirkin: so that the cpu can write it out natively
08:58imirkin: and then the fifo engine does a byteswap before interpreting the values
08:59KungFuJesus: does it apply to all DMA, including textures?
08:59KungFuJesus: what's a good thing for me to try here just to see if undoing implicit endianness will fix the issue?
09:00KungFuJesus: man there's so little decent documentation on this
09:05imirkin: this is where i get a little hazy
09:05imirkin: i'm pretty sure MMIO accesses are byte-swapped
09:05imirkin: but ... what about like ... cpu accesses? probably not
09:06imirkin: i wonder if nv30_transfer_rect_cpu gets hit a lot
09:06imirkin: it will never do a byteswap
09:06KungFuJesus: seemed to hardly at all
09:06imirkin: although it's copying form a bo to another bo
09:06imirkin: whereas we have to ensure that the cpu data -> etc gets treated properly
09:06imirkin: it's a giant mess =/
09:07imirkin: one of my bits of advice is to load up a trace in qapitrace
09:07imirkin: and then look at what the bound textures look like in there
09:09KungFuJesus: I mean I'm 100% certain the in memory textures will be linear RGBA
09:10imirkin: yeah
09:11imirkin: but if you pull it up in qapitrace
09:11imirkin: perhaps that will make it easier to track down wtf is going on
09:12KungFuJesus: actually I don't even see the textures in qapitrace
09:12KungFuJesus: just the framebuffer
09:12KungFuJesus: there's one texture, and it seems transparent?
09:12imirkin: try clicking on "opaque"
09:13KungFuJesus: it's one texture, a 32x32, and it seems to be just a black square
09:15KungFuJesus: for a given frame, I see several glBindTexture calls, though
09:16KungFuJesus: ah, maybe I'm not doing lookup state in the proper spot in the frame
09:17KungFuJesus: tried it later in the frame, 50 textures down
09:17KungFuJesus: I think X11 is hung
09:19imirkin: =/
09:24KungFuJesus: scp'd the trace over to x86 machine, I see the textures
09:24KungFuJesus: not sure if that says anything
09:25KungFuJesus: damn, display was very hung on that thing, had to reboot
09:29KungFuJesus: hmm, well, I gotta get to sleep, let me know if you have any suggestions to try
10:50cosurgi: pkill -SIGSTOP chromium ; ## solves lots of problems, while I don't use it.
10:55cosurgi: I added this to the script that locks xserver. And -SIGCONT upon unlocking. That will make life easier to other xservers.
23:34imirkin: ok, so ... i THINK that at least most of the issues with the rotation are actually core Xorg/EXA issues
23:35imirkin: so i plan on trying to do a release tonight of xf86-video-nouveau unless i hear any objections