00:56emile_3455: Hello all, I get this error: [ 16.326490] nouveau E[ DISPLAY][0000:01:00.0] 01:0130: func 08 lookup failed, -2
00:56emile_3455: And i want to file a bug. What information do you need?
00:57emile_3455: And btw, I never filed a bug before.
01:03pmoreau: emile_3455: Short answer would be: look at the channel's topic
05:28imirkin_: mwk: if ((op == 4 || op == 5) && sz & 1 && ctx->chipset < 0xa0)
05:28imirkin_: what's 1 there for?
05:32karlmag: imirkin_: ( sz & 1 )
05:32karlmag: as opposed to ( sz && 1 )
05:32imirkin_: oh. &. not &&.
05:33imirkin_: thank you :)
05:33karlmag: easy to overlook.
05:34karlmag: Personally I might be inclined to add () for clarity, but technically they're not needed
05:34karolherbst: https://i.imgur.com/EC0dvFf.png :)
05:34imirkin_: karolherbst: yay! :)
05:35imirkin_: why isn't temp going way up?
05:35karlmag: karolherbst: that looks nice
05:35karolherbst: imirkin_: because 62 is pretty hot already
05:35imirkin_: i guess
05:35imirkin_: 110 is hot :p
05:35karolherbst: I think my gpu maxes out at 80
05:35karolherbst: in heaven
05:35karolherbst: but this after like 10 minutes
05:35karolherbst: it starts with 47 though
05:36karolherbst: but I think I should add a value every second and not every frame?
05:38mwk: imirkin_: ops 4 and 5 (inc, dec) on signed operands (odd sz field) on pre-0xa0 behave weirdly and I haven't pinned them down yet
05:38mwk: I intend to remove that line when I do
05:39imirkin_: mwk: i was just wondering about the 1 more than anything
05:39imirkin_: i didn't notice the & vs && bit
05:39mwk: anyhow, nothing interesting in atomic ops
05:39mwk: they work as advertised
05:40mwk: except when you use an invalid op/size combo... you don't get an invalid operation error, weird shit happens instead
05:41mwk: as in, if you try red.add u8, what really happens is a 32-bit atomic add of the 8-bit source operand replicated 4 times
05:52karolherbst: imirkin_, karlmag: better? https://i.imgur.com/oHoan4F.png
05:52imirkin_: karolherbst: that's a lot of power consumption
05:52imirkin_: not MW? :)
05:52karolherbst: but hud doesn't have a way to add something besides "%"
05:53karolherbst: seems like I have to hack that in too
05:56karolherbst: imirkin_: https://github.com/karolherbst/mesa/commit/71050c885275a9d4c005ccfe7c341672d4228fdc
05:56RSpliet: on falcon, can you interpret $r3 in ld $r1 D[$r2+$r3] as unsigned (as you'd expect from two's complement arithmetic)?
05:56karolherbst: I hope everything looks fine besides the hud_sysfs file
05:56imirkin_: karolherbst: well, you can't add a dep on libudev for all of gallium
05:56imirkin_: among other things, it's built on windows/etc
05:56karolherbst: I figured
05:57imirkin_: you should be able to stick it in auxiliary/somewhere
05:57imirkin_: let me check
05:57karolherbst: yeah this won't work on windows anyway
05:57imirkin_: take a look at the Makefile.am in auxiliary
05:57imirkin_: note how it does stuff conditionally
05:57imirkin_: basically you want to stick this hud thing into a separate .la file
05:58imirkin_: er, the hud_sysfs thing
05:58imirkin_: and only link it in when stuff is sufficiently defined
05:58RSpliet: *cough* what I meant what, can you interpret $r3 as *signed*
05:58karolherbst: mesa always open only those /dev/drm files right?
05:58imirkin_: RSpliet: wow, it lets you add two registers? that's uncommon.
05:59imirkin_: karolherbst: /dev/dri... yes.
05:59imirkin_: karolherbst: i mean, they're not guaranteed to be in /dev/dri :) but it's those files.
05:59RSpliet: imirkin_: there's a REG3 format yet, and the third is interpreted as an index, so will be multiplied with the size of the insn
05:59karolherbst: imirkin_: and what if they aren't there?
05:59imirkin_: RSpliet: fancy
05:59RSpliet: only the finest for your falcon
06:26imirkin_: gnurou: any comment on having more than 8 constbufs on kepler+ compute? is it just not possible?
06:50gnurou: imirkin_: err sorry, haven't checked yet. first I have to figure out what this means. >_<
06:50imirkin_: gnurou: so you know how there's CB_BIND methods on the 3d pipeline?
06:50imirkin_: (even if you don't, they're there -- trust me)
06:51imirkin_: on compute, starting with kepler, there's a huge launch descriptor that goes with compute kernels
06:51imirkin_: it includes what appears to be only room for 8 constbufs
06:51imirkin_: and there are comments in nouveau that the "high" constbufs, i.e. constbufs 8..15 just alias 0..7
06:51imirkin_: however perhaps there's a way to not do that
06:52imirkin_: and have access to all 16 constbufs
06:52imirkin_: gnurou: this is the kepler compute launch descriptor: https://github.com/envytools/envytools/blob/master/rnndb/graph/gk104_compute.xml#L172
06:54gnurou: imirkin_: great, thanks for the pointers. this is helpful for my education (like with the Maxwell texture descriptors) and maybe we can release some doc again. In any case I will try to come with an answer tomorrow
06:55imirkin_: gnurou: the reason i'm asking is that GL mandates a minimum of 12 constbufs, and it'd be incredibly convenient to just keep doing the same thing in kepler compute as we do on fermi and kepler 3d.
06:56imirkin_: gnurou: btw, this might sound odd, but feel free to ask about how your hw works ;) it can be a lot faster to ask someone than to try to trudge through long-winded documentation
06:57imirkin_: gnurou: though obviously take what i or others here say with a grain of salt. basically everything is prefaced with "to the best of my understanding" :)
06:57gnurou: imirkin_: haha, well when I know what to look for, it's usually easy to find the documentation. The issue is with stuff I don't even suspect the existence :)
06:58imirkin_: the other problem is that docs tend to give you a micro view, i.e. method X does Y, not like "how is this whole thing supposed to even work" :)
07:04karolherbst: gnurou: [vbios] I see a small chance of success with that, but maybe you could help us how we can map the power sensors from the extdev table with the 0x28 P table, this is somewhat important for proper dynamic reclocking :/ This would really help a lot
07:05gnurou: karolherbst: yay, more homework :)
07:05imirkin_: gnurou: if you're going to look at vbios stuff, nobody ever answered how we're supposed to tell the max HDMI frequency
07:06gnurou: karolherbst: although I'd suggest to send that question to the email address dedicated to this first
07:06gnurou: imirkin_: ah. I guess my last line sounds stupid now :/
07:06karolherbst: yeah maybe I should :/ but somehow I think asking you has a higher success rate
07:07gnurou: karolherbst: not sure, I'm really juggling with 50 different tasks you know :/
07:07imirkin_: gnurou: no good deed goes unpunished. your helpfulness is being rewarded with more questions.
07:07gnurou: which is why none ever gets completed /o\
07:08imirkin_: that's why it pays to be a curmudgeonly jerk
07:08karolherbst: this could be a full time position: hanging around in IRC and answer question to nouveau devs :D
07:09karolherbst: sorry for asking you so much thought, but I am sure nobody would ask if there would be a way to figure that out ourselfs :/
07:11gnurou: let's try something to make sure we don't lose track of all this: could you maybe open an issue on https://github.com/Gnurou/nouveau/issues with your questions and as many detail as possible, and tag it with the "question" label? That way we would have a public, permanent track for this and I would have a link to send to the people who know
07:11Aquarina: hi again
07:11gnurou: the problem with email is that it gets lost and buried under new email, and IRC is even worse
07:12karolherbst: gnurou: nice idea! thank you
07:12Aquarina: My Xorg seems so start ok, since I can play with the mouse pointer... but no gdm3!
07:12imirkin_: gnurou: sounds good. want me to backfill it with the 1000 questions i've asked that have gone unanswered?
07:12gnurou: at least if stuff is recorded on Github I will remember to look at it once in a while and will feel bad to see questions unanswered
07:12gnurou: imirkin_: by all means
07:12imirkin_: gnurou: ok, will do. (not this second, but... soonish)
07:13gnurou: it's nothing official, and I'm not sure other people at NVIDIA will like this, but that can only be better than what we have now
07:13pmoreau: gnurou: So, just open an issue on your bug tracker? Or also send an email with a link to the issue on github?
07:13gnurou:thinks about sending a weekly nag email from that list to people of knowledge
07:14pmoreau: gnurou: Wasn't there some discussion at some point to have an official "bug tracker"? Or maybe something you suggested earlier?
07:14gnurou: pmoreau: just opening an issue will be fine, it's just to keep track of things
07:14pmoreau: Or have a bot send emails for you even :-)
07:14gnurou: pmoreau: that's nothing official, let's be clear about this
07:15imirkin_: Aquarina: remind me what hw you're using?
07:15pmoreau: gnurou: Right, but the talk of having an official one rings a bell, so I was wondering where I heard about "having an official one would be nice"
07:15Aquarina: imirkin that GeForce Go 7300
07:15Aquarina: old card
07:16gnurou: pmoreau: I still think it would be nice, but I did not get the internal traction I wanted to concretize this
07:16Aquarina: I added nouveu.config=NvMSI=0 to linux line in grub
07:16imirkin_: Aquarina: ok yeah, i'm not surprised that nouveau doesn't work with gnome-shell
07:16Aquarina: and worked ok with 3.16 kernel
07:16gnurou: pmoreau: so let's just go with this "unofficial" solution for now
07:16pmoreau: gnurou: Eh, who wants to be bugged with questions! ;-)
07:17Aquarina: but now with 4.3 it's not working again!
07:17Aquarina: but I think the problem must be somewhat different
07:17Aquarina: since I can see the mouse pointer and move it around
07:18imirkin_: Aquarina: 4.3 helpfully changed the names of everything
07:18imirkin_: er hm. NvMSI should still work though.
07:19imirkin_: Aquarina: did you still boot with NvMSI=0?
07:19karolherbst: gnurou: ohh I see that in the opengpu docs there is something about the extdev table, but I don't know if that helps :D
07:19Aquarina: I tried with that setting in gnome, in /etc/modprobe.d/nouveau.conf... nothing
07:20imirkin_: Aquarina: oh wait, i think 4.3 was released with a fail for pre-nv50 as well
07:20Aquarina: even after update-iritramfs -u
07:20imirkin_: Aquarina: can you try 4.4? :)
07:20Aquarina: imirkin hum... I don't thing is in the repos yet!
07:20Aquarina: otherwise I should have got it
07:23imirkin_: Aquarina: well it was released a week or so ago
07:24Aquarina: no 4.3 is the latest in the repos
07:24Aquarina: so.. they fixed it?
07:25imirkin_: Aquarina: well, there was an issue, and it was fixed. dunno if it's *your* issue
07:26Aquarina: I'll wait for a fixed kernel then
07:27RSpliet: Aquarina: waiting is often times the worst strategy to resolve an issue :-P
07:28Aquarina: just for your refference - if interested: 4.16 is the debian jessie (stable) kernel, could be fixed with NvMSI=0 thing. 4.3 is the backports kernel for jessie. doesn't work. :-(
07:28imirkin_: Aquarina: right, but it's an unrelated issue
07:28imirkin_: Aquarina: it had something to do with page flipping iirc
07:28Aquarina: RSpliet: you'are right, but I wouldn't know how to fix it
07:29Aquarina: imirkin_: are you aware of any workaround?
07:29RSpliet: Aquarina: test something newer and provide us with all the feedback that is asked for, file a bug if the ppl here get stuck to make it official is a more productive path :-)
07:31RSpliet: (albeit a bit more time consuming)
07:31imirkin_: Aquarina: sure, update to 4.4
07:31Aquarina: RSpliet: I've been having a problem with that aproach (which I would very much like to take) that is: it normally implies installing a bunch of stuf and break my system
07:32Aquarina: I'd still like to start to do just that
07:33Aquarina: as soon as I manage to move to jessie and prepare a bunch of containers, I might look into doing things like that
07:33Aquarina: in the meantime... trying to fix this!...
07:34Aquarina: what is NvMSI=0 doing anyway?...
07:34imirkin_: it turns off MSI
07:34Aquarina: what is that?
07:34imirkin_: stupid technology
07:35imirkin_: message-signaled interrupts? something like that
07:35imirkin_: theoretically bigger better faster than regular plain ol' interrupts
07:35Aquarina: but does it affect performance all that much?
07:36imirkin_: if you have a *ton* of interrupts, yes
07:36imirkin_: on nv4x hw, probably not a big deal
07:36karolherbst: I think it makes it possible to have more interrupts at all anyway
07:36Aquarina: (you can tell by the age of my board how much I care about preformance up to a certain level)
07:37imirkin_: and fwiw a patch went in to fix msi's on nv46
07:37imirkin_: should be in 4.4 or 4.5, i forget
07:38Aquarina: I do want graphics to be beautifull and some acceleration is needed even for web stuff, and gnome shell, other than that...
07:38imirkin_: Aquarina: well part of your problem is that the nv30 mesa gallium backend is kinda crap
07:39imirkin_: (the nv30 backend covers both nv3x and nv4x gpu's)
07:39Aquarina: nv30 is the kernel module?
07:39imirkin_: no, the mesa 3d driver backend
07:40Aquarina: the software that takes care of 3d graphics? is that it?
07:40imirkin_: the software that provides the OpenGL implementation
07:40imirkin_: "3d" is a simplistic way of thinking about it, since e.g. gnome-shell uses it, but it has nothing to do with 3d
07:41Aquarina: hum... so I'll never be able to view http://zygotebody.com/ properly with this card! right?
07:41imirkin_: dunno, soem stuff will work, others won't
07:42Aquarina: well, for now I think I'll just have to wait for a fixed kernel to emerge in package manager
07:42imirkin_: but where i think that the nv50 and nvc0 3d drivers are "pretty good", the nv30 driver is "pretty crap"
07:44Aquarina: but I promisse to look (with care) into contributing with bug reports properly... I'm sorry I'll have to pass this oportunity with you RSpliet
07:44Aquarina: imirkin_: it's a laptop, cannot change cards anyway
07:45imirkin_: Aquarina: at least not without some *very* careful footwork
07:45Aquarina: it's an asus old laptop
07:47Aquarina: thank you all for 1. put up with me (I think that's how you say it) 2. working on the nouveau driver
09:40karolherbst: imirkin_: the get_fd way seems to work
09:40karolherbst: I just do a return nouveau_screen(pscreen)->drm->fd; inside nouveau
09:53karolherbst: imirkin_: I think procfs is the only way to get the path of a fd :/
09:53karolherbst: or do you know something else?
09:54imirkin_: karolherbst: why do you need the filename? for udev?
09:54karolherbst: mhhh, I only need that renderD* thing though
09:54karolherbst: I call udev_device_new_from_subsystem_sysname(udev, "drm", "renderD129")
09:54karolherbst: so only the last part is the problem
09:55imirkin_: hold on
09:56imirkin_: karolherbst: udev_device_new_from_devnum
09:56imirkin_: karolherbst: that should have info from the fd
09:58karolherbst: fstat(fd, &tb); udev_device_new_from_devnum(udev, 'b', tb.st_dev);? or do I have to use 'c'?
09:58imirkin_: st_mode & S_IFCHR == 'c', st_mode & S_IFBLK = 'b'
09:58imirkin_: and you want st_rdev
09:58imirkin_: not st_dev
09:59imirkin_: but basically the device will always be a chrdev
09:59imirkin_: if it's not you can just fail
10:08karolherbst: nice, seems t work
10:09karolherbst: imirkin_: https://github.com/karolherbst/mesa/commit/c4b61875c2d25395f6685d2bbeaa8bce8d46cf68
10:11vita_cell: do I need to compile nouveau.ko when I updated the kernel?
10:12imirkin_: karolherbst: yay
10:12karolherbst: imirkin_: and I think it is enough when this just supports one hwmon node
10:12karolherbst: when drivers will come up with more than one, we can add that later
10:13imirkin_: karolherbst: ok
10:14karolherbst: now I have to do something about that root thing :/
10:14karolherbst: I doubt it will overflow _ever_, but who knows
10:21imirkin_: karolherbst: PATH_MAX
10:22imirkin_: some genius will try to use numascale or whatever
10:22imirkin_: and have 10 levels of pci buses
10:22karolherbst: now I am prepared
10:24imirkin_: karolherbst: also just do it in nouveau_screen_init
10:24imirkin_: that way you don't need to do a separate thing for each backend
10:24karolherbst: ahh for the nouveau ones
10:25karolherbst: makes sense
10:25imirkin_: i realize that's minor, but might as well get it right up front
10:27karolherbst: I was thinking about changing it anyway
10:27vita_cell: where I can download the latest "nouveau" source folder?
10:28karolherbst: vita_cell: kernel module: https://github.com/skeggsb/nouveau
10:30vita_cell: this is the latest one? includes kepler voltage fix and PCIe 3 support?
10:30karolherbst: pcie support is already in linux 4.5
10:30karolherbst: and there is no voltage fix
10:30vita_cell: and volt fix?
10:30karolherbst: we still have to figure out how to do that the right way
10:30vita_cell: I will try
10:30vita_cell: thanks you karolherbst
10:30karolherbst: there also won't be my kepler reclocking stuff, but I have no clue if you need those
10:31karolherbst: this has to wait until we figure out the volting thing anyway
10:33karolherbst: imirkin_: any complaints besides the make that thing optional with udev support issue?
10:33karolherbst: udev is dlopened I think anyway?
10:35imirkin_: karolherbst: the current approach in e.g. the loader is to dlopen it, since its api is fairly small
10:35imirkin_: karolherbst: basically it's gonna cause giant headaches with things like steam
10:35imirkin_: if you link against libudev.so.1 but they have libudev.so.0
10:35imirkin_: or... something
10:36imirkin_: so the idea is that you should dlopen + dlfunc the 2 functions you need
10:36vita_cell: please guys, what I doing wrong when compiling? I cant compile my saved nouveau sources and the latest nouveau-master
10:37karolherbst: vita_cell: you need a newer drm tree :/
10:37vita_cell: how I can get it?
10:38vita_cell: I am on 4.4.0 kernel
10:38karolherbst: it is also a kernel module
10:38vita_cell: I could compile for 4.3
10:42vita_cell: wow, how to download it?
10:44pmoreau: git clone it?
10:47vita_cell: $ git://anongit.freedesktop.org/drm-intel ??
10:48vita_cell: $ git://git.freedesktop.org/~airlied/linux/log/?h=drm-next
10:51pmoreau: `git clone -b drm-next git://people.freedesktop.org/~airlied/linux` should do it
10:54vita_cell: thank you, one thing more learned
10:54imirkin_: vita_cell: fyi the git url is written on the summary page of those cgit things
10:56karolherbst: imirkin_: maybe it would make sense to move the udev stuff out of loader.c, I really don't feel well with duplication all this dlopen stuff :/....
10:56karolherbst: imirkin_: well loader.h has stuff like loader_get_device_name_for_fd
10:56karolherbst: maybe I should implement a loader_get_sysfs_path_from_fd?
10:56imirkin_: no opinion.
11:02imirkin_: gnurou: it's probably obvious but it doesn't seem as though any of us can add labels...
11:03pmoreau: imirkin_: I followed karolherbst and put the label in the issue title
11:03karolherbst: yeah, only owners or moderators can manage labels
11:04pmoreau: Kinda makes sense
11:04imirkin_: pmoreau: there are special things called "labels"
11:05imirkin_: pmoreau: and we can't set them. any text we write in the issue title is not a label.
11:05karolherbst: imirkin_: there are teslas which can drive 2560x1600 displays
11:05imirkin_: karolherbst: over hdmi?
11:05pmoreau: imirkin_: I know about them, but since there weren't any created and I couldn't make some myself, I "added" it to the title
11:05karolherbst: imirkin_: ohh right, don't think so
11:06karolherbst: yeah okay, didn't notice this was about hdmi only :/
11:06imirkin_: karolherbst: i don't exclude the possibility that tesla's, esp GT21x could do > 165MHz HDMI.
11:07imirkin_: karolherbst: i don't have the requisite hw to check, unfortunately
11:07imirkin_: (i have a GT215, just no hdmi monitor)
11:08pmoreau: Works over DP, no idea about HDMI…
11:09imirkin_: yeah, DP is a different beast.
11:11karolherbst: imirkin_: I am sure even pre tesla can do 2560x1600 over dual DVI :/ but I have no clue what the connection is here with HDMI/DP and if that's somehow related at all....
11:11imirkin_: it definitely can.
11:11imirkin_: dual-link dvi is 330mhz
11:11imirkin_: (165mhz * 2)
11:11karolherbst: ohhh okay
11:11imirkin_: however hdmi is a single-link tmds
11:11imirkin_: so if it's just the regular dvi, that's 165mhz
11:12imirkin_: however because of better cabling/etc it allows a higher rate over that single link
11:12karolherbst: so dual-link dvi is kind of a hack, where you bind two dvi signals with the same display and like driver one half with one signal?
11:12imirkin_: karolherbst: well, it's 2 separate signals in one cable
11:12imirkin_: but HDMI just doesn't have the pins for i
11:12imirkin_: for it*
11:14imirkin_: and neither does DP for the passive DP -> DVI adapters
11:14imirkin_: that's why if you want > 165mhz pixclock with DP -> DVI, you need an active converter
11:14pmoreau: Hum… Have to check my OpSplit64 for mul, seems like something changed after some rebases.
11:15imirkin_: pmoreau: i made a very minor change to it the other day
11:15pmoreau: Changing the def to defFlag for the carry? I got a conflict on that one :-)
11:15imirkin_: pmoreau: yeah, and iirc another very minor change
11:16imirkin_: but i can't for the life of remember what
11:16pmoreau: Don't worry, I'll find it :-)
11:16vita_cell: guys, I download from git://people.freedesktop.org/~airlied/linux and I have the "linux folder, now what I must to do to update drm tree?
11:16imirkin_: ah right, to consider SHADER_OUTPUT as a memory area that needs to be incremented
11:17pmoreau: vita_cell: You need to set LINUX_DIR to the Linux folder, before building Nouveau. Otherwise it defaults to the source of the running kernel
11:18karolherbst: imirkin_: in /sys/dev/char: "226:129 -> ../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/renderD129"
11:18imirkin_: ... ok
11:18karolherbst: and there is a dev_node_from_fd function in loader.c
11:19karolherbst: which gives you both umbers
11:19karolherbst: which gives you both numbers
11:19vita_cell: pmoreau how to? I dont know what is LINUX_DIR, this is my kernel headers?
11:21pmoreau: Go to the drm folder of nouveau, and run `LINUX_DIR=/path/to/the/linux/version/of/airlied/you/just/cloned make`
11:22pmoreau: The patch still looks fine after rebase, could be I messed something earlier about the types of the regs, and the code exits before doing the split.
11:25vita_cell: always in all nouveau source folders it gives me a nv50_display error 1
11:26pmoreau: Sorry, its LINUXDIR not LINUX_DIR… --'
11:26vita_cell: going to try
11:28vita_cell: make oldconfig && make prepare??
11:29pmoreau: Probably… you might want to compile the kernel anyway, otherwise you'll get versions conflict when trying to load Nouveau
11:30pmoreau: Grrr… I'm sure the sources are TYPE_NONE… --"
11:31vita_cell: I already have my /lib/modules/4.4.0-gnu/kernel source folder
11:32vita_cell: yes, I want to compile with mine, I am previously compiled my nouveau sorce folder and put it to 4.3
11:53pmoreau: hakzsam: :-( Looks like I'm complitely messing between 64- and 32-bit values…
12:43karolherbst: imirkin_: that should do https://github.com/karolherbst/mesa/commit/b23172b1f0cdeef657154356987318dbdd2e7b93
12:44karolherbst: for dirname
12:44karolherbst: or do you think adding ../../ to the path is a good solution?
12:44imirkin_: i think you want #if defined(HAVE_LIBUDEV) && defined(HAVE_SYSFS)
12:44karolherbst: HAVE_SYSFS isn't defined for me
12:44imirkin_: oh hm
12:44imirkin_: maybe it means something else
12:44karolherbst: it is a fallback when libudev isn't there
12:44karolherbst: I could add a sysfs based implementation
12:45karolherbst: imirkin_: like here: https://github.com/karolherbst/mesa/blob/b23172b1f0cdeef657154356987318dbdd2e7b93/src/loader/loader.c#L677-L690
12:45imirkin_: dri nodes are always chardevs btw
12:45imirkin_: i'd nuke the block bit of it
12:46karolherbst: yeah, sysfs_get_device_name_for_fd also just assumes this
12:49karolherbst: imirkin_: https://github.com/karolherbst/mesa/commit/2442db9acc84934bcb42509d9ff7fcf9374e71af
12:49karolherbst: any other comment?
12:51karolherbst: the gallium patch looks much cleaner with that: https://github.com/karolherbst/mesa/commit/ffd9a571e1ebaf46711c8eff6aa2eda5d760f935
12:52imirkin_: no need to make nouveau_screen_get_fd global
12:52karolherbst: ohh right
12:52imirkin_: also please don't use the dirname thing
12:53imirkin_: if you must, just append /../.. to the dir string
12:56karolherbst: imirkin_: any specific reasons for that by the way? Afaik dirname is posix
12:57imirkin_: karolherbst: it's a weird header, i've never heard of it, easier to just not deal with it
12:57karolherbst: this header only contains dirname and basename, maybe that's why you never heard of it :/
12:58karolherbst: but I also didn't knew about it yesterday :D
13:01vita_cell: karolherbst how to update drm tree? I already downloaded "linux" folder from your link, but $ /Escritorio/nouveau/drm$ LINUXDIR=/home/vita/linux/ make
13:01vita_cell: doesn't work
13:01karolherbst: I never used the drm tree, but it is basically a kernel tree
13:02vita_cell: every time I try to compile nouveau source it throws me a nv50_display.o error1
13:07karolherbst: imirkin_: seems like every c standard lib implementation has this header and implements dirname, so I think I will stick to that, because I don't have any technical reasons why not to use it, will ask in dri-devel though
13:26Aquarina: imirkin_: a piece of info you could find interesting: new debian kernel, same version - just security fixes, same bug. No luck.
13:27imirkin_: not surprising
13:27Aquarina: they must have fixed a newly found bug in linux kernel, but the problem with nouveau driver remains
13:27Aquarina: yes, not very surprised, no...
13:38metalhead33: Hello everyone! I happen to have a problem. Today - listening to the advies of the folks on the #xfce channel - I decided to migrate from the closed-source binary Nvidia driver to the Nouveau driver. I have Optimus, second-generation Intel I915 + Nvidia GeForce GT 540M.
13:39metalhead33: After I switched to Nouveau, Xorg refuses to start up, claiming that i915 has some sort of invalid arguments or something.
13:39imirkin_: i assume you mean you have gen6, sandybridge, not gen2
13:39metalhead33: Not really sure.
13:39metalhead33: Also, when I do "eselect opengl list", it only says xorg-x11, it does not say nouveau.
13:39imirkin_: that's the right thing
13:40imirkin_: have a look at http://nouveau.freedesktop.org/wiki/Optimus/
13:40imirkin_: in the meanwhile, pastebin dmesg and xorg logs
13:41metalhead33: I will try, but it won't be easy, consdiering how the only way I can access this IRC is to log in with my old kernel - the one I had before migrating to Nouveau.
13:41imirkin_: well, the thing is that with optimus, it'll actually be intel doing most of the work
13:41imirkin_: and if you want to offload a particular application, you'll be able to do so
13:42imirkin_: perhaps there are old logs on your system, dunno
13:42metalhead33: The weird thing is, I did not change anything about Intel. It all happened only after I migrated to Nouveau.
13:42metalhead33: Lemme check.
13:42imirkin_: but the only way you resolve your issues is to communicate information here
13:42imirkin_: if you have no way to communicate that info, i'm not sure how to help you
13:43imirkin_: perhaps you have another computer and you can ssh in and retrieve the various info
13:43imirkin_: but note that it's most likely that your issues have absolutely nothing to do with nouveau
13:44imirkin_: since it'll be the intel chip driving your panel and probably other screens
13:44imirkin_: Number of created screens does not match number of detected devices.
13:44imirkin_: that means you have a funky xorg.conf situation
13:44imirkin_: you should just blow away whatever you have, a blank xorg.conf will work best
13:44metalhead33: [ 144.297180] xfce4-display-s: segfault at 10 ip 0000000000407af4 sp 00007ffedaa44e50 error 4 in xfce4-display-settings[400000+48000]
13:45imirkin_: [ 4559.176] (EE) [drm] KMS not enabled
13:45imirkin_: that's a big problem.
13:45metalhead33: [ 17.508088] nvidia: module license 'NVIDIA' taints kernel.
13:45metalhead33: [ 17.508929] Disabling lock debugging due to kernel taint
13:45imirkin_: [ 4559.167] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied
13:45karolherbst: imirkin_: you can ignore those permission denied issue
13:45metalhead33: looks like I forgot to disable Nvidia in Kernel?
13:45imirkin_: sounds like you still have leftovers of nvidia
13:46metalhead33: Hmmm... they won't be easy to remove.
13:46metalhead33: [ 17.524180] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 361.16 Tue Dec 29 23:02:26 PST 2015
13:46metalhead33: What the hell is it doing herE?
13:46karolherbst: metalhead33: remove nvidia.ko inside /lib/modules ;)
13:46imirkin_: also check /etc/modprobe* -- nvidia likes to install files that prevent nouveau from loading
13:46metalhead33: rm nvidiako - no such file
13:47metalhead33: don't you mean /usr/src/linux?
13:47karolherbst: metalhead33: it is /lib/modules/4.4.0-gentoo/video for me
13:47imirkin_: it'll be deep in /lib/modules
13:47karolherbst: I have no clue what it is for you
13:47metalhead33: found it
13:47karolherbst: I would remove all three modules
13:47imirkin_: you should just remove the nvidia-drivers package
13:48imirkin_: that should remove all that junk
13:48metalhead33: so, nvidia.ko and nvidia-driverset.ko
13:48karolherbst: imirkin_: portage doesn't remove compiled modules
13:49karolherbst: metalhead33: everything which starts with nvidia
13:49imirkin_: that :)
13:49metalhead33: wait... does Nouveau support GeForce GT 540M?
13:49karolherbst: yes it does
13:50imirkin_: for some definition of support
13:50imirkin_: there's no reclocking so you'll get shit perf.
13:50karolherbst: metalhead33: but, intel will still drive your main display ;)
13:50metalhead33: I think I got a better result now
13:50metalhead33: look at this
13:51karolherbst: metalhead33: you have to enable kernel mode setting
13:51imirkin_: wtf is in /root/xorg.conf.new ?
13:51imirkin_: there's something seriously bad going on...
13:51imirkin_: [ 4911.532] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied
13:51imirkin_: that means something's using it... or something.
13:51imirkin_: i'd just try rebooting.
13:51imirkin_: looks like you have systemd insnaity in there too
13:52imirkin_: which certainly won't be helping system debugability
13:52metalhead33: I will enable KMS
13:52karolherbst: why did you disable it?
13:52metalhead33: I never disabled it... I think.
13:52karolherbst: now I am curious too
13:52metalhead33: Oh wait, Maybe I should wait for MESA to finish first...
13:52karolherbst: what is inside /root/xorg.conf.new
13:52metalhead33: it's still emerging/compiling LLVM
13:53imirkin_: that's not important
13:53metalhead33: Nah, screw that, I gonna just reboot
13:53metalhead33: Should I just reboot or try to enable kernel mode setting first?
13:53imirkin_: metalhead33: grep -r nouveau /etc/modprobe*
13:53metalhead33: no result
13:54metalhead33: Was that sarcasm? :v
13:54imirkin_: grep -r i915 /etc/modprobe*
13:54imirkin_: no, it was not.
13:54imirkin_: just making sure you didn't have weird options accidentally set
13:54metalhead33: Well then... ready to reboot?
14:00metalhead33: Didn't work.
14:01imirkin_: pastebin dmesg
14:01imirkin_: and xorg log
14:03imirkin_: wtf is in your xorg.conf.new?
14:03imirkin_: (pastebin it)
14:04imirkin_: get rid of all that
14:04imirkin_: you should just use an empty xorg.conf
14:04imirkin_: that will work way better
14:04metalhead33: Hmmm, allright...
14:05imirkin_: also, nouveau wasn't loaded, i'm guessing your kernel doesn't have nouveau built at all
14:05metalhead33: It does
14:05metalhead33: I did build it with the Nouveau modules
14:05imirkin_: then something's preventing it from getting loaded
14:05metalhead33: I did make && make modules && make install
14:05imirkin_: try 'modprobe nouveau'
14:05imirkin_: i have nfc what 'make install' would do, but i can't imagine it'd be anything good.
14:05metalhead33: modprobe: FATAL: Module nouveau not found. - (reminder: I am using the old kernel. AFter al, how else would I access the IRC?)
14:06imirkin_: was the dmesg from the old kernel?
14:06imirkin_: ah yeah, it has nvidia in there
14:06metalhead33: I think so. It's next to impossible to get it from the new kernel, considering how I only have access to the command line, no X server.
14:06imirkin_: dmesg > foo
14:07metalhead33: Well then, back to the restarting grind again.
14:07metalhead33: One moment.
14:07imirkin_: and try with an empty xorg.conf
14:10imirkin_: that kernel is in deep trouble
14:10imirkin_: no i915, and no nouveau
14:11metalhead33: I have an idea... how about removing the configurations and copying over the configuartions of the old kernel?
14:11imirkin_: pastebin the config for that kernel?
14:12imirkin_: hmmmm ok. you do have them as modules
14:12imirkin_: which means that something's wrong with your modules
14:12imirkin_: did you not do "make modules_install"?
14:12metalhead33: oh wait... nope.
14:12imirkin_: do you have a /lib/modules/4.4.0-gentoo ?
14:13metalhead33: One moment, I make modules_install
14:13metalhead33: Mein Gott... so this was the big problem.
14:13metalhead33: I always forget this part.
14:13metalhead33: and yes, I do have it
14:13metalhead33: I gonna restart again.
14:13metalhead33: See if it works.
14:13imirkin_: well, now that you did the install :p
14:17metalhead33: Yep, now its working.
14:17metalhead33: At least, the Xorg server is working.
14:17metalhead33: Now I have to deal with bumblebee.
14:17imirkin_: no bumblebee
14:17imirkin_: bumblebee will just break everything
14:17metalhead33: No bumblebee?
14:17metalhead33: Then... how?
14:18imirkin_: look at the "offloading 3d" section
14:18metalhead33: Speaking of which, I have... a problem. Now I might get back to #xfce to solve it.
14:18imirkin_: or if you change your ebuild to do --enable-dri3 in the intel ddx, then you can use the DRI3 bits on that page
14:19metalhead33: When ever I click on the settings of the screen, it seems to segfault. Also, I have to manually edit files to get my 1366x768 resolution.
14:19metalhead33: arandr doesn't allow me to set it anything other than 640x480
14:20metalhead33: Provider 0: id: 0x47 cap: 0xb, Source Output, Sink Output, Sink Offload crtcs: 3 outputs: 5 associated providers: 0 name:Intel
14:20metalhead33: It doesn't recognize my nouveau?
14:21imirkin_: pastebin dmesg and xorg logs
14:23metalhead33: [ 14.240663] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 1 - okay, so it DOES recognize Nouveau
14:23imirkin_: i told you to clean out your xorg config
14:23imirkin_: but you did not
14:23imirkin_: i'll repeat again
14:23metalhead33: ah, forgot about it...
14:23imirkin_: clear out your xorg config
14:23imirkin_: and the good times will roll
14:23metalhead33: You mean, delete it?
14:23imirkin_: delete it, make a blank file, whatever
14:23imirkin_: having the Device section makes it so that dri2 offloading dosen't work
14:24imirkin_: (or rather, makes it so that it doesn't autodetect the other device and load a driver for it)
14:24metalhead33: okay, got rid of xorg.conf
14:24metalhead33: Now rebooting shall make it all work out just fine.
14:24imirkin_: just restart X is enough
14:24metalhead33: I don't know how to do that... does logging out and back suffice?
14:24imirkin_: depends on the login manager
14:25metalhead33: I'll reboot to be safe then
14:31metalhead33: Oh yeah
14:32metalhead33: now I can access the settings for resolution
14:32metalhead33: Hell yeah.
14:32metalhead33: Thanks everyone.
14:32metalhead33: Especially imirkin.
14:32metalhead33: Without you guys, I wouldn't be able to use this comptuer properly.
14:33metalhead33: However... xandr --listproviders still gives me only Intel.
14:33metalhead33: Maybe KMS is the problem.
14:33imirkin_: errrr.... hm
14:34imirkin_: [ 26.105] (**) | |-->Device "Device0"
14:34imirkin_: you did not clear out your config
14:34imirkin_: grep -r Device /etc/X11/xorg.conf*
14:34metalhead33: /etc/X11/xorg.conf.backup: InputDevice "Keyboard0" "CoreKeyboard"
14:34metalhead33: /etc/X11/xorg.conf.backup: InputDevice "Mouse0" "CorePointer"
14:34metalhead33: /etc/X11/xorg.conf.backup:Section "InputDevice"
14:34metalhead33: /etc/X11/xorg.conf.backup: Option "Device" "/dev/psaux"
14:34metalhead33: /etc/X11/xorg.conf.backup:Section "InputDevice"
14:34metalhead33: /etc/X11/xorg.conf.backup:Section "Device"
14:34metalhead33: /etc/X11/xorg.conf.backup: Identifier "Device0"
14:34metalhead33: /etc/X11/xorg.conf.backup: Device "Device0"
14:34metalhead33: /etc/X11/xorg.conf.d/device_no_vesa.conf:Section "Device"
14:34metalhead33: /etc/X11/xorg.conf.d/device_no_vesa.conf: Identifier "Device0"
14:34imirkin_: please use pastebin
14:35metalhead33: Ah, so I should get rid of the backups too...
14:35metalhead33: Oh, sorry.
14:35metalhead33: rm -r /etc/X11/xorg.conf* ? then
14:35metalhead33: This is the moment where you are going to impersonate Palpetine and say "Do it"
14:36imirkin_: that's probably a little harsh
14:36imirkin_: just that device_no_vesa.conf thing
14:36metalhead33: It's history
14:37metalhead33: I still have /etc/X11/xorg.conf.backup left, but not sure if that'll cause trouble...
14:37metalhead33: I think I'll rename it to something random, just in case.
14:37imirkin_: .backup should be fine... i think it's just *.conf
14:38metalhead33: 00-keyboard.conf 20opengl.conf - and thesse guys? They are from xorg.conf.d
14:40metalhead33: Now it lists both
14:40metalhead33: both Intel and Nouveau
14:43pmoreau: vita_cell: pong?
14:44vita_cell: LINUXDIR can be KERNELDIR?
14:44vita_cell: still doesn't work
14:44pmoreau: What is KERNELDIR?
14:45vita_cell: compiler stops at nv50_display.o
14:45vita_cell: don't know how to compile nouveau source with git cloned linux
14:46vita_cell: include/generated/autoconf.h or include/config/auto.conf are missing
14:47pmoreau: I guess they are generated when creating the .config file?
14:47imirkin_: metalhead33: ok so you're all good now?
14:48imirkin_: metalhead33: note that while your sandybridge will only ever provide up to GL 3.3, nouveau should expose GL 4.1 for you, and eventually will go up to GL 4.5, but ... not yet.
14:48metalhead33: DRI_PRIME=1 glxinfo | grep "OpenGL vendor string" says that it cannot load the Nouveau driver. I'm pretty sure all I have to do is to just re--emerge the driver.
14:48imirkin_: yeah, make sure nouveau is in your VIDEO_CARDS define
14:48metalhead33: It is
14:49metalhead33: libGL error: unable to load driver: nouveau_dri.so
14:49imirkin_: did you reinstall mesa?
14:49metalhead33: Not yet.
14:49imirkin_: after you changed VIDEO_CARDS
14:49imirkin_: ah, well, that's kinda important ;)
14:49metalhead33: well, here it goes then
14:49metalhead33: The long wait begins for LLVM
14:50imirkin_: hmmmm... it should be fine with whatever llvm you have. maybe the ebuild wants something newer though.
14:50metalhead33: I wonder why does it rely on LLVM though
14:50imirkin_: it doesn't
14:50imirkin_: but you probably have swrast enabled
14:50imirkin_: which will want to build llvmpipe
14:50imirkin_: which, as you might be able to guess, relies on llvm
14:50metalhead33: well, as far as I know, LLVM/Clang produces faster applications
14:50metalhead33: faster, but larger in size
14:50imirkin_: pretty sure that's false
14:50imirkin_: also this has nothing to do with clang
14:51imirkin_: it's using llvm as a jit to convert shaders into cpu-targeted machine code
14:53metalhead33: While the long wait is on... May I ask how will I deal with games?
14:53metalhead33: ONes that require my more advanced Nouveau driven card.
14:53imirkin_: DRI_PRIME=1 foo-game
14:53metalhead33: Previously, I would just go "optirun game" or "optirun wine game.exe"
14:54metalhead33: so DRI_PRIME=1
14:54imirkin_: however do note that, like i said earlier, performance is likely to be crap
14:54imirkin_: since (a) it's a GF108, so it's crap to start with, and (b) nouveau doesn't reclock, so you get the clock that you get... which is probably a low or middle clock, not the highest
14:54metalhead33: I don't usually play the newest games.
14:55metalhead33: The newest game I play is... Skyrim? And that lags for entirely different reasons.
14:55vita_cell: imirkin, that is the gpu switcher for laptop with intel + nvidia/amd?
14:55imirkin_: vita_cell: it's just gpu switcher, it doesn't care about gpu brands :)
14:55vita_cell: I will need one, how to get it?
14:56imirkin_: vita_cell: http://nouveau.freedesktop.org/wiki/Optimus/
14:56vita_cell: metalhead33 I am with gtx770 and Nouveau, works fine
14:56imirkin_: vita_cell: yours is a kepler, so it can reclock
14:56imirkin_: his is a fermi, so.. no reclocking, at least not yet
14:57vita_cell: I did it already at desktop computer, I playing games 3 month ago
14:57imirkin_: Fermi. it's a gpu family.
14:57vita_cell: I asked for my laptop, with Intel 4000 and AMD 8600
14:58vita_cell: Fermi works bad I think
14:58vita_cell: I teted gtx650, whis is Kepler, works fine, but gtx770 4gb refurbished, much better option
14:59metalhead33: Too bad I can't replace a GPU in a laptop
14:59metalhead33: Not that I'm in the mood to upgrade anyway.
14:59vita_cell: ahhh it is a laptop
14:59metalhead33: Acer Aspire 5755 G
14:59vita_cell: you can add an external GPU on laptop, but I dont like that idea
14:59vita_cell: the device costs you like 100-150€
15:00metalhead33: Intel Core i5-2430M 2.4 GHz, with Trubo Boost up to 3.0 GHz.
15:00metalhead33: Nvidia GeForce GT 540M UP TO 3765 mb Turbo Cache
15:00vita_cell: laptop for gaming, anyway, it is not a good idea
15:00vita_cell: not bad laptop
15:01metalhead33: The only other computer I have is older, has an even less advanced Nvidia card
15:01metalhead33: And I last used it before I migrated to Gentoo Linux
15:01metalhead33: It still has Windows 7 on it, and it lags soooooo bad...
15:01vita_cell: I hate any non-free software
15:01metalhead33: I loathe it too.
15:01imirkin_: vita_cell: is that why you play proprietary games?
15:02vita_cell: ogghh yes, I need them
15:02vita_cell: just Steam
15:02imirkin_: lol, that counts as '1'?
15:02metalhead33: Will I get banned from this channel if I proclaim that I'm an etarip?
15:02vita_cell: but with deblobbed kernel, GNU trisquel OS fully free and nouveau
15:02imirkin_: fun dichotomy
15:03metalhead33: I was always a pirate. Ever since I was a kid.
15:03vita_cell: Steam has a nice GNU support
15:03imirkin_: metalhead33: you should move to somalia
15:03metalhead33: My mom thought it was stupid to spend money on something you can get for free and get away with.
15:03metalhead33: We have been pirating games at home since like... 2002.
15:03metalhead33: Or even before that.
15:03vita_cell: you are not pirate, you are a freedom sharer
15:04metalhead33: I still have a lot of CDs with pirated DOS games on it
15:04metalhead33: CDs from the late 90's
15:04imirkin_: this is not the place to discuss any of that.
15:04metalhead33: Ah okay, sorry...
15:05vita_cell: I dont like to spend my money in privative-proprietary software, but I donate sometimes to free software developers, they give me their software fully free and with free software licence, full freedom
15:08metalhead33: "In June 2014 a breakthrough was finally achieved, and initial re-clocking support was added to nouveau"
15:09metalhead33: Wait so... it doesn't have reckcloking for... certain GPUs only?
15:09imirkin_: there are many gpu families
15:09imirkin_: yours is not one of the ones with any support
15:11vita_cell: is Kepler the first gpu that got reclocked?
15:11imirkin_: tesla and kepler families support reclocking (not all of tesla)
15:12metalhead33: Will Femi ever support reckcloking?
15:12imirkin_: maybe, maybe not
15:13vita_cell: I have here gtx460, and it works really bad
15:13imirkin_: nouveau is a largely volunteer effort, so hard to predict timelines :)
15:13metalhead33: http://www.linux.org/threads/nouveau-commits-fermi-reclocking-pm-mxm-etc.1328/ didn't they add Fermi reclocking 5 years ago? :V
15:14imirkin_: don't believe everything you read.
15:14imirkin_: that article is talking about engine clocks, not memory clocks
15:15imirkin_: and yeah, we can adjust engine clocks. but without the memory clock adjustment, that's only a very small bump in perf
15:15vita_cell: and memory reclock does much performance
15:16metalhead33: In other words, it's like overclocking a CPU but keeping the RAM the same.
15:16metalhead33: Except that it's for a GPU, not a whole PC.
15:16imirkin_: and the change in memory speed is something like 8-20x
15:16mupuf: oh my, 11 frames / minute for pixmark_piano on my nvd9
15:16mupuf: probably not a good benchmark to test on this machine then!
15:16metalhead33: I will test Minecraft tomorrow :v
15:16vita_cell: reclock a GPU it is not overclock, is get the stock glocks at pstate
15:16imirkin_: mupuf: not a good machine to benchmark on :p
15:17mupuf: imirkin: but ... but ... this is the best I can do right now :s
15:17metalhead33: If Minecraft plays decently, anything can play decently.
15:17mupuf: I can try to reclock the core at least
15:17mupuf: that should help piano
15:17mupuf: at least, the gpu does not hang
15:17mupuf: that's ... good!
15:17imirkin_: always a silver lining
15:18mupuf: voplosion is around 1 fps
15:18mupuf: I guess I just need to render in a smaller resolution
15:18imirkin_: 2560x1600 is probably not a good idea.
15:18metalhead33: What should I expect from Medieval 2 Total War? :v
15:18mupuf: I went for fullhd
15:18imirkin_: mupuf: try 640x480
15:19mupuf: 1024*640 is 8 FPS
15:19imirkin_: metalhead33: depends a lot on your gpu's boot clocks
15:19metalhead33: Boot clocks?
15:19imirkin_: the clock speed it boots at
15:19imirkin_: and different ones do different things
15:19imirkin_: so difficult to predict
15:19scaroo_: imirkin_: hi! so I am looking at the TCP prelude stuff on maxwell. I got that the complexity is that tcp invocations for the same patch share their outputs, so somehow the prelude should "inject" the shared adresses in invocations (then accesible in the program through vertexTexCoord and data), sounds corrent ?
15:19metalhead33: One question regarding performance though.
15:20imirkin_: scaroo_: you're *mostly* right
15:20metalhead33: Fermi with Nouveau is still faster than Intel, right?
15:20metalhead33: Or should be generally faster, right?
15:20imirkin_: metalhead33: *probably* :) for something like glxgears, definitely not. for heavier gpu things, probably.
15:20mupuf: imirkin: my bad, it was with Intel :D Nouveau is at 600ms/frame :D
15:20mupuf: now, for the silverlining .... it renders correctly!
15:21imirkin_: scaroo_: so... the basic deal is that yes, TCP outputs are effectively shared
15:21mupuf: horrible tearing though
15:21metalhead33: I'll genuinely eat my nonexistant hat if somehow, games will have better performance with nouveau.
15:21imirkin_: scaroo_: however thankfully the GPU was designed for that
15:21metalhead33: Unlikely, but I experienced similiar impossibilities.
15:21metalhead33: Medieval 2 Total War for example ran much faster on Wine than on a real Windows.
15:21scaroo_: imirkin_: are you refering to the barrier sync ?
15:21imirkin_: scaroo_: on kepler and earlier gens, you could use the regular AST and ALD op with the .O modifier to interact with them
15:22imirkin_: scaroo_: however starting with maxwell, you're supposed to do something else to read and write them
15:22imirkin_: scaroo_: what needs to be figured out is the "something else" bit of it
15:22karolherbst: well they are rumours that fermi core reclocking just works, but without memory reclocking there won't be much more perf than 25%
15:22imirkin_: scaroo_: i.e. you have to read those special regs with addresses in them
15:22imirkin_: scaroo_: and then do something with isberd
15:22imirkin_: scaroo_: and then more stuff.
15:22imirkin_: scaroo_: someone needs to go through and itemize it all and figure out how it all fits together
15:23imirkin_: scaroo_: so that the nouveau compiler can generate the right sequence of stuff when trying to read/write TCP outputs
15:23imirkin_: scaroo_: i also suspect that TEP inputs may suffer a similar fate, but i haven't checked
15:23imirkin_: scaroo_: you might want to look at what we do for GS inputs, iirc those also got some annoying treatment, but skeggsb worked it out
15:23mwk: hum, sounds like lots of fun
15:24imirkin_: mwk: yeah, my solution was to disable tess on maxwell
15:24imirkin_: problem solved ;)
15:25imirkin_: mwk: do you happen to know what XMAD is?
15:25imirkin_: not to be confused with IMAD.X of course
15:25mwk: not really
15:25mwk: maybe it involves carry bits?
15:25imirkin_: is it like when you started out mad, but then you're no longer mad. so you're ex-mad? :)
15:25metalhead33: Speaking of which... is it true that Nvidia's employees added some input into Nouveau?
15:26imirkin_: metalhead33: it is true, what you have heard.
15:26metalhead33: I like open-source stuff, but... how does Nvidia benefit from this?
15:26scaroo_: imirkin_: i am afraid this is quite over my current understanding of gl/nouveau/ISA and co. But who knows... beginner luck or something :)
15:26imirkin_: dunno. i suspect they're being pressured by google to have upstream kernels work on their tegra devices
15:27imirkin_: scaroo_: well, the nice thing is that you don't really have to worry about the GL or nouveau bits
15:27imirkin_: scaroo_: the key is to figure out wtf is going on
15:27imirkin_: the rest is easy
15:27imirkin_: or at least i can help with :)
15:29scaroo_: imirkin_: well, I must say it is quite overwhelming. How would you go to figure that stuff out? What do you mean by "itemize" ?
15:30imirkin_: scaroo_: look at the shaders the blob generates, feed it different ones and see what it does differently, and figure out the logic behind it
15:33scaroo_: imirkin_: first thing first, I have the tcp prelude for gk110 and this one side by side and well.... thwey dont look the same. At all
15:34scaroo_: imirkin_: is ssy somekind of jump ?
15:34metalhead33: 1440/1842 in LLVM....
15:35metalhead33: Not a too long wait before it will get to MESA
15:36scaroo_: metalhead33: you could have installed a binary-packaged distribution in less time that it takes to compli llvm, just sayin' :)
15:44imirkin_: scaroo_: SSY is like a pre-jump
15:45imirkin_: scaroo_: basically when there's an instruction with a .S flag (or a sync op on maxwell), it'll jump to that location
15:45imirkin_: it's used to manage divergent control flow
15:46imirkin_: all this stuff *looks* like it's operating on a single value, but in reality it's on a huge SIMD thing
15:46imirkin_: so you can't really have divergent control flow. so if you ever do diverge, it executes them one thread at a time
15:46imirkin_: (or perhaps a thread subgroup at a time, i dunno)
15:46imirkin_: and SSY says "divergence ends here"
15:47scaroo_: imirkin_: mmm okay
15:47imirkin_: usually for if/else situations, it points to the end of the else block
15:48imirkin_: and we tend to call it "joinat"
15:48mwk: it's thread group FWIW
15:49mwk: the way it works inside, there are two global control registers: PC (program counter, as usual) and thread mask
15:50mwk: and there's a big stack of saved (PC, thread mask, event) tuples
15:51mwk: whenever you do a conditional branch, if the threads don't agree on the target, one of the targets is chosen - and the other is pushed on the stack
15:51mwk: only the threads that took the current direction are left in the current thread mask, while the mask saved on stack along with the other PC contains the threads that chose the other direction
15:52mwk: when execution ends, by the way of exit, the stack is popped, and the execution continues with the remaining group of threads
15:53scaroo_: clever stuff! thank for explaining it out. BTW is there a wiki were you guys keep that kinda knowledge ?
15:53mwk: now, as for SSY and .S (aka joinat and join) - suppose you have a short if block. it'd be good to have the two split thread groups rejoin after the if
15:54mwk: so - SSY is an instruction that pushes a special type of entry on the stack, an SSY entry [the previous one was a BRA entry]
15:54mwk: you execute that one before the if
15:54mwk: then you execute the branch as usual
15:54mwk: and then, after the if, you execute a .S instruction
15:55mwk: .S basically means "end the current execution line and pop a BRA or SSY from stack"
15:55scaroo_: so both if and else branches converge back on the same codepath, right ?
15:55scaroo_: what follows the conditional, i mean
15:56mwk: so - if another branch is left on the stack to be executed later, it gets executed after the first branch executes .S; otherwise it jumps directly to the address mentioned by SSY
15:56mwk: now... what I just said is very simplified
15:56mwk: it's complicated because there are LOTS of types of entries
15:57mwk: there's another register, the exit mask - it says which threads have executed the exit instruction
15:57mwk: this is necessary, since you don't want popping the SSY entry from stack resurrect a thread that executed exit in the meantime
15:58mwk: so when you pop SSY, the mask from the stack is modified to exclude all threads that are in the exit mask
15:58mwk: but but but... consider a return; statement, it has similiar semantics
15:59mwk: so there's another global register, the "return mask"
15:59mwk: when a thread executes a "ret" instruction, it's added to this mask, and removed from the currently executing one
15:59scaroo_: what happens if the shaders exits/return within a conditionnal branch ?
16:00mwk: so that, likewise, popping SSY won't ressurect a thread that already returned from the current function
16:00mwk: but, popping a CALL entry from the stack will make the processor consider the function finished, and the return mask will be reset to what it was before the call
16:01mwk: scaroo_: if it exits, it's simple - the thread is permanently marked in the exit mask register
16:01mwk: popping any stack entry will just remove it from the popped mask
16:01mwk: so it won't be resurrected in any way
16:02mwk: with return, it works in a similiar way with the "return mask" register, but popping the CALL entry will remove the thread from that register
16:03mwk: and... if you think it's complex... there are actually 6 or so types of stack entries and mask registers
16:03mwk: call/return, prebrk/break, precont/cont, prelongjmp/longjmp, joinat/join
16:04mwk: and I'm not sure what the rules are if you nest them in sufficiently funny ways
16:04mwk: I'm going to guess you get a trap if you try anything smartass
16:06mwk: basically call/return is for functions, prebrk/break for loops and break statement, precont/cont is for loop individual iterations and continue statement, joinat/join for plain ifs/switches, and prelongjmp/longjmp for exception handling... I guess
16:06mwk: now, you might ask yourself what to do if you have to compile something with a "goto" in it
16:08scaroo_: "waking up from incomprehension" Oh yeah that is exactly what I am wondering ;)
16:08mwk: I guess... bang on it hard until the damn goto is mutated into a sane loop/if/return/whatever, otherwise just say screw it and completely ignore all these features, using plain bras
16:09imirkin: mwk: yeah, just kill perf and do a bra :)
16:09imirkin: moral of the story: don't use goto (or use structured control flow)
16:09scaroo_: hmmm, it is actually complex. I need to integrate that stuff slowly. I guess I ll reread this irc log for a while, Thanks! Is there an offline resource where you keep all of that ?
16:09scaroo_: bras as in... ?
16:10mwk: good old branch instruction
16:10imirkin: scaroo_: but none of this stuff really matters to what you're doing :) but i guess it's good to understand what's going on in the shaders
16:10mwk: scaroo_: http://envytools.readthedocs.org/en/latest/hw/graph/tesla/cuda/control.html
16:10scaroo_: yeah, interesting anyway
16:10mwk: unfortuantely, that's for tesla, and half of that is lies
16:11mwk: oh yeah, I saved you the gory details of quadon/quadoff instructions
16:11imirkin: mwk: where does it lie? in a material manner?
16:11mwk: but then I have no idea how exactly these work either
16:12imirkin: mwk: when are you doing hwtests for *those* :p
16:12mwk: imirkin: first and firemost, it doesn't mention the masks and general stack format at all
16:12imirkin: mwk: and all the lanemasking
16:12imirkin: on nv50
16:12mwk: oh, the mov feature?
16:13mwk: yep, that's fun
16:13mwk: it's on fermi too IIRC, if you run out of allowed tex inputs to TXD
16:13mwk: as for hwtests, I do have them planned, don't worry
16:13mwk: but... tesla only, I'm afraid
16:13imirkin: no fermi?
16:14mwk: and it involves first getting dangerously intimate with the grctx
16:14mwk: so that I can actually mess with the stack
16:14mwk: well, if you can tell me how to access the MP inner state on fermi...
16:15imirkin: fermi: the hw even mwk couldn't test
16:15mwk: tesla is quite nice, just poke an address here and read data from there
16:15mwk: and use that address over there to single-step an instruction
16:16mwk: testing that should actually be quite simple
16:17mwk: on fermi, the only way I know is to write a trap handler that's supposed to save all state, wait for host, and restore it
16:17imirkin: why do you care about this?
16:17mwk: care about what?
16:17imirkin: inner mp state
16:18mwk: everything in hwtest is in hwtest because it's *easy* to test
16:18imirkin: why not just have the compute shader just write out all its state to gmem at the end and done
16:18mwk: all the tests are pretty much the same: randomize a context, upload the context, hit run
16:19mwk: if you fail a test, keep masking off bits from the context until you figure out which are causing the trouble
16:19mwk: that's all built on an assumption that individual "test units" will be small
16:20mwk: to test control flow instructions, you have to first get to the state you want
16:20imirkin: i just wanna know what xmad does
16:20mwk: in tesla, I can just poke the masks and stack straight into the context
16:20imirkin: yeah, control flow would be a huge pain
16:20imirkin: anyways, you can use the gr falcon to upload any ctx you want
16:20imirkin: but the state space is enormous
16:21mwk: in fermi... I'd have to figure out the right sequence of SSYs, branches, calls, prebrks and whatnot to run
16:21mwk: not fun
16:21mwk: and then I'd have to somehow dump the state
16:21mwk: I think that might be doable by the trap handler, but it was tricky
16:22imirkin: iirc you had a doc that described it
16:22imirkin: anyways... don't start with control flow
16:22mwk: but there's a very real possibility of there being some kind of state I can't reach into
16:22imirkin: start with everything else :p
16:22mwk: and yeah
16:23mwk: I'm not going to start with control flow, don't worry
16:23mwk: you *will* have your XMAD tested
16:23mwk: in fact, I'll likely do that before I get to tesla control flow
16:23imirkin: and MADSP
16:24imirkin: there are just a lot of angry opcodes i guess :)
16:25mwk: I don't exactly have much time for nouveau right now, but I should be done with Tesla data-only tests in 2 weeks or so, Fermi data-only would be next in line
16:25imirkin: ok cool
16:25imirkin: looking forward to it :)
16:25mwk: and as for what you said about using falcon to access grctx... you think I haven't tried that?
16:25imirkin: i won't be mean and ask you to test surface ops
16:25imirkin: no, i think you've tried everything i can think of and about a dozen other things
16:26mwk: mp state, ie. GPRs, is *not* part of the grctx state switched by falcon
16:26mwk: or of tesla state switched by ctxprog
16:26imirkin: yeah that actually makes sense
16:26imirkin: there's gotta be a stupid debug area *somewhere* though =/
16:26mwk: nor of any analogous construct
16:26mwk: I'm hoping for one
16:27mwk: there is one on tesla, and it's entirely unrelated to any other way of accessing any context
16:27mwk: if there is one on fermi, it's very different from tesla
16:28mwk: you know
16:28mwk: nv claimed to be working on full preemption for quite a few generations now
16:29mwk: so there just *has* to be a way to dump *and reload* that state
16:29imirkin: but not necessarily on fermi
16:29mwk: I don't know, it may be buggy, it may be unable to perform a full and proper context switch, it may be horribly slow
16:29mwk: but it should definitely be there because they tried
16:31imirkin: maybe not for resuming shaders in funky states
16:31imirkin: who knows
16:31imirkin: they could just wait for invocations to quiesce
16:32mwk: and another argument
16:32mwk: they had it on tesla, they used it for cuda debugging
16:33mwk: sure, they had the shiny new trap handler thing for fermi... but, bringing up a piece of silicon is hard and expensive
16:33mwk: it'd make all kinds of sense to keep some simpler mechanism as a precaution if something goes wrong with traps
16:34imirkin: could be they had some stuff on GF100 that disappeared later on
16:34mwk: removing stuff from a proven design... expensive
16:34mwk: and even if they did... luckily I do own a GF100 :)
16:34imirkin: GF100 wasn't exactly perfect
16:35imirkin: wasn't it massively inefficient?
16:36mwk: yep, but the fix to that was called "Kepler"
16:40mwk: there are lots of pieces in fermi that I don't understand
16:40mwk: in fact, one of the major problems is that I don't know what pieces there are
16:41imirkin: yeah, for me there's just one piece i don't understand
16:41imirkin: "the gpu" :)
16:42imirkin: i feel like the more pieces you don't understand, the closer you are to understanding everything
16:44mwk: and I still have to test the crap out of tesla
16:45mwk: would be nice to finally know all the intermediate buffers with the funny memory spaces like a
16:46mwk: imirkin: well, I do have a nice map of tesla PGRAPH
16:46mwk: it has well-defined units and I know what each does and what the rough path of data is
16:47imirkin: pretty soon you'll be fab'ing your own tesla's
16:47mwk: for fermi... most of my map has "hic sunt dracones" written over it
16:47imirkin: not entirely inaccurate :)
16:49mwk: and it's so complex to set up...
16:49mwk: you have to allocate all these buffers in memory
16:50mwk: who knows what they're doing
16:50mwk: and set up the crazy MMIO regs
16:51mwk: a Tesla PGRAPH just needs FIFO access enabled and it's ready to accept methods
17:02mupuf: mwk: maybe there is a fused-out JTAG interface to access the inner state
17:02mupuf: you know, as a way to hinder reversers
17:24imirkin: mwk: do you know how to do atomic adds on shared memory on fermi/kepler? nouveau tries to do it by using g[$sbase] but that doesn't work -- the op only works on real gmem, not the windowed-away areas
17:30imirkin: i wonder if i need to do a quadop or something
17:34imirkin: mwk: something to do with ld lock/st lock?
17:38imirkin: and wtf is LDS_LDU??
17:46imirkin: hakzsam: can you do a trace of what the blob does with atomics on shared mem at some point?
23:24hakzsam: imirkin, we already have one for fermi, but demmt doesn't decode it properly
23:24hakzsam: I will do it on kepler too
23:27hakzsam: imirkin, another way: write a little test in cuda
23:55mwk: imirkin: indeed it involves the lock/unlock
23:56mwk: forget everything you know about the g atomics, they won't work on shared
23:56mwk: instead, read up on load locked & store conditional
23:56mwk: ... except this here is *not* it
23:57mwk: but similiar
23:57mwk: here's how it works: first, you load the word from s memory, with ld lock s instruction
23:58mwk: this instruction returns a status in a $c [tesla] or $p [fermi] register
23:59mwk: I forgot what value means what, but, if you get "successful", the word is locked for you
23:59mwk: you can do any operarions on the thing you loaded