[11:13:13]  ajax: we really do need a way for a server to tell the kernel when it's doing work on behalf of a particular client [11:13:31]  that would solve the whole dbus auditing problem steve grubb has too [11:13:54]  maybe something to talk about at plumbers... [11:14:49]  solaris had an X extension for this, apparently? [11:14:50]  https://bugs.freedesktop.org/show_bug.cgi?id=2192 [11:15:30]  i forget most of the details, i think that one was mostly about bumping client priority to reduce frontend latency, not about throttling stupid shit [11:16:22]  hmm too hard to grok that in passing [11:17:19]  ***  pjones has left chat #fedora-kernel (Changing host). [11:17:19]  ***  pjones (~pjones@fedora/pjones) has joined chat #fedora-kernel. [11:18:08]  basically i think we need something like prctl (PR_SET_CLIENT, client_fd) [11:18:57]  and from that point on the kernel knows the process is doing work on behalf of the process on the otherside of the fd [11:19:08]  i think that's too much overhead for X, but for audit sure. [11:19:24]  oh i see [11:19:27]  i really can't afford two more syscalls per request [11:19:36]  well we could do it in a smarter way [11:19:40]  well, okay, per soft-ctxsw, but sure. [11:19:54]  fcntl [11:20:48]  maybe.  if you make it an fcntl then obvious you enter a client context once you do read() [11:20:56]  but when do you exit that client's context? [11:21:06]  write?  next read/select? [11:21:12]  Optimizing Unix Resource Scheduling for User Interaction [11:21:12]  Steve Evans, Kevin Clarke, Dave Singleton, Bart Smaalders [11:21:13]  SunSoft Inc. [11:21:16]  that's a blast from the past. :) [11:21:47]  ajax: hmm [11:22:04]  yeah, there's been plenty of work on this in the past. remember when reading it that HZ used to be 100. [11:22:15]  which is _way_ too slow for graphics [11:22:52]  yeah, i can't think of many ways to do this that aren't a completely horrid hack. [11:23:04]  i'll read the paper afterl unch. [11:23:18]  i think rcvbuf does actually get me a lot of the way to where i want to be [11:24:15]  i don't know what my typical readq is for a greedy client, but i can find that out, that's just numerology [11:28:27]  so we could have a magix futex associated with each client [11:28:40]  when you hold it, the kernel knows you're serving that client [11:29:40]  hmm but the kernel doesn't get involved when it's uncontended i guess reading hte man page [11:30:47]  i guess the point is, you can't add any new calls at all [11:30:55]  it has to be implicitly figured out from what you already do [11:30:57]  i wonder if we could do something awesome with fuse. [11:31:02]  since what you already do is performance critical [11:31:53]  unless it was something really fast, like writing to a special address in memory? [11:33:07]  hrm. [11:33:17]  we could do the prctl thing with a vdso. that would be relatively fast. [11:33:51]  oh like the gettimeofday hack? [11:33:58]  yeh. [11:34:03]  and getpid. [11:34:12]  (Wait, did we ever put getpid in there/) [11:34:28]  pretty sure we did [11:34:36]  i thought that only worked for readng from the kernel not writing to the kernel? [11:34:49]  back in a bit. [11:35:00]  halfline, we can't write to kernel space, but we can put somethign somewhere the kernel can easily get. [11:35:01]  vdso is readonly right now, yeah [11:35:28]  a writeable vdso segment isn't _that_ much logically different from a futex [11:35:46]  it's just a bunch of predefined futexes.. [11:37:10]  okay i don't know much about them [11:40:03]  i think we just need some writeable mapped memory, a single integer where we write to the kernel "i'm handling this client now" and afterwards "i'm done" [11:40:43]  that doesn't help scheduling [11:41:03]  accounting, sure, because that will read the ctx value out when it needs it [11:41:29]  but once the kernel has the accounting information, it can perform scheduling tweaks for you [11:41:36]  but the scheduler assumes ctx transitions happen at scheduling, you'd have to tell it more explicitly [11:41:41]  the write itself won't trigger anything [11:41:44]  unless it's a pagefault [11:41:58]  and page faults are very expensive [11:43:24]  anyway, something to gnaw on. [11:44:13]  maybe the answer is what you said originally, force all apps to only deal with one fd per iteration of poll [11:44:37]  and mark that one fd ahead as one to use for accounting [11:44:49]  accounting ends on next poll [11:45:05]  maybe. [11:45:18]  but i mean, that's sort of secondary? [11:45:30]  or maybe accounting ends on next poll or on next read of some other fd in the fdset [11:45:35]  the problem that attempts to solve is the scheduler making bad decisions [11:46:03]  and i don't think it's making bad decisions.  there appears to be more work to do so it's doing it. [11:46:34]  well the issue is, the kernel can only make decisions based on the available information [11:46:38]  if i want to influence that i should make it look like there's nothing to do. [11:46:50]  and for a server, one important peice of information is which clients its serving and when [11:46:54]  but the kernel doesn't have that information [11:46:58]  ***  adamw has left chat #fedora-kernel (Quit: Coyote finally caught me). [11:47:16]  that's why i'm saying try shrinking the recieve buffer [11:47:28]  ajax: your recvbuf thing will probably be "good enough" for your specific issue [11:47:33]  client write()s, it blocks because the buffer is full. [11:47:35]  and that's fine [11:48:00]  i suspect fixing that will actually make it so accounting tricks aren't even needed though [11:48:06]  was just hoping to kill some other birds with a new stone [11:48:27]  well i'd like to fix the sgrubb auditing problem too [11:49:03]  but i guess you probably don't want an audit entry for every draw operation in the x server so... [11:49:10]  maybe i'm shoehorning where i shouldn't be [11:49:11]  yeah.  i think those end up being different enough problems that you don't want to conflate them. [11:49:19]  good thought and all, but. [11:51:50]  ***  adamw (~adamw@redhat/adamw) has joined chat #fedora-kernel. [11:52:53]  part of the issue is, the accounting has to be very fast to be something X could make use of for improved scheduling, which probably means it has to be deduced implicitly [11:53:12]  but the accounting has to be very trustworthy and accurate for it to be something audit could make use of [11:53:22]  which probably means it can't be deduced implicitly [11:53:34]  yeah.  make it work, make it good, then make it fast. [11:54:44]  prctl seems entirely reasonable for audit's needs and if i ever think i need it for X i can probably make it work [11:54:59]  like, right now we don't do any estimation of request cost [11:55:21]  which is lame.  i've got a ton of information about that. [11:55:36]  and i try to drain multiple reqs per read(), so. [11:56:07]  i could amortize the prctl across multiple reqs and only fire it if i think the next timeslice is going to be expensive [11:57:14]  yea maybe something like that would work