public inbox for [email protected]
 help / color / mirror / Atom feed
* [RFC] struct filename, io_uring and audit troubles
@ 2024-09-22  0:49 Al Viro
  2024-09-22  4:10 ` Al Viro
  2024-09-23  1:50 ` Al Viro
  0 siblings, 2 replies; 23+ messages in thread
From: Al Viro @ 2024-09-22  0:49 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: audit, io-uring

	Looks like things like async unlink might fuck the audit
very badly.  io_uring does getname() in originating thread and uses the
result at the time of operation, which can happen in a different thread.
Moreover, by that time the original syscall might have very well returned.

	The trouble is, getname() establishes linkage between the struct
filename and struct audit_name; filename->aname and audit_name->name
respectively.  struct filename can get moved from one thread to another;
struct audit_name is very much tied to audit_context, which is per-thread
- first few (5, currently) audit_name instances are embedded into
audit_context.	The rest gets allocated dynamically, but all of them
are placed on audit_context::names_list.

	At audit_free_names() they are all wiped out - references
back to filename instances are dropped, dynamically allocated ones
are freed, and while embedded ones survive, they are zeroed out on
reuse by audit_alloc_name().  audit_free_names() is called on each
audit_reset_context(), which is done by __audit_syscall_exit() and
(in states other than AUDIT_SYSCALL_CONTEXT) __audit_uring_exit().

	Linkage from filename to audit_name is used by __audit_inode().
It definitely expects the reference back to filename to be stable.
And in situation when io_uring has offloaded a directory operation to
helper thread, that is not guaranteed.

	Another fun bit is that both audit_inode() and audit_inode_child()
may bump the refcount on struct filename.  Which can get really fishy
if they get called by helper thread while the originator is exiting the
syscall - putname() from audit_free_names() in originator vs. refcount
increment in helper is Not Nice(tm), what with the refcount not being
atomic.

	Potential solutions:

* don't bother with audit_name creation and linkage in getname(); do that
when we start using the sucker.  Doing that from __set_nameidata() will
catch the majority of the stuff that ever gets audit_inode* called for it
(the only exceptions are mq_open(2) and mq_unlink(2)).  Unfortunately,
each audit_name instance gets spewed into logs, so we would need to
bring the rest of that shite in, including the things like symlink
bodies (note that for io_uring-originating symlink we'd need that done
in do_symlinkat()), etc.  Unpleasant, that.

* make refcount atomic, add a pointer to audit_context or even task_struct
in audit_name, have the "use name->aname if the type is acceptable"
logics in audit_inode dependent upon the name->aname->owner matching
what we want.  With some locking to make the check itself safe.

* make refcount atomic, get rid of ->aname and have audit_inode() scan
the names_list for entries with matching ->name and type - and that
before the existing scan with ->name->name comparisons.

* something else?

	Suggestions _not_ involving creative uses of TARDIS would
be welcome.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-09-26  3:56 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-22  0:49 [RFC] struct filename, io_uring and audit troubles Al Viro
2024-09-22  4:10 ` Al Viro
2024-09-22 15:09   ` Al Viro
2024-09-23  1:50 ` Al Viro
2024-09-23  6:30   ` Jens Axboe
2024-09-23 12:54     ` Paul Moore
2024-09-23 14:48       ` Al Viro
2024-09-23 16:14         ` Paul Moore
2024-09-23 18:17           ` Al Viro
2024-09-23 23:49             ` Paul Moore
2024-09-23 20:36           ` Al Viro
2024-09-24  0:11             ` Paul Moore
2024-09-24  7:01               ` Al Viro
2024-09-24 23:17                 ` Paul Moore
2024-09-25 20:44               ` Al Viro
2024-09-25 20:58                 ` Paul Moore
2024-09-24 21:40             ` Al Viro
2024-09-25  6:01               ` Jens Axboe
2024-09-25 17:39                 ` Al Viro
2024-09-25 17:58                   ` Jens Axboe
2024-09-26  3:56                 ` Al Viro
2024-09-23 15:07     ` Al Viro
2024-09-24 11:15       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox