commit 3a3877de44342d0a09216dfbe674a404e8f5e96f Author: Greg Kroah-Hartman Date: Fri Jun 21 14:54:16 2024 +0200 Linux 5.10.220 Link: https://lore.kernel.org/r/20240618123407.280171066@linuxfoundation.org Tested-by: Jon Hunter Tested-by: Pavel Machek (CIP) Tested-by: Dominique Martinet Tested-by: Florian Fainelli Tested-by: Linux Kernel Functional Testing Tested-by: Salvatore Bonaccorso Signed-off-by: Greg Kroah-Hartman commit 9444ce5cd48802188b73315dfde9e341fc3f75a2 Author: Trond Myklebust Date: Thu Feb 15 20:24:50 2024 -0500 nfsd: Fix a regression in nfsd_setattr() [ Upstream commit 6412e44c40aaf8f1d7320b2099c5bdd6cb9126ac ] Commit bb4d53d66e4b ("NFSD: use (un)lock_inode instead of fh_(un)lock for file operations") broke the NFSv3 pre/post op attributes behaviour when doing a SETATTR rpc call by stripping out the calls to fh_fill_pre_attrs() and fh_fill_post_attrs(). Fixes: bb4d53d66e4b ("NFSD: use (un)lock_inode instead of fh_(un)lock for file operations") Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton Reviewed-by: NeilBrown Message-ID: <20240216012451.22725-1-trondmy@kernel.org> [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a1a153fc73cc9184bcdfcd3a9f619705e35e53b7 Author: NeilBrown Date: Wed Jan 31 11:17:40 2024 +1100 nfsd: don't call locks_release_private() twice concurrently [ Upstream commit 05eda6e75773592760285e10ac86c56d683be17f ] It is possible for free_blocked_lock() to be called twice concurrently, once from nfsd4_lock() and once from nfsd4_release_lockowner() calling remove_blocked_locks(). This is why a kref was added. It is perfectly safe for locks_delete_block() and kref_put() to be called in parallel as they use locking or atomicity respectively as protection. However locks_release_private() has no locking. It is safe for it to be called twice sequentially, but not concurrently. This patch moves that call from free_blocked_lock() where it could race with itself, to free_nbl() where it cannot. This will slightly delay the freeing of private info or release of the owner - but not by much. It is arguably more natural for this freeing to happen in free_nbl() where the structure itself is freed. This bug was found by code inspection - it has not been seen in practice. Fixes: 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock") Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit feb3352af74273424cd4ae68650196a73a5a9d54 Author: NeilBrown Date: Mon Feb 5 13:22:39 2024 +1100 nfsd: don't take fi_lock in nfsd_break_deleg_cb() [ Upstream commit 5ea9a7c5fe4149f165f0e3b624fe08df02b6c301 ] A recent change to check_for_locks() changed it to take ->flc_lock while holding ->fi_lock. This creates a lock inversion (reported by lockdep) because there is a case where ->fi_lock is taken while holding ->flc_lock. ->flc_lock is held across ->fl_lmops callbacks, and nfsd_break_deleg_cb() is one of those and does take ->fi_lock. However it doesn't need to. Prior to v4.17-rc1~110^2~22 ("nfsd: create a separate lease for each delegation") nfsd_break_deleg_cb() would walk the ->fi_delegations list and so needed the lock. Since then it doesn't walk the list and doesn't need the lock. Two actions are performed under the lock. One is to call nfsd_break_one_deleg which calls nfsd4_run_cb(). These doesn't act on the nfs4_file at all, so don't need the lock. The other is to set ->fi_had_conflict which is in the nfs4_file. This field is only ever set here (except when initialised to false) so there is no possible problem will multiple threads racing when setting it. The field is tested twice in nfs4_set_delegation(). The first test does not hold a lock and is documented as an opportunistic optimisation, so it doesn't impose any need to hold ->fi_lock while setting ->fi_had_conflict. The second test in nfs4_set_delegation() *is* make under ->fi_lock, so removing the locking when ->fi_had_conflict is set could make a change. The change could only be interesting if ->fi_had_conflict tested as false even though nfsd_break_one_deleg() ran before ->fi_lock was unlocked. i.e. while hash_delegation_locked() was running. As hash_delegation_lock() doesn't interact in any way with nfs4_run_cb() there can be no importance to this interaction. So this patch removes the locking from nfsd_break_one_deleg() and moves the final test on ->fi_had_conflict out of the locked region to make it clear that locking isn't important to the test. It is still tested *after* vfs_setlease() has succeeded. This might be significant and as vfs_setlease() takes ->flc_lock, and nfsd_break_one_deleg() is called under ->flc_lock this "after" is a true ordering provided by a spinlock. Fixes: edcf9725150e ("nfsd: fix RELEASE_LOCKOWNER") Signed-off-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 99fb654d01dc3f08b5905c663ad6c89a9d83302f Author: NeilBrown Date: Mon Jan 22 14:58:16 2024 +1100 nfsd: fix RELEASE_LOCKOWNER [ Upstream commit edcf9725150e42beeca42d085149f4c88fa97afd ] The test on so_count in nfsd4_release_lockowner() is nonsense and harmful. Revert to using check_for_locks(), changing that to not sleep. First: harmful. As is documented in the kdoc comment for nfsd4_release_lockowner(), the test on so_count can transiently return a false positive resulting in a return of NFS4ERR_LOCKS_HELD when in fact no locks are held. This is clearly a protocol violation and with the Linux NFS client it can cause incorrect behaviour. If RELEASE_LOCKOWNER is sent while some other thread is still processing a LOCK request which failed because, at the time that request was received, the given owner held a conflicting lock, then the nfsd thread processing that LOCK request can hold a reference (conflock) to the lock owner that causes nfsd4_release_lockowner() to return an incorrect error. The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so it knows that the error is impossible. It assumes the lock owner was in fact released so it feels free to use the same lock owner identifier in some later locking request. When it does reuse a lock owner identifier for which a previous RELEASE failed, it will naturally use a lock_seqid of zero. However the server, which didn't release the lock owner, will expect a larger lock_seqid and so will respond with NFS4ERR_BAD_SEQID. So clearly it is harmful to allow a false positive, which testing so_count allows. The test is nonsense because ... well... it doesn't mean anything. so_count is the sum of three different counts. 1/ the set of states listed on so_stateids 2/ the set of active vfs locks owned by any of those states 3/ various transient counts such as for conflicting locks. When it is tested against '2' it is clear that one of these is the transient reference obtained by find_lockowner_str_locked(). It is not clear what the other one is expected to be. In practice, the count is often 2 because there is precisely one state on so_stateids. If there were more, this would fail. In my testing I see two circumstances when RELEASE_LOCKOWNER is called. In one case, CLOSE is called before RELEASE_LOCKOWNER. That results in all the lock states being removed, and so the lockowner being discarded (it is removed when there are no more references which usually happens when the lock state is discarded). When nfsd4_release_lockowner() finds that the lock owner doesn't exist, it returns success. The other case shows an so_count of '2' and precisely one state listed in so_stateid. It appears that the Linux client uses a separate lock owner for each file resulting in one lock state per lock owner, so this test on '2' is safe. For another client it might not be safe. So this patch changes check_for_locks() to use the (newish) find_any_file_locked() so that it doesn't take a reference on the nfs4_file and so never calls nfsd_file_put(), and so never sleeps. With this check is it safe to restore the use of check_for_locks() rather than testing so_count against the mysterious '2'. Fixes: ce3c4ad7f4ce ("NFSD: Fix possible sleep during nfsd4_release_lockowner()") Signed-off-by: NeilBrown Reviewed-by: Jeff Layton Cc: stable@vger.kernel.org # v6.2+ Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ca791e1a31cfa36c51f5c895258828f890edbc2e Author: Jeff Layton Date: Wed Jan 3 08:36:52 2024 -0500 nfsd: drop the nfsd_put helper [ Upstream commit 64e6304169f1e1f078e7f0798033f80a7fb0ea46 ] It's not safe to call nfsd_put once nfsd_last_thread has been called, as that function will zero out the nn->nfsd_serv pointer. Drop the nfsd_put helper altogether and open-code the svc_put in its callers instead. That allows us to not be reliant on the value of that pointer when handling an error. Fixes: 2a501f55cd64 ("nfsd: call nfsd_last_thread() before final nfsd_put()") Reported-by: Zhi Li Cc: NeilBrown Signed-off-by: Jeffrey Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 838a602db75d017c891af8db2e2a0a438f7bd70e Author: NeilBrown Date: Fri Dec 15 11:56:31 2023 +1100 nfsd: call nfsd_last_thread() before final nfsd_put() [ Upstream commit 2a501f55cd641eb4d3c16a2eab0d678693fac663 ] If write_ports_addfd or write_ports_addxprt fail, they call nfsd_put() without calling nfsd_last_thread(). This leaves nn->nfsd_serv pointing to a structure that has been freed. So remove 'static' from nfsd_last_thread() and call it when the nfsd_serv is about to be destroyed. Fixes: ec52361df99b ("SUNRPC: stop using ->sv_nrthreads as a refcount") Signed-off-by: NeilBrown Reviewed-by: Jeff Layton Cc: Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e35cb663a462918a36183cec73cf55dc8963ac0c Author: NeilBrown Date: Tue Sep 12 11:25:00 2023 +1000 NFSD: fix possible oops when nfsd/pool_stats is closed. [ Upstream commit 88956eabfdea7d01d550535af120d4ef265b1d02 ] If /proc/fs/nfsd/pool_stats is open when the last nfsd thread exits, then when the file is closed a NULL pointer is dereferenced. This is because nfsd_pool_stats_release() assumes that the pointer to the svc_serv cannot become NULL while a reference is held. This used to be the case but a recent patch split nfsd_last_thread() out from nfsd_put(), and clearing the pointer is done in nfsd_last_thread(). This is easily reproduced by running rpc.nfsd 8 ; ( rpc.nfsd 0;true) < /proc/fs/nfsd/pool_stats Fortunately nfsd_pool_stats_release() has easy access to the svc_serv pointer, and so can call svc_put() on it directly. Fixes: 9f28a971ee9f ("nfsd: separate nfsd_last_thread() from nfsd_put()") Signed-off-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3add01e067483833c302c1815b5df02491832fb9 Author: Chuck Lever Date: Fri Aug 25 15:04:23 2023 -0400 Documentation: Add missing documentation for EXPORT_OP flags [ Upstream commit b38a6023da6a12b561f0421c6a5a1f7624a1529c ] The commits that introduced these flags neglected to update the Documentation/filesystems/nfs/exporting.rst file. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d31cd25f5501d28dfe654856647253bb6eedd724 Author: NeilBrown Date: Mon Jul 31 16:48:32 2023 +1000 nfsd: separate nfsd_last_thread() from nfsd_put() [ Upstream commit 9f28a971ee9fdf1bf8ce8c88b103f483be610277 ] Now that the last nfsd thread is stopped by an explicit act of calling svc_set_num_threads() with a count of zero, we only have a limited number of places that can happen, and don't need to call nfsd_last_thread() in nfsd_put() So separate that out and call it at the two places where the number of threads is set to zero. Move the clearing of ->nfsd_serv and the call to svc_xprt_destroy_all() into nfsd_last_thread(), as they are really part of the same action. nfsd_put() is now a thin wrapper around svc_put(), so make it a static inline. nfsd_put() cannot be called after nfsd_last_thread(), so in a couple of places we have to use svc_put() instead. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 987c0e102874a27f40c06f4c94853e9979c9ee78 Author: NeilBrown Date: Mon Jul 31 16:48:31 2023 +1000 nfsd: Simplify code around svc_exit_thread() call in nfsd() [ Upstream commit 18e4cf915543257eae2925671934937163f5639b ] Previously a thread could exit asynchronously (due to a signal) so some care was needed to hold nfsd_mutex over the last svc_put() call. Now a thread can only exit when svc_set_num_threads() is called, and this is always called under nfsd_mutex. So no care is needed. Not only is the mutex held when a thread exits now, but the svc refcount is elevated, so the svc_put() in svc_exit_thread() will never be a final put, so the mutex isn't even needed at this point in the code. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7229200f68662660bb4d55f19247eaf3c79a4217 Author: Chuck Lever Date: Mon Jun 3 10:35:02 2024 -0400 nfsd: don't allow nfsd threads to be signalled. [ Upstream commit 3903902401451b1cd9d797a8c79769eb26ac7fe5 ] The original implementation of nfsd used signals to stop threads during shutdown. In Linux 2.3.46pre5 nfsd gained the ability to shutdown threads internally it if was asked to run "0" threads. After this user-space transitioned to using "rpc.nfsd 0" to stop nfsd and sending signals to threads was no longer an important part of the API. In commit 3ebdbe5203a8 ("SUNRPC: discard svo_setup and rename svc_set_num_threads_sync()") (v5.17-rc1~75^2~41) we finally removed the use of signals for stopping threads, using kthread_stop() instead. This patch makes the "obvious" next step and removes the ability to signal nfsd threads - or any svc threads. nfsd stops allowing signals and we don't check for their delivery any more. This will allow for some simplification in later patches. A change worth noting is in nfsd4_ssc_setup_dul(). There was previously a signal_pending() check which would only succeed when the thread was being shut down. It should really have tested kthread_should_stop() as well. Now it just does the latter, not the former. Signed-off-by: NeilBrown Reviewed-by: Jeff Layton [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8ef87fe6e87f754daa9d1b9f2cff71008f07da44 Author: Tavian Barnes Date: Fri Jun 23 17:09:06 2023 -0400 nfsd: Fix creation time serialization order [ Upstream commit d7dbed457c2ef83709a2a2723a2d58de43623449 ] In nfsd4_encode_fattr(), TIME_CREATE was being written out after all other times. However, they should be written out in an order that matches the bit flags in bmval1, which in this case are #define FATTR4_WORD1_TIME_ACCESS (1UL << 15) #define FATTR4_WORD1_TIME_CREATE (1UL << 18) #define FATTR4_WORD1_TIME_DELTA (1UL << 19) #define FATTR4_WORD1_TIME_METADATA (1UL << 20) #define FATTR4_WORD1_TIME_MODIFY (1UL << 21) so TIME_CREATE should come second. I noticed this on a FreeBSD NFSv4.2 client, which supports creation times. On this client, file times were weirdly permuted. With this patch applied on the server, times looked normal on the client. Fixes: e377a3e698fb ("nfsd: Add support for the birth time attribute") Link: https://unix.stackexchange.com/q/749605/56202 Signed-off-by: Tavian Barnes Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 72f28b5ad0b5f81208df16e75d241f13924d0d43 Author: Chuck Lever Date: Mon Jun 12 10:13:39 2023 -0400 NFSD: Add an nfsd4_encode_nfstime4() helper [ Upstream commit 262176798b18b12fd8ab84c94cfece0a6a652476 ] Clean up: de-duplicate some common code. Reviewed-by: Jeff Layton Acked-by: Tom Talpey Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b4417c53d4f90c6bd25b7778b1e956179fd36f0c Author: NeilBrown Date: Sat Jun 3 07:14:14 2023 +1000 lockd: drop inappropriate svc_get() from locked_get() [ Upstream commit 665e89ab7c5af1f2d260834c861a74b01a30f95f ] The below-mentioned patch was intended to simplify refcounting on the svc_serv used by locked. The goal was to only ever have a single reference from the single thread. To that end we dropped a call to lockd_start_svc() (except when creating thread) which would take a reference, and dropped the svc_put(serv) that would drop that reference. Unfortunately we didn't also remove the svc_get() from lockd_create_svc() in the case where the svc_serv already existed. So after the patch: - on the first call the svc_serv was allocated and the one reference was given to the thread, so there are no extra references - on subsequent calls svc_get() was called so there is now an extra reference. This is clearly not consistent. The inconsistency is also clear in the current code in lockd_get() takes *two* references, one on nlmsvc_serv and one by incrementing nlmsvc_users. This clearly does not match lockd_put(). So: drop that svc_get() from lockd_get() (which used to be in lockd_create_svc(). Reported-by: Ido Schimmel Closes: https://lore.kernel.org/linux-nfs/ZHsI%2FH16VX9kJQX1@shredder/T/#u Fixes: b73a2972041b ("lockd: move lockd_start_svc() call into lockd_create_svc()") Signed-off-by: NeilBrown Tested-by: Ido Schimmel Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b28b5c726e49e8020d7143b5beede75850639aa3 Author: Dan Carpenter Date: Mon May 29 14:35:55 2023 +0300 nfsd: fix double fget() bug in __write_ports_addfd() [ Upstream commit c034203b6a9dae6751ef4371c18cb77983e30c28 ] The bug here is that you cannot rely on getting the same socket from multiple calls to fget() because userspace can influence that. This is a kind of double fetch bug. The fix is to delete the svc_alien_sock() function and instead do the checking inside the svc_addsock() function. Fixes: 3064639423c4 ("nfsd: check passed socket's net matches NFSd superblock's one") Signed-off-by: Dan Carpenter Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8157832461bdae946485a17f7701c1215b85fd77 Author: Jeff Layton Date: Wed May 17 12:26:44 2023 -0400 nfsd: make a copy of struct iattr before calling notify_change [ Upstream commit d53d70084d27f56bcdf5074328f2c9ec861be596 ] notify_change can modify the iattr structure. In particular it can end up setting ATTR_MODE when ATTR_KILL_SUID is already set, causing a BUG() if the same iattr is passed to notify_change more than once. Make a copy of the struct iattr before calling notify_change. Reported-by: Zhi Li Link: https://bugzilla.redhat.com/show_bug.cgi?id=2207969 Tested-by: Zhi Li Fixes: 34b91dda7124 ("NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY") Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 05f45f3981d392f9f3ced9bd5302ad981bb55499 Author: Dai Ngo Date: Wed Apr 19 10:53:18 2023 -0700 NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop [ Upstream commit 147abcacee33781e75588869e944ddb07528a897 ] The following request sequence to the same file causes the NFS client and server getting into an infinite loop with COMMIT and NFS4ERR_DELAY: OPEN REMOVE WRITE COMMIT Problem reported by recall11, recall12, recall14, recall20, recall22, recall40, recall42, recall48, recall50 of nfstest suite. This patch restores the handling of race condition in nfsd_file_do_acquire with unlink to that prior of the regression. Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") Signed-off-by: Dai Ngo Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6c05d25ca8990158aeea90f731a7a1ee5bef4ffa Author: Jeff Layton Date: Fri Apr 14 17:31:44 2023 -0400 nfsd: simplify the delayed disposal list code [ Upstream commit 92e4a6733f922f0fef1d0995f7b2d0eaff86c7ea ] When queueing a dispose list to the appropriate "freeme" lists, it pointlessly queues the objects one at a time to an intermediate list. Remove a few helpers and just open code a list_move to make it more clear and efficient. Better document the resulting functions with kerneldoc comments. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 56b36b8960e5d0a6650d5e6081fd4e994ca40d34 Author: Chuck Lever Date: Thu Nov 24 15:09:04 2022 -0500 NFSD: Convert filecache to rhltable [ Upstream commit c4c649ab413ba6a785b25f0edbb12f617c87db2a ] While we were converting the nfs4_file hashtable to use the kernel's resizable hashtable data structure, Neil Brown observed that the list variant (rhltable) would be better for managing nfsd_file items as well. The nfsd_file hash table will contain multiple entries for the same inode -- these should be kept together on a list. And, it could be possible for exotic or malicious client behavior to cause the hash table to resize itself on every insertion. A nice simplification is that rhltable_lookup() can return a list that contains only nfsd_file items that match a given inode, which enables us to eliminate specialized hash table helper functions and use the default functions provided by the rhashtable implementation). Since we are now storing nfsd_file items for the same inode on a single list, that effectively reduces the number of hash entries that have to be tracked in the hash table. The mininum bucket count is therefore lowered. Light testing with fstests generic/531 show no regressions. Suggested-by: Neil Brown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5a132ffa76bdd48a2ecfe7ff543a406c883e42f4 Author: Jeff Layton Date: Wed Feb 15 06:53:54 2023 -0500 nfsd: allow reaping files still under writeback [ Upstream commit dcb779fcd4ed5984ad15991d574943d12a8693d1 ] On most filesystems, there is no reason to delay reaping an nfsd_file just because its underlying inode is still under writeback. nfsd just relies on client activity or the local flusher threads to do writeback. The main exception is NFS, which flushes all of its dirty data on last close. Add a new EXPORT_OP_FLUSH_ON_CLOSE flag to allow filesystems to signal that they do this, and only skip closing files under writeback on such filesystems. Also, remove a redundant NULL file pointer check in nfsd_file_check_writeback, and clean up nfs's export op flag definitions. Signed-off-by: Jeff Layton Acked-by: Anna Schumaker [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f7b157737c64adeac414f31262ef3ba4f322d5e4 Author: Jeff Layton Date: Thu Jan 26 12:21:16 2023 -0500 nfsd: update comment over __nfsd_file_cache_purge [ Upstream commit 972cc0e0924598cb293b919d39c848dc038b2c28 ] Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f593ea1423c66aa6fb3ab0f2243a1c4ec9f8a1af Author: Jeff Layton Date: Wed Jan 18 12:31:37 2023 -0500 nfsd: don't take/put an extra reference when putting a file [ Upstream commit b2ff1bd71db2a1b193a6dde0845adcd69cbcf75e ] The last thing that filp_close does is an fput, so don't bother taking and putting the extra reference. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c3677c14b3d4fc7e537d8a2dfc2f6c96732fd8a7 Author: Jeff Layton Date: Thu Jan 5 07:15:12 2023 -0500 nfsd: add some comments to nfsd_file_do_acquire [ Upstream commit b680cb9b737331aad271feebbedafb865504e234 ] David Howells mentioned that he found this bit of code confusing, so sprinkle in some comments to clarify. Reported-by: David Howells Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c9e8ed6efabe6580dbf6c6d6477d5b2dec4a4ec8 Author: Jeff Layton Date: Thu Jan 5 07:15:11 2023 -0500 nfsd: don't kill nfsd_files because of lease break error [ Upstream commit c6593366c0bf222be9c7561354dfb921c611745e ] An error from break_lease is non-fatal, so we needn't destroy the nfsd_file in that case. Just put the reference like we normally would and return the error. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2c95ad0a0cb9e624f79c89b697c6725692c7beee Author: Jeff Layton Date: Fri Jan 6 10:39:01 2023 -0500 nfsd: simplify test_bit return in NFSD_FILE_KEY_FULL comparator [ Upstream commit d69b8dbfd0866abc5ec84652cc1c10fc3d4d91ef ] test_bit returns bool, so we can just compare the result of that to the key->gc value without the "!!". Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e378da83577f6131a37eb79df12cb4276e657dfb Author: Jeff Layton Date: Fri Jan 6 10:39:00 2023 -0500 nfsd: NFSD_FILE_KEY_INODE only needs to find GC'ed entries [ Upstream commit 6c31e4c98853a4ba47355ea151b36a77c42b7734 ] Since v4 files are expected to be long-lived, there's little value in closing them out of the cache when there is conflicting access. Change the comparator to also match the gc value in the key. Change both of the current users of that key to set the gc value in the key to "true". Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9c599dee875463ba3888e9c05d1755a3896eba53 Author: Jeff Layton Date: Thu Jan 5 07:15:09 2023 -0500 nfsd: don't open-code clear_and_wake_up_bit [ Upstream commit b8bea9f6cdd7236c7c2238d022145e9b2f8aac22 ] Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 65a33135e91e6dd661ecdf1194b9d90c49ae3570 Author: Jeff Layton Date: Mon Mar 27 06:21:37 2023 -0400 nfsd: call op_release, even when op_func returns an error [ Upstream commit 15a8b55dbb1ba154d82627547c5761cac884d810 ] For ops with "trivial" replies, nfsd4_encode_operation will shortcut most of the encoding work and skip to just marshalling up the status. One of the things it skips is calling op_release. This could cause a memory leak in the layoutget codepath if there is an error at an inopportune time. Have the compound processing engine always call op_release, even when op_func sets an error in op->status. With this change, we also need nfsd4_block_get_device_info_scsi to set the gd_device pointer to NULL on error to avoid a double free. Reported-by: Zhi Li Link: https://bugzilla.redhat.com/show_bug.cgi?id=2181403 Fixes: 34b1744c91cc ("nfsd4: define ->op_release for compound ops") Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 50827896c365e0f6c8b55ed56d444dafd87c92c5 Author: Chuck Lever Date: Fri Mar 31 16:31:19 2023 -0400 NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL [ Upstream commit 804d8e0a6e54427268790472781e03bc243f4ee3 ] OPDESC() simply indexes into nfsd4_ops[] by the op's operation number, without range checking that value. It assumes callers are careful to avoid calling it with an out-of-bounds opnum value. nfsd4_decode_compound() is not so careful, and can invoke OPDESC() with opnum set to OP_ILLEGAL, which is 10044 -- well beyond the end of nfsd4_ops[]. Reported-by: Jeff Layton Fixes: f4f9ef4a1b0a ("nfsd4: opdesc will be useful outside nfs4proc.c") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8235cd619db6e67f1d7d26c55f1f3e4e575c947d Author: Jeff Layton Date: Fri Mar 17 13:13:08 2023 -0400 nfsd: don't replace page in rq_pages if it's a continuation of last page [ Upstream commit 27c934dd8832dd40fd34776f916dc201e18b319b ] The splice read calls nfsd_splice_actor to put the pages containing file data into the svc_rqst->rq_pages array. It's possible however to get a splice result that only has a partial page at the end, if (e.g.) the filesystem hands back a short read that doesn't cover the whole page. nfsd_splice_actor will plop the partial page into its rq_pages array and return. Then later, when nfsd_splice_actor is called again, the remainder of the page may end up being filled out. At this point, nfsd_splice_actor will put the page into the array _again_ corrupting the reply. If this is done enough times, rq_next_page will overrun the array and corrupt the trailing fields -- the rq_respages and rq_next_page pointers themselves. If we've already added the page to the array in the last pass, don't add it to the array a second time when dealing with a splice continuation. This was originally handled properly in nfsd_splice_actor, but commit 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") removed the check for it. Fixes: 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") Cc: Al Viro Reported-by: Dario Lesca Tested-by: David Critch Link: https://bugzilla.redhat.com/show_bug.cgi?id=2150630 Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 37b34eb5677073ec6972f395dfe650cf44b421eb Author: Jeff Layton Date: Tue Mar 14 06:20:58 2023 -0400 lockd: set file_lock start and end when decoding nlm4 testargs [ Upstream commit 7ff84910c66c9144cc0de9d9deed9fb84c03aff0 ] Commit 6930bcbfb6ce dropped the setting of the file_lock range when decoding a nlm_lock off the wire. This causes the client side grant callback to miss matching blocks and reject the lock, only to rerequest it 30s later. Add a helper function to set the file_lock range from the start and end values that the protocol uses, and have the nlm_lock decoder call that to set up the file_lock args properly. Fixes: 6930bcbfb6ce ("lockd: detect and reject lock arguments that overflow") Reported-by: Amir Goldstein Signed-off-by: Jeff Layton Tested-by: Amir Goldstein Cc: stable@vger.kernel.org #6.0 Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b0f33732796b45d7833257ea30d56de42bf80403 Author: Chuck Lever Date: Mon Mar 6 10:43:47 2023 -0500 NFSD: Protect against filesystem freezing [ Upstream commit fd9a2e1d513823e840960cb3bc26d8b7749d4ac2 ] Flole observes this WARNING on occasion: [1210423.486503] WARNING: CPU: 8 PID: 1524732 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0 Reported-by: Suggested-by: Jan Kara Link: https://bugzilla.kernel.org/show_bug.cgi?id=217123 Fixes: 73da852e3831 ("nfsd: use vfs_iter_read/write") Reviewed-by: Jeff Layton Reviewed-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 37cd49faaa944d1c14fb7adc30550fe27f08e7e6 Author: Chuck Lever Date: Tue Feb 14 10:07:59 2023 -0500 NFSD: copy the whole verifier in nfsd_copy_write_verifier [ Upstream commit 90d2175572470ba7f55da8447c72ddd4942923c4 ] Currently, we're only memcpy'ing the first __be32. Ensure we copy into both words. Fixes: 91d2e9b56cf5 ("NFSD: Clean up the nfsd_net::nfssvc_boot field") Reported-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit dd7d50c695a6194bdd26ecbd832aefdc36c6d77a Author: Jeff Layton Date: Tue Feb 7 12:02:46 2023 -0500 nfsd: don't fsync nfsd_files on last close [ Upstream commit 4c475eee02375ade6e864f1db16976ba0d96a0a2 ] Most of the time, NFSv4 clients issue a COMMIT before the final CLOSE of an open stateid, so with NFSv4, the fsync in the nfsd_file_free path is usually a no-op and doesn't block. We have a customer running knfsd over very slow storage (XFS over Ceph RBD). They were using the "async" export option because performance was more important than data integrity for this application. That export option turns NFSv4 COMMIT calls into no-ops. Due to the fsync in this codepath however, their final CLOSE calls would still stall (since a CLOSE effectively became a COMMIT). I think this fsync is not strictly necessary. We only use that result to reset the write verifier. Instead of fsync'ing all of the data when we free an nfsd_file, we can just check for writeback errors when one is acquired and when it is freed. If the client never comes back, then it'll never see the error anyway and there is no point in resetting it. If an error occurs after the nfsd_file is removed from the cache but before the inode is evicted, then it will reset the write verifier on the next nfsd_file_acquire, (since there will be an unseen error). The only exception here is if something else opens and fsyncs the file during that window. Given that local applications work with this limitation today, I don't see that as an issue. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2166658 Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") Reported-and-tested-by: Pierguido Lambri Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1178547637a2b05aeff2d65e90c145826e5d865c Author: Jeff Layton Date: Fri Feb 3 13:18:34 2023 -0500 nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open [ Upstream commit dcd779dc46540e174a6ac8d52fbed23593407317 ] The nested if statements here make no sense, as you can never reach "else" branch in the nested statement. Fix the error handling for when there is a courtesy client that holds a conflicting deny mode. Fixes: 3d6942715180 ("NFSD: add support for share reservation conflict to courteous server") Reported-by: 張智諺 Signed-off-by: Jeff Layton Reviewed-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3db6c79de9232f04c62b8a13c9b951fd42d1aeb0 Author: Dai Ngo Date: Tue Jan 31 11:12:29 2023 -0800 NFSD: fix problems with cleanup on errors in nfsd4_copy [ Upstream commit 81e722978ad21072470b73d8f6a50ad62c7d5b7d ] When nfsd4_copy fails to allocate memory for async_copy->cp_src, or nfs4_init_copy_state fails, it calls cleanup_async_copy to do the cleanup for the async_copy which causes page fault since async_copy is not yet initialized. This patche rearranges the order of initializing the fields in async_copy and adds checks in cleanup_async_copy to skip un-initialized fields. Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") Fixes: 87689df69491 ("NFSD: Shrink size of struct nfsd4_copy") Signed-off-by: Dai Ngo Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e5e1dc828499735e7bd525174cd3141a316da97a Author: Jeff Layton Date: Fri Jan 27 07:09:33 2023 -0500 nfsd: don't hand out delegation on setuid files being opened for write [ Upstream commit 826b67e6376c2a788e3a62c4860dcd79500a27d5 ] We had a bug report that xfstest generic/355 was failing on NFSv4.0. This test sets various combinations of setuid/setgid modes and tests whether DIO writes will cause them to be stripped. What I found was that the server did properly strip those bits, but the client didn't notice because it held a delegation that was not recalled. The recall didn't occur because the client itself was the one generating the activity and we avoid recalls in that case. Clearing setuid bits is an "implicit" activity. The client didn't specifically request that we do that, so we need the server to issue a CB_RECALL, or avoid the situation entirely by not issuing a delegation. The easiest fix here is to simply not give out a delegation if the file is being opened for write, and the mode has the setuid and/or setgid bit set. Note that there is a potential race between the mode and lease being set, so we test for this condition both before and after setting the lease. This patch fixes generic/355, generic/683 and generic/684 for me. (Note that 355 fails only on v4.0, and 683 and 684 require NFSv4.2 to run and fail). Reported-by: Boyang Xue Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2da50149981d05955e51c28e982e9ac29bd73417 Author: Dai Ngo Date: Mon Jan 23 21:34:13 2023 -0800 NFSD: fix leaked reference count of nfsd4_ssc_umount_item [ Upstream commit 34e8f9ec4c9ac235f917747b23a200a5e0ec857b ] The reference count of nfsd4_ssc_umount_item is not decremented on error conditions. This prevents the laundromat from unmounting the vfsmount of the source file. This patch decrements the reference count of nfsd4_ssc_umount_item on error. Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Signed-off-by: Dai Ngo Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fd63299db8090307eae66f2aef17c8f00aafa0a9 Author: Jeff Layton Date: Tue Jan 17 14:38:31 2023 -0500 nfsd: clean up potential nfsd_file refcount leaks in COPY codepath [ Upstream commit 6ba434cb1a8d403ea9aad1b667c3ea3ad8b3191f ] There are two different flavors of the nfsd4_copy struct. One is embedded in the compound and is used directly in synchronous copies. The other is dynamically allocated, refcounted and tracked in the client struture. For the embedded one, the cleanup just involves releasing any nfsd_files held on its behalf. For the async one, the cleanup is a bit more involved, and we need to dequeue it from lists, unhash it, etc. There is at least one potential refcount leak in this code now. If the kthread_create call fails, then both the src and dst nfsd_files in the original nfsd4_copy object are leaked. The cleanup in this codepath is also sort of weird. In the async copy case, we'll have up to four nfsd_file references (src and dst for both flavors of copy structure). They are both put at the end of nfsd4_do_async_copy, even though the ones held on behalf of the embedded one outlive that structure. Change it so that we always clean up the nfsd_file refs held by the embedded copy structure before nfsd4_copy returns. Rework cleanup_async_copy to handle both inter and intra copies. Eliminate nfsd4_cleanup_intra_ssc since it now becomes a no-op. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3c7b9b3487c06e3ffe1b82caf34c1af39fb1e326 Author: Jeff Layton Date: Fri Jan 6 10:33:47 2023 -0500 nfsd: allow nfsd_file_get to sanely handle a NULL pointer [ Upstream commit 70f62231cdfd52357836733dd31db787e0412ab2 ] ...and remove some now-useless NULL pointer checks in its callers. Suggested-by: NeilBrown Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9d7608dc4bd1ab1593ce84f802e12b3bb6b27fd4 Author: Dai Ngo Date: Sun Dec 18 16:55:53 2022 -0800 NFSD: enhance inter-server copy cleanup [ Upstream commit df24ac7a2e3a9d0bc68f1756a880e50bfe4b4522 ] Currently nfsd4_setup_inter_ssc returns the vfsmount of the source server's export when the mount completes. After the copy is done nfsd4_cleanup_inter_ssc is called with the vfsmount of the source server and it searches nfsd_ssc_mount_list for a matching entry to do the clean up. The problems with this approach are (1) the need to search the nfsd_ssc_mount_list and (2) the code has to handle the case where the matching entry is not found which looks ugly. The enhancement is instead of nfsd4_setup_inter_ssc returning the vfsmount, it returns the nfsd4_ssc_umount_item which has the vfsmount embedded in it. When nfsd4_cleanup_inter_ssc is called it's passed with the nfsd4_ssc_umount_item directly to do the clean up so no searching is needed and there is no need to handle the 'not found' case. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever [ cel: adjusted whitespace and variable/function names ] Reviewed-by: Olga Kornievskaia Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6856f1385d62ce7cc88f6300326df1038df8ec86 Author: Jeff Layton Date: Sat Feb 11 07:50:08 2023 -0500 nfsd: don't destroy global nfs4_file table in per-net shutdown [ Upstream commit 4102db175b5d884d133270fdbd0e59111ce688fc ] The nfs4_file table is global, so shutting it down when a containerized nfsd is shut down is wrong and can lead to double-frees. Tear down the nfs4_file_rhltable in nfs4_state_shutdown instead of nfs4_state_shutdown_net. Fixes: d47b295e8d76 ("NFSD: Use rhashtable for managing nfs4_file objects") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017 Reported-by: JianHong Yin Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e997a230d854c65626a5463ea447c02ac3710747 Author: Jeff Layton Date: Fri Jan 20 14:52:14 2023 -0500 nfsd: don't free files unconditionally in __nfsd_file_cache_purge [ Upstream commit 4bdbba54e9b1c769da8ded9abd209d765715e1d6 ] nfsd_file_cache_purge is called when the server is shutting down, in which case, tearing things down is generally fine, but it also gets called when the exports cache is flushed. Instead of walking the cache and freeing everything unconditionally, handle it the same as when we have a notification of conflicting access. Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") Reported-by: Ruben Vestergaard Reported-by: Torkil Svensgaard Reported-by: Shachar Kagan Signed-off-by: Jeff Layton Tested-by: Shachar Kagan Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2bbf10861d51dae76c6da7113516d0071c782653 Author: Dai Ngo Date: Wed Jan 11 16:06:51 2023 -0800 NFSD: replace delayed_work with work_struct for nfsd_client_shrinker [ Upstream commit 7c24fa225081f31bc6da6a355c1ba801889ab29a ] Since nfsd4_state_shrinker_count always calls mod_delayed_work with 0 delay, we can replace delayed_work with work_struct to save some space and overhead. Also add the call to cancel_work after unregister the shrinker in nfs4_state_shutdown_net. Signed-off-by: Dai Ngo Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 438ef64bbfe44f99f0ffe890af44b687c12d089f Author: Dai Ngo Date: Wed Jan 11 12:17:09 2023 -0800 NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time [ Upstream commit f385f7d244134246f984975ed34cd75f77de479f ] Currently the nfsd-client shrinker is registered and unregistered at the time the nfsd module is loaded and unloaded. The problem with this is the shrinker is being registered before all of the relevant fields in nfsd_net are initialized when nfsd is started. This can lead to an oops when memory is low and the shrinker is called while nfsd is not running. This patch moves the register/unregister of nfsd-client shrinker from module load/unload time to nfsd startup/shutdown time. Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition") Reported-by: Mike Galbraith Signed-off-by: Dai Ngo [ cel: adjusted to apply without e33c267ab70d ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6ac4c383c39f8f2f955f868d1ad9365c2363e80b Author: Xingyuan Mo Date: Thu Jan 12 00:24:53 2023 +0800 NFSD: fix use-after-free in nfsd4_ssc_setup_dul() [ Upstream commit e6cf91b7b47ff82b624bdfe2fdcde32bb52e71dd ] If signal_pending() returns true, schedule_timeout() will not be executed, causing the waiting task to remain in the wait queue. Fixed by adding a call to finish_wait(), which ensures that the waiting task will always be removed from the wait queue. Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Signed-off-by: Xingyuan Mo Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2ecc439931ef44ec37a220185a4f0718e9159c98 Author: Chuck Lever Date: Sat Jan 7 10:15:35 2023 -0500 NFSD: Use set_bit(RQ_DROPME) [ Upstream commit 5304930dbae82d259bcf7e5611db7c81e7a42eff ] The premise that "Once an svc thread is scheduled and executing an RPC, no other processes will touch svc_rqst::rq_flags" is false. svc_xprt_enqueue() examines the RQ_BUSY flag in scheduled nfsd threads when determining which thread to wake up next. Fixes: 9315564747cb ("NFSD: Use only RQ_DROPME to signal the need to drop a reply") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 115b58b56f8875569cbea90cca607d56382de614 Author: Chuck Lever Date: Fri Jan 6 12:43:37 2023 -0500 Revert "SUNRPC: Use RMW bitops in single-threaded hot paths" [ Upstream commit 7827c81f0248e3c2f40d438b020f3d222f002171 ] The premise that "Once an svc thread is scheduled and executing an RPC, no other processes will touch svc_rqst::rq_flags" is false. svc_xprt_enqueue() examines the RQ_BUSY flag in scheduled nfsd threads when determining which thread to wake up next. Found via KCSAN. Fixes: 28df0988815f ("SUNRPC: Use RMW bitops in single-threaded hot paths") Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 45c08a752982116f3287afcd1bd9c50f4fab0c28 Author: Jeff Layton Date: Thu Jan 5 14:55:56 2023 -0500 nfsd: fix handling of cached open files in nfsd4_open codepath [ Upstream commit 0b3a551fa58b4da941efeb209b3770868e2eddd7 ] Commit fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file") added the ability to cache an open fd over a compound. There are a couple of problems with the way this currently works: It's racy, as a newly-created nfsd_file can end up with its PENDING bit cleared while the nf is hashed, and the nf_file pointer is still zeroed out. Other tasks can find it in this state and they expect to see a valid nf_file, and can oops if nf_file is NULL. Also, there is no guarantee that we'll end up creating a new nfsd_file if one is already in the hash. If an extant entry is in the hash with a valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with the value of op_file and the old nf_file will leak. Fix both issues by making a new nfsd_file_acquirei_opened variant that takes an optional file pointer. If one is present when this is called, we'll take a new reference to it instead of trying to open the file. If the nfsd_file already has a valid nf_file, we'll just ignore the optional file and pass the nfsd_file back as-is. Also rework the tracepoints a bit to allow for an "opened" variant and don't try to avoid counting acquisitions in the case where we already have a cached open file. Fixes: fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file") Cc: Trond Myklebust Reported-by: Stanislav Saner Reported-and-Tested-by: Ruben Vestergaard Reported-and-Tested-by: Torkil Svensgaard Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f31bc0bc12f3062feba2bb56c2c4a9bb27bae580 Author: Jeff Layton Date: Sun Dec 11 06:19:33 2022 -0500 nfsd: rework refcounting in filecache [ Upstream commit ac3a2585f018f10039b4a856dcb122da88c1c1c9 ] The filecache refcounting is a bit non-standard for something searchable by RCU, in that we maintain a sentinel reference while it's hashed. This in turn requires that we have to do things differently in the "put" depending on whether its hashed, which we believe to have led to races. There are other problems in here too. nfsd_file_close_inode_sync can end up freeing an nfsd_file while there are still outstanding references to it, and there are a number of subtle ToC/ToU races. Rework the code so that the refcount is what drives the lifecycle. When the refcount goes to zero, then unhash and rcu free the object. A task searching for a nfsd_file is allowed to bump its refcount, but only if it's not already 0. Ensure that we don't make any other changes to it until a reference is held. With this change, the LRU carries a reference. Take special care to deal with it when removing an entry from the list, and ensure that we only repurpose the nf_lru list_head when the refcount is 0 to ensure exclusive access to it. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit dfbf3066d9739e58f24ddc98e4a660af9b525315 Author: Kees Cook Date: Fri Dec 2 12:48:59 2022 -0800 NFSD: Avoid clashing function prototypes [ Upstream commit e78e274eb22d966258a3845acc71d3c5b8ee2ea8 ] When built with Control Flow Integrity, function prototypes between caller and function declaration must match. These mismatches are visible at compile time with the new -Wcast-function-type-strict in Clang[1]. There were 97 warnings produced by NFS. For example: fs/nfsd/nfs4xdr.c:2228:17: warning: cast from '__be32 (*)(struct nfsd4_compoundargs *, struct nfsd4_access *)' (aka 'unsigned int (*)(struct nfsd4_compoundargs *, struct nfsd4_access *)') to 'nfsd4_dec' (aka 'unsigned int (*)(struct nfsd4_compoundargs *, void *)') converts to incompatible function type [-Wcast-function-type-strict] [OP_ACCESS] = (nfsd4_dec)nfsd4_decode_access, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The enc/dec callbacks were defined as passing "void *" as the second argument, but were being implicitly cast to a new type. Replace the argument with union nfsd4_op_u, and perform explicit member selection in the function body. There are no resulting binary differences. Changes were made mechanically using the following Coccinelle script, with minor by-hand fixes for members that didn't already match their existing argument name: @find@ identifier func; type T, opsT; identifier ops, N; @@ opsT ops[] = { [N] = (T) func, }; @already_void@ identifier find.func; identifier name; @@ func(..., -void +union nfsd4_op_u *name) { ... } @proto depends on !already_void@ identifier find.func; type T; identifier name; position p; @@ func@p(..., T name ) { ... } @script:python get_member@ type_name << proto.T; member; @@ coccinelle.member = cocci.make_ident(type_name.split("_", 1)[1].split(' ',1)[0]) @convert@ identifier find.func; type proto.T; identifier proto.name; position proto.p; identifier get_member.member; @@ func@p(..., - T name + union nfsd4_op_u *u ) { + T name = &u->member; ... } @cast@ identifier find.func; type T, opsT; identifier ops, N; @@ opsT ops[] = { [N] = - (T) func, }; Cc: Chuck Lever Cc: Jeff Layton Cc: Gustavo A. R. Silva Cc: linux-nfs@vger.kernel.org Signed-off-by: Kees Cook Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ea468098605ef6a02d5a6ff54a8e45fdf0c08d93 Author: Chuck Lever Date: Sat Nov 26 15:55:30 2022 -0500 NFSD: Use only RQ_DROPME to signal the need to drop a reply [ Upstream commit 9315564747cb6a570e99196b3a4880fb817635fd ] Clean up: NFSv2 has the only two usages of rpc_drop_reply in the NFSD code base. Since NFSv2 is going away at some point, replace these in order to simplify the "drop this reply?" check in nfsd_dispatch(). Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 71a98737cdcfda6dee009522db1196bbc4a215f6 Author: Dai Ngo Date: Wed Nov 16 19:44:47 2022 -0800 NFSD: add delegation reaper to react to low memory condition [ Upstream commit 44df6f439a1790a5f602e3842879efa88f346672 ] The delegation reaper is called by nfsd memory shrinker's on the 'count' callback. It scans the client list and sends the courtesy CB_RECALL_ANY to the clients that hold delegations. To avoid flooding the clients with CB_RECALL_ANY requests, the delegation reaper sends only one CB_RECALL_ANY request to each client per 5 seconds. Signed-off-by: Dai Ngo [ cel: moved definition of RCA4_TYPE_MASK_RDATA_DLG ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 80a81db01ab0521c9355739a9276836264141da6 Author: Dai Ngo Date: Wed Nov 16 19:44:46 2022 -0800 NFSD: add support for sending CB_RECALL_ANY [ Upstream commit 3959066b697b5dfbb7141124ae9665337d4bc638 ] Add XDR encode and decode function for CB_RECALL_ANY. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 87098b663f42f4f0866fa876afdfab502a10b4de Author: Dai Ngo Date: Wed Nov 16 19:44:45 2022 -0800 NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker [ Upstream commit a1049eb47f20b9eabf9afb218578fff16b4baca6 ] Refactoring courtesy_client_reaper to generic low memory shrinker so it can be used for other purposes. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 35a48412f6a4d6ca8473c91dc853d0b5f3953fe0 Author: Brian Foster Date: Wed Nov 16 10:28:36 2022 -0500 NFSD: pass range end to vfs_fsync_range() instead of count [ Upstream commit 79a1d88a36f77374c77fd41a4386d8c2736b8704 ] _nfsd_copy_file_range() calls vfs_fsync_range() with an offset and count (bytes written), but the former wants the start and end bytes of the range to sync. Fix it up. Fixes: eac0b17a77fb ("NFSD add vfs_fsync after async copy is done") Signed-off-by: Brian Foster Tested-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0d5f3de2b422c43f968f0048e1ad62be78a3e0af Author: Jeff Layton Date: Fri Nov 11 14:36:38 2022 -0500 lockd: fix file selection in nlmsvc_cancel_blocked [ Upstream commit 9f27783b4dd235ef3c8dbf69fc6322777450323c ] We currently do a lock_to_openmode call based on the arguments from the NLM_UNLOCK call, but that will always set the fl_type of the lock to F_UNLCK, and the O_RDONLY descriptor is always chosen. Fix it to use the file_lock from the block instead. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7ecaa9aff9f5b2f9b2b634c9f52ec726749374e4 Author: Jeff Layton Date: Fri Nov 11 14:36:37 2022 -0500 lockd: ensure we use the correct file descriptor when unlocking [ Upstream commit 69efce009f7df888e1fede3cb2913690eb829f52 ] Shared locks are set on O_RDONLY descriptors and exclusive locks are set on O_WRONLY ones. nlmsvc_unlock however calls vfs_lock_file twice, once for each descriptor, but it doesn't reset fl_file. Ensure that it does. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 781c3f3d18125ab852c038e938f74f2c7f55f46e Author: Jeff Layton Date: Fri Nov 11 14:36:36 2022 -0500 lockd: set missing fl_flags field when retrieving args [ Upstream commit 75c7940d2a86d3f1b60a0a265478cb8fc887b970 ] Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ae8f2bb3dd3440be214b1f59c6aef4b3a8c2631c Author: Xiu Jianfeng Date: Fri Nov 11 17:18:35 2022 +0800 NFSD: Use struct_size() helper in alloc_session() [ Upstream commit 85a0d0c9a58002ef7d1bf5e3ea630f4fbd42a4f0 ] Use struct_size() helper to simplify the code, no functional changes. Signed-off-by: Xiu Jianfeng Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e2505cb851641451273f9f8913e4c3ff5f5c6621 Author: Jeff Layton Date: Mon Nov 7 06:58:41 2022 -0500 nfsd: return error if nfs4_setacl fails [ Upstream commit 01d53a88c08951f88f2a42f1f1e6568928e0590e ] With the addition of POSIX ACLs to struct nfsd_attrs, we no longer return an error if setting the ACL fails. Ensure we return the na_aclerr error on SETATTR if there is one. Fixes: c0cbe70742f4 ("NFSD: add posix ACLs to struct nfsd_attrs") Cc: Neil Brown Reported-by: Yongcheng Yang Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 31c93ee5f1e4dc278b562e20f3c3274ac34997f3 Author: Trond Myklebust Date: Sun Nov 6 14:02:39 2022 -0500 lockd: set other missing fields when unlocking files [ Upstream commit 18ebd35b61b4693a0ddc270b6d4f18def232e770 ] vfs_lock_file() expects the struct file_lock to be fully initialised by the caller. Re-exported NFSv3 has been seen to Oops if the fl_file field is NULL. Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files") Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton Link: https://bugzilla.kernel.org/show_bug.cgi?id=216582 Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 739202b2b9cfee0941bd22c3838f545a6c7f278c Author: Chuck Lever Date: Thu Nov 3 16:22:48 2022 -0400 NFSD: Add an nfsd_file_fsync tracepoint [ Upstream commit d7064eaf688cfe454c50db9f59298463d80d403c ] Add a tracepoint to capture the number of filecache-triggered fsync calls and which files needed it. Also, record when an fsync triggers a write verifier reset. Examples: <...>-97 [007] 262.505611: nfsd_file_free: inode=0xffff888171e08140 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d2400 <...>-97 [007] 262.505612: nfsd_file_fsync: inode=0xffff888171e08140 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d2400 ret=0 <...>-97 [007] 262.505623: nfsd_file_free: inode=0xffff888171e08dc0 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d1e00 <...>-97 [007] 262.505624: nfsd_file_fsync: inode=0xffff888171e08dc0 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d1e00 ret=0 Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4453e0c1bbabd660b6de0a790dbf907131635fda Author: Jeff Layton Date: Wed Nov 2 14:44:50 2022 -0400 nfsd: fix up the filecache laundrette scheduling [ Upstream commit 22ae4c114f77b55a4c5036e8f70409a0799a08f8 ] We don't really care whether there are hashed entries when it comes to scheduling the laundrette. They might all be non-gc entries, after all. We only want to schedule it if there are entries on the LRU. Switch to using list_lru_count, and move the check into nfsd_file_gc_worker. The other callsite in nfsd_file_put doesn't need to count entries, since it only schedules the laundrette after adding an entry to the LRU. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3d479899f4fe0ce70ab12ef212e438cc3621b3c4 Author: Jeff Layton Date: Wed Nov 2 14:44:48 2022 -0400 nfsd: reorganize filecache.c [ Upstream commit 8214118589881b2d390284410c5ff275e7a5e03c ] In a coming patch, we're going to rework how the filecache refcounting works. Move some code around in the function to reduce the churn in the later patches, and rename some of the functions with (hopefully) clearer names: nfsd_file_flush becomes nfsd_file_fsync, and nfsd_file_unhash_and_dispose is renamed to nfsd_file_unhash_and_queue. Also, the nfsd_file_put_final tracepoint is renamed to nfsd_file_free, to better match the name of the function from which it's called. Signed-off-by: Jeff Layton Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 605a5acd6f42033f490cb71ff9594678929a40ca Author: Jeff Layton Date: Wed Nov 2 14:44:47 2022 -0400 nfsd: remove the pages_flushed statistic from filecache [ Upstream commit 1f696e230ea5198e393368b319eb55651828d687 ] We're counting mapping->nrpages, but not all of those are necessarily dirty. We don't really have a simple way to count just the dirty pages, so just remove this stat since it's not accurate. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 384b23f13672aebbf0bd56339b749ea4e162b11e Author: Chuck Lever Date: Mon Oct 31 09:53:26 2022 -0400 NFSD: Fix licensing header in filecache.c [ Upstream commit 3f054211b29c0fa06dfdcab402c795fd7e906be1 ] Add a missing SPDX header. Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 56eedeaf71b0dd30b607e5f05a92c18bd8bf2fcd Author: Chuck Lever Date: Fri Oct 28 10:47:53 2022 -0400 NFSD: Use rhashtable for managing nfs4_file objects [ Upstream commit d47b295e8d76a4d69f0e2ea0cd8a79c9d3488280 ] fh_match() is costly, especially when filehandles are large (as is the case for NFSv4). It needs to be used sparingly when searching data structures. Unfortunately, with common workloads, I see multiple thousands of objects stored in file_hashtbl[], which has just 256 buckets, making its bucket hash chains quite lengthy. Walking long hash chains with the state_lock held blocks other activity that needs that lock. Sizable hash chains are a common occurrance once the server has handed out some delegations, for example -- IIUC, each delegated file is held open on the server by an nfs4_file object. To help mitigate the cost of searching with fh_match(), replace the nfs4_file hash table with an rhashtable, which can dynamically resize its bucket array to minimize hash chain length. The result of this modification is an improvement in the latency of NFSv4 operations, and the reduction of nfsd CPU utilization due to eliminating the cost of multiple calls to fh_match() and reducing the CPU cache misses incurred while walking long hash chains in the nfs4_file hash table. Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8fdef896122f53a5cbf5871cfa6c40070e83d676 Author: Chuck Lever Date: Fri Oct 28 10:47:47 2022 -0400 NFSD: Refactor find_file() [ Upstream commit 15424748001a9b5ea62b3e6ad45f0a8b27f01df9 ] find_file() is now the only caller of find_file_locked(), so just fold these two together. Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5e92a168495cd7c8a789004c78313583176e8816 Author: Chuck Lever Date: Fri Oct 28 10:47:41 2022 -0400 NFSD: Clean up find_or_add_file() [ Upstream commit 9270fc514ba7d415636b23bcb937573a1ce54f6a ] Remove the call to find_file_locked() in insert_nfs4_file(). Tracing shows that over 99% of these calls return NULL. Thus it is not worth the expense of the extra bucket list traversal. insert_file() already deals correctly with the case where the item is already in the hash bucket. Since nfsd4_file_hash_insert() is now just a wrapper around insert_file(), move the meat of insert_file() into nfsd4_file_hash_insert() and get rid of it. Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5aa2c4a1fe2806f081997dfbc12992c5d50ec33d Author: Chuck Lever Date: Fri Oct 28 10:47:34 2022 -0400 NFSD: Add a nfsd4_file_hash_remove() helper [ Upstream commit 3341678f2fd6106055cead09e513fad6950a0d19 ] Refactor to relocate hash deletion operation to a helper function that is close to most other nfs4_file data structure operations. The "noinline" annotation will become useful in a moment when the hlist_del_rcu() is replaced with a more complex rhash remove operation. It also guarantees that hash remove operations can be traced with "-p function -l remove_nfs4_file_locked". This also simplifies the organization of forward declarations: the to-be-added rhashtable and its param structure will be defined /after/ put_nfs4_file(). Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e77b1d63c02e6d14aa8fd4c95a4010933efb5ffd Author: Chuck Lever Date: Fri Oct 28 10:47:28 2022 -0400 NFSD: Clean up nfsd4_init_file() [ Upstream commit 81a21fa3e7fdecb3c5b97014f0fc5a17d5806cae ] Name this function more consistently. I'm going to use nfsd4_file_ and nfsd4_file_hash_ for these helpers. Change the @fh parameter to be const pointer for better type safety. Finally, move the hash insertion operation to the caller. This is typical for most other "init_object" type helpers, and it is where most of the other nfs4_file hash table operations are located. Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c152e4ffb9e85b3788a002a79184f0674aa8b62f Author: Chuck Lever Date: Fri Oct 28 10:47:22 2022 -0400 NFSD: Update file_hashtbl() helpers [ Upstream commit 3fe828caddd81e68e9d29353c6e9285a658ca056 ] Enable callers to use const pointers for type safety. Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b0952d49483a08ecdd04fea6a09d013cb8956171 Author: Chuck Lever Date: Fri Oct 28 10:47:16 2022 -0400 NFSD: Use const pointers as parameters to fh_ helpers [ Upstream commit b48f8056c034f28dd54668399f1d22be421b0bef ] Enable callers to use const pointers where they are able to. Signed-off-by: Chuck Lever Tested-by: Jeff Layton Reviewed-by: Jeff Layton Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a10d111fd09fb5090269f0f856a64a0f0ba8d9cc Author: Chuck Lever Date: Fri Oct 28 10:47:09 2022 -0400 NFSD: Trace delegation revocations [ Upstream commit a1c74569bbde91299f24535abf711be5c84df9de ] Delegation revocation is an exceptional event that is not otherwise visible externally (eg, no network traffic is emitted). Generate a trace record when it occurs so that revocation can be observed or other activity can be triggered. Example: nfsd-1104 [005] 1912.002544: nfsd_stid_revoke: client 633c9343:4e82788d stateid 00000003:00000001 ref=2 type=DELEG Trace infrastructure is provided for subsequent additional tracing related to nfs4_stid activity. Signed-off-by: Chuck Lever Tested-by: Jeff Layton Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 88cf6a1e76aa40e4e71df2b4f6b93eff0845790f Author: Chuck Lever Date: Fri Oct 28 10:47:03 2022 -0400 NFSD: Trace stateids returned via DELEGRETURN [ Upstream commit 20eee313ff4b8a7e71ae9560f5c4ba27cd763005 ] Handing out a delegation stateid is recorded with the nfsd_deleg_read tracepoint, but there isn't a matching tracepoint for recording when the stateid is returned. Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 14c9c091f2a6a2c5e196b9ae832136e2e15c94ab Author: Chuck Lever Date: Fri Oct 28 10:46:57 2022 -0400 NFSD: Clean up nfs4_preprocess_stateid_op() call sites [ Upstream commit eeff73f7c1c583f79a401284f46c619294859310 ] Remove the lame-duck dprintk()s around nfs4_preprocess_stateid_op() call sites. Signed-off-by: Chuck Lever Tested-by: Jeff Layton Reviewed-by: Jeff Layton Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d9991b0b9dd5ad53701d14ea194d59f4833ec829 Author: Chuck Lever Date: Tue Nov 1 13:30:46 2022 -0400 NFSD: Flesh out a documenting comment for filecache.c [ Upstream commit b3276c1f5b268ff56622e9e125b792b4c3dc03ac ] Record what we've learned recently about the NFSD filecache in a documenting comment so our future selves don't forget what all this is for. Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5f866f5a8611bacc40c1f16831f96af6e937eff3 Author: Chuck Lever Date: Fri Oct 28 10:46:51 2022 -0400 NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection [ Upstream commit 4d1ea8455716ca070e3cd85767e6f6a562a58b1b ] NFSv4 operations manage the lifetime of nfsd_file items they use by means of NFSv4 OPEN and CLOSE. Hence there's no need for them to be garbage collected. Introduce a mechanism to enable garbage collection for nfsd_file items used only by NFSv2/3 callers. Note that the change in nfsd_file_put() ensures that both CLOSE and DELEGRETURN will actually close out and free an nfsd_file on last reference of a non-garbage-collected file. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=394 Suggested-by: Trond Myklebust Signed-off-by: Chuck Lever Tested-by: Jeff Layton Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c09b456a81d23a508fb017d7814e875399fb7dca Author: Chuck Lever Date: Fri Oct 28 10:46:44 2022 -0400 NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately" [ Upstream commit dcf3f80965ca787c70def402cdf1553c93c75529 ] This reverts commit 5e138c4a750dc140d881dab4a8804b094bbc08d2. That commit attempted to make files available to other users as soon as all NFSv4 clients were done with them, rather than waiting until the filecache LRU had garbage collected them. It gets the reference counting wrong, for one thing. But it also misses that DELEGRETURN should release a file in the same fashion. In fact, any nfsd_file_put() on an file held open by an NFSv4 client needs potentially to release the file immediately... Clear the way for implementing that idea. Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit caa627020132b571f79e3c65b8d85ad4f23b1a84 Author: Chuck Lever Date: Fri Oct 28 10:46:38 2022 -0400 NFSD: Pass the target nfsd_file to nfsd_commit() [ Upstream commit c252849082ff525af18b4f253b3c9ece94e951ed ] In a moment I'm going to introduce separate nfsd_file types, one of which is garbage-collected; the other, not. The garbage-collected variety is to be used by NFSv2 and v3, and the non-garbage-collected variety is to be used by NFSv4. nfsd_commit() is invoked by both NFSv3 and NFSv4 consumers. We want nfsd_commit() to find and use the correct variety of cached nfsd_file object for the NFS version that is in use. Signed-off-by: Chuck Lever Tested-by: Jeff Layton Reviewed-by: Jeff Layton Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 599d5c22912f8891e3e50f7f4c8ac4ae943cf13e Author: David Disseldorp Date: Fri Oct 21 14:24:14 2022 +0200 exportfs: use pr_debug for unreachable debug statements [ Upstream commit 427505ffeaa464f683faba945a88d3e3248f6979 ] expfs.c has a bunch of dprintk statements which are unusable due to: #define dprintk(fmt, args...) do{}while(0) Use pr_debug so that they can be enabled dynamically. Also make some minor changes to the debug statements to fix some incorrect types, and remove __func__ which can be handled by dynamic debug separately. Signed-off-by: David Disseldorp Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4ab1211c28f10b22b01200f07e600921ebf191af Author: Jeff Layton Date: Tue Oct 18 07:47:56 2022 -0400 nfsd: allow disabling NFSv2 at compile time [ Upstream commit 2f3a4b2ac2f28b9be78ad21f401f31e263845214 ] rpc.nfsd stopped supporting NFSv2 a year ago. Take the next logical step toward deprecating it and allow NFSv2 support to be compiled out. Add a new CONFIG_NFSD_V2 option that can be turned off and rework the CONFIG_NFSD_V?_ACL option dependencies. Add a description that discourages enabling it. Also, change the description of CONFIG_NFSD to state that the always-on version is now 3 instead of 2. Finally, add an #ifdef around "case 2:" in __write_versions. When NFSv2 is disabled at compile time, this should make the kernel ignore attempts to disable it at runtime, but still error out when trying to enable it. Signed-off-by: Jeff Layton Reviewed-by: Tom Talpey Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 68f7bd7f29a014e0994ae66747803d434ee11eb0 Author: Jeff Layton Date: Tue Oct 18 07:47:55 2022 -0400 nfsd: move nfserrno() to vfs.c [ Upstream commit cb12fae1c34b1fa7eaae92c5aadc72d86d7fae19 ] nfserrno() is common to all nfs versions, but nfsproc.c is specifically for NFSv2. Move it to vfs.c, and the prototype to vfs.h. While we're in here, remove the #ifdef EDQUOT check in this function. It's apparently a holdover from the initial merge of the nfsd code in 1997. No other place in the kernel checks that that symbol is defined before using it, so I think we can dispense with it here. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit abbd1215c3f91545b8f244672ec054d149e46f89 Author: Jeff Layton Date: Tue Oct 18 07:47:54 2022 -0400 nfsd: ignore requests to disable unsupported versions [ Upstream commit 8e823bafff2308753d430566256c83d8085952da ] The kernel currently errors out if you attempt to enable or disable a version that it doesn't recognize. Change it to ignore attempts to disable an unrecognized version. If we don't support it, then there is no harm in doing so. Signed-off-by: Jeff Layton Reviewed-by: Tom Talpey Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 81714ef8e3ef8052df358fbccc3fe5125bf0ac67 Author: Chuck Lever Date: Sun Oct 16 11:47:08 2022 -0400 NFSD: Finish converting the NFSv3 GETACL result encoder [ Upstream commit 841fd0a3cb490eae5dfd262eccb8c8b11d57f8b8 ] For some reason, the NFSv2 GETACL result encoder was fully converted to use the new nfs_stream_encode_acl(), but the NFSv3 equivalent was not similarly converted. Fixes: 20798dfe249a ("NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream") Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a20b0abab966a189a79aba6ebf41f59024a3224d Author: Chuck Lever Date: Sun Oct 16 11:47:02 2022 -0400 NFSD: Finish converting the NFSv2 GETACL result encoder [ Upstream commit ea5021e911d3479346a75ac9b7d9dcd751b0fb99 ] The xdr_stream conversion inadvertently left some code that set the page_len of the send buffer. The XDR stream encoders should handle this automatically now. This oversight adds garbage past the end of the Reply message. Clients typically ignore the garbage, but NFSD does not need to send it, as it leaks stale memory contents onto the wire. Fixes: f8cba47344f7 ("NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream") Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1dd04600f62953390e6f35827e49b0662ecd2e65 Author: Colin Ian King Date: Mon Oct 10 21:24:23 2022 +0100 NFSD: Remove redundant assignment to variable host_err [ Upstream commit 69eed23baf877bbb1f14d7f4df54f89807c9ee2a ] Variable host_err is assigned a value that is never read, it is being re-assigned a value in every different execution path in the following switch statement. The assignment is redundant and can be removed. Cleans up clang-scan warning: warning: Value stored to 'host_err' is never read [deadcode.DeadStores] Signed-off-by: Colin Ian King Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 48a237cb5e52106e111a3afbe61d2e7bb5c1f49f Author: Anna Schumaker Date: Tue Sep 13 14:01:51 2022 -0400 NFSD: Simplify READ_PLUS [ Upstream commit eeadcb75794516839078c28b3730132aeb700ce6 ] Chuck had suggested reverting READ_PLUS so it returns a single DATA segment covering the requested read range. This prepares the server for a future "sparse read" function so support can easily be added without needing to rip out the old READ_PLUS code at the same time. Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 10727ce312c6a23614afe7ae24978aeb00d53e29 Author: Jeff Layton Date: Wed Nov 16 09:36:07 2022 -0500 nfsd: use locks_inode_context helper [ Upstream commit 77c67530e1f95ac25c7075635f32f04367380894 ] nfsd currently doesn't access i_flctx safely everywhere. This requires a smp_load_acquire, as the pointer is set via cmpxchg (a release operation). Acked-by: Chuck Lever Reviewed-by: Christoph Hellwig Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 32c59062f86837668cfc0feee968b083fdb2afb6 Author: Jeff Layton Date: Wed Nov 16 09:19:43 2022 -0500 lockd: use locks_inode_context helper [ Upstream commit 98b41ffe0afdfeaa1439a5d6bd2db4a94277e31b ] lockd currently doesn't access i_flctx safely. This requires a smp_load_acquire, as the pointer is set via cmpxchg (a release operation). Cc: Trond Myklebust Cc: Anna Schumaker Cc: Chuck Lever Reviewed-by: Christoph Hellwig Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 70ffaa7896d9002687eaba43caf437737b424704 Author: Jeff Layton Date: Wed Nov 16 09:02:30 2022 -0500 filelock: add a new locks_inode_context accessor function [ Upstream commit 401a8b8fd5acd51582b15238d72a8d0edd580e9f ] There are a number of places in the kernel that are accessing the inode->i_flctx field without smp_load_acquire. This is required to ensure that the caller doesn't see a partially-initialized structure. Add a new accessor function for it to make this clear and convert all of the relevant accesses in locks.c to use it. Also, convert locks_free_lock_context to use the helper as well instead of just doing a "bare" assignment. Reviewed-by: Christoph Hellwig Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7ea635fc47af797fea5250fcba22acc4c159e4de Author: Chuck Lever Date: Wed Nov 23 14:14:32 2022 -0500 NFSD: Fix reads with a non-zero offset that don't end on a page boundary [ Upstream commit ac8db824ead0de2e9111337c401409d010fba2f0 ] This was found when virtual machines with nfs-mounted qcow2 disks failed to boot properly. Reported-by: Anders Blomdell Suggested-by: Al Viro Link: https://bugzilla.redhat.com/show_bug.cgi?id=2142132 Fixes: bfbfb6182ad1 ("nfsd_splice_actor(): handle compound pages") [ cel: "‘for’ loop initial declarations are only allowed in C99 or C11 mode" ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7d867c6c30e1c5abd7ef01418108af9911163a67 Author: Jeff Layton Date: Tue Nov 8 11:23:11 2022 -0500 nfsd: put the export reference in nfsd4_verify_deleg_dentry [ Upstream commit 50256e4793a5e5ab77703c82a47344ad2e774a59 ] nfsd_lookup_dentry returns an export reference in addition to the dentry ref. Ensure that we put it too. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138866 Fixes: 876c553cb410 ("NFSD: verify the opened dentry after setting a delegation") Reported-by: Yongcheng Yang Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 551f17db6508b45eb3d984bfcec61b1c3dba4806 Author: Jeff Layton Date: Sat Nov 5 09:49:26 2022 -0400 nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint [ Upstream commit bdd6b5624c62d0acd350d07564f1c82fe649235f ] When we fail to insert into the hashtable with a non-retryable error, we'll free the object and then goto out_status. If the tracepoint is enabled, it'll end up accessing the freed object when it tries to grab the fields out of it. Set nf to NULL after freeing it to avoid the issue. Fixes: 243a5263014a ("nfsd: rework hashtable handling in nfsd_do_file_acquire") Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 31268eb4572b155f69c527cb3274a0721dc1482e Author: Jeff Layton Date: Mon Oct 31 11:49:21 2022 -0400 nfsd: fix net-namespace logic in __nfsd_file_cache_purge [ Upstream commit d3aefd2b29ff5ffdeb5c06a7d3191a027a18cdb8 ] If the namespace doesn't match the one in "net", then we'll continue, but that doesn't cause another rhashtable_walk_next call, so it will loop infinitely. Fixes: ce502f81ba88 ("NFSD: Convert the filecache to use rhashtable") Reported-by: Petr Vorel Link: https://lore.kernel.org/ltp/Y1%2FP8gDAcWC%2F+VR3@pevik/ Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5428383c6fb3184c8be00e8b874678ae89129f23 Author: Tetsuo Handa Date: Mon Oct 10 14:59:02 2022 +0900 NFSD: unregister shrinker when nfsd_init_net() fails [ Upstream commit bd86c69dae65de30f6d47249418ba7889809e31a ] syzbot is reporting UAF read at register_shrinker_prepared() [1], for commit 7746b32f467b3813 ("NFSD: add shrinker to reap courtesy clients on low memory condition") missed that nfsd4_leases_net_shutdown() from nfsd_exit_net() is called only when nfsd_init_net() succeeded. If nfsd_init_net() fails due to nfsd_reply_cache_init() failure, register_shrinker() from nfsd4_init_leases_net() has to be undone before nfsd_init_net() returns. Link: https://syzkaller.appspot.com/bug?extid=ff796f04613b4c84ad89 [1] Reported-by: syzbot Signed-off-by: Tetsuo Handa Fixes: 7746b32f467b3813 ("NFSD: add shrinker to reap courtesy clients on low memory condition") Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1bb33492578c18a837632d64de0e19b7e77c6afe Author: Jeff Layton Date: Tue Oct 4 15:41:10 2022 -0400 nfsd: rework hashtable handling in nfsd_do_file_acquire [ Upstream commit 243a5263014a30436c93ed3f1f864c1da845455e ] nfsd_file is RCU-freed, so we need to hold the rcu_read_lock long enough to get a reference after finding it in the hash. Take the rcu_read_lock() and call rhashtable_lookup directly. Switch to using rhashtable_lookup_insert_key as well, and use the usual retry mechanism if we hit an -EEXIST. Rename the "retry" bool to open_retry, and eliminiate the insert_err goto target. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2db3e73f9afd256f4d45640324100f7cd61ea375 Author: Jeff Layton Date: Fri Sep 30 16:56:02 2022 -0400 nfsd: fix nfsd_file_unhash_and_dispose [ Upstream commit 8d0d254b15cc5b7d46d85fb7ab8ecede9575e672 ] nfsd_file_unhash_and_dispose() is called for two reasons: We're either shutting down and purging the filecache, or we've gotten a notification about a file delete, so we want to go ahead and unhash it so that it'll get cleaned up when we close. We're either walking the hashtable or doing a lookup in it and we don't take a reference in either case. What we want to do in both cases is to try and unhash the object and put it on the dispose list if that was successful. If it's no longer hashed, then we don't want to touch it, with the assumption being that something else is already cleaning up the sentinel reference. Instead of trying to selectively decrement the refcount in this function, just unhash it, and if that was successful, move it to the dispose list. Then, the disposal routine will just clean that up as usual. Also, just make this a void function, drop the WARN_ON_ONCE, and the comments about deadlocking since the nature of the purported deadlock is no longer clear. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 683fb922e7b59dcd56a73316eacddd797f6965a9 Author: Gaosheng Cui Date: Mon Sep 26 10:30:18 2022 +0800 fanotify: Remove obsoleted fanotify_event_has_path() [ Upstream commit 7a80bf902d2bc722b4477442ee772e8574603185 ] All uses of fanotify_event_has_path() have been removed since commit 9c61f3b560f5 ("fanotify: break up fanotify_alloc_event()"), now it is useless, so remove it. Link: https://lore.kernel.org/r/20220926023018.1505270-1-cuigaosheng1@huawei.com Signed-off-by: Gaosheng Cui Signed-off-by: Jan Kara [ cel: resolved merge conflict ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 229e73a0f4071ba699cd55836e95e7953b4c45bb Author: Gaosheng Cui Date: Fri Sep 9 11:38:28 2022 +0800 fsnotify: remove unused declaration [ Upstream commit f847c74d6e89f10926db58649a05b99237258691 ] fsnotify_alloc_event_holder() and fsnotify_destroy_event_holder() has been removed since commit 7053aee26a35 ("fsnotify: do not share events between notification groups"), so remove it. Reviewed-by: Ritesh Harjani (IBM) Signed-off-by: Gaosheng Cui Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a2d440dce60311cc33d3a1b85933287155179bd6 Author: Al Viro Date: Thu Aug 4 12:57:38 2022 -0400 fs/notify: constify path [ Upstream commit d5bf88895f24686641c39420ee6df716dc1d95d8 ] Reviewed-by: Matthew Bobrowski Reviewed-by: Christian Brauner (Microsoft) Signed-off-by: Al Viro Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 241685bab2774f54d347525889f06e265a0a133c Author: Jeff Layton Date: Mon Sep 26 14:41:02 2022 -0400 nfsd: extra checks when freeing delegation stateids [ Upstream commit 895ddf5ed4c54ea9e3533606d7a8b4e4f27f95ef ] We've had some reports of problems in the refcounting for delegation stateids that we've yet to track down. Add some extra checks to ensure that we've removed the object from various lists before freeing it. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2127067 Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 345e3bb5e82a922ffc81cec9746290a8ef0930a9 Author: Jeff Layton Date: Mon Sep 26 14:41:01 2022 -0400 nfsd: make nfsd4_run_cb a bool return function [ Upstream commit b95239ca4954a0d48b19c09ce7e8f31b453b4216 ] queue_work can return false and not queue anything, if the work is already queued. If that happens in the case of a CB_RECALL, we'll have taken an extra reference to the stid that will never be put. Ensure we throw a warning in that case. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d7f2774d8c59df6497de0751458f76319c75970d Author: Jeff Layton Date: Mon Sep 26 12:38:45 2022 -0400 nfsd: fix comments about spinlock handling with delegations [ Upstream commit 25fbe1fca14142beae6c882f7906510363d42bff ] Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 89b63627049083f63c1113910e422775c22b213c Author: Jeff Layton Date: Mon Sep 26 12:38:44 2022 -0400 nfsd: only fill out return pointer on success in nfsd4_lookup_stateid [ Upstream commit 4d01416ab41540bb13ec4a39ac4e6c4aa5934bc9 ] In the case of a revoked delegation, we still fill out the pointer even when returning an error, which is bad form. Only overwrite the pointer on success. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 31b16e6b0b7841bfecb489a08f957d79586b30e8 Author: Chuck Lever Date: Thu Sep 1 15:29:55 2022 -0400 NFSD: Cap rsize_bop result based on send buffer size [ Upstream commit 76ce4dcec0dc08a032db916841ddc4e3998be317 ] Since before the git era, NFSD has conserved the number of pages held by each nfsd thread by combining the RPC receive and send buffers into a single array of pages. This works because there are no cases where an operation needs a large RPC Call message and a large RPC Reply at the same time. Once an RPC Call has been received, svc_process() updates svc_rqst::rq_res to describe the part of rq_pages that can be used for constructing the Reply. This means that the send buffer (rq_res) shrinks when the received RPC record containing the RPC Call is large. Add an NFSv4 helper that computes the size of the send buffer. It replaces svc_max_payload() in spots where svc_max_payload() returns a value that might be larger than the remaining send buffer space. Callers who need to know the transport's actual maximum payload size will continue to use svc_max_payload(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 60b46564e0b6e9e5fade655df3552f01cd1a6b4f Author: Chuck Lever Date: Thu Sep 22 13:10:35 2022 -0400 NFSD: Rename the fields in copy_stateid_t [ Upstream commit 781fde1a2ba2391f31142f46f964cf1148ca1791 ] Code maintenance: The name of the copy_stateid_t::sc_count field collides with the sc_count field in struct nfs4_stid, making the latter difficult to grep for when auditing stateid reference counting. No behavior change expected. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b7aea45a67e94bacaa7ce3d76fc629b1380a681e Author: ChenXiaoSong Date: Fri Sep 23 00:31:56 2022 +0800 nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops [ Upstream commit 1342f9dd3fc219089deeb2620f6790f19b4129b1 ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. Signed-off-by: ChenXiaoSong Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 21e18dd5eba48855b1800b24af9d3dc32b1b8788 Author: ChenXiaoSong Date: Fri Sep 23 00:31:55 2022 +0800 nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_reply_cache_stats_fops [ Upstream commit 64776611a06322b99386f8dfe3b3ba1aa0347a38 ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. nfsd_net is converted from seq_file->file instead of seq_file->private in nfsd_reply_cache_stats_show(). Signed-off-by: ChenXiaoSong [ cel: reduce line length ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 443e6484259f134c8acad4ae50955ac357cca3ea Author: ChenXiaoSong Date: Fri Sep 23 00:31:54 2022 +0800 nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops [ Upstream commit 1d7f6b302b75ff7acb9eb3cab0c631b10cfa7542 ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. inode is converted from seq_file->file instead of seq_file->private in client_info_show(). Signed-off-by: ChenXiaoSong Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 615d761a6b99d126d733d13803cd53dd5735676b Author: ChenXiaoSong Date: Fri Sep 23 00:31:53 2022 +0800 nfsd: use DEFINE_SHOW_ATTRIBUTE to define export_features_fops and supported_enctypes_fops [ Upstream commit 9beeaab8e05d353d709103cafa1941714b4d5d94 ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. Signed-off-by: ChenXiaoSong [ cel: reduce line length ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a063abefc6a5253d4f2bcdc46f80e560530f536e Author: ChenXiaoSong Date: Fri Sep 23 00:31:52 2022 +0800 nfsd: use DEFINE_PROC_SHOW_ATTRIBUTE to define nfsd_proc_ops [ Upstream commit 0cfb0c4228a5c8e2ed2b58f8309b660b187cef02 ] Use DEFINE_PROC_SHOW_ATTRIBUTE helper macro to simplify the code. Signed-off-by: ChenXiaoSong Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cda3e9b8cd5eafe922b2d253fe75cbc3f646e186 Author: Chuck Lever Date: Mon Sep 12 17:23:36 2022 -0400 NFSD: Pack struct nfsd4_compoundres [ Upstream commit 9f553e61bd36c1048543ac2f6945103dd2f742be ] Remove a couple of 4-byte holes on platforms with 64-bit pointers. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a54822e64d3a0b7af015bb6e4eca5635078493a2 Author: Chuck Lever Date: Mon Sep 12 17:23:30 2022 -0400 NFSD: Remove unused nfsd4_compoundargs::cachetype field [ Upstream commit 77e378cf2a595d8e39cddf28a31efe6afd9394a0 ] This field was added by commit 1091006c5eb1 ("nfsd: turn on reply cache for NFSv4") but was never put to use. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 17bb698078678c71d2a98a060e3d1c2dab8f6f1b Author: Chuck Lever Date: Mon Sep 12 17:23:25 2022 -0400 NFSD: Remove "inline" directives on op_rsize_bop helpers [ Upstream commit 6604148cf961b57fc735e4204f8996536da9253c ] These helpers are always invoked indirectly, so the compiler can't inline these anyway. While we're updating the synopses of these helpers, defensively convert their parameters to const pointers. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f533a01b0982fdd292858ad68e18986f8773fc96 Author: Chuck Lever Date: Mon Sep 12 17:23:19 2022 -0400 NFSD: Clean up nfs4svc_encode_compoundres() [ Upstream commit 9993a66317fc9951322483a9edbfae95a640b210 ] In today's Linux NFS server implementation, the NFS dispatcher initializes each XDR result stream, and the NFSv4 .pc_func and .pc_encode methods all use xdr_stream-based encoding. This keeps rq_res.len automatically updated. There is no longer a need for the WARN_ON_ONCE() check in nfs4svc_encode_compoundres(). Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 918054d2d8ac981314af3f00ae1cbf5545d01cd0 Author: Chuck Lever Date: Mon Sep 12 17:23:07 2022 -0400 NFSD: Clean up WRITE arg decoders [ Upstream commit d4da5baa533215b14625458e645056baf646bb2e ] xdr_stream_subsegment() already returns a boolean value. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c92e8b295ae86a79f1c17d8ae54043db47953271 Author: Chuck Lever Date: Mon Sep 12 17:23:02 2022 -0400 NFSD: Use xdr_inline_decode() to decode NFSv3 symlinks [ Upstream commit c3d2a04f05c590303c125a176e6e43df4a436fdb ] Replace the check for buffer over/underflow with a helper that is commonly used for this purpose. The helper also sets xdr->nwords correctly after successfully linearizing the symlink argument into the stream's scratch buffer. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d08acee648f1e1237a3b045a7af382d434a779fb Author: Chuck Lever Date: Mon Sep 12 17:22:56 2022 -0400 NFSD: Refactor common code out of dirlist helpers [ Upstream commit 98124f5bd6c76699d514fbe491dd95265369cc99 ] The dust has settled a bit and it's become obvious what code is totally common between nfsd_init_dirlist_pages() and nfsd3_init_dirlist_pages(). Move that common code to SUNRPC. The new helper brackets the existing xdr_init_decode_pages() API. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5e76b25d7cc82c148d391c0c43b884e6427cb302 Author: Chuck Lever Date: Mon Sep 12 17:22:44 2022 -0400 NFSD: Reduce amount of struct nfsd4_compoundargs that needs clearing [ Upstream commit 3fdc546462348b8a497c72bc894e0cde9f10fc40 ] Have SunRPC clear everything except for the iops array. Then have each NFSv4 XDR decoder clear it's own argument before decoding. Now individual operations may have a large argument struct while not penalizing the vast majority of operations with a small struct. And, clearing the argument structure occurs as the argument fields are initialized, enabling the CPU to do write combining on that memory. In some cases, clearing is not even necessary because all of the fields in the argument structure are initialized by the decoder. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5ed252489368015326022adf541f08f2f073382d Author: Chuck Lever Date: Mon Sep 12 17:22:38 2022 -0400 SUNRPC: Parametrize how much of argsize should be zeroed [ Upstream commit 103cc1fafee48adb91fca0e19deb869fd23e46ab ] Currently, SUNRPC clears the whole of .pc_argsize before processing each incoming RPC transaction. Add an extra parameter to struct svc_procedure to enable upper layers to reduce the amount of each operation's argument structure that is zeroed by SUNRPC. The size of struct nfsd4_compoundargs, in particular, is a lot to clear on each incoming RPC Call. A subsequent patch will cut this down to something closer to what NFSv2 and NFSv3 uses. This patch should cause no behavior changes. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6e50de3b3a289c3166d20620abef1885156ceaa8 Author: Dai Ngo Date: Wed Sep 14 08:54:26 2022 -0700 NFSD: add shrinker to reap courtesy clients on low memory condition [ Upstream commit 7746b32f467b3813fb61faaab3258de35806a7ac ] Add courtesy_client_reaper to react to low memory condition triggered by the system memory shrinker. The delayed_work for the courtesy_client_reaper is scheduled on the shrinker's count callback using the laundry_wq. The shrinker's scan callback is not used for expiring the courtesy clients due to potential deadlocks. Signed-off-by: Dai Ngo [ cel: adjusted to apply without e33c267ab70d ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 67302ef04e5411d7d81279c5957f8bdffb800c65 Author: Dai Ngo Date: Wed Sep 14 08:54:25 2022 -0700 NFSD: keep track of the number of courtesy clients in the system [ Upstream commit 3a4ea23d86a317c4b68b9a69d51f7e84e1e04357 ] Add counter nfs4_courtesy_client_count to nfsd_net to keep track of the number of courtesy clients in the system. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1022fe63c57e4e24a57316d8ddfdf88bd293e2f8 Author: Chuck Lever Date: Thu Sep 8 18:14:25 2022 -0400 NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY [ Upstream commit 5f5f8b6d655fd947e899b1771c2f7cb581a06764 ] nfsd_unlink() can kick off a CB_RECALL (via vfs_unlink() -> leases_conflict()) if a delegation is present. Before returning NFS4ERR_DELAY, give the client holding that delegation a chance to return it and then retry the nfsd_unlink() again, once. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354 Tested-by: Igor Mammedov Reviewed-by: Jeff Layton [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 235738ccea3b6b2dd58440d91dcadd3e88ba5d7d Author: Chuck Lever Date: Thu Sep 8 18:14:19 2022 -0400 NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY [ Upstream commit 68c522afd0b1936b48a03a4c8b81261e7597c62d ] nfsd_rename() can kick off a CB_RECALL (via vfs_rename() -> leases_conflict()) if a delegation is present. Before returning NFS4ERR_DELAY, give the client holding that delegation a chance to return it and then retry the nfsd_rename() again, once. This version of the patch handles renaming an existing file, but does not deal with renaming onto an existing file. That case will still always trigger an NFS4ERR_DELAY. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354 Tested-by: Igor Mammedov Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b6c6c7153bdbcdbab07a08fbbe330ca71a2305af Author: Chuck Lever Date: Thu Sep 8 18:14:13 2022 -0400 NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY [ Upstream commit 34b91dda7124fc3259e4b2ae53e0c933dedfec01 ] nfsd_setattr() can kick off a CB_RECALL (via notify_change() -> break_lease()) if a delegation is present. Before returning NFS4ERR_DELAY, give the client holding that delegation a chance to return it and then retry the nfsd_setattr() again, once. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354 Tested-by: Igor Mammedov Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f326970df189b622fe73f217514847668042590a Author: Chuck Lever Date: Thu Sep 8 18:14:07 2022 -0400 NFSD: Refactor nfsd_setattr() [ Upstream commit c0aa1913db57219e91a0a8832363cbafb3a9cf8f ] Move code that will be retried (in a subsequent patch) into a helper function. Reviewed-by: Jeff Layton [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 95dce2279c8172cab0ba5d4c23a84ddd1290d3d2 Author: Chuck Lever Date: Thu Sep 8 18:14:00 2022 -0400 NFSD: Add a mechanism to wait for a DELEGRETURN [ Upstream commit c035362eb935fe9381d9d1cc453bc2a37460e24c ] Subsequent patches will use this mechanism to wake up an operation that is waiting for a client to return a delegation. The new tracepoint records whether the wait timed out or was properly awoken by the expected DELEGRETURN: nfsd-1155 [002] 83799.493199: nfsd_delegret_wakeup: xid=0x14b7d6ef fh_hash=0xf6826792 (timed out) Suggested-by: Jeff Layton Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3c0e831b87c6a424f85e6ad4cf254198421b6794 Author: Chuck Lever Date: Thu Sep 8 18:13:54 2022 -0400 NFSD: Add tracepoints to report NFSv4 callback completions [ Upstream commit 1035d65446a018ca2dd179e29a2fcd6d29057781 ] Wireshark has always been lousy about dissecting NFSv4 callbacks, especially NFSv4.0 backchannel requests. Add tracepoints so we can surgically capture these events in the trace log. Tracepoints are time-stamped and ordered so that we can now observe the timing relationship between a CB_RECALL Reply and the client's DELEGRETURN Call. Example: nfsd-1153 [002] 211.986391: nfsd_cb_recall: addr=192.168.1.67:45767 client 62ea82e4:fee7492a stateid 00000003:00000001 nfsd-1153 [002] 212.095634: nfsd_compound: xid=0x0000002c opcnt=2 nfsd-1153 [002] 212.095647: nfsd_compound_status: op=1/2 OP_PUTFH status=0 nfsd-1153 [002] 212.095658: nfsd_file_put: hash=0xf72 inode=0xffff9291148c7410 ref=3 flags=HASHED|REFERENCED may=READ file=0xffff929103b3ea00 nfsd-1153 [002] 212.095661: nfsd_compound_status: op=2/2 OP_DELEGRETURN status=0 kworker/u25:8-148 [002] 212.096713: nfsd_cb_recall_done: client 62ea82e4:fee7492a stateid 00000003:00000001 status=0 Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bc6bead0af16f1343f7e8d0e71a3b5a64d658df9 Author: Gaosheng Cui Date: Fri Sep 9 14:59:10 2022 +0800 nfsd: remove nfsd4_prepare_cb_recall() declaration [ Upstream commit 18224dc58d960c65446971930d0487fc72d00598 ] nfsd4_prepare_cb_recall() has been removed since commit 0162ac2b978e ("nfsd: introduce nfsd4_callback_ops"), so remove it. Signed-off-by: Gaosheng Cui Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 330914c342451a8e229fbe67bb37a63e40564767 Author: Jeff Layton Date: Thu Sep 8 12:31:07 2022 -0400 nfsd: clean up mounted_on_fileid handling [ Upstream commit 6106d9119b6599fa23dc556b429d887b4c2d9f62 ] We only need the inode number for this, not a full rack of attributes. Rename this function make it take a pointer to a u64 instead of struct kstat, and change it to just request STATX_INO. Signed-off-by: Jeff Layton [ cel: renamed get_mounted_on_ino() ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f574d41b1bda4fa3687b9c34ff3a9e5251833e65 Author: Chuck Lever Date: Mon Sep 5 15:33:32 2022 -0400 NFSD: Fix handling of oversized NFSv4 COMPOUND requests [ Upstream commit 7518a3dc5ea249d4112156ce71b8b184eb786151 ] If an NFS server returns NFS4ERR_RESOURCE on the first operation in an NFSv4 COMPOUND, there's no way for a client to know where the problem is and then simplify the compound to make forward progress. So instead, make NFSD process as many operations in an oversized COMPOUND as it can and then return NFS4ERR_RESOURCE on the first operation it did not process. pynfs NFSv4.0 COMP6 exercises this case, but checks only for the COMPOUND status code, not whether the server has processed any of the operations. pynfs NFSv4.1 SEQ6 and SEQ7 exercise the NFSv4.1 case, which detects too many operations per COMPOUND by checking against the limits negotiated when the session was created. Suggested-by: Bruce Fields Fixes: 0078117c6d91 ("nfsd: return RESOURCE not GARBAGE_ARGS on too many ops") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b0062184a18435036223174aff2ff3bd5a260029 Author: NeilBrown Date: Tue Sep 6 10:42:19 2022 +1000 NFSD: drop fname and flen args from nfsd_create_locked() [ Upstream commit 9558f9304ca1903090fa5d995a3269a8e82804b4 ] nfsd_create_locked() does not use the "fname" and "flen" arguments, so drop them from declaration and all callers. Signed-off-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c23687911f82a63fa2977ce9c992b395e90f8ba0 Author: Chuck Lever Date: Thu Sep 1 15:10:24 2022 -0400 NFSD: Protect against send buffer overflow in NFSv3 READ [ Upstream commit fa6be9cc6e80ec79892ddf08a8c10cabab9baf38 ] Since before the git era, NFSD has conserved the number of pages held by each nfsd thread by combining the RPC receive and send buffers into a single array of pages. This works because there are no cases where an operation needs a large RPC Call message and a large RPC Reply at the same time. Once an RPC Call has been received, svc_process() updates svc_rqst::rq_res to describe the part of rq_pages that can be used for constructing the Reply. This means that the send buffer (rq_res) shrinks when the received RPC record containing the RPC Call is large. A client can force this shrinkage on TCP by sending a correctly- formed RPC Call header contained in an RPC record that is excessively large. The full maximum payload size cannot be constructed in that case. Cc: Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2007867c5874134f2271eb276398208070049dd3 Author: Chuck Lever Date: Thu Sep 1 15:10:18 2022 -0400 NFSD: Protect against send buffer overflow in NFSv2 READ [ Upstream commit 401bc1f90874280a80b93f23be33a0e7e2d1f912 ] Since before the git era, NFSD has conserved the number of pages held by each nfsd thread by combining the RPC receive and send buffers into a single array of pages. This works because there are no cases where an operation needs a large RPC Call message and a large RPC Reply at the same time. Once an RPC Call has been received, svc_process() updates svc_rqst::rq_res to describe the part of rq_pages that can be used for constructing the Reply. This means that the send buffer (rq_res) shrinks when the received RPC record containing the RPC Call is large. A client can force this shrinkage on TCP by sending a correctly- formed RPC Call header contained in an RPC record that is excessively large. The full maximum payload size cannot be constructed in that case. Cc: Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 57774b1526163766403167b7bf00b136fb103761 Author: Chuck Lever Date: Thu Sep 1 15:10:12 2022 -0400 NFSD: Protect against send buffer overflow in NFSv3 READDIR [ Upstream commit 640f87c190e0d1b2a0fcb2ecf6d2cd53b1c41991 ] Since before the git era, NFSD has conserved the number of pages held by each nfsd thread by combining the RPC receive and send buffers into a single array of pages. This works because there are no cases where an operation needs a large RPC Call message and a large RPC Reply message at the same time. Once an RPC Call has been received, svc_process() updates svc_rqst::rq_res to describe the part of rq_pages that can be used for constructing the Reply. This means that the send buffer (rq_res) shrinks when the received RPC record containing the RPC Call is large. A client can force this shrinkage on TCP by sending a correctly- formed RPC Call header contained in an RPC record that is excessively large. The full maximum payload size cannot be constructed in that case. Thanks to Aleksi Illikainen and Kari Hulkko for uncovering this issue. Reported-by: Ben Ronallo Cc: Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0e57d696f60dee6117a8ace0cac7c5761d375277 Author: Chuck Lever Date: Thu Sep 1 15:10:05 2022 -0400 NFSD: Protect against send buffer overflow in NFSv2 READDIR [ Upstream commit 00b4492686e0497fdb924a9d4c8f6f99377e176c ] Restore the previous limit on the @count argument to prevent a buffer overflow attack. Fixes: 53b1119a6e50 ("NFSD: Fix READDIR buffer overflow") Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2bd6f95ff9911d06d922fdbd49db9b52f6609182 Author: Chuck Lever Date: Fri Sep 2 18:18:16 2022 -0400 NFSD: Increase NFSD_MAX_OPS_PER_COMPOUND [ Upstream commit 80e591ce636f3ae6855a0ca26963da1fdd6d4508 ] When attempting an NFSv4 mount, a Solaris NFSv4 client builds a single large COMPOUND that chains a series of LOOKUPs to get to the pseudo filesystem root directory that is to be mounted. The Linux NFS server's current maximum of 16 operations per NFSv4 COMPOUND is not large enough to ensure that this works for paths that are more than a few components deep. Since NFSD_MAX_OPS_PER_COMPOUND is mostly a sanity check, and most NFSv4 COMPOUNDS are between 3 and 6 operations (thus they do not trigger any re-allocation of the operation array on the server), increasing this maximum should result in little to no impact. The ops array can get large now, so allocate it via vmalloc() to help ensure memory fragmentation won't cause an allocation failure. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216383 Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d40bef3801cdf7bbb13035067de16dd99b115593 Author: Christophe JAILLET Date: Thu Sep 1 07:27:19 2022 +0200 nfsd: Propagate some error code returned by memdup_user() [ Upstream commit 30a30fcc3fc1ad4c5d017c9fcb75dc8f59e7bdad ] Propagate the error code returned by memdup_user() instead of a hard coded -EFAULT. Suggested-by: Dan Carpenter Signed-off-by: Christophe JAILLET Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 490af5b07d854b4eb3cd46d58910230ab86a2103 Author: Christophe JAILLET Date: Thu Sep 1 07:27:11 2022 +0200 nfsd: Avoid some useless tests [ Upstream commit d44899b8bb0b919f923186c616a84f0e70e04772 ] memdup_user() can't return NULL, so there is no point for checking for it. Simplify some tests accordingly. Suggested-by: Dan Carpenter Signed-off-by: Christophe JAILLET Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cef1ab71ae3705fd8203d506866f00e6aea5a104 Author: Jinpeng Cui Date: Wed Aug 31 14:20:02 2022 +0000 NFSD: remove redundant variable status [ Upstream commit 4ab3442ca384a02abf8b1f2b3449a6c547851873 ] Return value directly from fh_verify() do_open_permission() exp_pseudoroot() instead of getting value from redundant variable status. Reported-by: Zeal Robot Signed-off-by: Jinpeng Cui Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 30b0e49a957435fcd184d1e39255a4f1f5c42960 Author: Olga Kornievskaia Date: Fri Aug 19 15:16:36 2022 -0400 NFSD enforce filehandle check for source file in COPY [ Upstream commit 754035ff79a14886e68c0c9f6fa80adb21f12b53 ] If the passed in filehandle for the source file in the COPY operation is not a regular file, the server MUST return NFS4ERR_WRONG_TYPE. Signed-off-by: Olga Kornievskaia Reviewed-by: Jeff Layton [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9dc20a662fb8389ad1e8c766bc0cb6a2028c2e82 Author: Wolfram Sang Date: Thu Aug 18 23:01:16 2022 +0200 lockd: move from strlcpy with unused retval to strscpy [ Upstream commit 97f8e62572555f8ad578d7b1739ba64d5d2cac0f ] Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 91eebaa181b5cfec8fca647bb33c6dc167051397 Author: Wolfram Sang Date: Thu Aug 18 23:01:14 2022 +0200 NFSD: move from strlcpy with unused retval to strscpy [ Upstream commit 72f78ae00a8e5d7abe13abac8305a300f6afd74b ] Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 57afda7bf248389290b81c33370a8e320df59ab8 Author: Al Viro Date: Sat Sep 10 22:14:02 2022 +0100 nfsd_splice_actor(): handle compound pages [ Upstream commit bfbfb6182ad1d7d184b16f25165faad879147f79 ] pipe_buffer might refer to a compound page (and contain more than a PAGE_SIZE worth of data). Theoretically it had been possible since way back, but nfsd_splice_actor() hadn't run into that until copy_page_to_iter() change. Fortunately, the only thing that changes for compound pages is that we need to stuff each relevant subpage in and convert the offset into offset in the first subpage. Acked-by: Chuck Lever Tested-by: Benjamin Coddington Fixes: f0f6b614f83d "copy_page_to_iter(): don't split high-order page in case of ITER_PIPE" Signed-off-by: Al Viro [ cel: "‘for’ loop initial declarations are only allowed in C99 or C11 mode" ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c7d320e62066039c5863dc637e616042fc387d05 Author: NeilBrown Date: Thu Sep 8 12:08:40 2022 +1000 NFSD: fix regression with setting ACLs. [ Upstream commit 00801cd92d91e94aa04d687f9bb9a9104e7c3d46 ] A recent patch moved ACL setting into nfsd_setattr(). Unfortunately it didn't work as nfsd_setattr() aborts early if iap->ia_valid is 0. Remove this test, and instead avoid calling notify_change() when ia_valid is 0. This means that nfsd_setattr() will now *always* lock the inode. Previously it didn't if only a ATTR_MODE change was requested on a symlink (see Commit 15b7a1b86d66 ("[PATCH] knfsd: fix setattr-on-symlink error return")). I don't think this change really matters. Fixes: c0cbe70742f4 ("NFSD: add posix ACLs to struct nfsd_attrs") Signed-off-by: NeilBrown Reviewed-by: Jeff Layton [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1f87122d348ea4e2c9f6286c86a9dc7c433d711f Author: Jeff Layton Date: Mon Aug 1 15:57:26 2022 -0400 lockd: detect and reject lock arguments that overflow [ Upstream commit 6930bcbfb6ceda63e298c6af6d733ecdf6bd4cde ] lockd doesn't currently vet the start and length in nlm4 requests like it should, and can end up generating lock requests with arguments that overflow when passed to the filesystem. The NLM4 protocol uses unsigned 64-bit arguments for both start and length, whereas struct file_lock tracks the start and end as loff_t values. By the time we get around to calling nlm4svc_retrieve_args, we've lost the information that would allow us to determine if there was an overflow. Start tracking the actual start and len for NLM4 requests in the nlm_lock. In nlm4svc_retrieve_args, vet these values to ensure they won't cause an overflow, and return NLM4_FBIG if they do. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=392 Reported-by: Jan Kasiak Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Cc: # 5.14+ Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b15656dfa283dad1b2a328753bdc1f5b19d1cf5c Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: discard fh_locked flag and fh_lock/fh_unlock [ Upstream commit dd8dd403d7b223cc77ee89d8d09caf045e90e648 ] As all inode locking is now fully balanced, fh_put() does not need to call fh_unlock(). fh_lock() and fh_unlock() are no longer used, so discard them. These are the only real users of ->fh_locked, so discard that too. Reviewed-by: Jeff Layton Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5a8d428f5e373f146ef76a004b38630e2c8bc30a Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: use (un)lock_inode instead of fh_(un)lock for file operations [ Upstream commit bb4d53d66e4b8c8b8e5634802262e53851a2d2db ] When locking a file to access ACLs and xattrs etc, use explicit locking with inode_lock() instead of fh_lock(). This means that the calls to fh_fill_pre/post_attr() are also explicit which improves readability and allows us to place them only where they are needed. Only the xattr calls need pre/post information. When locking a file we don't need I_MUTEX_PARENT as the file is not a parent of anything, so we can use inode_lock() directly rather than the inode_lock_nested() call that fh_lock() uses. Reviewed-by: Jeff Layton Signed-off-by: NeilBrown [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9ef325edeade4d5c67121553ea158fa17b215bb4 Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: use explicit lock/unlock for directory ops [ Upstream commit debf16f0c671cb8db154a9ebcd6014cfff683b80 ] When creating or unlinking a name in a directory use explicit inode_lock_nested() instead of fh_lock(), and explicit calls to fh_fill_pre_attrs() and fh_fill_post_attrs(). This is already done for renames, with lock_rename() as the explicit locking. Also move the 'fill' calls closer to the operation that might change the attributes. This way they are avoided on some error paths. For the v2-only code in nfsproc.c, the fill calls are not replaced as they aren't needed. Making the locking explicit will simplify proposed future changes to locking for directories. It also makes it easily visible exactly where pre/post attributes are used - not all callers of fh_lock() actually need the pre/post attributes. Reviewed-by: Jeff Layton Signed-off-by: NeilBrown [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 203f09fae4e21ee8fe14077f6081516722af12f4 Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: reduce locking in nfsd_lookup() [ Upstream commit 19d008b46941b8c668402170522e0f7a9258409c ] nfsd_lookup() takes an exclusive lock on the parent inode, but no callers want the lock and it may not be needed at all if the result is in the dcache. Change nfsd_lookup_dentry() to not take the lock, and call lookup_one_len_locked() which takes lock only if needed. nfsd4_open() currently expects the lock to still be held, but that isn't necessary as nfsd_validate_delegated_dentry() provides required guarantees without the lock. NOTE: NFSv4 requires directory changeinfo for OPEN even when a create wasn't requested and no change happened. Now that nfsd_lookup() doesn't use fh_lock(), we need to explicitly fill the attributes when no create happens. A new fh_fill_both_attrs() is provided for that task. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bedd266b1fe3a787cc0c3e8b3d5ecaac7a85232d Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: only call fh_unlock() once in nfsd_link() [ Upstream commit e18bcb33bc5b69bccc2b532075aa00bb49cc01c5 ] On non-error paths, nfsd_link() calls fh_unlock() twice. This is safe because fh_unlock() records that the unlock has been done and doesn't repeat it. However it makes the code a little confusing and interferes with changes that are planned for directory locking. So rearrange the code to ensure fh_unlock() is called exactly once if fh_lock() was called. Reviewed-by: Jeff Layton Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 77f83bc2ed03b3038e12b80dee5c46cf094590e5 Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: always drop directory lock in nfsd_unlink() [ Upstream commit b677c0c63a135a916493c064906582e9f3ed4802 ] Some error paths in nfsd_unlink() allow it to exit without unlocking the directory. This is not a problem in practice as the directory will be locked with an fh_put(), but it is untidy and potentially confusing. This allows us to remove all the fh_unlock() calls that are immediately after nfsd_unlink() calls. Reviewed-by: Jeff Layton Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 617f72a1aa6d9eb3e370c12deefaaac139f74003 Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning. [ Upstream commit 927bfc5600cd6333c9ef9f090f19e66b7d4c8ee1 ] nfsd_create() usually returns with the directory still locked. nfsd_symlink() usually returns with it unlocked. This is clumsy. Until recently nfsd_create() needed to keep the directory locked until ACLs and security label had been set. These are now set inside nfsd_create() (in nfsd_setattr()) so this need is gone. So change nfsd_create() and nfsd_symlink() to always unlock, and remove any fh_unlock() calls that follow calls to these functions. Signed-off-by: NeilBrown [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c5409ce523af40d5c3019717bc5b4f72038d48be Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: add posix ACLs to struct nfsd_attrs [ Upstream commit c0cbe70742f4a70893cd6e5f6b10b6e89b6db95b ] pacl and dpacl pointers are added to struct nfsd_attrs, which requires that we have an nfsd_attrs_free() function to free them. Those nfsv4 functions that can set ACLs now set up these pointers based on the passed in NFSv4 ACL. nfsd_setattr() sets the acls as appropriate. Errors are handled as with security labels. Signed-off-by: NeilBrown [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 18ee0869d6f3357043ba6a191edc3b53655e4942 Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: add security label to struct nfsd_attrs [ Upstream commit d6a97d3f589a3a46a16183e03f3774daee251317 ] nfsd_setattr() now sets a security label if provided, and nfsv4 provides it in the 'open' and 'create' paths and the 'setattr' path. If setting the label failed (including because the kernel doesn't support labels), an error field in 'struct nfsd_attrs' is set, and the caller can respond. The open/create callers clear FATTR4_WORD2_SECURITY_LABEL in the returned attr set in this case. The setattr caller returns the error. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2a5642abeb7249beb529ea31a3c000db9c51b729 Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: set attributes when creating symlinks [ Upstream commit 93adc1e391a761441d783828b93979b38093d011 ] The NFS protocol includes attributes when creating symlinks. Linux does store attributes for symlinks and allows them to be set, though they are not used for permission checking. NFSD currently doesn't set standard (struct iattr) attributes when creating symlinks, but for NFSv4 it does set ACLs and security labels. This is inconsistent. To improve consistency, pass the provided attributes into nfsd_symlink() and call nfsd_create_setattr() to set them. NOTE: this results in a behaviour change for all NFS versions when the client sends non-default attributes with a SYMLINK request. With the Linux client, the only attributes are: attr.ia_mode = S_IFLNK | S_IRWXUGO; attr.ia_valid = ATTR_MODE; so the final outcome will be unchanged. Other clients might sent different attributes, and if they did they probably expect them to be honoured. We ignore any error from nfsd_create_setattr(). It isn't really clear what should be done if a file is successfully created, but the attributes cannot be set. NFS doesn't allow partial success to be reported. Reporting failure is probably more misleading than reporting success, so the status is ignored. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 45cf4b1bb10fc848657bdc69ffe014165d1780f4 Author: NeilBrown Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: introduce struct nfsd_attrs [ Upstream commit 7fe2a71dda349a1afa75781f0cc7975be9784d15 ] The attributes that nfsd might want to set on a file include 'struct iattr' as well as an ACL and security label. The latter two are passed around quite separately from the first, in part because they are only needed for NFSv4. This leads to some clumsiness in the code, such as the attributes NOT being set in nfsd_create_setattr(). We need to keep the directory locked until all attributes are set to ensure the file is never visibile without all its attributes. This need combined with the inconsistent handling of attributes leads to more clumsiness. As a first step towards tidying this up, introduce 'struct nfsd_attrs'. This is passed (by reference) to vfs.c functions that work with attributes, and is assembled by the various nfs*proc functions which call them. As yet only iattr is included, but future patches will expand this. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3aac39eaa675259e313ab5418fc3b6476f5b8712 Author: Jeff Layton Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: verify the opened dentry after setting a delegation [ Upstream commit 876c553cb41026cb6ad3cef970a35e5f69c42a25 ] Between opening a file and setting a delegation on it, someone could rename or unlink the dentry. If this happens, we do not want to grant a delegation on the open. On a CLAIM_NULL open, we're opening by filename, and we may (in the non-create case) or may not (in the create case) be holding i_rwsem when attempting to set a delegation. The latter case allows a race. After getting a lease, redo the lookup of the file being opened and validate that the resulting dentry matches the one in the open file description. To properly redo the lookup we need an rqst pointer to pass to nfsd_lookup_dentry(), so make sure that is available. Signed-off-by: Jeff Layton Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 820bf1383d66a4e2c5fce7277d6f99ee68e5f98a Author: Jeff Layton Date: Tue Jul 26 16:45:30 2022 +1000 NFSD: drop fh argument from alloc_init_deleg [ Upstream commit bbf936edd543e7220f60f9cbd6933b916550396d ] Currently, we pass the fh of the opened file down through several functions so that alloc_init_deleg can pass it to delegation_blocked. The filehandle of the open file is available in the nfs4_file however, so there's no need to pass it in a separate argument. Drop the argument from alloc_init_deleg, nfs4_open_delegation and nfs4_set_delegation. Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c62dcf86332e030dc7c427cf5582a769926e2a1e Author: Chuck Lever Date: Wed Jul 27 14:41:18 2022 -0400 NFSD: Move copy offload callback arguments into a separate structure [ Upstream commit a11ada99ce93a79393dc6683d22f7915748c8f6b ] Refactor so that CB_OFFLOAD arguments can be passed without allocating a whole struct nfsd4_copy object. On my system (x86_64) this removes another 96 bytes from struct nfsd4_copy. [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e1d1b6574e7b7bb593b1c612379f6632fa90ea30 Author: Chuck Lever Date: Wed Jul 27 14:41:12 2022 -0400 NFSD: Add nfsd4_send_cb_offload() [ Upstream commit e72f9bc006c08841c46d27747a4debc747a8fe13 ] Refactor for legibility. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d87486acbd6eef1d5ebc997d70115007594b7f6d Author: Chuck Lever Date: Wed Jul 27 14:41:06 2022 -0400 NFSD: Remove kmalloc from nfsd4_do_async_copy() [ Upstream commit ad1e46c9b07b13659635ee5405f83ad0df143116 ] Instead of manufacturing a phony struct nfsd_file, pass the struct file returned by nfs42_ssc_open() directly to nfsd4_do_copy(). [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a860bd179e7a0483f4d4ad3df57ffe5f7775afc9 Author: Chuck Lever Date: Wed Jul 27 14:40:59 2022 -0400 NFSD: Refactor nfsd4_do_copy() [ Upstream commit 3b7bf5933cada732783554edf0dc61283551c6cf ] Refactor: Now that nfsd4_do_copy() no longer calls the cleanup helpers, plumb the use of struct file pointers all the way down to _nfsd_copy_file_range(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8153ed38cc9d43ccb40d5bc4dcd3046e9f3f5be7 Author: Chuck Lever Date: Wed Jul 27 14:40:53 2022 -0400 NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2) [ Upstream commit 478ed7b10d875da2743d1a22822b9f8a82df8f12 ] Move the nfsd4_cleanup_*() call sites out of nfsd4_do_copy(). A subsequent patch will modify one of the new call sites to avoid the need to manufacture the phony struct nfsd_file. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0d592d96d6c60de28ff8877e75d55ea39b29f33f Author: Chuck Lever Date: Wed Jul 27 14:40:47 2022 -0400 NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2) [ Upstream commit 24d796ea383b8a4c8234e06d1b14bbcd371192ea ] The @src parameter is sometimes a pointer to a struct nfsd_file and sometimes a pointer to struct file hiding in a phony struct nfsd_file. Refactor nfsd4_cleanup_inter_ssc() so the @src parameter is always an explicit struct file. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ac774e1eebe8cc93a29530252cbc97a4c21fab23 Author: Chuck Lever Date: Wed Jul 27 14:40:41 2022 -0400 NFSD: Replace boolean fields in struct nfsd4_copy [ Upstream commit 1913cdf56cb5bfbc8170873728d13598cbecda23 ] Clean up: saves 8 bytes, and we can replace check_and_set_stop_copy() with an atomic bitop. [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 627b896c5219ed1f973a594fd9c0bebcc4f0d89f Author: Chuck Lever Date: Wed Jul 27 14:40:35 2022 -0400 NFSD: Make nfs4_put_copy() static [ Upstream commit 8ea6e2c90bb0eb74a595a12e23a1dff9abbc760a ] Clean up: All call sites are in fs/nfsd/nfs4proc.c. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0d7e3df76b50276db781ce7a2a09df74b2147a61 Author: Chuck Lever Date: Wed Jul 27 14:40:28 2022 -0400 NFSD: Reorder the fields in struct nfsd4_op [ Upstream commit d314309425ad5dc1b6facdb2d456580fb5fa5e3a ] Pack the fields to reduce the size of struct nfsd4_op, which is used an array in struct nfsd4_compoundargs. sizeof(struct nfsd4_op): Before: /* size: 672, cachelines: 11, members: 5 */ After: /* size: 640, cachelines: 10, members: 5 */ Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 94fd87568e91dad8641f4e648b1bfd782652a245 Author: Chuck Lever Date: Wed Jul 27 14:40:22 2022 -0400 NFSD: Shrink size of struct nfsd4_copy [ Upstream commit 87689df694916c40e8e6c179ab1c8710f65cb6c6 ] struct nfsd4_copy is part of struct nfsd4_op, which resides in an 8-element array. sizeof(struct nfsd4_op): Before: /* size: 1696, cachelines: 27, members: 5 */ After: /* size: 672, cachelines: 11, members: 5 */ Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7c6fd14057a7dab2032aa617870a49f5b05b867e Author: Chuck Lever Date: Wed Jul 27 14:40:16 2022 -0400 NFSD: Shrink size of struct nfsd4_copy_notify [ Upstream commit 09426ef2a64ee189ca1e3298f1e874842dbf35ea ] struct nfsd4_copy_notify is part of struct nfsd4_op, which resides in an 8-element array. sizeof(struct nfsd4_op): Before: /* size: 2208, cachelines: 35, members: 5 */ After: /* size: 1696, cachelines: 27, members: 5 */ Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 02bc4d514c2553ca2461102bb960635e4b844593 Author: Chuck Lever Date: Wed Jul 27 14:40:09 2022 -0400 NFSD: nfserrno(-ENOMEM) is nfserr_jukebox [ Upstream commit bb4d842722b84a2731257054b6405f2d866fc5f3 ] Suggested-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8ce03085cc536770516af77a5ac84b8f171ead99 Author: Chuck Lever Date: Wed Jul 27 14:40:03 2022 -0400 NFSD: Fix strncpy() fortify warning [ Upstream commit 5304877936c0a67e1a01464d113bae4c81eacdb6 ] In function ‘strncpy’, inlined from ‘nfsd4_ssc_setup_dul’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1392:3, inlined from ‘nfsd4_interssc_connect’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1489:11: /home/cel/src/linux/manet/include/linux/fortify-string.h:52:33: warning: ‘__builtin_strncpy’ specified bound 63 equals destination size [-Wstringop-truncation] 52 | #define __underlying_strncpy __builtin_strncpy | ^ /home/cel/src/linux/manet/include/linux/fortify-string.h:89:16: note: in expansion of macro ‘__underlying_strncpy’ 89 | return __underlying_strncpy(p, q, size); | ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0a1b9a216f7f3f19cd85ebf4f812cc4bfc9e583b Author: Chuck Lever Date: Fri Jul 22 16:09:23 2022 -0400 NFSD: Clean up nfsd4_encode_readlink() [ Upstream commit 99b002a1fa00d90e66357315757e7277447ce973 ] Similar changes to nfsd4_encode_readv(), all bundled into a single patch. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c7863472e57e19570b5fbae08f88b270f45cafd9 Author: Chuck Lever Date: Fri Jul 22 16:09:16 2022 -0400 NFSD: Use xdr_pad_size() [ Upstream commit 5e64d85c7d0c59cfcd61d899720b8ccfe895d743 ] Clean up: Use a helper instead of open-coding the calculation of the XDR pad size. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c587004a7634d02d1aec63cd94c50a1a8bcd8347 Author: Chuck Lever Date: Fri Jul 22 16:09:10 2022 -0400 NFSD: Simplify starting_len [ Upstream commit 071ae99feadfc55979f89287d6ad2c6a315cb46d ] Clean-up: Now that nfsd4_encode_readv() does not have to encode the EOF or rd_length values, it no longer needs to subtract 8 from @starting_len. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e77d3f5ee50fac3d3b1e31705af68fc2a8da24fb Author: Chuck Lever Date: Fri Jul 22 16:09:04 2022 -0400 NFSD: Optimize nfsd4_encode_readv() [ Upstream commit 28d5bc468efe74b790e052f758ce083a5015c665 ] write_bytes_to_xdr_buf() is pretty expensive to use for inserting an XDR data item that is always 1 XDR_UNIT at an address that is always XDR word-aligned. Since both the readv and splice read paths encode EOF and maxcount values, move both to a common code path. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d176e7348bd05822aee549e3e99326eef6940a86 Author: Chuck Lever Date: Fri Jul 22 16:08:57 2022 -0400 NFSD: Add an nfsd4_read::rd_eof field [ Upstream commit 24c7fb85498eda1d4c6b42cc4886328429814990 ] Refactor: Make the EOF result available in the entire NFSv4 READ path. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 427bd174a4d345bd0e8f9d9b34a45fdd073d795b Author: Chuck Lever Date: Fri Jul 22 16:08:51 2022 -0400 NFSD: Clean up SPLICE_OK in nfsd4_encode_read() [ Upstream commit c738b218a2e5a753a336b4b7fee6720b902c7ace ] Do the test_bit() once -- this reduces the number of locked-bus operations and makes the function a little easier to read. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8fd87bf897bcb3af491f0d6ad9586d8f041db679 Author: Chuck Lever Date: Fri Jul 22 16:08:45 2022 -0400 NFSD: Optimize nfsd4_encode_fattr() [ Upstream commit ab04de60ae1cc64ae16b77feae795311b97720c7 ] write_bytes_to_xdr_buf() is a generic way to place a variable-length data item in an already-reserved spot in the encoding buffer. However, it is costly. In nfsd4_encode_fattr(), it is unnecessary because the data item is fixed in size and the buffer destination address is always word-aligned. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d8c3d704085cf197f95289029bd927e4c5fa9163 Author: Chuck Lever Date: Fri Jul 22 16:08:38 2022 -0400 NFSD: Optimize nfsd4_encode_operation() [ Upstream commit 095a764b7afb06c9499b798c04eaa3cbf70ebe2d ] write_bytes_to_xdr_buf() is a generic way to place a variable-length data item in an already-reserved spot in the encoding buffer. However, it is costly, and here, it is unnecessary because the data item is fixed in size, the buffer destination address is always word-aligned, and the destination location is already in @p. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3b5dcf6b46d99801049240111f78db0d3435c265 Author: Jeff Layton Date: Wed Jul 20 08:39:23 2022 -0400 nfsd: silence extraneous printk on nfsd.ko insertion [ Upstream commit 3a5940bfa17fb9964bf9688b4356ca643a8f5e2d ] This printk pops every time nfsd.ko gets plugged in. Most kmods don't do that and this one is not very informative. Olaf's email address seems to be defunct at this point anyway. Just drop it. Cc: Olaf Kirch Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f81ab23756abfc1829f20aaf6469f76d0ed4326c Author: Dai Ngo Date: Fri Jul 15 16:54:53 2022 -0700 NFSD: limit the number of v4 clients to 1024 per 1GB of system memory [ Upstream commit 4271c2c0887562318a0afef97d32d8a71cbe0743 ] Currently there is no limit on how many v4 clients are supported by the system. This can be a problem in systems with small memory configuration to function properly when a very large number of clients exist that creates memory shortage conditions. This patch enforces a limit of 1024 NFSv4 clients, including courtesy clients, per 1GB of system memory. When the number of the clients reaches the limit, requests that create new clients are returned with NFS4ERR_DELAY and the laundromat is kicked start to trim old clients. Due to the overhead of the upcall to remove the client record, the maximun number of clients the laundromat removes on each run is limited to 128. This is done to ensure the laundromat can still process the other tasks in a timely manner. Since there is now a limit of the number of clients, the 24-hr idle time limit of courtesy client is no longer needed and was removed. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ec16f5f7faaa6e0cf602d0c6ebe678a332e9dbea Author: Dai Ngo Date: Fri Jul 15 16:54:52 2022 -0700 NFSD: keep track of the number of v4 clients in the system [ Upstream commit 0926c39515aa065a296e97dfc8790026f1e53f86 ] Add counter nfs4_client_count to keep track of the total number of v4 clients, including courtesy clients, in the system. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4e7a739f6372422a38a69dce96413641f9d623b2 Author: Dai Ngo Date: Fri Jul 15 16:54:51 2022 -0700 NFSD: refactoring v4 specific code to a helper in nfs4state.c [ Upstream commit 6867137ebcf4155fe25f2ecf7c29b9fb90a76d1d ] This patch moves the v4 specific code from nfsd_init_net() to nfsd4_init_leases_net() helper in nfs4state.c Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 705e2cb1fec053c0354943c3c10d6d95e71c21e7 Author: Chuck Lever Date: Fri Jul 8 14:27:09 2022 -0400 NFSD: Ensure nf_inode is never dereferenced [ Upstream commit 427f5f83a3191cbf024c5aea6e5b601cdf88d895 ] The documenting comment for struct nf_file states: /* * A representation of a file that has been opened by knfsd. These are hashed * in the hashtable by inode pointer value. Note that this object doesn't * hold a reference to the inode by itself, so the nf_inode pointer should * never be dereferenced, only used for comparison. */ Replace the two existing dereferences to make the comment always true. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 451b2c2125dfa98d1a050bd149a8e10a1288ac94 Author: Chuck Lever Date: Fri Jul 8 14:27:02 2022 -0400 NFSD: NFSv4 CLOSE should release an nfsd_file immediately [ Upstream commit 5e138c4a750dc140d881dab4a8804b094bbc08d2 ] The last close of a file should enable other accessors to open and use that file immediately. Leaving the file open in the filecache prevents other users from accessing that file until the filecache garbage-collects the file -- sometimes that takes several seconds. Reported-by: Wang Yugui Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387 Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c553e79c08038cd92b61a2d38549a4f5b865512e Author: Chuck Lever Date: Fri Jul 8 14:26:49 2022 -0400 NFSD: Move nfsd_file_trace_alloc() tracepoint [ Upstream commit b40a2839470cd62ed68c4a32d72a18ee8975b1ac ] Avoid recording the allocation of an nfsd_file item that is immediately released because a matching item was already inserted in the hash. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 26664203ddebce2cbadcca756cc114ea07953301 Author: Chuck Lever Date: Fri Jul 8 14:26:43 2022 -0400 NFSD: Separate tracepoints for acquire and create [ Upstream commit be0230069fcbf7d332d010b57c1d0cfd623a84d6 ] These tracepoints collect different information: the create case does not open a file, so there's no nf_file available. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit de070f66d23ff90e4e25122d5809bd21d2441e52 Author: Chuck Lever Date: Fri Jul 8 14:26:36 2022 -0400 NFSD: Clean up unused code after rhashtable conversion [ Upstream commit 0ec8e9d1539a7b8109a554028bbce441052f847e ] Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a174ce98b302958394d8fdbc1a5b8b30dadcdd72 Author: Chuck Lever Date: Fri Jul 8 14:26:30 2022 -0400 NFSD: Convert the filecache to use rhashtable [ Upstream commit ce502f81ba884c1fe45dc0ebddbcaaa4ec0fc5fb ] Enable the filecache hash table to start small, then grow with the workload. Smaller server deployments benefit because there should be lower memory utilization. Larger server deployments should see improved scaling with the number of open files. Suggested-by: Jeff Layton Suggested-by: Dave Chinner Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ebe886ac37d2d566a1a310d75cc47fe65dae4f14 Author: Chuck Lever Date: Fri Jul 8 14:26:23 2022 -0400 NFSD: Set up an rhashtable for the filecache [ Upstream commit fc22945ecc2a0a028f3683115f98a922d506c284 ] Add code to initialize and tear down an rhashtable. The rhashtable is not used yet. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1ea9b51f738c27ca81447a01565a5b9a4045616e Author: Chuck Lever Date: Fri Jul 8 14:26:16 2022 -0400 NFSD: Replace the "init once" mechanism [ Upstream commit c7b824c3d06c85e054caf86e227255112c5e3c38 ] In a moment, the nfsd_file_hashtbl global will be replaced with an rhashtable. Replace the one or two spots that need to check if the hash table is available. We can easily reuse the SHUTDOWN flag for this purpose. Document that this mechanism relies on callers to hold the nfsd_mutex to prevent init, shutdown, and purging to run concurrently. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bbb260f3ce9ff499f841df36b83d890c2ea11ec4 Author: Chuck Lever Date: Fri Jul 8 14:26:10 2022 -0400 NFSD: Remove nfsd_file::nf_hashval [ Upstream commit f0743c2b25c65debd4f599a7c861428cd9de5906 ] The value in this field can always be computed from nf_inode, thus it is no longer used. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 12494d98fea94e4cedcd52e819c3fa1a8db176b5 Author: Chuck Lever Date: Fri Jul 8 14:26:03 2022 -0400 NFSD: nfsd_file_hash_remove can compute hashval [ Upstream commit cb7ec76e73ff6640241c8f1f2f35c81d4005a2d6 ] Remove an unnecessary use of nf_hashval. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 10ba39f788868da1a20c9a32e3276a5364681c16 Author: Chuck Lever Date: Fri Jul 8 14:25:57 2022 -0400 NFSD: Refactor __nfsd_file_close_inode() [ Upstream commit a845511007a63467fee575353c706806c21218b1 ] The code that computes the hashval is the same in both callers. To prevent them from going stale, reframe the documenting comments to remove descriptions of the underlying hash table structure, which is about to be replaced. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a86953523ea9608e233d24a91cd5233406bdb638 Author: Chuck Lever Date: Fri Jul 8 14:25:50 2022 -0400 NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode [ Upstream commit 8755326399f471ec3b31e2ab8c5074c0d28a0fb5 ] Remove an unnecessary usage of nf_hashval. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ef7fe4908a1a6187246becc6d1e7ae74c38aa8b4 Author: Chuck Lever Date: Fri Jul 8 14:25:44 2022 -0400 NFSD: Remove lockdep assertion from unhash_and_release_locked() [ Upstream commit f53cef15dddec7203df702cdc62e554190385450 ] IIUC, holding the hash bucket lock is needed only in nfsd_file_unhash, and there is already a lockdep assertion there. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 525c2c81fdccbac305ca10ec76f648321941a4d9 Author: Chuck Lever Date: Fri Jul 8 14:25:37 2022 -0400 NFSD: No longer record nf_hashval in the trace log [ Upstream commit 54f7df7094b329ca35d9f9808692bb16c48b13e9 ] I'm about to replace nfsd_file_hashtbl with an rhashtable. The individual hash values will no longer be visible or relevant, so remove them from the tracepoints. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 99735b8d82d1d9defb439d0125f1654d96fd7a91 Author: Chuck Lever Date: Fri Jul 8 14:25:30 2022 -0400 NFSD: Never call nfsd_file_gc() in foreground paths [ Upstream commit 6df19411367a5fb4ef61854cbd1af269c077f917 ] The checks in nfsd_file_acquire() and nfsd_file_put() that directly invoke filecache garbage collection are intended to keep cache occupancy between a low- and high-watermark. The reason to limit the capacity of the filecache is to keep filecache lookups reasonably fast. However, invoking garbage collection at those points has some undesirable negative impacts. Files that are held open by NFSv4 clients often push the occupancy of the filecache over these watermarks. At that point: - Every call to nfsd_file_acquire() and nfsd_file_put() results in an LRU walk. This has the same effect on lookup latency as long chains in the hash table. - Garbage collection will then run on every nfsd thread, causing a lot of unnecessary lock contention. - Limiting cache capacity pushes out files used only by NFSv3 clients, which are the type of files the filecache is supposed to help. To address those negative impacts, remove the direct calls to the garbage collector. Subsequent patches will address maintaining lookup efficiency as cache capacity increases. Suggested-by: Wang Yugui Suggested-by: Dave Chinner Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 586e8d6c3dc3f88e0491458c05740aa0abca5f42 Author: Chuck Lever Date: Fri Jul 8 14:25:24 2022 -0400 NFSD: Fix the filecache LRU shrinker [ Upstream commit edead3a55804739b2e4af0f35e9c7326264e7b22 ] Without LRU item rotation, the shrinker visits only a few items on the end of the LRU list, and those would always be long-term OPEN files for NFSv4 workloads. That makes the filecache shrinker completely ineffective. Adopt the same strategy as the inode LRU by using LRU_ROTATE. Suggested-by: Dave Chinner Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 51fc2b2c797108ba058633a4c0fe674fcabbe8a8 Author: Chuck Lever Date: Fri Jul 8 14:25:17 2022 -0400 NFSD: Leave open files out of the filecache LRU [ Upstream commit 4a0e73e635e3f36b616ad5c943e3d23debe4632f ] There have been reports of problems when running fstests generic/531 against Linux NFS servers with NFSv4. The NFS server that hosts the test's SCRATCH_DEV suffers from CPU soft lock-ups during the test. Analysis shows that: fs/nfsd/filecache.c 482 ret = list_lru_walk(&nfsd_file_lru, 483 nfsd_file_lru_cb, 484 &head, LONG_MAX); causes nfsd_file_gc() to walk the entire length of the filecache LRU list every time it is called (which is quite frequently). The walk holds a spinlock the entire time that prevents other nfsd threads from accessing the filecache. What's more, for NFSv4 workloads, none of the items that are visited during this walk may be evicted, since they are all files that are held OPEN by NFS clients. Address this by ensuring that open files are not kept on the LRU list. Reported-by: Frank van der Linden Reported-by: Wang Yugui Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386 Suggested-by: Trond Myklebust Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c15db0869e97cc8d13ff536d2ab59b33b5d2cd95 Author: Chuck Lever Date: Fri Jul 8 14:25:11 2022 -0400 NFSD: Trace filecache LRU activity [ Upstream commit c46203acddd9b9200dbc53d0603c97355fd3a03b ] Observe the operation of garbage collection and the lifetime of filecache items. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7cca6908fa148d6dedc9d78d673b2f8d7889b8af Author: Chuck Lever Date: Fri Jul 8 14:25:04 2022 -0400 NFSD: WARN when freeing an item still linked via nf_lru [ Upstream commit 668ed92e651d3c25f9b6e8cb7ceca54d00daa96d ] Add a guardrail to prevent freeing memory that is still on a list. This includes either a dispose list or the LRU list. This is the sign of a bug, but this class of bugs can be detected so that they don't endanger system stability, especially while debugging. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0c426d4621c862ba9975573ed42f349c4918f8d3 Author: Chuck Lever Date: Fri Jul 8 14:24:58 2022 -0400 NFSD: Hook up the filecache stat file [ Upstream commit 2e6c6e4c4375bfd3defa5b1ff3604d9f33d1c936 ] There has always been the capability of exporting filecache metrics via /proc, but it was never hooked up. Let's surface these metrics to enable better observability of the filecache. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6dc5cab80881746e3f97227718723e3c426148af Author: Chuck Lever Date: Fri Jul 8 14:24:51 2022 -0400 NFSD: Zero counters when the filecache is re-initialized [ Upstream commit 8b330f78040cbe16cf8029df70391b2a491f17e2 ] If nfsd_file_cache_init() is called after a shutdown, be sure the stat counters are reset. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 04b9376a106f2004777d6cbfb7c07573ebf06207 Author: Chuck Lever Date: Fri Jul 8 14:24:45 2022 -0400 NFSD: Record number of flush calls [ Upstream commit df2aff524faceaf743b7c5ab0f4fb86cb511f782 ] Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2cba48b3d0a000ff119fb98b380ce27911fdcabc Author: Chuck Lever Date: Fri Jul 8 14:24:38 2022 -0400 NFSD: Report the number of items evicted by the LRU walk [ Upstream commit 94660cc19c75083af046b0f8362e3d3bc2eba21d ] Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit af057e5884adc6100720a515b6815eb67ccb4d49 Author: Chuck Lever Date: Fri Jul 8 14:24:31 2022 -0400 NFSD: Refactor nfsd_file_lru_scan() [ Upstream commit 39f1d1ff8148902c5692ffb0e1c4479416ab44a7 ] Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e7d5efd20ea92d5648056cdf570bc6dcb5873744 Author: Chuck Lever Date: Fri Jul 8 14:24:25 2022 -0400 NFSD: Refactor nfsd_file_gc() [ Upstream commit 3bc6d3470fe412f818f9bff6b71d1be3a76af8f3 ] Refactor nfsd_file_gc() to use the new list_lru helper. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8d038e72e7ad559fe02a759557f2d535aca595ad Author: Chuck Lever Date: Fri Jul 8 14:24:18 2022 -0400 NFSD: Add nfsd_file_lru_dispose_list() helper [ Upstream commit 0bac5a264d9a923f5b01f3521e1519a8d0358342 ] Refactor the invariant part of nfsd_file_lru_walk_list() into a separate helper function. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d176e98400712387d7dc36a91026f0827dbf84b2 Author: Chuck Lever Date: Fri Jul 8 14:24:12 2022 -0400 NFSD: Report average age of filecache items [ Upstream commit 904940e94a887701db24401e3ed6928a1d4e329f ] This is a measure of how long items stay in the filecache, to help assess how efficient the cache is. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ca9cc17ec04f50c9ac46f625d74f382b0844d75e Author: Chuck Lever Date: Fri Jul 8 14:24:05 2022 -0400 NFSD: Report count of freed filecache items [ Upstream commit d63293272abb51c02457f1017dfd61c3270d9ae3 ] Surface the count of freed nfsd_file items. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a38dff5964f30710b2fad3fb4cc17a2b9f19d69a Author: Chuck Lever Date: Fri Jul 8 14:23:59 2022 -0400 NFSD: Report count of calls to nfsd_file_acquire() [ Upstream commit 29d4bdbbb910f33d6058d2c51278f00f656df325 ] Count the number of successful acquisitions that did not create a file (ie, acquisitions that do not result in a compulsory cache miss). This count can be compared directly with the reported hit count to compute a hit ratio. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 91c03a61241f7e42fad58f28d4b8f02b3d52de87 Author: Chuck Lever Date: Fri Jul 8 14:23:52 2022 -0400 NFSD: Report filecache LRU size [ Upstream commit 0fd244c115f0321fc5e34ad2291f2a572508e3f7 ] Surface the NFSD filecache's LRU list length to help field troubleshooters monitor filecache issues. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4ff0e22e547e5201ee5c321db8f31472abca1bec Author: Chuck Lever Date: Fri Jul 8 14:23:45 2022 -0400 NFSD: Demote a WARN to a pr_warn() [ Upstream commit ca3f9acb6d3faf78da2b63324f7c737dbddf7f69 ] The call trace doesn't add much value, but it sure is noisy. Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cc3b111e3b020edf1ef68b4864a26cd122b1f44d Author: Colin Ian King Date: Tue Jun 28 22:25:25 2022 +0100 nfsd: remove redundant assignment to variable len [ Upstream commit 842e00ac3aa3b4a4f7f750c8ab54f8578fc875d3 ] Variable len is being assigned a value zero and this is never read, it is being re-assigned later. The assignment is redundant and can be removed. Cleans up clang scan-build warning: fs/nfsd/nfsctl.c:636:2: warning: Value stored to 'len' is never read Signed-off-by: Colin Ian King Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0a18cd2b946b544371fc1f0d6433c279f8c7e35a Author: Zhang Jiaming Date: Thu Jun 23 16:20:05 2022 +0800 NFSD: Fix space and spelling mistake [ Upstream commit f532c9ff103897be0e2a787c0876683c3dc39ed3 ] Add a blank space after ','. Change 'succesful' to 'successful'. Signed-off-by: Zhang Jiaming Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b5b79fc3ff4f74060eac092fd2ebd7046aa9788e Author: Benjamin Coddington Date: Mon Jun 13 09:40:06 2022 -0400 NLM: Defend against file_lock changes after vfs_test_lock() [ Upstream commit 184cefbe62627730c30282df12bcff9aae4816ea ] Instead of trusting that struct file_lock returns completely unchanged after vfs_test_lock() when there's no conflicting lock, stash away our nlm_lockowner reference so we can properly release it for all cases. This defends against another file_lock implementation overwriting fl_owner when the return type is F_UNLCK. Reported-by: Roberto Bergantinos Corpas Tested-by: Roberto Bergantinos Corpas Signed-off-by: Benjamin Coddington Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 16acc0677f801d98b2de7f83f0f3389ce36c2806 Author: Chuck Lever Date: Tue Jul 19 09:18:35 2022 -0400 SUNRPC: Fix xdr_encode_bool() [ Upstream commit c770f31d8f580ed4b965c64f924ec1cc50e41734 ] I discovered that xdr_encode_bool() was returning the same address that was passed in the @p parameter. The documenting comment states that the intent is to return the address of the next buffer location, just like the other "xdr_encode_*" helpers. The result was the encoded results of NFSv3 PATHCONF operations were not formed correctly. Fixes: ded04a587f6c ("NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream") Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bcaac325dd955ba808f507a0baf0aaf444e1db50 Author: Jeff Layton Date: Fri Jul 29 17:01:07 2022 -0400 nfsd: eliminate the NFSD_FILE_BREAK_* flags [ Upstream commit 23ba98de6dcec665e15c0ca19244379bb0d30932 ] We had a report from the spring Bake-a-thon of data corruption in some nfstest_interop tests. Looking at the traces showed the NFS server allowing a v3 WRITE to proceed while a read delegation was still outstanding. Currently, we only set NFSD_FILE_BREAK_* flags if NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc. NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for COMMIT ops, where we need a writeable filehandle but don't need to break read leases. It doesn't make any sense to consult that flag when allocating a file since the file may be used on subsequent calls where we do want to break the lease (and the usage of it here seems to be reverse from what it should be anyway). Also, after calling nfsd_open_break_lease, we don't want to clear the BREAK_* bits. A lease could end up being set on it later (more than once) and we need to be able to break those leases as well. This means that the NFSD_FILE_BREAK_* flags now just mirror NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just drop those flags and unconditionally call nfsd_open_break_lease every time. Reported-by: Olga Kornieskaia Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360 Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd) Cc: # 5.4.x : bb283ca18d1e NFSD: Clean up the show_nf_flags() macro Cc: # 5.4.x Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 302ae1fb80a3689397747cf7da06d70ea9fb5aab Author: Xin Gao Date: Sat Jul 23 03:46:39 2022 +0800 fsnotify: Fix comment typo [ Upstream commit feee1ce45a5666bbdb08c5bb2f5f394047b1915b ] The double `if' is duplicated in line 104, remove one. Signed-off-by: Xin Gao Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220722194639.18545-1-gaoxin@cdjrlc.com Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 85c640adf9fc204eb770d9789ea689bcc56d288a Author: Amir Goldstein Date: Wed Jun 29 17:42:10 2022 +0300 fanotify: introduce FAN_MARK_IGNORE [ Upstream commit e252f2ed1c8c6c3884ab5dd34e003ed21f1fe6e0 ] This flag is a new way to configure ignore mask which allows adding and removing the event flags FAN_ONDIR and FAN_EVENT_ON_CHILD in ignore mask. The legacy FAN_MARK_IGNORED_MASK flag would always ignore events on directories and would ignore events on children depending on whether the FAN_EVENT_ON_CHILD flag was set in the (non ignored) mask. FAN_MARK_IGNORE can be used to ignore events on children without setting FAN_EVENT_ON_CHILD in the mark's mask and will not ignore events on directories unconditionally, only when FAN_ONDIR is set in ignore mask. The new behavior is non-downgradable. After calling fanotify_mark() with FAN_MARK_IGNORE once, calling fanotify_mark() with FAN_MARK_IGNORED_MASK on the same object will return EEXIST error. Setting the event flags with FAN_MARK_IGNORE on a non-dir inode mark has no meaning and will return ENOTDIR error. The meaning of FAN_MARK_IGNORED_SURV_MODIFY is preserved with the new FAN_MARK_IGNORE flag, but with a few semantic differences: 1. FAN_MARK_IGNORED_SURV_MODIFY is required for filesystem and mount marks and on an inode mark on a directory. Omitting this flag will return EINVAL or EISDIR error. 2. An ignore mask on a non-directory inode that survives modify could never be downgraded to an ignore mask that does not survive modify. With new FAN_MARK_IGNORE semantics we make that rule explicit - trying to update a surviving ignore mask without the flag FAN_MARK_IGNORED_SURV_MODIFY will return EEXIST error. The conveniene macro FAN_MARK_IGNORE_SURV is added for (FAN_MARK_IGNORE | FAN_MARK_IGNORED_SURV_MODIFY), because the common case should use short constant names. Link: https://lore.kernel.org/r/20220629144210.2983229-4-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 99a022c4bcbbcc2a9c53c706353081c542518bfe Author: Amir Goldstein Date: Wed Jun 29 17:42:09 2022 +0300 fanotify: cleanups for fanotify_mark() input validations [ Upstream commit 8afd7215aa97f8868d033f6e1d01a276ab2d29c0 ] Create helper fanotify_may_update_existing_mark() for checking for conflicts between existing mark flags and fanotify_mark() flags. Use variable mark_cmd to make the checks for mark command bits cleaner. Link: https://lore.kernel.org/r/20220629144210.2983229-3-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b8d06d1187961381e0c40c70f27f3ff79c323e3a Author: Amir Goldstein Date: Wed Jun 29 17:42:08 2022 +0300 fanotify: prepare for setting event flags in ignore mask [ Upstream commit 31a371e419c885e0f137ce70395356ba8639dc52 ] Setting flags FAN_ONDIR FAN_EVENT_ON_CHILD in ignore mask has no effect. The FAN_EVENT_ON_CHILD flag in mask implicitly applies to ignore mask and ignore mask is always implicitly applied to events on directories. Define a mark flag that replaces this legacy behavior with logic of applying the ignore mask according to event flags in ignore mask. Implement the new logic to prepare for supporting an ignore mask that ignores events on children and ignore mask that does not ignore events on directories. To emphasize the change in terminology, also rename ignored_mask mark member to ignore_mask and use accessors to get only the effective ignored events or the ignored events and flags. This change in terminology finally aligns with the "ignore mask" language in man pages and in most of the comments. Link: https://lore.kernel.org/r/20220629144210.2983229-2-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 71860cc4e4365496f748c999ef2a6915005d5b3a Author: Oliver Ford Date: Wed May 18 15:59:59 2022 +0100 fs: inotify: Fix typo in inotify comment [ Upstream commit c05787b4c2f80a3bebcb9cdbf255d4fa5c1e24e1 ] Correct spelling in comment. Signed-off-by: Oliver Ford Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220518145959.41-1-ojford@gmail.com Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 795f9fa1b50b6578eeb16c98cf90d50cd9a0961c Author: Jeff Layton Date: Mon Jul 11 14:30:14 2022 -0400 lockd: fix nlm_close_files [ Upstream commit 1197eb5906a5464dbaea24cac296dfc38499cc00 ] This loop condition tries a bit too hard to be clever. Just test for the two indices we care about explicitly. Cc: J. Bruce Fields Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file") Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 486c1acf14233b1ff7b37e6f026a737a2f7f53f1 Author: Jeff Layton Date: Mon Jul 11 14:30:13 2022 -0400 lockd: set fl_owner when unlocking files [ Upstream commit aec158242b87a43d83322e99bc71ab4428e5ab79 ] Unlocking a POSIX lock on an inode with vfs_lock_file only works if the owner matches. Ensure we set it in the request. Cc: J. Bruce Fields Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file") Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 845b309cf58604925de6fd3e4cc11c9d788ca0f7 Author: Chuck Lever Date: Sun Jul 10 14:46:04 2022 -0400 NFSD: Decode NFSv4 birth time attribute [ Upstream commit 5b2f3e0777da2a5dd62824bbe2fdab1d12caaf8f ] NFSD has advertised support for the NFSv4 time_create attribute since commit e377a3e698fb ("nfsd: Add support for the birth time attribute"). Igor Mammedov reports that Mac OS clients attempt to set the NFSv4 birth time attribute via OPEN(CREATE) and SETATTR if the server indicates that it supports it, but since the above commit was merged, those attempts now fail. Table 5 in RFC 8881 lists the time_create attribute as one that can be both set and retrieved, but the above commit did not add server support for clients to provide a time_create attribute. IMO that's a bug in our implementation of the NFSv4 protocol, which this commit addresses. Whether NFSD silently ignores the new birth time or actually sets it is another matter. I haven't found another filesystem service in the Linux kernel that enables users or clients to modify a file's birth time attribute. This commit reflects my (perhaps incorrect) understanding of whether Linux users can set a file's birth time. NFSD will now recognize a time_create attribute but it ignores its value. It clears the time_create bit in the returned attribute bitmask to indicate that the value was not used. Reported-by: Igor Mammedov Fixes: e377a3e698fb ("nfsd: Add support for the birth time attribute") Tested-by: Igor Mammedov Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 58f985d688aad7e3ee970e43377dfe8a79cf594c Author: NeilBrown Date: Thu Jun 23 14:47:34 2022 +1000 NFS: restore module put when manager exits. [ Upstream commit 080abad71e99d2becf38c978572982130b927a28 ] Commit f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module") removed calls to module_put_and_kthread_exit() from threads that acted as SUNRPC servers and had a related svc_serv_ops structure. This was correct. It ALSO removed the module_put_and_kthread_exit() call from nfs4_run_state_manager() which is NOT a SUNRPC service. Consequently every time the NFSv4 state manager runs the module count increments and won't be decremented. So the nfsv4 module cannot be unloaded. So restore the module_put_and_kthread_exit() call. Fixes: f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module") Signed-off-by: NeilBrown Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e9156a243175cd4f4b574709122dcef4cc14d829 Author: Amir Goldstein Date: Mon Jun 27 20:47:19 2022 +0300 fanotify: refine the validation checks on non-dir inode mask [ Upstream commit 8698e3bab4dd7968666e84e111d0bfd17c040e77 ] Commit ceaf69f8eadc ("fanotify: do not allow setting dirent events in mask of non-dir") added restrictions about setting dirent events in the mask of a non-dir inode mark, which does not make any sense. For backward compatibility, these restictions were added only to new (v5.17+) APIs. It also does not make any sense to set the flags FAN_EVENT_ON_CHILD or FAN_ONDIR in the mask of a non-dir inode. Add these flags to the dir-only restriction of the new APIs as well. Move the check of the dir-only flags for new APIs into the helper fanotify_events_supported(), which is only called for FAN_MARK_ADD, because there is no need to error on an attempt to remove the dir-only flags from non-dir inode. Fixes: ceaf69f8eadc ("fanotify: do not allow setting dirent events in mask of non-dir") Link: https://lore.kernel.org/linux-fsdevel/20220627113224.kr2725conevh53u4@quack3.lan/ Link: https://lore.kernel.org/r/20220627174719.2838175-1-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6943f1073abe910e7cd8c1383e7f330c4111b7a0 Author: Chuck Lever Date: Tue Jun 7 16:47:58 2022 -0400 SUNRPC: Optimize xdr_reserve_space() [ Upstream commit 62ed448cc53b654036f7d7f3c99f299d79ad14c3 ] Transitioning between encode buffers is quite infrequent. It happens about 1 time in 400 calls to xdr_reserve_space(), measured on NFSD with a typical build/test workload. Force the compiler to remove that code from xdr_reserve_space(), which is a hot path on both the server and the client. This change reduces the size of xdr_reserve_space() from 10 cache lines to 2 when compiled with -Os. Signed-off-by: Chuck Lever Reviewed-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ada1757b259f353cade47037ee0a0249b4cddad3 Author: Chuck Lever Date: Tue May 31 19:49:01 2022 -0400 NFSD: Fix potential use-after-free in nfsd_file_put() [ Upstream commit b6c71c66b0ad8f2b59d9bc08c7a5079b110bec01 ] nfsd_file_put_noref() can free @nf, so don't dereference @nf immediately upon return from nfsd_file_put_noref(). Suggested-by: Trond Myklebust Fixes: 999397926ab3 ("nfsd: Clean up nfsd_file_put()") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4862b618860332d23218f2caef372d1a6d251e97 Author: Chuck Lever Date: Wed May 11 13:02:21 2022 -0400 NFSD: nfsd_file_put() can sleep [ Upstream commit 08af54b3e5729bc1d56ad3190af811301bdc37a1 ] Now that there are no more callers of nfsd_file_put() that might hold a spin lock, ensure the lockdep infrastructure can catch newly introduced calls to nfsd_file_put() made while a spinlock is held. Link: https://lore.kernel.org/linux-nfs/ece7fd1d-5fb3-5155-54ba-347cfc19bd9a@oracle.com/T/#mf1855552570cf9a9c80d1e49d91438cd9085aada Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 06252d1bd57ab7f1cc9c9e8b94d158a059e63d83 Author: Chuck Lever Date: Sun May 22 12:34:38 2022 -0400 NFSD: Add documenting comment for nfsd4_release_lockowner() [ Upstream commit 043862b09cc00273e35e6c3a6389957953a34207 ] And return explicit nfserr values that match what is documented in the new comment / API contract. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 345e2e48d8df5941ceddf283cf4977c13620c669 Author: Chuck Lever Date: Sun May 22 12:07:18 2022 -0400 NFSD: Modernize nfsd4_release_lockowner() [ Upstream commit bd8fdb6e545f950f4654a9a10d7e819ad48146e5 ] Refactor: Use existing helpers that other lock operations use. This change removes several automatic variables, so re-organize the variable declarations for readability. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 13459d22256aa189ea6de1a8fbf9415b2fafb988 Author: Julian Schroeder Date: Mon May 23 18:52:26 2022 +0000 nfsd: destroy percpu stats counters after reply cache shutdown [ Upstream commit fd5e363eac77ef81542db77ddad0559fa0f9204e ] Upon nfsd shutdown any pending DRC cache is freed. DRC cache use is tracked via a percpu counter. In the current code the percpu counter is destroyed before. If any pending cache is still present, percpu_counter_add is called with a percpu counter==NULL. This causes a kernel crash. The solution is to destroy the percpu counter after the cache is freed. Fixes: e567b98ce9a4b (“nfsd: protect concurrent access to nfsd stats counters”) Signed-off-by: Julian Schroeder Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 15081df04a6ec7ff7adc03d897751c048395fa18 Author: Zhang Xiaoxu Date: Sat May 21 12:08:45 2022 +0800 nfsd: Fix null-ptr-deref in nfsd_fill_super() [ Upstream commit 6f6f84aa215f7b6665ccbb937db50860f9ec2989 ] KASAN report null-ptr-deref as follows: BUG: KASAN: null-ptr-deref in nfsd_fill_super+0xc6/0xe0 [nfsd] Write of size 8 at addr 000000000000005d by task a.out/852 CPU: 7 PID: 852 Comm: a.out Not tainted 5.18.0-rc7-dirty #66 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: dump_stack_lvl+0x34/0x44 kasan_report+0xab/0x120 ? nfsd_mkdir+0x71/0x1c0 [nfsd] ? nfsd_fill_super+0xc6/0xe0 [nfsd] nfsd_fill_super+0xc6/0xe0 [nfsd] ? nfsd_mkdir+0x1c0/0x1c0 [nfsd] get_tree_keyed+0x8e/0x100 vfs_get_tree+0x41/0xf0 __do_sys_fsconfig+0x590/0x670 ? fscontext_read+0x180/0x180 ? anon_inode_getfd+0x4f/0x70 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae This can be reproduce by concurrent operations: 1. fsopen(nfsd)/fsconfig 2. insmod/rmmod nfsd Since the nfsd file system is registered before than nfsd_net allocated, the caller may get the file_system_type and use the nfsd_net before it allocated, then null-ptr-deref occurred. So init_nfsd() should call register_filesystem() last. Fixes: bd5ae9288d64 ("nfsd: register pernet ops last, unregister first") Signed-off-by: Zhang Xiaoxu Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ff4e7a4b497a437c9091efcf5d4e44362a9fb961 Author: Zhang Xiaoxu Date: Sat May 21 12:08:44 2022 +0800 nfsd: Unregister the cld notifier when laundry_wq create failed [ Upstream commit 62fdb65edb6c43306c774939001f3a00974832aa ] If laundry_wq create failed, the cld notifier should be unregistered. Signed-off-by: Zhang Xiaoxu Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e1e87709c4539a0cd11cf0fbca737c861f20efcf Author: Chuck Lever Date: Fri Apr 29 10:06:21 2022 -0400 SUNRPC: Use RMW bitops in single-threaded hot paths [ Upstream commit 28df0988815f63e2af5e6718193c9f68681ad7ff ] I noticed CPU pipeline stalls while using perf. Once an svc thread is scheduled and executing an RPC, no other processes will touch svc_rqst::rq_flags. Thus bus-locked atomics are not needed outside the svc thread scheduler. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f7a1ecf2aa4b1d37854d0d3c8d64351e384ba314 Author: Chuck Lever Date: Sun Mar 27 16:43:03 2022 -0400 NFSD: Clean up the show_nf_flags() macro [ Upstream commit bb283ca18d1e67c82d22a329c96c9d6036a74790 ] The flags are defined using C macros, so TRACE_DEFINE_ENUM is unnecessary. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7b8462f22a63a9ed341df3aa6dd9af437bb0c5ef Author: Chuck Lever Date: Sun Mar 27 16:42:20 2022 -0400 NFSD: Trace filecache opens [ Upstream commit 0122e882119ddbd9efa6edfeeac3f5c704a7aeea ] Instrument calls to nfsd_open_verified() to get a sense of the filecache hit rate. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a38be004749649badfbcfad967b557a65258e2a2 Author: Chuck Lever Date: Wed Mar 23 13:55:37 2022 -0400 NFSD: Move documenting comment for nfsd4_process_open2() [ Upstream commit 7e2ce0cc15a509b859199235a2bad9cece00f67a ] Clean up nfsd4_open() by converting a large comment at the only call site for nfsd4_process_open2() to a kerneldoc comment in front of that function. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bfe9aab120b236c18d8e3a575d4d5b9cdf5dec55 Author: Chuck Lever Date: Mon Mar 21 16:41:32 2022 -0400 NFSD: Fix whitespace [ Upstream commit 26320d7e317c37404c811603d50d811132aef78c ] Clean up: Pull case arms back one tab stop to conform every other switch statement in fs/nfsd/nfs4proc.c. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2805c5439c9587eb263aa85f16f259f215a3fd9b Author: Chuck Lever Date: Wed Mar 30 14:28:51 2022 -0400 NFSD: Remove dprintk call sites from tail of nfsd4_open() [ Upstream commit f67a16b147045815b6aaafeef8663e5faeb6d569 ] Clean up: These relics are not likely to benefit server administrators. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c20097329d2c196b818c4666c7820c1378d69d61 Author: Chuck Lever Date: Wed Mar 30 10:30:54 2022 -0400 NFSD: Instantiate a struct file when creating a regular NFSv4 file [ Upstream commit fb70bf124b051d4ded4ce57511dfec6d3ebf2b43 ] There have been reports of races that cause NFSv4 OPEN(CREATE) to return an error even though the requested file was created. NFSv4 does not provide a status code for this case. To mitigate some of these problems, reorganize the NFSv4 OPEN(CREATE) logic to allocate resources before the file is actually created, and open the new file while the parent directory is still locked. Two new APIs are added: + Add an API that works like nfsd_file_acquire() but does not open the underlying file. The OPEN(CREATE) path can use this API when it already has an open file. + Add an API that is kin to dentry_open(). NFSD needs to create a file and grab an open "struct file *" atomically. The alloc_empty_file() has to be done before the inode create. If it fails (for example, because the NFS server has exceeded its max_files limit), we avoid creating the file and can still return an error to the NFS client. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382 Signed-off-by: Chuck Lever Tested-by: JianHong Yin [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d8714bda3f692de172b0226f35b607df63a8c5a4 Author: Chuck Lever Date: Sun Mar 27 16:46:47 2022 -0400 NFSD: Clean up nfsd_open_verified() [ Upstream commit f4d84c52643ae1d63a8e73e2585464470e7944d1 ] Its only caller always passes S_IFREG as the @type parameter. As an additional clean-up, add a kerneldoc comment. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 274fd0f9c26105daac1991ee55243e08e0696d90 Author: Chuck Lever Date: Mon Mar 28 15:36:58 2022 -0400 NFSD: Remove do_nfsd_create() [ Upstream commit 1c388f27759c5d9271d4fca081f7ee138986eb7d ] Now that its two callers have their own version-specific instance of this function, do_nfsd_create() is no longer used. [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 66af1db0cc37870629d71bb1a1577173e473a2d6 Author: Chuck Lever Date: Mon Mar 28 14:47:34 2022 -0400 NFSD: Refactor NFSv4 OPEN(CREATE) [ Upstream commit 254454a5aa4a9f696d6bae080c08d5863e650f49 ] Copy do_nfsd_create() to nfs4proc.c and remove NFSv3-specific logic. [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a019add1b4561f42e867535f2bc186168d5298e0 Author: Chuck Lever Date: Mon Mar 28 13:29:23 2022 -0400 NFSD: Refactor NFSv3 CREATE [ Upstream commit df9606abddfb01090d5ece7dcc2441d848f690f0 ] The NFSv3 CREATE and NFSv4 OPEN(CREATE) use cases are about to diverge such that it makes sense to split do_nfsd_create() into one version for NFSv3 and one for NFSv4. As a first step, copy do_nfsd_create() to nfs3proc.c and remove NFSv4-specific logic. One immediate legibility benefit is that the logic for handling NFSv3 createhow is now quite straightforward. NFSv4 createhow has some subtleties that IMO do not belong in generic code. [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a132795b61fe113a3588015c5a0c0ff621e4f9c6 Author: Chuck Lever Date: Mon Mar 28 16:10:17 2022 -0400 NFSD: Refactor nfsd_create_setattr() [ Upstream commit 5f46e950c395b9c14c282b53ba78c5fd46d6c256 ] I'd like to move do_nfsd_create() out of vfs.c. Therefore nfsd_create_setattr() needs to be made publicly visible. Note that both call sites in vfs.c commit both the new object and its parent directory, so just combine those common metadata commits into nfsd_create_setattr(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ee0742a93ccba786bc06e00bb0d10a5123acd3ba Author: Chuck Lever Date: Mon Mar 28 10:16:42 2022 -0400 NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create() [ Upstream commit 14ee45b70dd0d9ae76fb066cd8c0652d657353f6 ] Clean up: The "out" label already invokes fh_drop_write(). Note that fh_drop_write() is already careful not to invoke mnt_drop_write() if either it has already been done or there is nothing to drop. Therefore no change in behavior is expected. [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 304505e2e89c45bfe9d78b1ff0296f6abbb26f00 Author: Chuck Lever Date: Fri Mar 25 14:47:54 2022 -0400 NFSD: Clean up nfsd3_proc_create() [ Upstream commit e61568599c9ad638fdaba150fee07d7065e31851 ] As near as I can tell, mode bit masking and setting S_IFREG is already done by do_nfsd_create() and vfs_create(). The NFSv4 path (do_open_lookup), for example, does not bother with this special processing. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c6207942b2554bbc9a8b4414457ff312f214c6cc Author: Dai Ngo Date: Mon May 2 14:19:27 2022 -0700 NFSD: Show state of courtesy client in client info [ Upstream commit e9488d5ae13c0a72223c507e2508dc2ac66cad4f ] Update client_info_show to show state of courtesy client and seconds since last renew. Reviewed-by: J. Bruce Fields Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4a39f029e7e3436d9b6be1900a2a89bda01fcd69 Author: Dai Ngo Date: Mon May 2 14:19:26 2022 -0700 NFSD: add support for lock conflict to courteous server [ Upstream commit 27431affb0dbc259ac6ffe6071243a576c8f38f1 ] This patch allows expired client with lock state to be in COURTESY state. Lock conflict with COURTESY client is resolved by the fs/lock code using the lm_lock_expirable and lm_expire_lock callback in the struct lock_manager_operations. If conflict client is in COURTESY state, set it to EXPIRABLE and schedule the laundromat to run immediately to expire the client. The callback lm_expire_lock waits for the laundromat to flush its work queue before returning to caller. Reviewed-by: J. Bruce Fields Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 97f77d7d501bee5bf8dd96b6c0b3f666b5b5bc99 Author: Dai Ngo Date: Mon May 2 14:19:25 2022 -0700 fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict [ Upstream commit 2443da2259e97688f93d64d17ab69b15f466078a ] Add 2 new callbacks, lm_lock_expirable and lm_expire_lock, to lock_manager_operations to allow the lock manager to take appropriate action to resolve the lock conflict if possible. A new field, lm_mod_owner, is also added to lock_manager_operations. The lm_mod_owner is used by the fs/lock code to make sure the lock manager module such as nfsd, is not freed while lock conflict is being resolved. lm_lock_expirable checks and returns true to indicate that the lock conflict can be resolved else return false. This callback must be called with the flc_lock held so it can not block. lm_expire_lock is called to resolve the lock conflict if the returned value from lm_lock_expirable is true. This callback is called without the flc_lock held since it's allowed to block. Upon returning from this callback, the lock conflict should be resolved and the caller is expected to restart the conflict check from the beginnning of the list. Lock manager, such as NFSv4 courteous server, uses this callback to resolve conflict by destroying lock owner, or the NFSv4 courtesy client (client that has expired but allowed to maintains its states) that owns the lock. Reviewed-by: J. Bruce Fields Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit eb2eb6b6afdf602b779824ce4e3629ae234ca67a Author: Dai Ngo Date: Mon May 2 14:19:24 2022 -0700 fs/lock: add helper locks_owner_has_blockers to check for blockers [ Upstream commit 591502c5cb325b1c6ec59ab161927d606b918aa0 ] Add helper locks_owner_has_blockers to check if there is any blockers for a given lockowner. Reviewed-by: J. Bruce Fields Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 461d0b57c9f31335d22c9e91e33f9a6c68b85440 Author: Dai Ngo Date: Mon May 2 14:19:23 2022 -0700 NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd [ Upstream commit d76cc46b37e123e8d245cc3490978dbda56f979d ] This patch moves create/destroy of laundry_wq from nfs4_state_start and nfs4_state_shutdown_net to init_nfsd and exit_nfsd to prevent the laundromat from being freed while a thread is processing a conflicting lock. Reviewed-by: J. Bruce Fields Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a26848e2bcc9c40b2f9791f6d827504c513a8307 Author: Dai Ngo Date: Mon May 2 14:19:22 2022 -0700 NFSD: add support for share reservation conflict to courteous server [ Upstream commit 3d69427151806656abf129342028f3f4e5e1fee0 ] This patch allows expired client with open state to be in COURTESY state. Share/access conflict with COURTESY client is resolved by setting COURTESY client to EXPIRABLE state, schedule laundromat to run and returning nfserr_jukebox to the request client. Reviewed-by: J. Bruce Fields Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 67ef9e5fd737eab2495f2586df7e9ea30caa1b77 Author: Dai Ngo Date: Mon May 2 14:19:21 2022 -0700 NFSD: add courteous server support for thread with only delegation [ Upstream commit 66af25799940b26efd41ea6e648f75c41a48a2c2 ] This patch provides courteous server support for delegation only. Only expired client with delegation but no conflict and no open or lock state is allowed to be in COURTESY state. Delegation conflict with COURTESY/EXPIRABLE client is resolved by setting it to EXPIRABLE, queue work for the laundromat and return delay to the caller. Conflict is resolved when the laudromat runs and expires the EXIRABLE client while the NFS client retries the OPEN request. Local thread request that gets conflict is doing the retry in _break_lease. Client in COURTESY or EXPIRABLE state is allowed to reconnect and continues to have access to its state. Access to the nfs4_client by the reconnecting thread and the laundromat is serialized via the client_lock. Reviewed-by: J. Bruce Fields Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bf1cbe2f3650b4f4a8add6af933c6d7f6af1f361 Author: Chuck Lever Date: Thu Apr 7 16:48:24 2022 -0400 NFSD: Clean up nfsd_splice_actor() [ Upstream commit 91e23b1c39820bfed642119ff6b6ef9f43cf09ce ] nfsd_splice_actor() checks that the page being spliced does not match the previous element in the svc_rqst::rq_pages array. We believe this is to prevent a double put_page() in cases where the READ payload is partially contained in the xdr_buf's head buffer. However, the NFSD READ proc functions no longer place any part of the READ payload in the head buffer, in order to properly support NFS/RDMA READ with Write chunks. Therefore, simplify the logic in nfsd_splice_actor() to remove this unnecessary check. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2723d479f51f2afab7fa0db2aef1948635b95ef7 Author: Vasily Averin Date: Sun May 22 15:08:02 2022 +0300 fanotify: fix incorrect fmode_t casts [ Upstream commit dccd855771b37820b6d976a99729c88259549f85 ] Fixes sparce warnings: fs/notify/fanotify/fanotify_user.c:267:63: sparse: warning: restricted fmode_t degrades to integer fs/notify/fanotify/fanotify_user.c:1351:28: sparse: warning: restricted fmode_t degrades to integer FMODE_NONTIFY have bitwise fmode_t type and requires __force attribute for any casts. Signed-off-by: Vasily Averin Reviewed-by: Christian Brauner (Microsoft) Reviewed-by: Christoph Hellwig Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/9adfd6ac-1b89-791e-796b-49ada3293985@openvz.org Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4cd725129e65e700d7abc5b5dff34a841f69a65d Author: Amir Goldstein Date: Wed May 11 22:02:13 2022 +0300 fsnotify: consistent behavior for parent not watching children [ Upstream commit e730558adffb88a52e562db089e969ee9510184a ] The logic for handling events on child in groups that have a mark on the parent inode, but without FS_EVENT_ON_CHILD flag in the mask is duplicated in several places and inconsistent. Move the logic into the preparation of mark type iterator, so that the parent mark type will be excluded from all mark type iterations in that case. This results in several subtle changes of behavior, hopefully all desired changes of behavior, for example: - Group A has a mount mark with FS_MODIFY in mask - Group A has a mark with ignore mask that does not survive FS_MODIFY and does not watch children on directory D. - Group B has a mark with FS_MODIFY in mask that does watch children on directory D. - FS_MODIFY event on file D/foo should not clear the ignore mask of group A, but before this change it does And if group A ignore mask was set to survive FS_MODIFY: - FS_MODIFY event on file D/foo should be reported to group A on account of the mount mark, but before this change it is wrongly ignored Fixes: 2f02fd3fa13e ("fanotify: fix ignore mask logic for events on child and on dir") Reported-by: Jan Kara Link: https://lore.kernel.org/linux-fsdevel/20220314113337.j7slrb5srxukztje@quack3.lan/ Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220511190213.831646-3-amir73il@gmail.com Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e3bce57ffc7b020f983a5fcd22cd06ddd3520916 Author: Amir Goldstein Date: Wed May 11 22:02:12 2022 +0300 fsnotify: introduce mark type iterator [ Upstream commit 14362a2541797cf9df0e86fb12dcd7950baf566e ] fsnotify_foreach_iter_mark_type() is used to reduce boilerplate code of iterating all marks of a specific group interested in an event by consulting the iterator report_mask. Use an open coded version of that iterator in fsnotify_iter_next() that collects all marks of the current iteration group without consulting the iterator report_mask. At the moment, the two iterator variants are the same, but this decoupling will allow us to exclude some of the group's marks from reporting the event, for example for event on child and inode marks on parent did not request to watch events on children. Fixes: 2f02fd3fa13e ("fanotify: fix ignore mask logic for events on child and on dir") Reported-by: Jan Kara Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220511190213.831646-2-amir73il@gmail.com Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f6017a718b6398c6ddba8be33ebcb3ab96c83fb2 Author: Amir Goldstein Date: Fri Apr 22 15:03:27 2022 +0300 fanotify: enable "evictable" inode marks [ Upstream commit 5f9d3bd520261fd7a850818c71809fd580e0f30c ] Now that the direct reclaim path is handled we can enable evictable inode marks. Link: https://lore.kernel.org/r/20220422120327.3459282-17-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3083d602ba91d374a0baf87e56e8733c7c507e70 Author: Amir Goldstein Date: Fri Apr 22 15:03:26 2022 +0300 fanotify: use fsnotify group lock helpers [ Upstream commit e79719a2ca5c61912c0493bc1367db52759cf6fd ] Direct reclaim from fanotify mark allocation context may try to evict inodes with evictable marks of the same group and hit this deadlock: [<0>] fsnotify_destroy_mark+0x1f/0x3a [<0>] fsnotify_destroy_marks+0x71/0xd9 [<0>] __destroy_inode+0x24/0x7e [<0>] destroy_inode+0x2c/0x67 [<0>] dispose_list+0x49/0x68 [<0>] prune_icache_sb+0x5b/0x79 [<0>] super_cache_scan+0x11c/0x16f [<0>] shrink_slab.constprop.0+0x23e/0x40f [<0>] shrink_node+0x218/0x3e7 [<0>] do_try_to_free_pages+0x12a/0x2d2 [<0>] try_to_free_pages+0x166/0x242 [<0>] __alloc_pages_slowpath.constprop.0+0x30c/0x903 [<0>] __alloc_pages+0xeb/0x1c7 [<0>] cache_grow_begin+0x6f/0x31e [<0>] fallback_alloc+0xe0/0x12d [<0>] ____cache_alloc_node+0x15a/0x17e [<0>] kmem_cache_alloc_trace+0xa1/0x143 [<0>] fanotify_add_mark+0xd5/0x2b2 [<0>] do_fanotify_mark+0x566/0x5eb [<0>] __x64_sys_fanotify_mark+0x21/0x24 [<0>] do_syscall_64+0x6d/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae Set the FSNOTIFY_GROUP_NOFS flag to prevent going into direct reclaim from allocations under fanotify group lock and use the safe group lock helpers. Link: https://lore.kernel.org/r/20220422120327.3459282-16-amir73il@gmail.com Suggested-by: Jan Kara Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/ Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f85d590059532ac2efe344ba0e3d30442527b5b3 Author: Amir Goldstein Date: Fri Apr 22 15:03:25 2022 +0300 fanotify: implement "evictable" inode marks [ Upstream commit 7d5e005d982527e4029b0139823d179986e34cdc ] When an inode mark is created with flag FAN_MARK_EVICTABLE, it will not pin the marked inode to inode cache, so when inode is evicted from cache due to memory pressure, the mark will be lost. When an inode mark with flag FAN_MARK_EVICATBLE is updated without using this flag, the marked inode is pinned to inode cache. When an inode mark is updated with flag FAN_MARK_EVICTABLE but an existing mark already has the inode pinned, the mark update fails with error EEXIST. Evictable inode marks can be used to setup inode marks with ignored mask to suppress events from uninteresting files or directories in a lazy manner, upon receiving the first event, without having to iterate all the uninteresting files or directories before hand. The evictbale inode mark feature allows performing this lazy marks setup without exhausting the system memory with pinned inodes. This change does not enable the feature yet. Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiRDpuS=2uA6+ZUM7yG9vVU-u212tkunBmSnP_u=mkv=Q@mail.gmail.com/ Link: https://lore.kernel.org/r/20220422120327.3459282-15-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 80fb0ae4b14560ddadd67cef61250bd43c9320a7 Author: Amir Goldstein Date: Fri Apr 22 15:03:24 2022 +0300 fanotify: factor out helper fanotify_mark_update_flags() [ Upstream commit 8998d110835e3781ccd3f1ae061a590b4aaba911 ] Handle FAN_MARK_IGNORED_SURV_MODIFY flag change in a helper that is called after updating the mark mask. Replace the added and removed return values and help variables with bool recalc return values and help variable, which makes the code a bit easier to follow. Rename flags argument to fan_flags to emphasize the difference from mark->flags. Link: https://lore.kernel.org/r/20220422120327.3459282-14-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b9576077eee32f9e7ce6112b21aad63393f5cbdd Author: Amir Goldstein Date: Fri Apr 22 15:03:23 2022 +0300 fanotify: create helper fanotify_mark_user_flags() [ Upstream commit 4adce25ccfff215939ee465b8c0aa70526d5c352 ] To translate from fsnotify mark flags to user visible flags. Link: https://lore.kernel.org/r/20220422120327.3459282-13-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ff34ebaa6f6dc1eebce6a8d6f12a1566f33d00fe Author: Amir Goldstein Date: Fri Apr 22 15:03:22 2022 +0300 fsnotify: allow adding an inode mark without pinning inode [ Upstream commit c3638b5b13740fa31762d414bbce8b7a694e582a ] fsnotify_add_mark() and variants implicitly take a reference on inode when attaching a mark to an inode. Make that behavior opt-out with the mark flag FSNOTIFY_MARK_FLAG_NO_IREF. Instead of taking the inode reference when attaching connector to inode and dropping the inode reference when detaching connector from inode, take the inode reference on attach of the first mark that wants to hold an inode reference and drop the inode reference on detach of the last mark that wants to hold an inode reference. Backends can "upgrade" an existing mark to take an inode reference, but cannot "downgrade" a mark with inode reference to release the refernce. This leaves the choice to the backend whether or not to pin the inode when adding an inode mark. This is intended to be used when adding a mark with ignored mask that is used for optimization in cases where group can afford getting unneeded events and reinstate the mark with ignored mask when inode is accessed again after being evicted. Link: https://lore.kernel.org/r/20220422120327.3459282-12-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3bd557cfdf991a5f050aeeca528b7899846bacea Author: Amir Goldstein Date: Fri Apr 22 15:03:21 2022 +0300 dnotify: use fsnotify group lock helpers [ Upstream commit aabb45fdcb31f00f1e7cae2bce83e83474a87c03 ] Before commit 9542e6a643fc6 ("nfsd: Containerise filecache laundrette") nfsd would close open files in direct reclaim context. There is no guarantee that others memory shrinkers don't do the same and no guarantee that future shrinkers won't do that. For example, if overlayfs implements inode cache of fscache would keep open files to cached objects, inode shrinkers could end up closing open files to underlying fs. Direct reclaim from dnotify mark allocation context may try to close open files that have dnotify marks of the same group and hit a deadlock on mark_mutex. Set the FSNOTIFY_GROUP_NOFS flag to prevent going into direct reclaim from allocations under dnotify group lock and use the safe group lock helpers. Link: https://lore.kernel.org/r/20220422120327.3459282-11-amir73il@gmail.com Suggested-by: Jan Kara Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/ Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cc1c875b6960fc372fdb9c6f889a7a1f2072bde1 Author: Amir Goldstein Date: Fri Apr 22 15:03:20 2022 +0300 nfsd: use fsnotify group lock helpers [ Upstream commit b8962a9d8cc2d8c93362e2f684091c79f702f6f3 ] Before commit 9542e6a643fc6 ("nfsd: Containerise filecache laundrette") nfsd would close open files in direct reclaim context and that could cause a deadlock when fsnotify mark allocation went into direct reclaim and nfsd shrinker tried to free existing fsnotify marks. To avoid issues like this in future code, set the FSNOTIFY_GROUP_NOFS flag on nfsd fsnotify group to prevent going into direct reclaim from fsnotify_add_inode_mark(). Link: https://lore.kernel.org/r/20220422120327.3459282-10-amir73il@gmail.com Suggested-by: Jan Kara Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/ Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c2c6ced500ad26fb4a30277dbe26932d78f65ae7 Author: Amir Goldstein Date: Fri Apr 22 15:03:18 2022 +0300 inotify: use fsnotify group lock helpers [ Upstream commit 642054b87058019be36033f73c3e48ffff1915aa ] inotify inode marks pin the inode so there is no need to set the FSNOTIFY_GROUP_NOFS flag. Link: https://lore.kernel.org/r/20220422120327.3459282-8-amir73il@gmail.com Suggested-by: Jan Kara Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/ Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f91ba4a49b6ee69f64b130f4324e7f1826ac14a3 Author: Amir Goldstein Date: Fri Apr 22 15:03:17 2022 +0300 fsnotify: create helpers for group mark_mutex lock [ Upstream commit 43b245a788e2d8f1bb742668a9bdace02fcb3e96 ] Create helpers to take and release the group mark_mutex lock. Define a flag FSNOTIFY_GROUP_NOFS in fsnotify_group that determines if the mark_mutex lock is fs reclaim safe or not. If not safe, the lock helpers take the lock and disable direct fs reclaim. In that case we annotate the mutex with a different lockdep class to express to lockdep that an allocation of mark of an fs reclaim safe group may take the group lock of another "NOFS" group to evict inodes. For now, converted only the callers in common code and no backend defines the NOFS flag. It is intended to be set by fanotify for evictable marks support. Link: https://lore.kernel.org/r/20220422120327.3459282-7-amir73il@gmail.com Suggested-by: Jan Kara Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/ Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 74f9be7f64ed0aff7b859ca23a21ee8b3fae6616 Author: Amir Goldstein Date: Fri Apr 22 15:03:16 2022 +0300 fsnotify: make allow_dups a property of the group [ Upstream commit f3010343d9e119da35ee864b3a28993bb5c78ed7 ] Instead of passing the allow_dups argument to fsnotify_add_mark() as an argument, define the group flag FSNOTIFY_GROUP_DUPS to express the allow_dups behavior and set this behavior at group creation time for all calls of fsnotify_add_mark(). Rename the allow_dups argument to generic add_flags argument for future use. Link: https://lore.kernel.org/r/20220422120327.3459282-6-amir73il@gmail.com Suggested-by: Jan Kara Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4dc30393bd7b909c3d6083f8d734bf6b65cf82cc Author: Amir Goldstein Date: Fri Apr 22 15:03:15 2022 +0300 fsnotify: pass flags argument to fsnotify_alloc_group() [ Upstream commit 867a448d587e7fa845bceaf4ee1c632448f2a9fa ] Add flags argument to fsnotify_alloc_group(), define and use the flag FSNOTIFY_GROUP_USER in inotify and fanotify instead of the helper fsnotify_alloc_user_group() to indicate user allocation. Although the flag FSNOTIFY_GROUP_USER is currently not used after group allocation, we store the flags argument in the group struct for future use of other group flags. Link: https://lore.kernel.org/r/20220422120327.3459282-5-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1c47d87317e29eee56b74e82492ce22c171a7cf8 Author: Amir Goldstein Date: Fri Apr 22 15:03:13 2022 +0300 inotify: move control flags from mask to mark flags [ Upstream commit 38035c04f5865c4ef9597d6beed6a7178f90f64a ] The inotify control flags in the mark mask (e.g. FS_IN_ONE_SHOT) are not relevant to object interest mask, so move them to the mark flags. This frees up some bits in the object interest mask. Link: https://lore.kernel.org/r/20220422120327.3459282-3-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit aecfd231bf53be3495fa6eb8c22cf39fc4cd3a86 Author: Dai Ngo Date: Sat Feb 12 10:12:52 2022 -0800 fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock. [ Upstream commit 9d6647762b9c6b555bc83d97d7c93be6057a990f ] Update lock usage of lock_manager_operations' functions to reflect the changes in commit 6109c85037e5 ("locks: add a dedicated spinlock to protect i_flctx lists"). Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d71ea54835dfa69019b91a9465b299d5e6984956 Author: Amir Goldstein Date: Sat May 7 11:00:28 2022 +0300 fanotify: do not allow setting dirent events in mask of non-dir [ Upstream commit ceaf69f8eadcafb323392be88e7a5248c415d423 ] Dirent events (create/delete/move) are only reported on watched directory inodes, but in fanotify as well as in legacy inotify, it was always allowed to set them on non-dir inode, which does not result in any meaningful outcome. Until kernel v5.17, dirent events in fanotify also differed from events "on child" (e.g. FAN_OPEN) in the information provided in the event. For example, FAN_OPEN could be set in the mask of a non-dir or the mask of its parent and event would report the fid of the child regardless of the marked object. By contrast, FAN_DELETE is not reported if the child is marked and the child fid was not reported in the events. Since kernel v5.17, with fanotify group flag FAN_REPORT_TARGET_FID, the fid of the child is reported with dirent events, like events "on child", which may create confusion for users expecting the same behavior as events "on child" when setting events in the mask on a child. The desired semantics of setting dirent events in the mask of a child are not clear, so for now, deny this action for a group initialized with flag FAN_REPORT_TARGET_FID and for the new event FAN_RENAME. We may relax this restriction in the future if we decide on the semantics and implement them. Fixes: d61fd650e9d2 ("fanotify: introduce group flag FAN_REPORT_TARGET_FID") Fixes: 8cc3b1ccd930 ("fanotify: wire up FAN_RENAME event") Link: https://lore.kernel.org/linux-fsdevel/20220505133057.zm5t6vumc4xdcnsg@quack3.lan/ Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220507080028.219826-1-amir73il@gmail.com Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9862064ca81f9b9da8ebe26c86cf2b10c293ba1a Author: Trond Myklebust Date: Thu Mar 31 09:54:02 2022 -0400 nfsd: Clean up nfsd_file_put() [ Upstream commit 999397926ab3f78c7d1235cc4ca6e3c89d2769bf ] Make it a little less racy, by removing the refcount_read() test. Then remove the redundant 'is_hashed' variable. Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cf04df21a46f5cb0382d04ea4384c60893d26046 Author: Trond Myklebust Date: Thu Mar 31 09:54:01 2022 -0400 nfsd: Fix a write performance regression [ Upstream commit 6b8a94332ee4f7d9a8ae0cbac7609f79c212f06c ] The call to filemap_flush() in nfsd_file_put() is there to ensure that we clear out any writes belonging to a NFSv3 client relatively quickly and avoid situations where the file can't be evicted by the garbage collector. It also ensures that we detect write errors quickly. The problem is this causes a regression in performance for some workloads. So try to improve matters by deferring writeback until we're ready to close the file, and need to detect errors so that we can force the client to resend. Tested-by: Jan Kara Fixes: b6669305d35a ("nfsd: Reduce the number of calls to nfsd_file_gc()") Signed-off-by: Trond Myklebust Link: https://lore.kernel.org/all/20220330103457.r4xrhy2d6nhtouzk@quack3.lan Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 997575f1a1b593a1fd6ffb6a390d335dfe6e4e71 Author: Haowen Bai Date: Mon Mar 28 10:48:59 2022 +0800 SUNRPC: Return true/false (not 1/0) from bool functions [ Upstream commit 5f7b839d47dbc74cf4a07beeab5191f93678673e ] Return boolean values ("true" or "false") instead of 1 or 0 from bool functions. This fixes the following warnings from coccicheck: ./fs/nfsd/nfs2acl.c:289:9-10: WARNING: return of 0/1 in function 'nfsaclsvc_encode_accessres' with return type bool ./fs/nfsd/nfs2acl.c:252:9-10: WARNING: return of 0/1 in function 'nfsaclsvc_encode_getaclres' with return type bool Signed-off-by: Haowen Bai Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a5fa9c824db839c1b7f702b2731c37ee2d700d81 Author: Bang Li Date: Fri Mar 11 23:12:40 2022 +0800 fsnotify: remove redundant parameter judgment [ Upstream commit f92ca72b0263d601807bbd23ed25cbe6f4da89f4 ] iput() has already judged the incoming parameter, so there is no need to repeat the judgment here. Link: https://lore.kernel.org/r/20220311151240.62045-1-libang.linuxer@gmail.com Signed-off-by: Bang Li Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 552c24a32ce8b149436c7a6d7fdb9285fcd581ce Author: Amir Goldstein Date: Wed Feb 23 17:14:38 2022 +0200 fsnotify: optimize FS_MODIFY events with no ignored masks [ Upstream commit 04e317ba72d07901b03399b3d1525e83424df5b3 ] fsnotify() treats FS_MODIFY events specially - it does not skip them even if the FS_MODIFY event does not apear in the object's fsnotify mask. This is because send_to_group() checks if FS_MODIFY needs to clear ignored mask of marks. The common case is that an object does not have any mark with ignored mask and in particular, that it does not have a mark with ignored mask and without the FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY flag. Set FS_MODIFY in object's fsnotify mask during fsnotify_recalc_mask() if object has a mark with an ignored mask and without the FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY flag and remove the special treatment of FS_MODIFY in fsnotify(), so that FS_MODIFY events could be optimized in the common case. Call fsnotify_recalc_mask() from fanotify after adding or removing an ignored mask from a mark without FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY or when adding the FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY flag to a mark with ignored mask (the flag cannot be removed by fanotify uapi). Performance results for doing 10000000 write(2)s to tmpfs: vanilla patched without notification mark 25.486+-1.054 24.965+-0.244 with notification mark 30.111+-0.139 26.891+-1.355 So we can see the overhead of notification subsystem has been drastically reduced. Link: https://lore.kernel.org/r/20220223151438.790268-3-amir73il@gmail.com Suggested-by: Jan Kara Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5e84e33832d50f707843be79859cecddde6a7f34 Author: Amir Goldstein Date: Wed Feb 23 17:14:37 2022 +0200 fsnotify: fix merge with parent's ignored mask [ Upstream commit 4f0b903ded728c505850daf2914bfc08841f0ae6 ] fsnotify_parent() does not consider the parent's mark at all unless the parent inode shows interest in events on children and in the specific event. So unless parent added an event to both its mark mask and ignored mask, the event will not be ignored. Fix this by declaring the interest of an object in an event when the event is in either a mark mask or ignored mask. Link: https://lore.kernel.org/r/20220223151438.790268-2-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 62fa144b858700ce644720b558491fc097b0096b Author: Jakob Koschel Date: Sat Mar 19 21:27:04 2022 +0100 nfsd: fix using the correct variable for sizeof() [ Upstream commit 4fc5f5346592cdc91689455d83885b0af65d71b8 ] While the original code is valid, it is not the obvious choice for the sizeof() call and in preparation to limit the scope of the list iterator variable the sizeof should be changed to the size of the destination. Signed-off-by: Jakob Koschel Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e96076f5790f93ff7166519cd8e3548ddfc80743 Author: Chuck Lever Date: Wed Feb 16 11:26:06 2022 -0500 NFSD: Clean up _lm_ operation names [ Upstream commit 35aff0678f99b0623bb72d50112de9e163a19559 ] The common practice is to name function instances the same as the method names, but with a uniquifying prefix. Commit aef9583b234a ("NFSD: Get reference of lockowner when coping file_lock") missed this -- the new function names should both have been of the form "nfsd4_lm_*". Before more lock manager operations are added in NFSD, rename these two functions for consistency. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ec3b252a55f0a188ef3c7db87056c4d5e0696365 Author: Chuck Lever Date: Sun Feb 6 12:25:47 2022 -0500 NFSD: Remove CONFIG_NFSD_V3 [ Upstream commit 5f9a62ff7d2808c7b56c0ec90f3b7eae5872afe6 ] Eventually support for NFSv2 in the Linux NFS server is to be deprecated and then removed. However, NFSv2 is the "always supported" version that is available as soon as CONFIG_NFSD is set. Before NFSv2 support can be removed, we need to choose a different "always supported" version. This patch removes CONFIG_NFSD_V3 so that NFSv3 is always supported, as NFSv2 is today. When NFSv2 support is removed, NFSv3 will become the only "always supported" NFS version. The defconfigs still need to be updated to remove CONFIG_NFSD_V3=y. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7e4328b3b98fa6c8ee47298d110abe3c17221ba4 Author: Chuck Lever Date: Wed Feb 16 12:16:27 2022 -0500 NFSD: Move svc_serv_ops::svo_function into struct svc_serv [ Upstream commit 37902c6313090235c847af89c5515591261ee338 ] Hoist svo_function back into svc_serv and remove struct svc_serv_ops, since the struct is now devoid of fields. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9802c5746038a1b1e5cedb151352780230e983da Author: Chuck Lever Date: Wed Feb 16 12:31:09 2022 -0500 NFSD: Remove svc_serv_ops::svo_module [ Upstream commit f49169c97fceb21ad6a0aaf671c50b0f520f15a5 ] struct svc_serv_ops is about to be removed. Neil Brown says: > I suspect svo_module can go as well - I don't think the thread is > ever the thing that primarily keeps a module active. A random sample of kthread_create() callers shows sunrpc is the only one that manages module reference count in this way. Suggested-by: Neil Brown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 36c57b27a7d83aeb232e78eaee645dea26a037b3 Author: Chuck Lever Date: Wed Jan 26 11:30:55 2022 -0500 SUNRPC: Remove svc_shutdown_net() [ Upstream commit c7d7ec8f043e53ad16e30f5ebb8b9df415ec0f2b ] Clean up: svc_shutdown_net() now does nothing but call svc_close_net(). Replace all external call sites. svc_close_net() is renamed to be the inverse of svc_xprt_create(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a4bbb1ab69abbd5735a38916f3eae1cf7781eb02 Author: Chuck Lever Date: Mon Jan 31 13:34:29 2022 -0500 SUNRPC: Rename svc_close_xprt() [ Upstream commit 4355d767a21b9445958fc11bce9a9701f76529d3 ] Clean up: Use the "svc_xprt_" function naming convention as is used for other external APIs. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c58a9cfd2091c99a84f0a9d811ce2bc3db2adf28 Author: Chuck Lever Date: Wed Jan 26 11:42:08 2022 -0500 SUNRPC: Rename svc_create_xprt() [ Upstream commit 352ad31448fecc78a2e9b78da64eea5d63b8d0ce ] Clean up: Use the "svc_xprt_" function naming convention as is used for other external APIs. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9a43ddd6b626c3da45abd5eab74492b65a238991 Author: Chuck Lever Date: Tue Jan 25 13:49:29 2022 -0500 SUNRPC: Remove svo_shutdown method [ Upstream commit 87cdd8641c8a1ec6afd2468265e20840a57fd888 ] Clean up. Neil observed that "any code that calls svc_shutdown_net() knows what the shutdown function should be, and so can call it directly." Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8c60a476704da0e2cf3f865e20295d79d988115c Author: Chuck Lever Date: Tue Jan 25 17:57:23 2022 -0500 SUNRPC: Merge svc_do_enqueue_xprt() into svc_enqueue_xprt() [ Upstream commit c0219c499799c1e92bd570c15a47e6257a27bb15 ] Neil says: "These functions were separated in commit 0971374e2818 ("SUNRPC: Reduce contention in svc_xprt_enqueue()") so that the XPT_BUSY check happened before taking any spinlocks. We have since moved or removed the spinlocks so the extra test is fairly pointless." I've made this a separate patch in case the XPT_BUSY change has unexpected consequences and needs to be reverted. Suggested-by: Neil Brown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 99ab6abc88edd907caf97f52ba13825d18aedfad Author: Chuck Lever Date: Tue Jan 25 10:17:59 2022 -0500 SUNRPC: Remove the .svo_enqueue_xprt method [ Upstream commit a9ff2e99e9fa501ec965da03c18a5422b37a2f44 ] We have never been able to track down and address the underlying cause of the performance issues with workqueue-based service support. svo_enqueue_xprt is called multiple times per RPC, so it adds instruction path length, but always ends up at the same function: svc_xprt_do_enqueue(). We do not anticipate needing this flexibility for dynamic nfsd thread management support. As a micro-optimization, remove .svo_enqueue_xprt because Spectre/Meltdown makes virtual function calls more costly. This change essentially reverts commit b9e13cdfac70 ("nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operation"). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 194071d46c5cc8b3fab17768a9214376eee25a30 Author: Chuck Lever Date: Tue Sep 28 11:40:59 2021 -0400 NFSD: Streamline the rare "found" case [ Upstream commit add1511c38166cf1036765f8c4aa939f0275a799 ] Move a rarely called function call site out of the hot path. This is an exceptionally small improvement because the compiler inlines most of the functions that nfsd_cache_lookup() calls. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3304d16c24f5388763b2a89212a3143b7da41d12 Author: Chuck Lever Date: Tue Sep 28 11:39:02 2021 -0400 NFSD: Skip extra computation for RC_NOCACHE case [ Upstream commit 0f29ce32fbc56cfdb304eec8a4deb920ccfd89c3 ] Force the compiler to skip unneeded initialization for cases that don't need those values. For example, NFSv4 COMPOUND operations are RC_NOCACHE. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4aa8dac58c17a95bb3e301782d612c1236af4648 Author: Chuck Lever Date: Thu Sep 30 19:19:57 2021 -0400 NFSD: De-duplicate hash bucket indexing [ Upstream commit 378a6109dd142a678f629b740f558365150f60f9 ] Clean up: The details of finding the right hash bucket are exactly the same in both nfsd_cache_lookup() and nfsd_cache_update(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ca6761d39ad27ac73bd40254ebfbe4b6331c20a8 Author: Ondrej Valousek Date: Tue Jan 11 13:08:42 2022 +0100 nfsd: Add support for the birth time attribute [ Upstream commit e377a3e698fb56cb63f6bddbebe7da76dc37e316 ] For filesystems that supports "btime" timestamp (i.e. most modern filesystems do) we share it via kernel nfsd. Btime support for NFS client has already been added by Trond recently. Suggested-by: Bruce Fields Signed-off-by: Ondrej Valousek [ cel: addressed some whitespace/checkpatch nits ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0d1bbb0efe5a2304a3b322bd480b16d07d703a85 Author: Chuck Lever Date: Tue Jan 25 15:57:45 2022 -0500 NFSD: Deprecate NFS_OFFSET_MAX [ Upstream commit c306d737691ef84305d4ed0d302c63db2932f0bb ] NFS_OFFSET_MAX was introduced way back in Linux v2.3.y before there was a kernel-wide OFFSET_MAX value. As a clean up, replace the last few uses of it with its generic equivalent, and get rid of it. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 70a80c7e8d5b88d86e2167dc903d7e14c0aea189 Author: Chuck Lever Date: Mon Jan 24 15:50:31 2022 -0500 NFSD: COMMIT operations must not return NFS?ERR_INVAL [ Upstream commit 3f965021c8bc38965ecb1924f570c4842b33d408 ] Since, well, forever, the Linux NFS server's nfsd_commit() function has returned nfserr_inval when the passed-in byte range arguments were non-sensical. However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests are permitted to return only the following non-zero status codes: NFS3ERR_IO NFS3ERR_STALE NFS3ERR_BADHANDLE NFS3ERR_SERVERFAULT NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL is not listed in the COMMIT row of Table 6 in RFC 8881. RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not specify when it can or should be used. Instead of dropping or failing a COMMIT request in a byte range that is not supported, turn it into a valid request by treating one or both arguments as zero. Offset zero means start-of-file, count zero means until-end-of-file, so we only ever extend the commit range. NFS servers are always allowed to commit more and sooner than requested. The range check is no longer bounded by NFS_OFFSET_MAX, but rather by the value that is returned in the maxfilesize field of the NFSv3 FSINFO procedure or the NFSv4 maxfilesize file attribute. Note that this change results in a new pynfs failure: CMT4 st_commit.testCommitOverflow : RUNNING CMT4 st_commit.testCommitOverflow : FAILURE COMMIT with offset + count overflow should return NFS4ERR_INVAL, instead got NFS4_OK IMO the test is not correct as written: RFC 8881 does not allow the COMMIT operation to return NFS4ERR_INVAL. Reported-by: Dan Aloni Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever Reviewed-by: Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a231ae6bb50e7c0a9e9efd7b0d10687f1d71b3a3 Author: Chuck Lever Date: Tue Jan 25 15:59:57 2022 -0500 NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes [ Upstream commit a648fdeb7c0e17177a2280344d015dba3fbe3314 ] iattr::ia_size is a loff_t, so these NFSv3 procedures must be careful to deal with incoming client size values that are larger than s64_max without corrupting the value. Silently capping the value results in storing a different value than the client passed in which is unexpected behavior, so remove the min_t() check in decode_sattr3(). Note that RFC 1813 permits only the WRITE procedure to return NFS3ERR_FBIG. We believe that NFSv3 reference implementations also return NFS3ERR_FBIG when ia_size is too large. Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 38d02ba22e43b6fc7d291cf724bc6e3b7be6626b Author: Chuck Lever Date: Mon Jan 31 13:01:53 2022 -0500 NFSD: Fix ia_size underflow [ Upstream commit e6faac3f58c7c4176b66f63def17a34232a17b0e ] iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and NFSv4 both define file size as an unsigned 64-bit type. Thus there is a range of valid file size values an NFS client can send that is already larger than Linux can handle. Currently decode_fattr4() dumps a full u64 value into ia_size. If that value happens to be larger than S64_MAX, then ia_size underflows. I'm about to fix up the NFSv3 behavior as well, so let's catch the underflow in the common code path: nfsd_setattr(). Cc: stable@vger.kernel.org [ cel: context adjusted, 2f221d6f7b88 has not been applied ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1726a39b0879acfb490b22dca643f26f4f907da9 Author: Chuck Lever Date: Fri Feb 4 15:19:34 2022 -0500 NFSD: Fix the behavior of READ near OFFSET_MAX [ Upstream commit 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 ] Dan Aloni reports: > Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to > the RPC read layers") on the client, a read of 0xfff is aligned up > to server rsize of 0x1000. > > As a result, in a test where the server has a file of size > 0x7fffffffffffffff, and the client tries to read from the offset > 0x7ffffffffffff000, the read causes loff_t overflow in the server > and it returns an NFS code of EINVAL to the client. The client as > a result indefinitely retries the request. The Linux NFS client does not handle NFS?ERR_INVAL, even though all NFS specifications permit servers to return that status code for a READ. Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed and return a short result. Set the EOF flag in the result to prevent the client from retrying the READ request. This behavior appears to be consistent with Solaris NFS servers. Note that NFSv3 and NFSv4 use u64 offset values on the wire. These must be converted to loff_t internally before use -- an implicit type cast is not adequate for this purpose. Otherwise VFS checks against sb->s_maxbytes do not work properly. Reported-by: Dan Aloni Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fc2d8c153d52ff8ed0ddb88b764a8ee6a2cc0fe5 Author: J. Bruce Fields Date: Tue Jan 18 17:00:51 2022 -0500 lockd: fix failure to cleanup client locks [ Upstream commit d19a7af73b5ecaac8168712d18be72b9db166768 ] In my testing, we're sometimes hitting the request->fl_flags & FL_EXISTS case in posix_lock_inode, presumably just by random luck since we're not actually initializing fl_flags here. This probably didn't matter before commit 7f024fcd5c97 ("Keep read and write fds with each nlm_file") since we wouldn't previously unlock unless we knew there were locks. But now it causes lockd to give up on removing more locks. We could just initialize fl_flags, but really it seems dubious to be calling vfs_lock_file with random values in some of the fields. Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file") Signed-off-by: J. Bruce Fields [ cel: fixed checkpatch.pl nit ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 20a74a69119e032e431f143697ac1c0bdf938a3e Author: J. Bruce Fields Date: Tue Jan 18 17:00:16 2022 -0500 lockd: fix server crash on reboot of client holding lock [ Upstream commit 6e7f90d163afa8fc2efd6ae318e7c20156a5621f ] I thought I was iterating over the array when actually the iteration is over the values contained in the array? Ugh, keep it simple. Symptoms were a null deference in vfs_lock_file() when an NFSv3 client that previously held a lock came back up and sent a notify. Reported-by: Jonathan Woithe Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file") Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a667e1df409e5708d53bcc5819c5930bef4c9d44 Author: Yang Li Date: Thu Jan 20 13:57:22 2022 +0100 fanotify: remove variable set but not used [ Upstream commit 217663f101a56ef77f82273818253fff082bf503 ] The code that uses the pointer info has been removed in 7326e382c21e ("fanotify: report old and/or new parent+name in FAN_RENAME event"). and fanotify_event_info() doesn't change 'event', so the declaration and assignment of info can be removed. Eliminate the following clang warning: fs/notify/fanotify/fanotify_user.c:161:24: warning: variable ‘info’ set but not used Reported-by: Abaci Robot Signed-off-by: Yang Li Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 11bcfabf24815ae90c60978a1f21a874cd483561 Author: J. Bruce Fields Date: Wed Jan 5 14:15:03 2022 -0500 nfsd: fix crash on COPY_NOTIFY with special stateid [ Upstream commit 074b07d94e0bb6ddce5690a9b7e2373088e8b33a ] RTM says "If the special ONE stateid is passed to nfs4_preprocess_stateid_op(), it returns status=0 but does not set *cstid. nfsd4_copy_notify() depends on stid being set if status=0, and thus can crash if the client sends the right COPY_NOTIFY RPC." RFC 7862 says "The cna_src_stateid MUST refer to either open or locking states provided earlier by the server. If it is invalid, then the operation MUST fail." The RFC doesn't specify an error, and the choice doesn't matter much as this is clearly illegal client behavior, but bad_stateid seems reasonable. Simplest is just to guarantee that nfs4_preprocess_stateid_op, called with non-NULL cstid, errors out if it can't return a stateid. Reported-by: rtm@csail.mit.edu Fixes: 624322f1adc5 ("NFSD add COPY_NOTIFY operation") Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Reviewed-by: Olga Kornievskaia Tested-by: Olga Kornievskaia Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4eefd1125b96168a265a9ef2fd6daecebb3f0ad6 Author: Chuck Lever Date: Fri Dec 24 14:36:49 2021 -0500 NFSD: Move fill_pre_wcc() and fill_post_wcc() [ Upstream commit fcb5e3fa012351f3b96024c07bc44834c2478213 ] These functions are related to file handle processing and have nothing to do with XDR encoding or decoding. Also they are no longer NFSv3-specific. As a clean-up, move their definitions to a more appropriate location. WCC is also an NFSv3-specific term, so rename them as general-purpose helpers. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 695719e5e6b972bea0107fd795205b510b6db9be Author: Chuck Lever Date: Fri Dec 24 14:22:28 2021 -0500 Revert "nfsd: skip some unnecessary stats in the v4 case" [ Upstream commit 58f258f65267542959487dbe8b5641754411843d ] On the wire, I observed NFSv4 OPEN(CREATE) operations sometimes returning a reasonable-looking value in the cinfo.before field and zero in the cinfo.after field. RFC 8881 Section 10.8.1 says: > When a client is making changes to a given directory, it needs to > determine whether there have been changes made to the directory by > other clients. It does this by using the change attribute as > reported before and after the directory operation in the associated > change_info4 value returned for the operation. and > ... The post-operation change > value needs to be saved as the basis for future change_info4 > comparisons. A good quality client implementation therefore saves the zero cinfo.after value. During a subsequent OPEN operation, it will receive a different non-zero value in the cinfo.before field for that directory, and it will incorrectly believe the directory has changed, triggering an undesirable directory cache invalidation. There are filesystem types where fs_supports_change_attribute() returns false, tmpfs being one. On NFSv4 mounts, this means the fh_getattr() call site in fill_pre_wcc() and fill_post_wcc() is never invoked. Subsequently, nfsd4_change_attribute() is invoked with an uninitialized @stat argument. In fill_pre_wcc(), @stat contains stale stack garbage, which is then placed on the wire. In fill_post_wcc(), ->fh_post_wc is all zeroes, so zero is placed on the wire. Both of these values are meaningless. This fix can be applied immediately to stable kernels. Once there are more regression tests in this area, this optimization can be attempted again. Fixes: 428a23d2bf0c ("nfsd: skip some unnecessary stats in the v4 case") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5e07d49f4abd0e2b2e764d54d327a22e54266ef7 Author: Chuck Lever Date: Tue Dec 28 14:27:56 2021 -0500 NFSD: Trace boot verifier resets [ Upstream commit 75acacb6583df0b9328dc701d8eeea05af49b8b5 ] According to commit bbf2f098838a ("nfsd: Reset the boot verifier on all write I/O errors"), the Linux NFS server forces all clients to resend pending unstable writes if any server-side write or commit operation encounters an error (say, ENOSPC). This is a rare and quite exceptional event that could require administrative recovery action, so it should be made trace-able. Example trace event: nfsd-938 [002] 7174.945558: nfsd_writeverf_reset: boot_time= 61cc920d xid=0xdcd62036 error=-28 new verifier=0x08aecc6142515904 [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a1c9bcfd16f31e6ac1e2bc97ac7914f8d0fa2a75 Author: Chuck Lever Date: Thu Dec 30 10:22:05 2021 -0500 NFSD: Rename boot verifier functions [ Upstream commit 3988a57885eeac05ef89f0ab4d7e47b52fbcf630 ] Clean up: These functions handle what the specs call a write verifier, which in the Linux NFS server implementation is now divorced from the server's boot instance [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e49677ff33f3e3bfb52427d7f9f4a7783d264719 Author: Chuck Lever Date: Wed Dec 29 14:43:16 2021 -0500 NFSD: Clean up the nfsd_net::nfssvc_boot field [ Upstream commit 91d2e9b56cf5c80f9efc530d494968369a8a0e0d ] There are two boot-time fields in struct nfsd_net: one called boot_time and one called nfssvc_boot. The latter is used only to form write verifiers, but its documenting comment declares: /* Time of server startup */ Since commit 27c438f53e79 ("nfsd: Support the server resetting the boot verifier"), this field can be reset at any time; it's no longer tied to server restart. So that comment is stale. Also, according to pahole, struct timespec64 is 16 bytes long on x86_64. The nfssvc_boot field is used only to form a write verifier, which is 8 bytes long. Let's clarify this situation by manufacturing an 8-byte verifier in nfs_reset_boot_verifier() and storing only that in struct nfsd_net. We're grabbing 128 bits of time, so compress all of those into a 64-bit verifier instead of throwing out the high-order bits. In the future, the siphash_key can be re-used for other hashed objects per-nfsd_net. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 083d44094ff12f17c6538420a1da12d3f08756e5 Author: Chuck Lever Date: Thu Dec 30 10:26:18 2021 -0500 NFSD: Write verifier might go backwards [ Upstream commit cdc556600c0133575487cc69fb3128440b3c3e92 ] When vfs_iter_write() starts to fail because a file system is full, a bunch of writes can fail at once with ENOSPC. These writes repeatedly invoke nfsd_reset_boot_verifier() in quick succession. Ensure that the time it grabs doesn't go backwards due to an ntp adjustment going on at the same time. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 306d2c1c080360fd466d244da5c77b6433a15c9e Author: Trond Myklebust Date: Sat Dec 18 20:38:00 2021 -0500 nfsd: Add a tracepoint for errors in nfsd4_clone_file_range() [ Upstream commit a2f4c3fa4db94ba44d32a72201927cfd132a8e82 ] Since a clone error commit can cause the boot verifier to change, we should trace those errors. Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever [ cel: Addressed a checkpatch.pl splat in fs/nfsd/vfs.h ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 45ef8b7aea36e6c179661e5da639a6193aa5cb2f Author: Chuck Lever Date: Tue Dec 28 14:26:03 2021 -0500 NFSD: De-duplicate net_generic(nf->nf_net, nfsd_net_id) [ Upstream commit 2c445a0e72cb1fbfbdb7f9473c53556ee27c1d90 ] Since this pointer is used repeatedly, move it to a stack variable. [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5a1575c02baa0fa13db84134d15b670540c83c38 Author: Chuck Lever Date: Tue Dec 28 12:41:32 2021 -0500 NFSD: De-duplicate net_generic(SVC_NET(rqstp), nfsd_net_id) [ Upstream commit fb7622c2dbd1aa41133a8c73e1137b833c074519 ] Since this pointer is used repeatedly, move it to a stack variable. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit aa9ea9ec295f70b646e00f842fd9bfad0c3052d6 Author: Chuck Lever Date: Tue Dec 28 14:19:41 2021 -0500 NFSD: Clean up nfsd_vfs_write() [ Upstream commit 33388b3aefefd4d83764dab8038cb54068161a44 ] The RWF_SYNC and !RWF_SYNC arms are now exactly alike except that the RWF_SYNC arm resets the boot verifier twice in a row. Fix that redundancy and de-duplicate the code. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 30282a70aac12b05b329d9a78f45ba847615b5e1 Author: Jeff Layton Date: Sat Dec 18 20:37:56 2021 -0500 nfsd: Retry once in nfsd_open on an -EOPENSTALE return [ Upstream commit 12bcbd40fd931472c7fc9cf3bfe66799ece93ed8 ] If we get back -EOPENSTALE from an NFSv4 open, then we either got some unhandled error or the inode we got back was not the same as the one associated with the dentry. We really have no recourse in that situation other than to retry the open, and if it fails to just return nfserr_stale back to the client. Signed-off-by: Jeff Layton Signed-off-by: Lance Shelton Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3128aa9c984d3eab63b07b39690bcfa7321b62c0 Author: Jeff Layton Date: Sat Dec 18 20:37:55 2021 -0500 nfsd: Add errno mapping for EREMOTEIO [ Upstream commit a2694e51f60c5a18c7e43d1a9feaa46d7f153e65 ] The NFS client can occasionally return EREMOTEIO when signalling issues with the server. ...map to NFSERR_IO. Signed-off-by: Jeff Layton Signed-off-by: Lance Shelton Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f12557372b7685c22d77458ca405f5b5264b668f Author: Peng Tao Date: Sat Dec 18 20:37:54 2021 -0500 nfsd: map EBADF [ Upstream commit b3d0db706c77d02055910fcfe2f6eb5155ff9d5e ] Now that we have open file cache, it is possible that another client deletes the file and DP will not know about it. Then IO to MDS would fail with BADSTATEID and knfsd would start state recovery, which should fail as well and then nfs read/write will fail with EBADF. And it triggers a WARN() in nfserrno(). -----------[ cut here ]------------ WARNING: CPU: 0 PID: 13529 at fs/nfsd/nfsproc.c:758 nfserrno+0x58/0x70 [nfsd]() nfsd: non-standard errno: -9 modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_connt pata_acpi floppy CPU: 0 PID: 13529 Comm: nfsd Tainted: G W 4.1.5-00307-g6e6579b #7 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 0000000000000000 00000000464e6c9c ffff88079085fba8 ffffffff81789936 0000000000000000 ffff88079085fc00 ffff88079085fbe8 ffffffff810a08ea ffff88079085fbe8 ffff88080f45c900 ffff88080f627d50 ffff880790c46a48 all Trace: [] dump_stack+0x45/0x57 [] warn_slowpath_common+0x8a/0xc0 [] warn_slowpath_fmt+0x55/0x70 [] ? splice_direct_to_actor+0x148/0x230 [] ? fsid_source+0x60/0x60 [nfsd] [] nfserrno+0x58/0x70 [nfsd] [] nfsd_finish_read+0x97/0xb0 [nfsd] [] nfsd_splice_read+0x76/0xa0 [nfsd] [] nfsd_read+0xc1/0xd0 [nfsd] [] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [] nfsd3_proc_read+0xba/0x150 [nfsd] [] nfsd_dispatch+0xc3/0x210 [nfsd] [] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [] svc_process_common+0x453/0x6f0 [sunrpc] [] svc_process+0x113/0x1b0 [sunrpc] [] nfsd+0xff/0x170 [nfsd] [] ? nfsd_destroy+0x80/0x80 [nfsd] [] kthread+0xd8/0xf0 [] ? kthread_create_on_node+0x1b0/0x1b0 [] ret_from_fork+0x42/0x70 [] ? kthread_create_on_node+0x1b0/0x1b0 Signed-off-by: Peng Tao Signed-off-by: Lance Shelton Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9175fcf39c20ffd2fe3084dd671827b96c6422f3 Author: Chuck Lever Date: Tue Dec 21 11:52:06 2021 -0500 NFSD: Fix zero-length NFSv3 WRITEs [ Upstream commit 6a2f774424bfdcc2df3e17de0cefe74a4269cad5 ] The Linux NFS server currently responds to a zero-length NFSv3 WRITE request with NFS3ERR_IO. It responds to a zero-length NFSv4 WRITE with NFS4_OK and count of zero. RFC 1813 says of the WRITE procedure's @count argument: count The number of bytes of data to be written. If count is 0, the WRITE will succeed and return a count of 0, barring errors due to permissions checking. RFC 8881 has similar language for NFSv4, though NFSv4 removed the explicit @count argument because that value is already contained in the opaque payload array. The synthetic client pynfs's WRT4 and WRT15 tests do emit zero- length WRITEs to exercise this spec requirement. Commit fdec6114ee1f ("nfsd4: zero-length WRITE should succeed") addressed the same problem there with the same fix. But interestingly the Linux NFS client does not appear to emit zero- length WRITEs, instead squelching them. I'm not aware of a test that can generate such WRITEs for NFSv3, so I wrote a naive C program to generate a zero-length WRITE and test this fix. Fixes: 8154ef2776aa ("NFSD: Clean up legacy NFS WRITE argument XDR decoders") Reported-by: Trond Myklebust Signed-off-by: Chuck Lever Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fab02e979949d3cf5122475db0d28161256773d5 Author: Vasily Averin Date: Fri Dec 17 09:49:39 2021 +0300 nfsd4: add refcount for nfsd4_blocked_lock [ Upstream commit 47446d74f1707049067fee038507cdffda805631 ] nbl allocated in nfsd4_lock can be released by a several ways: directly in nfsd4_lock(), via nfs4_laundromat(), via another nfs command RELEASE_LOCKOWNER or via nfsd4_callback. This structure should be refcounted to be used and released correctly in all these cases. Refcount is initialized to 1 during allocation and is incremented when nbl is added into nbl_list/nbl_lru lists. Usually nbl is linked into both lists together, so only one refcount is used for both lists. However nfsd4_lock() should keep in mind that nbl can be present in one of lists only. This can happen if nbl was handled already by nfs4_laundromat/nfsd4_callback/etc. Refcount is decremented if vfs_lock_file() returns FILE_LOCK_DEFERRED, because nbl can be handled already by nfs4_laundromat/nfsd4_callback/etc. Refcount is not changed in find_blocked_lock() because of it reuses counter released after removing nbl from lists. Signed-off-by: Vasily Averin Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 535204ecaed0b75bcba00280101e2e90ab5858b9 Author: J. Bruce Fields Date: Thu Dec 16 12:20:13 2021 -0500 nfs: block notification on fs with its own ->lock [ Upstream commit 40595cdc93edf4110c0f0c0b06f8d82008f23929 ] NFSv4.1 supports an optional lock notification feature which notifies the client when a lock comes available. (Normally NFSv4 clients just poll for locks if necessary.) To make that work, we need to request a blocking lock from the filesystem. We turned that off for NFS in commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] because it actually blocks the nfsd thread while waiting for the lock. Thanks to Vasily Averin for pointing out that NFS isn't the only filesystem with that problem. Any filesystem that leaves ->lock NULL will use posix_lock_file(), which does the right thing. Simplest is just to assume that any filesystem that defines its own ->lock is not safe to request a blocking lock from. So, this patch mostly reverts commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] and commit b840be2f00c0 ("lockd: don't attempt blocking locks on nfs reexports"), and instead uses a check of ->lock (Vasily's suggestion) to decide whether to support blocking lock notifications on a given filesystem. Also add a little documentation. Perhaps someday we could add back an export flag later to allow filesystems with "good" ->lock methods to support blocking lock notifications. Reported-by: Vasily Averin Signed-off-by: J. Bruce Fields [ cel: Description rewritten to address checkpatch nits ] [ cel: Fixed warning when SUNRPC debugging is disabled ] [ cel: Fixed NULL check ] Signed-off-by: Chuck Lever Reviewed-by: Vasily Averin Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bf5e7e1fa1dbd50c986530c4408881ce991ef4b0 Author: Chuck Lever Date: Mon Dec 13 10:20:45 2021 -0500 NFSD: De-duplicate nfsd4_decode_bitmap4() [ Upstream commit cd2e999c7c394ae916d8be741418b3c6c1dddea8 ] Clean up. Trond points out that xdr_stream_decode_uint32_array() does the same thing as nfsd4_decode_bitmap4(). Suggested-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5a0710a6b40a44b96ec64d2cec70d030cf018a7d Author: J. Bruce Fields Date: Tue Dec 7 17:32:21 2021 -0500 nfsd: improve stateid access bitmask documentation [ Upstream commit 3dcd1d8aab00c5d3a0a3725253c86440b1a0f5a7 ] The use of the bitmaps is confusing. Add a cross-reference to make it easier to find the existing comment. Add an updated reference with URL to make it quicker to look up. And a bit more editorializing about the value of this. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f0dbe05f6df2cc9d8d16c25e36f82a9d92492886 Author: Chuck Lever Date: Thu Oct 21 12:11:45 2021 -0400 NFSD: Combine XDR error tracepoints [ Upstream commit 70e94d757b3e1f46486d573729d84c8955c81dce ] Clean up: The garbage_args and cant_encode tracepoints report the same information as each other, so combine them into a single tracepoint class to reduce code duplication and slightly reduce the size of trace.o. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e8f923e1e9fcc0832a12c8a2461792e5a9544c03 Author: NeilBrown Date: Wed Dec 1 10:58:14 2021 +1100 NFSD: simplify per-net file cache management [ Upstream commit 1463b38e7cf34d4cc60f41daff459ad807b2e408 ] We currently have a 'laundrette' for closing cached files - a different work-item for each network-namespace. These 'laundrettes' (aka struct nfsd_fcache_disposal) are currently on a list, and are freed using rcu. The list is not necessary as we have a per-namespace structure (struct nfsd_net) which can hold a link to the nfsd_fcache_disposal. The use of kfree_rcu is also unnecessary as the cache is cleaned of all files associated with a given namespace, and no new files can be added, before the nfsd_fcache_disposal is freed. So add a '->fcache_disposal' link to nfsd_net, and discard the list management and rcu usage. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 677fd67d8b804bbe2ef24f4404b03e8a42359723 Author: Jiapeng Chong Date: Thu Dec 2 16:35:42 2021 +0800 NFSD: Fix inconsistent indenting [ Upstream commit 1e37d0e5bda45881eea1bec4b812def72c7d4aea ] Eliminate the follow smatch warning: fs/nfsd/nfs4xdr.c:4766 nfsd4_encode_read_plus_hole() warn: inconsistent indenting. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0bc12c128940544d1a19de44c88d3b4e210704ca Author: Chuck Lever Date: Thu Sep 30 19:10:03 2021 -0400 NFSD: Remove be32_to_cpu() from DRC hash function [ Upstream commit 7578b2f628db27281d3165af0aa862311883a858 ] Commit 7142b98d9fd7 ("nfsd: Clean up drc cache in preparation for global spinlock elimination"), billed as a clean-up, added be32_to_cpu() to the DRC hash function without explanation. That commit removed two comments that state that byte-swapping in the hash function is unnecessary without explaining whether there was a need for that change. On some Intel CPUs, the swab32 instruction is known to cause a CPU pipeline stall. be32_to_cpu() does not add extra randomness, since the hash multiplication is done /before/ shifting to the high-order bits of the result. As a micro-optimization, remove the unnecessary transform from the DRC hash function. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e072a635c1ef9d17cf0102401be855eae7305b9f Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 NFS: switch the callback service back to non-pooled. [ Upstream commit 23a1a573c61ccb5e7829c1f5472d3e025293a031 ] Now that thread management is consistent there is no need for nfs-callback to use svc_create_pooled() as introduced in Commit df807fffaabd ("NFSv4.x/callback: Create the callback service through svc_create_pooled"). So switch back to svc_create(). If service pools were configured, but the number of threads were left at '1', nfs callback may not work reliably when svc_create_pooled() is used. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 948e4664cc3705ea73d5bf9bdaa2b5ea17a22223 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 lockd: use svc_set_num_threads() for thread start and stop [ Upstream commit 6b044fbaab02292fedb17565dbb3f2528083b169 ] svc_set_num_threads() does everything that lockd_start_svc() does, except set sv_maxconn. It also (when passed 0) finds the threads and stops them with kthread_stop(). So move the setting for sv_maxconn, and use svc_set_num_thread() We now don't need nlmsvc_task. Now that we use svc_set_num_threads() it makes sense to set svo_module. This request that the thread exists with module_put_and_exit(). Also fix the documentation for svo_module to make this explicit. svc_prepare_thread is now only used where it is defined, so it can be made static. Signed-off-by: NeilBrown [ cel: address merge conflict with fd2468fa1301 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit deeda24a6762d10c3364f34680afa7f6cb48bcc1 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 SUNRPC: always treat sv_nrpools==1 as "not pooled" [ Upstream commit 93aa619eb0b42eec2f3a9b4d9db41f5095390aec ] Currently 'pooled' services hold a reference on the pool_map, and 'unpooled' services do not. svc_destroy() uses the presence of ->svo_function (via svc_serv_is_pooled()) to determine if the reference should be dropped. There is no direct correlation between being pooled and the use of svo_function, though in practice, lockd is the only non-pooled service, and the only one not to use svo_function. This is untidy and would cause problems if we changed lockd to use svc_set_num_threads(), which requires the use of ->svo_function. So change the test for "is the service pooled" to "is sv_nrpools > 1". This means that when svc_pool_map_get() returns 1, it must NOT take a reference to the pool. We discard svc_serv_is_pooled(), and test sv_nrpools directly. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 74a0e37a20995f48af61be4fc24fa19175540b92 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 SUNRPC: move the pool_map definitions (back) into svc.c [ Upstream commit cf0e124e0a489944d08fcc3c694d2b234d2cc658 ] These definitions are not used outside of svc.c, and there is no evidence that they ever have been. So move them into svc.c and make the declarations 'static'. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9fe19a48a3bf62d57b0f2110e5b286c31e7996cd Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 lockd: rename lockd_create_svc() to lockd_get() [ Upstream commit ecd3ad68d2c6d3ae178a63a2d9a02c392904fd36 ] lockd_create_svc() already does an svc_get() if the service already exists, so it is more like a "get" than a "create". So: - Move the increment of nlmsvc_users into the function as well - rename to lockd_get(). It is now the inverse of lockd_put(). Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e5087b3d584f3e18bf43954a5bf068352f498bc7 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 lockd: introduce lockd_put() [ Upstream commit 865b674069e05e5779fcf8cf7a166d2acb7e930b ] There is some cleanup that is duplicated in lockd_down() and the failure path of lockd_up(). Factor these out into a new lockd_put() and call it from both places. lockd_put() does *not* take the mutex - that must be held by the caller. It decrements nlmsvc_users and if that reaches zero, it cleans up. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8304dd04fb7b6c734c7ce2e0e09a164f890db6ee Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 lockd: move svc_exit_thread() into the thread [ Upstream commit 6a4e2527a63620a820c4ebf3596b57176da26fb3 ] The normal place to call svc_exit_thread() is from the thread itself just before it exists. Do this for lockd. This means that nlmsvc_rqst is not used out side of lockd_start_svc(), so it can be made local to that function, and renamed to 'rqst'. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7077a007037517faf7006eec94dbf0c3d2b76137 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 lockd: move lockd_start_svc() call into lockd_create_svc() [ Upstream commit b73a2972041bee70eb0cbbb25fa77828c63c916b ] lockd_start_svc() only needs to be called once, just after the svc is created. If the start fails, the svc is discarded too. It thus makes sense to call lockd_start_svc() from lockd_create_svc(). This allows us to remove the test against nlmsvc_rqst at the start of lockd_start_svc() - it must always be NULL. lockd_up() only held an extra reference on the svc until a thread was created - then it dropped it. The thread - and thus the extra reference - will remain until kthread_stop() is called. Now that the thread is created in lockd_create_svc(), the extra reference can be dropped there. So the 'serv' variable is no longer needed in lockd_up(). Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a389baad9137f26e50ba524ed1d3bcebd0daa93f Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 lockd: simplify management of network status notifiers [ Upstream commit 5a8a7ff57421b7de3ae72019938ffb5daaee36e7 ] Now that the network status notifiers use nlmsvc_serv rather then nlmsvc_rqst the management can be simplified. Notifier unregistration synchronises with any pending notifications so providing we unregister before nlm_serv is freed no further interlock is required. So we move the unregister call to just before the thread is killed (which destroys the service) and just before the service is destroyed in the failure-path of lockd_up(). Then nlm_ntf_refcnt and nlm_ntf_wq can be removed. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 32f3e5a70f283105cb1325dd82f39ed7cbf8059e Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 lockd: introduce nlmsvc_serv [ Upstream commit 2840fe864c91a0fe822169b1fbfddbcac9aeac43 ] lockd has two globals - nlmsvc_task and nlmsvc_rqst - but mostly it wants the 'struct svc_serv', and when it doesn't want it exactly it can get to what it wants from the serv. This patch is a first step to removing nlmsvc_task and nlmsvc_rqst. It introduces nlmsvc_serv to store the 'struct svc_serv*'. This is set as soon as the serv is created, and cleared only when it is destroyed. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d95899dadb4d7865b1457d1de09ac6923f0c9f1f Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 NFSD: simplify locking for network notifier. [ Upstream commit d057cfec4940ce6eeffa22b4a71dec203b06cd55 ] nfsd currently maintains an open-coded read/write semaphore (refcount and wait queue) for each network namespace to ensure the nfs service isn't shut down while the notifier is running. This is excessive. As there is unlikely to be contention between notifiers and they run without sleeping, a single spinlock is sufficient to avoid problems. Signed-off-by: NeilBrown [ cel: ensure nfsd_notifier_lock is static ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7149250beeea91f5de506e9a2f3195a385340210 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 SUNRPC: discard svo_setup and rename svc_set_num_threads_sync() [ Upstream commit 3ebdbe5203a874614819700d3f470724cb803709 ] The ->svo_setup callback serves no purpose. It is always called from within the same module that chooses which callback is needed. So discard it and call the relevant function directly. Now that svc_set_num_threads() is no longer used remove it and rename svc_set_num_threads_sync() to remove the "_sync" suffix. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 361452374168b572feae8f377f386564bdcb4308 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 NFSD: Make it possible to use svc_set_num_threads_sync [ Upstream commit 3409e4f1e8f239f0ed81be0b068ecf4e73e2e826 ] nfsd cannot currently use svc_set_num_threads_sync. It instead uses svc_set_num_threads which does *not* wait for threads to all exit, and has a separate mechanism (nfsd_shutdown_complete) to wait for completion. The reason that nfsd is unlike other services is that nfsd threads can exit separately from svc_set_num_threads being called - they die on receipt of SIGKILL. Also, when the last thread exits, the service must be shut down (sockets closed). For this, the nfsd_mutex needs to be taken, and as that mutex needs to be held while svc_set_num_threads is called, the one cannot wait for the other. This patch changes the nfsd thread so that it can drop the ref on the service without blocking on nfsd_mutex, so that svc_set_num_threads_sync can be used: - if it can drop a non-last reference, it does that. This does not trigger shutdown and does not require a mutex. This will likely happen for all but the last thread signalled, and for all threads being shut down by nfsd_shutdown_threads() - if it can get the mutex without blocking (trylock), it does that and then drops the reference. This will likely happen for the last thread killed by SIGKILL - Otherwise there might be an unrelated task holding the mutex, possibly in another network namespace, or nfsd_shutdown_threads() might be just about to get a reference on the service, after which we can drop ours safely. We cannot conveniently get wakeup notifications on these events, and we are unlikely to need to, so we sleep briefly and check again. With this we can discard nfsd_shutdown_complete and nfsd_complete_shutdown(), and switch to svc_set_num_threads_sync. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6343271d5315f82c5fcb483e71c271a64fb96f83 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 NFSD: narrow nfsd_mutex protection in nfsd thread [ Upstream commit 9d3792aefdcda71d20c2b1ecc589c17ae71eb523 ] There is nothing happening in the start of nfsd() that requires protection by the mutex, so don't take it until shutting down the thread - which does still require protection - but only for nfsd_put(). Signed-off-by: NeilBrown [ cel: address merge conflict with fd2468fa1301 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 61d12fc30a5e2fa2296f0afc759f023b1e7f3937 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 SUNRPC: use sv_lock to protect updates to sv_nrthreads. [ Upstream commit 2a36395fac3b72771f87c3ee4387e3a96d85a7cc ] Using sv_lock means we don't need to hold the service mutex over these updates. In particular, svc_exit_thread() no longer requires synchronisation, so threads can exit asynchronously. Note that we could use an atomic_t, but as there are many more read sites than writes, that would add unnecessary noise to the code. Some reads are already racy, and there is no need for them to not be. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4efe0b9d11fc5d650d627dba4f43973da10e11ac Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 nfsd: make nfsd_stats.th_cnt atomic_t [ Upstream commit 9b6c8c9bebccd5fb785c306b948c08874a88874d ] This allows us to move the updates for th_cnt out of the mutex. This is a step towards reducing mutex coverage in nfsd(). Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 17041f014060a759a890d601e75356cef3835043 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 SUNRPC: stop using ->sv_nrthreads as a refcount [ Upstream commit ec52361df99b490f6af412b046df9799b92c1050 ] The use of sv_nrthreads as a general refcount results in clumsy code, as is seen by various comments needed to explain the situation. This patch introduces a 'struct kref' and uses that for reference counting, leaving sv_nrthreads to be a pure count of threads. The kref is managed particularly in svc_get() and svc_put(), and also nfsd_put(); svc_destroy() now takes a pointer to the embedded kref, rather than to the serv. nfsd allows the svc_serv to exist with ->sv_nrhtreads being zero. This happens when a transport is created before the first thread is started. To support this, a 'keep_active' flag is introduced which holds a ref on the svc_serv. This is set when any listening socket is successfully added (unless there are running threads), and cleared when the number of threads is set. So when the last thread exits, the nfs_serv will be destroyed. The use of 'keep_active' replaces previous code which checked if there were any permanent sockets. We no longer clear ->rq_server when nfsd() exits. This was done to prevent svc_exit_thread() from calling svc_destroy(). Instead we take an extra reference to the svc_serv to prevent svc_destroy() from being called. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 64312a7c9fa1676afedbf1ad5e08a10eb272ad59 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 SUNRPC/NFSD: clean up get/put functions. [ Upstream commit 8c62d12740a1450d2e8456d5747f440e10db281a ] svc_destroy() is poorly named - it doesn't necessarily destroy the svc, it might just reduce the ref count. nfsd_destroy() is poorly named for the same reason. This patch: - removes the refcount functionality from svc_destroy(), moving it to a new svc_put(). Almost all previous callers of svc_destroy() now call svc_put(). - renames nfsd_destroy() to nfsd_put() and improves the code, using the new svc_destroy() rather than svc_put() - removes a few comments that explain the important for balanced get/put calls. This should be obvious. The only non-trivial part of this is that svc_destroy() would call svc_sock_update() on a non-final decrement. It can no longer do that, and svc_put() isn't really a good place of it. This call is now made from svc_exit_thread() which seems like a good place. This makes the call *before* sv_nrthreads is decremented rather than after. This is not particularly important as the call just sets a flag which causes sv_nrthreads set be checked later. A subsequent patch will improve the ordering. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e9a4156137cf3051648b17512a77104009ec3a35 Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 SUNRPC: change svc_get() to return the svc. [ Upstream commit df5e49c880ea0776806b8a9f8ab95e035272cf6f ] It is common for 'get' functions to return the object that was 'got', and there are a couple of places where users of svc_get() would be a little simpler if svc_get() did that. Make it so. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e0bf8993522038f5f9881c966ca17bf8885a535f Author: NeilBrown Date: Mon Nov 29 15:51:25 2021 +1100 NFSD: handle errors better in write_ports_addfd() [ Upstream commit 89b24336f03a8ba560e96b0c47a8434a7fa48e3c ] If write_ports_add() fails, we shouldn't destroy the serv, unless we had only just created it. So if there are any permanent sockets already attached, leave the serv in place. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 307b391221ce5cac86554d5df8cff903921f06ee Author: Chuck Lever Date: Wed Oct 13 16:44:20 2021 -0400 NFSD: Fix sparse warning [ Upstream commit c2f1c4bd20621175c581f298b4943df0cffbd841 ] /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: warning: incorrect type in assignment (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: expected restricted __be32 [usertype] status /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: got int Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c59dc174b2e4bb1731aaf93f8fad23b9feab00f5 Author: Eric W. Biederman Date: Fri Dec 3 11:00:19 2021 -0600 exit: Rename module_put_and_exit to module_put_and_kthread_exit [ Upstream commit ca3574bd653aba234a4b31955f2778947403be16 ] Update module_put_and_exit to call kthread_exit instead of do_exit. Change the name to reflect this change in functionality. All of the users of module_put_and_exit are causing the current kthread to exit so this change makes it clear what is happening. There is no functional change. Signed-off-by: "Eric W. Biederman" Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 15606c8d5200d83cb50c26495bdb2ef00c0893fc Author: Eric W. Biederman Date: Mon Nov 22 10:27:36 2021 -0600 exit: Implement kthread_exit [ Upstream commit bbda86e988d4c124e4cfa816291cbd583ae8bfb1 ] The way the per task_struct exit_code is used by kernel threads is not quite compatible how it is used by userspace applications. The low byte of the userspace exit_code value encodes the exit signal. While kthreads just use the value as an int holding ordinary kernel function exit status like -EPERM. Add kthread_exit to clearly separate the two kinds of uses. Signed-off-by: "Eric W. Biederman" Stable-dep-of: ca3574bd653a ("exit: Rename module_put_and_exit to module_put_and_kthread_exit") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 63b8c1923117f483aa6d05bf8240cc7b22ba06a7 Author: Amir Goldstein Date: Mon Nov 29 22:15:37 2021 +0200 fanotify: wire up FAN_RENAME event [ Upstream commit 8cc3b1ccd930fe6971e1527f0c4f1bdc8cb56026 ] FAN_RENAME is the successor of FAN_MOVED_FROM and FAN_MOVED_TO and can be used to get the old and new parent+name information in a single event. FAN_MOVED_FROM and FAN_MOVED_TO are still supported for backward compatibility, but it makes little sense to use them together with FAN_RENAME in the same group. FAN_RENAME uses special info type records to report the old and new parent+name, so reporting only old and new parent id is less useful and was not implemented. Therefore, FAN_REANAME requires a group with flag FAN_REPORT_NAME. Link: https://lore.kernel.org/r/20211129201537.1932819-12-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a860dd8bf5712a3e4a569e36735cc7c4a9b5fa53 Author: Amir Goldstein Date: Mon Nov 29 22:15:36 2021 +0200 fanotify: report old and/or new parent+name in FAN_RENAME event [ Upstream commit 7326e382c21e9c23c89c88369afdc90b82a14da8 ] In the special case of FAN_RENAME event, we report old or new or both old and new parent+name. A single info record will be reported if either the old or new dir is watched and two records will be reported if both old and new dir (or their filesystem) are watched. The old and new parent+name are reported using new info record types FAN_EVENT_INFO_TYPE_{OLD,NEW}_DFID_NAME, so if a single info record is reported, it is clear to the application, to which dir entry the fid+name info is referring to. Link: https://lore.kernel.org/r/20211129201537.1932819-11-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c76fa8515949d5188a85956b57679e29795a6481 Author: Amir Goldstein Date: Mon Nov 29 22:15:35 2021 +0200 fanotify: record either old name new name or both for FAN_RENAME [ Upstream commit 2bfbcccde6e7a787feabad4645f628f963fe0663 ] We do not want to report the dirfid+name of a directory whose inode/sb are not watched, because watcher may not have permissions to see the directory content. Use an internal iter_info to indicate to fanotify_alloc_event() which marks of this group are watching FAN_RENAME, so it can decide if we need to record only the old parent+name, new parent+name or both. Link: https://lore.kernel.org/r/20211129201537.1932819-10-amir73il@gmail.com Signed-off-by: Amir Goldstein [JK: Modified code to pass around only mask of mark types matching generated event] Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit da527da33bcd5c8a36fadef60ae9569852f2137f Author: Amir Goldstein Date: Mon Nov 29 22:15:34 2021 +0200 fanotify: record old and new parent and name in FAN_RENAME event [ Upstream commit 3982534ba5ce45e890b2f5ef5e7372c1accd14c7 ] In the special case of FAN_RENAME event, we record both the old and new parent and name. Link: https://lore.kernel.org/r/20211129201537.1932819-9-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f59e978cfa9f20663e618294026676b35c2da033 Author: Amir Goldstein Date: Mon Nov 29 22:15:33 2021 +0200 fanotify: support secondary dir fh and name in fanotify_info [ Upstream commit 3cf984e950c1c3f41d407ed31db33beb996be132 ] Allow storing a secondary dir fh and name tupple in fanotify_info. This will be used to store the new parent and name information in FAN_RENAME event. Link: https://lore.kernel.org/r/20211129201537.1932819-8-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 967ae137209ce2d983fe16cb5519ff6b21eecde2 Author: Amir Goldstein Date: Mon Nov 29 22:15:32 2021 +0200 fanotify: use helpers to parcel fanotify_info buffer [ Upstream commit 1a9515ac9e55e68d733bab81bd408463ab1e25b1 ] fanotify_info buffer is parceled into variable sized records, so the records must be written in order: dir_fh, file_fh, name. Use helpers to assert that order and make fanotify_alloc_name_event() a bit more generic to allow empty dir_fh record and to allow expanding to more records (i.e. name2) soon. Link: https://lore.kernel.org/r/20211129201537.1932819-7-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4e63ce91997a6ac8b8337e87aafb539590ba9ae9 Author: Amir Goldstein Date: Mon Nov 29 22:15:31 2021 +0200 fanotify: use macros to get the offset to fanotify_info buffer [ Upstream commit 2d9374f095136206a02eb0b6cd9ef94632c1e9f7 ] The fanotify_info buffer contains up to two file handles and a name. Use macros to simplify the code that access the different items within the buffer. Add assertions to verify that stored fh len and name len do not overflow the u8 stored value in fanotify_info header. Remove the unused fanotify_info_len() helper. Link: https://lore.kernel.org/r/20211129201537.1932819-6-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 580eb8de84704fe37a34b8a974f22a11ed385300 Author: Amir Goldstein Date: Mon Nov 29 22:15:30 2021 +0200 fsnotify: generate FS_RENAME event with rich information [ Upstream commit e54183fa7047c15819bc155f4c58501d9a9a3489 ] The dnotify FS_DN_RENAME event is used to request notification about a move within the same parent directory and was always coupled with the FS_MOVED_FROM event. Rename the FS_DN_RENAME event flag to FS_RENAME, decouple it from FS_MOVED_FROM and report it with the moved dentry instead of the moved inode, so it has the information about both old and new parent and name. Generate the FS_RENAME event regardless of same parent dir and apply the "same parent" rule in the generic fsnotify_handle_event() helper that is used to call backends with ->handle_inode_event() method (i.e. dnotify). The ->handle_inode_event() method is not rich enough to report both old and new parent and name anyway. The enriched event is reported to fanotify over the ->handle_event() method with the old and new dir inode marks in marks array slots for ITER_TYPE_INODE and a new iter type slot ITER_TYPE_INODE2. The enriched event will be used for reporting old and new parent+name to fanotify groups with FAN_RENAME events. Link: https://lore.kernel.org/r/20211129201537.1932819-5-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4e59c7b3e3b666822144fcb857c5009d921cda30 Author: Amir Goldstein Date: Mon Nov 29 22:15:29 2021 +0200 fanotify: introduce group flag FAN_REPORT_TARGET_FID [ Upstream commit d61fd650e9d206a71fda789f02a1ced4b19944c4 ] FAN_REPORT_FID is ambiguous in that it reports the fid of the child for some events and the fid of the parent for create/delete/move events. The new FAN_REPORT_TARGET_FID flag is an implicit request to report the fid of the target object of the operation (a.k.a the child inode) also in create/delete/move events in addition to the fid of the parent and the name of the child. To reduce the test matrix for uninteresting use cases, the new FAN_REPORT_TARGET_FID flag requires both FAN_REPORT_NAME and FAN_REPORT_FID. The convenience macro FAN_REPORT_DFID_NAME_TARGET combines FAN_REPORT_TARGET_FID with all the required flags. Link: https://lore.kernel.org/r/20211129201537.1932819-4-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit be14cab43ddf78ba1cd34663ef6edf4b97e6a6b8 Author: Amir Goldstein Date: Mon Nov 29 22:15:28 2021 +0200 fsnotify: separate mark iterator type from object type enum [ Upstream commit 1c9007d62bea6fd164285314f7553f73e5308863 ] They are two different types that use the same enum, so this confusing. Use the object type to indicate the type of object mark is attached to and the iter type to indicate the type of watch. A group can have two different watches of the same object type (parent and child watches) that match the same event. Link: https://lore.kernel.org/r/20211129201537.1932819-3-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c0a5f0b561c8ab2af72ab23da5a6e13d6b5ee154 Author: Amir Goldstein Date: Mon Nov 29 22:15:27 2021 +0200 fsnotify: clarify object type argument [ Upstream commit ad69cd9972e79aba103ba5365de0acd35770c265 ] In preparation for separating object type from iterator type, rename some 'type' arguments in functions to 'obj_type' and remove the unused interface to clear marks by object type mask. Link: https://lore.kernel.org/r/20211129201537.1932819-2-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9e291a6a28d32545ed2fd959a8165144d1724df1 Author: Chuck Lever Date: Thu Dec 16 11:12:11 2021 -0500 NFSD: Fix READDIR buffer overflow [ Upstream commit 53b1119a6e5028b125f431a0116ba73510d82a72 ] If a client sends a READDIR count argument that is too small (say, zero), then the buffer size calculation in the new init_dirlist helper functions results in an underflow, allowing the XDR stream functions to write beyond the actual buffer. This calculation has always been suspect. NFSD has never sanity- checked the READDIR count argument, but the old entry encoders managed the problem correctly. With the commits below, entry encoding changed, exposing the underflow to the pointer arithmetic in xdr_reserve_space(). Modern NFS clients attempt to retrieve as much data as possible for each READDIR request. Also, we have no unit tests that exercise the behavior of READDIR at the lower bound of @count values. Thus this case was missed during testing. Reported-by: Anatoly Trosinenko Fixes: f5dcccd647da ("NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream") Fixes: 7f87fc2d34d4 ("NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1abf3ec5587741cbc248c756a1b1bf933ad79506 Author: Chuck Lever Date: Sun Nov 14 15:16:04 2021 -0500 NFSD: Fix exposure in nfsd4_decode_bitmap() [ Upstream commit c0019b7db1d7ac62c711cda6b357a659d46428fe ] rtm@csail.mit.edu reports: > nfsd4_decode_bitmap4() will write beyond bmval[bmlen-1] if the RPC > directs it to do so. This can cause nfsd4_decode_state_protect4_a() > to write client-supplied data beyond the end of > nfsd4_exchange_id.spo_must_allow[] when called by > nfsd4_decode_exchange_id(). Rewrite the loops so nfsd4_decode_bitmap() cannot iterate beyond @bmlen. Reported by: rtm@csail.mit.edu Fixes: d1c263a031e8 ("NFSD: Replace READ* macros in nfsd4_decode_fattr()") Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 88ccda1a814323f6d7934a76bea43a2331ab1994 Author: J. Bruce Fields Date: Tue Oct 26 12:56:55 2021 -0400 nfsd4: remove obselete comment [ Upstream commit 80479eb862102f9513e93fcf726c78cc0be2e3b2 ] Mandatory locking has been removed. And the rest of this comment is redundant with the code. Reported-by: Jeff layton Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f4e9e9565e4240ba7bb7d903adbbc96ce758814f Author: Changcheng Deng Date: Tue Oct 19 04:14:22 2021 +0000 NFSD:fix boolreturn.cocci warning [ Upstream commit 291cd656da04163f4bba67953c1f2f823e0d1231 ] ./fs/nfsd/nfssvc.c: 1072: 8-9: :WARNING return of 0/1 in function 'nfssvc_decode_voidarg' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Reported-by: Zeal Robot Signed-off-by: Changcheng Deng Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 022723fe15074fd5132a4c55b1d98753d70bc53f Author: J. Bruce Fields Date: Fri Oct 15 14:42:11 2021 -0400 nfsd: update create verifier comment [ Upstream commit 2336d696862186fd4a6ddd1ea0cb243b3e32847c ] I don't know if that Solaris behavior matters any more or if it's still possible to look up that bug ID any more. The XFS behavior's definitely still relevant, though; any but the most recent XFS filesystems will lose the top bits. Reported-by: Frank S. Filz Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c7b0a9c75d3c832c919d7ff0d24b410ba4591074 Author: Chuck Lever Date: Wed Oct 13 10:41:13 2021 -0400 SUNRPC: Change return value type of .pc_encode [ Upstream commit 130e2054d4a652a2bd79fb1557ddcd19c053cb37 ] Returning an undecorated integer is an age-old trope, but it's not clear (even to previous experts in this code) that the only valid return values are 1 and 0. These functions do not return a negative errno, rpc_stat value, or a positive length. Document there are only two valid return values by having .pc_encode return only true or false. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 61cf6815070a8d27d40b31e4eb4e34fea10a76c5 Author: Chuck Lever Date: Wed Oct 13 10:41:06 2021 -0400 SUNRPC: Replace the "__be32 *p" parameter to .pc_encode [ Upstream commit fda494411485aff91768842c532f90fb8eb54943 ] The passed-in value of the "__be32 *p" parameter is now unused in every server-side XDR encoder, and can be removed. Note also that there is a line in each encoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per encoder function. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 47047d40af7b5b42d5055a19f66725d8517bb63a Author: Chuck Lever Date: Wed Oct 13 10:40:59 2021 -0400 NFSD: Save location of NFSv4 COMPOUND status [ Upstream commit 3b0ebb255fdc49a3d340846deebf045ef58ec744 ] Refactor: Currently nfs4svc_encode_compoundres() relies on the NFS dispatcher to pass in the buffer location of the COMPOUND status. Instead, save that buffer location in struct nfsd4_compoundres. The compound tag follows immediately after. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f747ce574c4a4baa65e9a15b4524245cecc12045 Author: Chuck Lever Date: Tue Oct 12 11:57:28 2021 -0400 SUNRPC: Change return value type of .pc_decode [ Upstream commit c44b31c263798ec34614dd394c31ef1a2e7e716e ] Returning an undecorated integer is an age-old trope, but it's not clear (even to previous experts in this code) that the only valid return values are 1 and 0. These functions do not return a negative errno, rpc_stat value, or a positive length. Document there are only two valid return values by having .pc_decode return only true or false. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0696b6b513a78d8c87aed664266dbc03846a08b4 Author: Chuck Lever Date: Tue Oct 12 11:57:22 2021 -0400 SUNRPC: Replace the "__be32 *p" parameter to .pc_decode [ Upstream commit 16c663642c7ec03cd4cee5fec520bb69e97babe4 ] The passed-in value of the "__be32 *p" parameter is now unused in every server-side XDR decoder, and can be removed. Note also that there is a line in each decoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per decoder function. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 396b359832e7db6da1ce0307a9628f86ccae65a0 Author: Chuck Lever Date: Thu Sep 30 17:06:21 2021 -0400 NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment() [ Upstream commit dae9a6cab8009e526570e7477ce858dcdfeb256e ] Refactor. Now that the NFSv2 and NFSv3 XDR decoders have been converted to use xdr_streams, the WRITE decoder functions can use xdr_stream_subsegment() to extract the WRITE payload into its own xdr_buf, just as the NFSv4 WRITE XDR decoder currently does. That makes it possible to pass the first kvec, pages array + length, page_base, and total payload length via a single function parameter. The payload's page_base is not yet assigned or used, but will be in subsequent patches. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c23b25dd19288db06d8167314cc89365ba649feb Author: Colin Ian King Date: Sat Sep 25 23:58:41 2021 +0100 NFSD: Initialize pointer ni with NULL and not plain integer 0 [ Upstream commit 8e70bf27fd20cc17e87150327a640e546bfbee64 ] Pointer ni is being initialized with plain integer zero. Fix this by initializing with NULL. Signed-off-by: Colin Ian King Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 67841880909219887688ee0ee1e53012c1be1569 Author: NeilBrown Date: Thu Sep 2 11:16:32 2021 +1000 NFSD: simplify struct nfsfh [ Upstream commit d8b26071e65e80a348602b939e333242f989221b ] Most of the fields in 'struct knfsd_fh' are 2 levels deep (a union and a struct) and are accessed using macros like: #define fh_FOO fh_base.fh_new.fb_FOO This patch makes the union and struct anonymous, so that "fh_FOO" can be a name directly within 'struct knfsd_fh' and the #defines aren't needed. The file handle as a whole is sometimes accessed as "fh_base" or "fh_base.fh_pad", neither of which are particularly helpful names. As the struct holding the filehandle is now anonymous, we cannot use the name of that, so we union it with 'fh_raw' and use that where the raw filehandle is needed. fh_raw also ensure the structure is large enough for the largest possible filehandle. fh_raw is a 'char' array, removing any need to cast it for memcpy etc. SVCFH_fmt() is simplified using the "%ph" printk format. This changes the appearance of filehandles in dprintk() debugging, making them a little more precise. Reviewed-by: Christoph Hellwig Signed-off-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 25054b04ec92cceccf2c1da75c5abd52c8fdc5b7 Author: NeilBrown Date: Thu Sep 2 11:15:29 2021 +1000 NFSD: drop support for ancient filehandles [ Upstream commit c645a883df34ee10b884ec921e850def54b7f461 ] Filehandles not in the "new" or "version 1" format have not been handed out for new mounts since Linux 2.4 which was released 20 years ago. I think it is safe to say that no such file handles are still in use, and that we can drop support for them. Signed-off-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 918bc45a57bcd0bfa5673cba8844765a466f8e7b Author: NeilBrown Date: Thu Sep 2 11:14:47 2021 +1000 NFSD: move filehandle format declarations out of "uapi". [ Upstream commit ef5825e3cf0d0af657f5fb4dd86d750ed42fee0a ] A small part of the declaration concerning filehandle format are currently in the "uapi" include directory: include/uapi/linux/nfsd/nfsfh.h There is a lot more to the filehandle format, including "enum fid_type" and "enum nfsd_fsid" which are not exported via "uapi". This small part of the filehandle definition is of minimal use outside of the kernel, and I can find no evidence that an other code is using it. Certainly nfs-utils and wireshark (The most likely candidates) do not use these declarations. So move it out of "uapi" by copying the content from include/uapi/linux/nfsd/nfsfh.h into fs/nfsd/nfsfh.h A few unnecessary "#include" directives are not copied, and neither is the #define of fh_auth, which is annotated as being for userspace only. The copyright claims in the uapi file are identical to those in the nfsd file, so there is no need to copy those. The "__u32" style integer types are only needed in "uapi". In kernel-only code we can use the more familiar "u32" style. Signed-off-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d2815110a7418a3155bd74e6a15db5ea6261d333 Author: Chuck Lever Date: Mon Sep 20 15:25:21 2021 -0400 NFSD: Optimize DRC bucket pruning [ Upstream commit 8847ecc9274a14114385d1cb4030326baa0766eb ] DRC bucket pruning is done by nfsd_cache_lookup(), which is part of every NFSv2 and NFSv3 dispatch (ie, it's done while the client is waiting). I added a trace_printk() in prune_bucket() to see just how long it takes to prune. Here are two ends of the spectrum: prune_bucket: Scanned 1 and freed 0 in 90 ns, 62 entries remaining prune_bucket: Scanned 2 and freed 1 in 716 ns, 63 entries remaining ... prune_bucket: Scanned 75 and freed 74 in 34149 ns, 1 entries remaining Pruning latency is noticeable on fast transports with fast storage. By noticeable, I mean that the latency measured here in the worst case is the same order of magnitude as the round trip time for cached server operations. We could do something like moving expired entries to an expired list and then free them later instead of freeing them right in prune_bucket(). But simply limiting the number of entries that can be pruned by a lookup is simple and retains more entries in the cache, making the DRC somewhat more effective. Comparison with a 70/30 fio 8KB 12 thread direct I/O test: Before: write: IOPS=61.6k, BW=481MiB/s (505MB/s)(14.1GiB/30001msec); 0 zone resets WRITE: 1848726 ops (30%) avg bytes sent per op: 8340 avg bytes received per op: 136 backlog wait: 0.635158 RTT: 0.128525 total execute time: 0.827242 (milliseconds) After: write: IOPS=63.0k, BW=492MiB/s (516MB/s)(14.4GiB/30001msec); 0 zone resets WRITE: 1891144 ops (30%) avg bytes sent per op: 8340 avg bytes received per op: 136 backlog wait: 0.616114 RTT: 0.126842 total execute time: 0.805348 (milliseconds) Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2b2963c72c8ab618237d9bccf1fbfea6092ed8a0 Author: Chuck Lever Date: Sat Oct 16 18:02:57 2021 -0400 SUNRPC: Trace calls to .rpc_call_done [ Upstream commit b40887e10dcacc5e8ae3c1a99dcba20877c4831b ] Introduce a single tracepoint that can replace simple dprintk call sites in upper layer "rpc_call_done" callbacks. Example: kworker/u24:2-1254 [001] 771.026677: rpc_stats_latency: task:00000001@00000002 xid=0x16a6f3c0 rpcbindv2 GETPORT backlog=446 rtt=101 execute=555 kworker/u24:2-1254 [001] 771.026677: rpc_task_call_done: task:00000001@00000002 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpcb_getport_done kworker/u24:2-1254 [001] 771.026678: rpcb_setport: task:00000001@00000002 status=0 port=20048 Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2eda01447798f7ee30a26583a684bbd00ac0d54c Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:43 2021 -0300 fanotify: Allow users to request FAN_FS_ERROR events [ Upstream commit 9709bd548f11a092d124698118013f66e1740f9b ] Wire up the FAN_FS_ERROR event in the fanotify_mark syscall, allowing user space to request the monitoring of FAN_FS_ERROR events. These events are limited to filesystem marks, so check it is the case in the syscall handler. Link: https://lore.kernel.org/r/20211025192746.66445-29-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b0f01b7c080896cb32689de598622125fbd62bc3 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:42 2021 -0300 fanotify: Emit generic error info for error event [ Upstream commit 130a3c742107acff985541c28360c8b40203559c ] The error info is a record sent to users on FAN_FS_ERROR events documenting the type of error. It also carries an error count, documenting how many errors were observed since the last reporting. Link: https://lore.kernel.org/r/20211025192746.66445-28-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit aefd9029fa501e1bc661c1e436838959d996b075 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:41 2021 -0300 fanotify: Report fid info for file related file system errors [ Upstream commit 936d6a38be39177495af38497bf8da1c6128fa1b ] Plumb the pieces to add a FID report to error records. Since all error event memory must be pre-allocated, we pre-allocate the maximum file handle size possible, such that it should always fit. For errors that don't expose a file handle, report it with an invalid FID. Internally we use zero-length FILEID_ROOT file handle for passing the information (which we report as zero-length FILEID_INVALID file handle to userspace) so we update the handle reporting code to deal with this case correctly. Link: https://lore.kernel.org/r/20211025192746.66445-27-krisman@collabora.com Link: https://lore.kernel.org/r/20211025192746.66445-25-krisman@collabora.com Signed-off-by: Gabriel Krisman Bertazi Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara [Folded two patches into 2 to make series bisectable] Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bb247feb22d76161d28d67d80da06c6ca459fd34 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:40 2021 -0300 fanotify: WARN_ON against too large file handles [ Upstream commit 572c28f27a269f88e2d8d7b6b1507f114d637337 ] struct fanotify_error_event, at least, is preallocated and isn't able to to handle arbitrarily large file handles. Future-proof the code by complaining loudly if a handle larger than MAX_HANDLE_SZ is ever found. Link: https://lore.kernel.org/r/20211025192746.66445-26-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7fa20568b6e5150ce6ccf437769a92a8144fd407 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:38 2021 -0300 fanotify: Add helpers to decide whether to report FID/DFID [ Upstream commit 4bd5a5c8e6e5cd964e9738e6ef87f6c2cb453edf ] Now that there is an event that reports FID records even for a zeroed file handle, wrap the logic that deides whether to issue the records into helper functions. This shouldn't have any impact on the code, but simplifies further patches. Link: https://lore.kernel.org/r/20211025192746.66445-24-krisman@collabora.com Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Reviewed-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7935cf4070c42c9a49577221317d5e0198cf600c Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:37 2021 -0300 fanotify: Wrap object_fh inline space in a creator macro [ Upstream commit 2c5069433a3adc01ff9c5673567961bb7f138074 ] fanotify_error_event would duplicate this sequence of declarations that already exist elsewhere with a slight different size. Create a helper macro to avoid code duplication. Link: https://lore.kernel.org/r/20211025192746.66445-23-krisman@collabora.com Suggested-by: Jan Kara Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b974c8aa0081321d1e908a79c2b025219edbe2de Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:36 2021 -0300 fanotify: Support merging of error events [ Upstream commit 8a6ae64132fd27a944faed7bc38484827609eb76 ] Error events (FAN_FS_ERROR) against the same file system can be merged by simply iterating the error count. The hash is taken from the fsid, without considering the FH. This means that only the first error object is reported. Link: https://lore.kernel.org/r/20211025192746.66445-22-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9b98f4ff5186a06eac16bc81a337b15d49261653 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:35 2021 -0300 fanotify: Support enqueueing of error events [ Upstream commit 83e9acbe13dc1b767f91b5c1350f7a65689b26f6 ] Once an error event is triggered, enqueue it in the notification group, similarly to what is done for other events. FAN_FS_ERROR is not handled specially, since the memory is now handled by a preallocated mempool. For now, make the event unhashed. A future patch implements merging of this kind of event. Link: https://lore.kernel.org/r/20211025192746.66445-21-krisman@collabora.com Reviewed-by: Jan Kara Reviewed-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 68aacb60a799ed556a1910d6df31cd89ff894dd0 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:34 2021 -0300 fanotify: Pre-allocate pool of error events [ Upstream commit 734a1a5eccc5f7473002b0669f788e135f1f64aa ] Pre-allocate slots for file system errors to have greater chances of succeeding, since error events can happen in GFP_NOFS context. This patch introduces a group-wide mempool of error events, shared by all FAN_FS_ERROR marks in this group. Link: https://lore.kernel.org/r/20211025192746.66445-20-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit eec22d03a98e1e10e0754259e2d56234dc6891de Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:33 2021 -0300 fanotify: Reserve UAPI bits for FAN_FS_ERROR [ Upstream commit 8d11a4f43ef4679be0908026907a7613b33d7127 ] FAN_FS_ERROR allows reporting of event type FS_ERROR to userspace, which is a mechanism to report file system wide problems via fanotify. This commit preallocate userspace visible bits to match the FS_ERROR event. Link: https://lore.kernel.org/r/20211025192746.66445-19-krisman@collabora.com Reviewed-by: Jan Kara Reviewed-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit badbf879decac83029ad22b9f2a38ee8fa489fd6 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:32 2021 -0300 fsnotify: Support FS_ERROR event type [ Upstream commit 9daa811073fa19c08e8aad3b90f9235fed161acf ] Expose a new type of fsnotify event for filesystems to report errors for userspace monitoring tools. fanotify will send this type of notification for FAN_FS_ERROR events. This also introduce a helper for generating the new event. Link: https://lore.kernel.org/r/20211025192746.66445-18-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8ccc724f50706836092aa361c5a31677b4873bcc Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:31 2021 -0300 fanotify: Require fid_mode for any non-fd event [ Upstream commit 4fe595cf1c80e7a5af4d00c4da29def64aff57a2 ] Like inode events, FAN_FS_ERROR will require fid mode. Therefore, convert the verification during fanotify_mark(2) to require fid for any non-fd event. This means fid_mode will not only be required for inode events, but for any event that doesn't provide a descriptor. Link: https://lore.kernel.org/r/20211025192746.66445-17-krisman@collabora.com Suggested-by: Amir Goldstein Reviewed-by: Jan Kara Reviewed-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2f65be620948af5e7308e23faaa752177e1be3f4 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:30 2021 -0300 fanotify: Encode empty file handle when no inode is provided [ Upstream commit 272531ac619b374ab474e989eb387162fded553f ] Instead of failing, encode an invalid file handle in fanotify_encode_fh if no inode is provided. This bogus file handle will be reported by FAN_FS_ERROR for non-inode errors. Link: https://lore.kernel.org/r/20211025192746.66445-16-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 86bda2d7525206651acd055276981459d20724df Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:29 2021 -0300 fanotify: Allow file handle encoding for unhashed events [ Upstream commit 74fe4734897a2da2ae2a665a5e622cd490d36eaf ] Allow passing a NULL hash to fanotify_encode_fh and avoid calculating the hash if not needed. Link: https://lore.kernel.org/r/20211025192746.66445-15-krisman@collabora.com Reviewed-by: Jan Kara Reviewed-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 44ce59c254109b9692db2e925b2e129e12989811 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:28 2021 -0300 fanotify: Support null inode event in fanotify_dfid_inode [ Upstream commit 12f47bf0f0990933d95d021d13d31bda010648fd ] FAN_FS_ERROR doesn't support DFID, but this function is still called for every event. The problem is that it is not capable of handling null inodes, which now can happen in case of superblock error events. For this case, just returning dir will be enough. Link: https://lore.kernel.org/r/20211025192746.66445-14-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 313234a93ea1517df516608925cc431996ac5709 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:27 2021 -0300 fsnotify: Pass group argument to free_event [ Upstream commit 330ae77d2a5b0af32c0f29e139bf28ec8591de59 ] For group-wide mempool backed events, like FS_ERROR, the free_event callback will need to reference the group's mempool to free the memory. Wire that argument into the current callers. Link: https://lore.kernel.org/r/20211025192746.66445-13-krisman@collabora.com Reviewed-by: Jan Kara Reviewed-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c9f9d99ea4c3f3e00979f042f5039790c32748f1 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:26 2021 -0300 fsnotify: Protect fsnotify_handle_inode_event from no-inode events [ Upstream commit 24dca90590509a7a6cbe0650100c90c5b8a3468a ] FAN_FS_ERROR allows events without inodes - i.e. for file system-wide errors. Even though fsnotify_handle_inode_event is not currently used by fanotify, this patch protects other backends from cases where neither inode or dir are provided. Also document the constraints of the interface (inode and dir cannot be both NULL). Link: https://lore.kernel.org/r/20211025192746.66445-12-krisman@collabora.com Suggested-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5c4ce075c92b6714b74c9a68abc78bde6d72b96b Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:25 2021 -0300 fsnotify: Retrieve super block from the data field [ Upstream commit 29335033c574a15334015d8c4e36862cff3d3384 ] Some file system events (i.e. FS_ERROR) might not be associated with an inode or directory. For these, we can retrieve the super block from the data field. But, since the super_block is available in the data field on every event type, simplify the code to always retrieve it from there, through a new helper. Link: https://lore.kernel.org/r/20211025192746.66445-11-krisman@collabora.com Suggested-by: Jan Kara Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 44844158eea621f8b9d3485196310ef48caa9a7b Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:24 2021 -0300 fsnotify: Add wrapper around fsnotify_add_event [ Upstream commit 1ad03c3a326a86e259389592117252c851873395 ] fsnotify_add_event is growing in number of parameters, which in most case are just passed a NULL pointer. So, split out a new fsnotify_insert_event function to clean things up for users who don't need an insert hook. Link: https://lore.kernel.org/r/20211025192746.66445-10-krisman@collabora.com Suggested-by: Amir Goldstein Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 24eda1b5e6f6e17fc4d8a1962dac8b8daaff977e Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:23 2021 -0300 fsnotify: Add helper to detect overflow_event [ Upstream commit 808967a0a4d2f4ce6a2005c5692fffbecaf018c1 ] Similarly to fanotify_is_perm_event and friends, provide a helper predicate to say whether a mask is of an overflow event. Link: https://lore.kernel.org/r/20211025192746.66445-9-krisman@collabora.com Suggested-by: Amir Goldstein Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7c9ba74cb30b1073783a6edbd5aacf14f4714fd8 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:22 2021 -0300 inotify: Don't force FS_IN_IGNORED [ Upstream commit e0462f91d24756916fded4313d508e0fc52f39c9 ] According to Amir: "FS_IN_IGNORED is completely internal to inotify and there is no need to set it in i_fsnotify_mask at all, so if we remove the bit from the output of inotify_arg_to_mask() no functionality will change and we will be able to overload the event bit for FS_ERROR." This is done in preparation to overload FS_ERROR with the notification mechanism in fanotify. Link: https://lore.kernel.org/r/20211025192746.66445-8-krisman@collabora.com Suggested-by: Amir Goldstein Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9539a89f28ed814fe3fd2d89cf011fb78d405f8e Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:21 2021 -0300 fanotify: Split fsid check from other fid mode checks [ Upstream commit 8299212cbdb01a5867e230e961f82e5c02a6de34 ] FAN_FS_ERROR will require fsid, but not necessarily require the filesystem to expose a file handle. Split those checks into different functions, so they can be used separately when setting up an event. While there, update a comment about tmpfs having 0 fsid, which is no longer true. Link: https://lore.kernel.org/r/20211025192746.66445-7-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 326be73a59858d4aa51fd4eeea98f25ef886a531 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:20 2021 -0300 fanotify: Fold event size calculation to its own function [ Upstream commit b9928e80dda84b349ba8de01780b9bef2fc36ffa ] Every time this function is invoked, it is immediately added to FAN_EVENT_METADATA_LEN, since there is no need to just calculate the length of info records. This minor clean up folds the rest of the calculation into the function, which now operates in terms of events, returning the size of the entire event, including metadata. Link: https://lore.kernel.org/r/20211025192746.66445-6-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7fee789540e920f294955603e44197f397b997c5 Author: Gabriel Krisman Bertazi Date: Mon Oct 25 16:27:19 2021 -0300 fsnotify: Don't insert unmergeable events in hashtable [ Upstream commit cc53b55f697fe5aa98bdbfdfe67c6401da242155 ] Some events, like the overflow event, are not mergeable, so they are not hashed. But, when failing inside fsnotify_add_event for lack of space, fsnotify_add_event() still calls the insert hook, which adds the overflow event to the merge list. Add a check to prevent any kind of unmergeable event to be inserted in the hashtable. Fixes: 94e00d28a680 ("fsnotify: use hash table for faster events merge") Link: https://lore.kernel.org/r/20211025192746.66445-5-krisman@collabora.com Reviewed-by: Amir Goldstein Reviewed-by: Jan Kara Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 60b6dab8c81eb058e506456c4650d067bfb57fe2 Author: Amir Goldstein Date: Mon Oct 25 16:27:18 2021 -0300 fsnotify: clarify contract for create event hooks [ Upstream commit dabe729dddca550446e9cc118c96d1f91703345b ] Clarify argument names and contract for fsnotify_create() and fsnotify_mkdir() to reflect the anomaly of kernfs, which leaves dentries negavite after mkdir/create. Remove the WARN_ON(!inode) in audit code that were added by the Fixes commit under the wrong assumption that dentries cannot be negative after mkdir/create. Fixes: aa93bdc5500c ("fsnotify: use helpers to access data by data_type") Link: https://lore.kernel.org/linux-fsdevel/87mtp5yz0q.fsf@collabora.com/ Link: https://lore.kernel.org/r/20211025192746.66445-4-krisman@collabora.com Reviewed-by: Jan Kara Reported-by: Gabriel Krisman Bertazi Signed-off-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9601d20734061ea0891e9effaf36eeeab0dad897 Author: Amir Goldstein Date: Mon Oct 25 16:27:17 2021 -0300 fsnotify: pass dentry instead of inode data [ Upstream commit fd5a3ff49a19aa69e2bc1e26e98037c2d778e61a ] Define a new data type to pass for event - FSNOTIFY_EVENT_DENTRY. Use it to pass the dentry instead of it's ->d_inode where available. This is needed in preparation to the refactor to retrieve the super block from the data field. In some cases (i.e. mkdir in kernfs), the data inode comes from a negative dentry, such that no super block information would be available. By receiving the dentry itself, instead of the inode, fsnotify can derive the super block even on these cases. Link: https://lore.kernel.org/r/20211025192746.66445-3-krisman@collabora.com Reviewed-by: Jan Kara Signed-off-by: Amir Goldstein [Expand explanation in commit message] Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f114860f727950e8227b0bafe26216e71a681ce4 Author: Amir Goldstein Date: Mon Oct 25 16:27:16 2021 -0300 fsnotify: pass data_type to fsnotify_name() [ Upstream commit 9baf93d68bcc3d0a6042283b82603c076e25e4f5 ] Align the arguments of fsnotify_name() to those of fsnotify(). Link: https://lore.kernel.org/r/20211025192746.66445-2-krisman@collabora.com Reviewed-by: Jan Kara Signed-off-by: Amir Goldstein Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jan Kara [ cel: adjust fsnotify_delete as well, a37d9a17f099 is already applied ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6719531e67134984b951948563f47740271b4b74 Author: Trond Myklebust Date: Thu Sep 30 15:44:42 2021 -0400 nfsd: Fix a warning for nfsd_file_close_inode [ Upstream commit 19598141f40dff728dd50799e510805261f48850 ] Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7918a95bc2267f6a6404ed4d580c19e768be9e43 Author: Chuck Lever Date: Thu Sep 16 17:24:54 2021 -0400 NLM: Fix svcxdr_encode_owner() [ Upstream commit 89c485c7a3ecbc2ebd568f9c9c2edf3a8cf7485b ] Dai Ngo reports that, since the XDR overhaul, the NLM server crashes when the TEST procedure wants to return NLM_DENIED. There is a bug in svcxdr_encode_owner() that none of our standard test cases found. Replace the open-coded function with a call to an appropriate pre-fabricated XDR helper. Reported-by: Dai Ngo Fixes: a6a63ca5652e ("lockd: Common NLM XDR helpers") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b801327ba3c3b3672db67f3efdbc299bae6ece8b Author: Amir Goldstein Date: Thu Sep 9 14:56:34 2021 +0300 fsnotify: fix sb_connectors leak [ Upstream commit 4396a73115fc8739083536162e2228c0c0c3ed1a ] Fix a leak in s_fsnotify_connectors counter in case of a race between concurrent add of new fsnotify mark to an object. The task that lost the race fails to drop the counter before freeing the unused connector. Following umount() hangs in fsnotify_sb_delete()/wait_var_event(), because s_fsnotify_connectors never drops to zero. Fixes: ec44610fe2b8 ("fsnotify: count all objects with attached connectors") Reported-by: Murphy Zhou Link: https://lore.kernel.org/linux-fsdevel/20210907063338.ycaw6wvhzrfsfdlp@xzhoux.usersys.redhat.com/ Signed-off-by: Amir Goldstein Signed-off-by: Linus Torvalds Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1773901afb33b28fce99c978cffae598210f689b Author: Chuck Lever Date: Thu Jul 15 15:52:31 2021 -0400 NFS: Remove unused callback void decoder [ Upstream commit c35a810ce59524971c4a3b45faed4d0121e5a305 ] Clean up: The callback RPC dispatcher no longer invokes these call outs, although svc_process_common() relies on seeing a .pc_encode function. Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit edf220fe151691f35ee19f86d32291f932d780f7 Author: Chuck Lever Date: Thu Jul 15 15:52:25 2021 -0400 NFS: Add a private local dispatcher for NFSv4 callback operations [ Upstream commit 7d34c96217cf3c2d37ca0a56ca0bc3c3bef1e189 ] The client's NFSv4 callback service is the only remaining user of svc_generic_dispatch(). Note that the NFSv4 callback service doesn't use the .pc_encode and .pc_decode callouts in any substantial way, so they are removed. Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 91bbbffece637eccb3e744670cb3b5877060aabe Author: Chuck Lever Date: Thu Jul 15 15:52:19 2021 -0400 SUNRPC: Eliminate the RQ_AUTHERR flag [ Upstream commit 9082e1d914f8b27114352b1940bbcc7522f682e7 ] Now that there is an alternate method for returning an auth_stat value, replace the RQ_AUTHERR flag with use of that new method. Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit febf43bcdc2bbb5aef8d27df5104288562c765fe Author: Chuck Lever Date: Thu Jul 15 15:52:12 2021 -0400 SUNRPC: Set rq_auth_stat in the pg_authenticate() callout [ Upstream commit 5c2465dfd457f3015eebcc3ace50570e1d896aeb ] In a few moments, rq_auth_stat will need to be explicitly set to rpc_auth_ok before execution gets to the dispatcher. svc_authenticate() already sets it, but it often gets reset to rpc_autherr_badcred right after that call, even when authentication is successful. Let's ensure that the pg_authenticate callout and svc_set_client() set it properly in every case. Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a96da583ff54c0935ed44a80d79e54f3c84e8843 Author: Chuck Lever Date: Thu Jul 15 15:52:06 2021 -0400 SUNRPC: Add svc_rqst::rq_auth_stat [ Upstream commit 438623a06bacd69c40c4af633bb09a3bbb9dfc78 ] I'd like to take commit 4532608d71c8 ("SUNRPC: Clean up generic dispatcher code") even further by using only private local SVC dispatchers for all kernel RPC services. This change would enable the removal of the logic that switches between svc_generic_dispatch() and a service's private dispatcher, and simplify the invocation of the service's pc_release method so that humans can visually verify that it is always invoked properly. All that will come later. First, let's provide a better way to return authentication errors from SVC dispatcher functions. Instead of overloading the dispatch method's *statp argument, add a field to struct svc_rqst that can hold an error value. Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit efea5d558ef3a24b9938fed1668931c464461a0d Author: J. Bruce Fields Date: Fri Aug 20 17:02:06 2021 -0400 nfs: don't allow reexport reclaims [ Upstream commit bb0a55bb7148a49e549ee992200860e7a040d3a5 ] In the reexport case, nfsd is currently passing along locks with the reclaim bit set. The client sends a new lock request, which is granted if there's currently no conflict--even if it's possible a conflicting lock could have been briefly held in the interim. We don't currently have any way to safely grant reclaim, so for now let's just deny them all. I'm doing this by passing the reclaim bit to nfs and letting it fail the call, with the idea that eventually the client might be able to do something more forgiving here. Signed-off-by: J. Bruce Fields Acked-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bd5b3deed01afe27154465f06c973b4dbea6b8c3 Author: J. Bruce Fields Date: Fri Aug 20 17:02:05 2021 -0400 lockd: don't attempt blocking locks on nfs reexports [ Upstream commit b840be2f00c0bc00d993f8f76e251052b83e4382 ] As in the v4 case, it doesn't work well to block waiting for a lock on an nfs filesystem. As in the v4 case, that means we're depending on the client to poll. It's probably incorrect to depend on that, but I *think* clients do poll in practice. In any case, it's an improvement over hanging the lockd thread indefinitely as we currently are. Signed-off-by: J. Bruce Fields Acked-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5ea5be84ddd7bb81277598512b11f2116b566bcd Author: J. Bruce Fields Date: Fri Aug 20 17:02:04 2021 -0400 nfs: don't atempt blocking locks on nfs reexports [ Upstream commit f657f8eef3ff870552c9fd2839e0061046f44618 ] NFS implements blocking locks by blocking inside its lock method. In the reexport case, this blocks the nfs server thread, which could lead to deadlocks since an nfs server thread might be required to unlock the conflicting lock. It also causes a crash, since the nfs server thread assumes it can free the lock when its lm_notify lock callback is called. Ideal would be to make the nfs lock method return without blocking in this case, but for now it works just not to attempt blocking locks. The difference is just that the original client will have to poll (as it does in the v4.0 case) instead of getting a callback when the lock's available. Signed-off-by: J. Bruce Fields Acked-by: Anna Schumaker Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e580323ac0b51ad10ec2e181d1f777479b7983e7 Author: J. Bruce Fields Date: Mon Aug 23 16:44:00 2021 -0400 Keep read and write fds with each nlm_file [ Upstream commit 7f024fcd5c97dc70bb9121c80407cf3cf9be7159 ] We shouldn't really be using a read-only file descriptor to take a write lock. Most filesystems will put up with it. But NFS, for example, won't. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b4bf52174b4f39b0e8db01568914920a3d2e80d0 Author: J. Bruce Fields Date: Fri Aug 20 17:02:02 2021 -0400 lockd: update nlm_lookup_file reexport comment [ Upstream commit b661601a9fdf1af8516e1100de8bba84bd41cca4 ] Update comment to reflect that we *do* allow reexport, whether it's a good idea or not.... Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 14c2a0fad5410aac061916e49042d9729db52852 Author: J. Bruce Fields Date: Mon Aug 23 11:26:39 2021 -0400 nlm: minor refactoring [ Upstream commit a81041b7d8f08c4e1014173c5483a0f18724a576 ] Make this lookup slightly more concise, and prepare for changing how we look this up in a following patch. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3fbc744783dd3be199ded1d8f84ca1d283e0467f Author: J. Bruce Fields Date: Mon Aug 23 12:01:18 2021 -0400 nlm: minor nlm_lookup_file argument change [ Upstream commit 2dc6f19e4f438d4c14987cb17aee38aaf7304e7f ] It'll come in handy to get the whole nlm_lock. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 860f01260e5379ea15b90adb86edc22e9c7da3b6 Author: Jia He Date: Tue Aug 3 12:59:37 2021 +0200 lockd: change the proc_handler for nsm_use_hostnames [ Upstream commit d02a3a2cb25d384005a6e3446a445013342024b7 ] nsm_use_hostnames is a module parameter and it will be exported to sysctl procfs. This is to let user sometimes change it from userspace. But the minimal unit for sysctl procfs read/write it sizeof(int). In big endian system, the converting from/to bool to/from int will cause error for proc items. This patch use a new proc_handler proc_dobool to fix it. Signed-off-by: Jia He Reviewed-by: Pan Xinhui [thuth: Fix typo in commit message] Signed-off-by: Thomas Huth Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f469e60f9a0fd2b211b723dbd19c8f87294da3e3 Author: Jia He Date: Tue Aug 3 12:59:36 2021 +0200 sysctl: introduce new proc handler proc_dobool [ Upstream commit a2071573d6346819cc4e5787b4206f2184985160 ] This is to let bool variable could be correctly displayed in big/little endian sysctl procfs. sizeof(bool) is arch dependent, proc_dobool should work in all arches. Suggested-by: Pan Xinhui Signed-off-by: Jia He [thuth: rebased the patch to the current kernel version] Signed-off-by: Thomas Huth Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 130dcbf77a7ec948fd3f6c4c90d7a788e685f7c6 Author: NeilBrown Date: Wed Jul 28 08:56:09 2021 +1000 NFSD: remove vanity comments [ Upstream commit ea49dc79002c416a9003f3204bc14f846a0dbcae ] Including one's name in copyright claims is appropriate. Including it in random comments is just vanity. After 2 decades, it is time for these to be gone. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 86df138e8d4d378b4b7c61632ef5fb243672e02f Author: Chuck Lever Date: Mon Jun 28 17:24:27 2021 -0400 NFSD: Batch release pages during splice read [ Upstream commit 496d83cf0f2fa70cfe256c2499e2d3523d3868f3 ] Large splice reads call put_page() repeatedly. put_page() is relatively expensive to call, so replace it with the new svc_rqst_replace_page() helper to help amortize that cost. Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a4f616afb4ee5ad56ac37a640bd9b1613d463113 Author: Chuck Lever Date: Thu Jul 1 10:03:10 2021 -0400 SUNRPC: Add svc_rqst_replace_page() API [ Upstream commit 2f0f88f42f2eab0421ed37d7494de9124fdf0d34 ] Replacing a page in rq_pages[] requires a get_page(), which is a bus-locked operation, and a put_page(), which can be even more costly. To reduce the cost of replacing a page in rq_pages[], batch the put_page() operations by collecting "freed" pages in a pagevec, and then release those pages when the pagevec is full. This pagevec is also emptied when each RPC completes. [ cel: adjusted to apply without f6e70aab9dfe ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9e5f2e0ae0196ff8be932f26f5d4e5cd4e572e69 Author: Chuck Lever Date: Mon Jun 28 16:34:20 2021 -0400 NFSD: Clean up splice actor [ Upstream commit c7e0b781b73c2e26e442ed71397cc2bc5945a732 ] A few useful observations: - The value in @size is never modified. - splice_desc.len is an unsigned int, and so is xdr_buf.page_len. An implicit cast to size_t is unnecessary. - The computation of .page_len is the same in all three arms of the "if" statement, so hoist it out to make it clear that the operation is an unconditional invariant. The resulting function is 18 bytes shorter on my system (-Os). Signed-off-by: Chuck Lever Reviewed-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 860893f9e35178d6894f190b8476ce7e1fb9481f Author: Amir Goldstein Date: Tue Aug 10 18:12:20 2021 +0300 fsnotify: optimize the case of no marks of any type [ Upstream commit e43de7f0862b8598cd1ef440e3b4701cd107ea40 ] Add a simple check in the inline helpers to avoid calling fsnotify() and __fsnotify_parent() in case there are no marks of any type (inode/sb/mount) for an inode's sb, so there can be no objects of any type interested in the event. Link: https://lore.kernel.org/r/20210810151220.285179-5-amir73il@gmail.com Reviewed-by: Matthew Bobrowski Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9917e1bda3d7a4c5ac9a4ec7795a2fceb0d6f8fa Author: Amir Goldstein Date: Tue Aug 10 18:12:19 2021 +0300 fsnotify: count all objects with attached connectors [ Upstream commit ec44610fe2b86daef70f3f53f47d2a2542d7094f ] Rename s_fsnotify_inode_refs to s_fsnotify_connectors and count all objects with attached connectors, not only inodes with attached connectors. This will be used to optimize fsnotify() calls on sb without any type of marks. Link: https://lore.kernel.org/r/20210810151220.285179-4-amir73il@gmail.com Signed-off-by: Amir Goldstein Reviewed-by: Matthew Bobrowski Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 44858a348881c66db3a5e346cd53585b5ac6ce7e Author: Amir Goldstein Date: Tue Aug 10 18:12:18 2021 +0300 fsnotify: count s_fsnotify_inode_refs for attached connectors [ Upstream commit 11fa333b58ba1518e7c69fafb6513a0117f8fe33 ] Instead of incrementing s_fsnotify_inode_refs when detaching connector from inode, increment it earlier when attaching connector to inode. Next patch is going to use s_fsnotify_inode_refs to count all objects with attached connectors. Link: https://lore.kernel.org/r/20210810151220.285179-3-amir73il@gmail.com Reviewed-by: Matthew Bobrowski Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cdbf9c5f81d0951521f0dc1670851f4cabf0fba9 Author: Amir Goldstein Date: Tue Aug 10 18:12:17 2021 +0300 fsnotify: replace igrab() with ihold() on attach connector [ Upstream commit 09ddbe69c9925b42cb9529f60678c25b241d8b18 ] We must have a reference on inode, so ihold is cheaper. Link: https://lore.kernel.org/r/20210810151220.285179-2-amir73il@gmail.com Reviewed-by: Matthew Bobrowski Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cde8883b0b292f94a2f2e0e6096892e0e13cac0f Author: Matthew Bobrowski Date: Sun Aug 8 15:26:25 2021 +1000 fanotify: add pidfd support to the fanotify API [ Upstream commit af579beb666aefb17e9a335c12c788c92932baf1 ] Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which allows userspace applications to control whether a pidfd information record containing a pidfd is to be returned alongside the generic event metadata for each event. If FAN_REPORT_PIDFD is enabled for a notification group, an additional struct fanotify_event_info_pidfd object type will be supplied alongside the generic struct fanotify_event_metadata for a single event. This functionality is analogous to that of FAN_REPORT_FID in terms of how the event structure is supplied to a userspace application. Usage of FAN_REPORT_PIDFD with FAN_REPORT_FID/FAN_REPORT_DFID_NAME is permitted, and in this case a struct fanotify_event_info_pidfd object will likely follow any struct fanotify_event_info_fid object. Currently, the usage of the FAN_REPORT_TID flag is not permitted along with FAN_REPORT_PIDFD as the pidfd API currently only supports the creation of pidfds for thread-group leaders. Additionally, usage of the FAN_REPORT_PIDFD flag is limited to privileged processes only i.e. event listeners that are running with the CAP_SYS_ADMIN capability. Attempting to supply the FAN_REPORT_TID initialization flags with FAN_REPORT_PIDFD or creating a notification group without CAP_SYS_ADMIN will result with -EINVAL being returned to the caller. In the event of a pidfd creation error, there are two types of error values that can be reported back to the listener. There is FAN_NOPIDFD, which will be reported in cases where the process responsible for generating the event has terminated prior to the event listener being able to read the event. Then there is FAN_EPIDFD, which will be reported when a more generic pidfd creation error has occurred when fanotify calls pidfd_create(). Link: https://lore.kernel.org/r/5f9e09cff7ed62bfaa51c1369e0f7ea5f16a91aa.1628398044.git.repnop@google.com Signed-off-by: Matthew Bobrowski Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 77bc7f529abd6f3031bffbb53cfc3dc9c8e20db3 Author: Matthew Bobrowski Date: Sun Aug 8 15:25:58 2021 +1000 fanotify: introduce a generic info record copying helper [ Upstream commit 0aca67bb7f0d8c997dfef8ff0bfeb0afb361f0e6 ] The copy_info_records_to_user() helper allows for the separation of info record copying routines/conditionals from copy_event_to_user(), which reduces the overall clutter within this function. This becomes especially true as we start introducing additional info records in the future i.e. struct fanotify_event_info_pidfd. On success, this helper returns the total amount of bytes that have been copied into the user supplied buffer and on error, a negative value is returned to the caller. The newly defined macro FANOTIFY_INFO_MODES can be used to obtain info record types that have been enabled for a specific notification group. This macro becomes useful in the subsequent patch when the FAN_REPORT_PIDFD initialization flag is introduced. Link: https://lore.kernel.org/r/8872947dfe12ce8ae6e9a7f2d49ea29bc8006af0.1628398044.git.repnop@google.com Signed-off-by: Matthew Bobrowski Reviewed-by: Amir Goldstein Signed-off-by: Jan Kara [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3ddcb1939608af343d2242efcbc099e34f340ab9 Author: Matthew Bobrowski Date: Sun Aug 8 15:25:32 2021 +1000 fanotify: minor cosmetic adjustments to fid labels [ Upstream commit d3424c9bac893bd06f38a20474cd622881d384ca ] With the idea to support additional info record types in the future i.e. fanotify_event_info_pidfd, it's a good idea to rename some of the labels assigned to some of the existing fid related functions, parameters, etc which more accurately represent the intent behind their usage. For example, copy_info_to_user() was defined with a generic function label, which arguably reads as being supportive of different info record types, however the parameter list for this function is explicitly tailored towards the creation and copying of the fanotify_event_info_fid records. This same point applies to the macro defined as FANOTIFY_INFO_HDR_LEN. With fanotify_event_info_len(), we change the parameter label so that the function implies that it can be extended to calculate the length for additional info record types. Link: https://lore.kernel.org/r/7c3ec33f3c718dac40764305d4d494d858f59c51.1628398044.git.repnop@google.com Signed-off-by: Matthew Bobrowski Reviewed-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 03b5d3ee505bae04c8612eda4d92b9d29899ca0d Author: Matthew Bobrowski Date: Sun Aug 8 15:25:05 2021 +1000 kernel/pid.c: implement additional checks upon pidfd_create() parameters [ Upstream commit 490b9ba881e2c6337bb09b68010803ae98e59f4a ] By adding the pidfd_create() declaration to linux/pid.h, we effectively expose this function to the rest of the kernel. In order to avoid any unintended behavior, or set false expectations upon this function, ensure that constraints are forced upon each of the passed parameters. This includes the checking of whether the passed struct pid is a thread-group leader as pidfd creation is currently limited to such pid types. Link: https://lore.kernel.org/r/2e9b91c2d529d52a003b8b86c45f866153be9eb5.1628398044.git.repnop@google.com Signed-off-by: Matthew Bobrowski Acked-by: Christian Brauner Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 774c2dbca76e9c63b63a43b059b7b5b6d0ed9440 Author: Matthew Bobrowski Date: Sun Aug 8 15:24:33 2021 +1000 kernel/pid.c: remove static qualifier from pidfd_create() [ Upstream commit c576e0fcd6188d0edb50b0fb83f853433ef4819b ] With the idea of returning pidfds from the fanotify API, we need to expose a mechanism for creating pidfds. We drop the static qualifier from pidfd_create() and add its declaration to linux/pid.h so that the pidfd_create() helper can be called from other kernel subsystems i.e. fanotify. Link: https://lore.kernel.org/r/0c68653ec32f1b7143301f0231f7ed14062fd82b.1628398044.git.repnop@google.com Signed-off-by: Matthew Bobrowski Acked-by: Christian Brauner Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e79057d15d96ef19de4de6d7e479bae3d58a2a8d Author: J. Bruce Fields Date: Thu Jul 1 20:06:56 2021 -0400 nfsd: fix NULL dereference in nfs3svc_encode_getaclres [ Upstream commit ab1016d39cc052064e32f25ad18ef8767a0ee3b8 ] In error cases the dentry may be NULL. Before 20798dfe249a, the encoder also checked dentry and d_really_is_positive(dentry), but that looks like overkill to me--zero status should be enough to guarantee a positive dentry. This isn't the first time we've seen an error-case NULL dereference hidden in the initialization of a local variable in an xdr encoder. But I went back through the other recent rewrites and didn't spot any similar bugs. Reported-by: JianHong Yin Reviewed-by: Chuck Lever III Fixes: 20798dfe249a ("NFSD: Update the NFSv3 GETACL result encoder...") Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5610ed80e86022e13733715a381db75f724392e2 Author: Chuck Lever Date: Fri Jun 25 11:12:49 2021 -0400 NFSD: Prevent a possible oops in the nfs_dirent() tracepoint [ Upstream commit 7b08cf62b1239a4322427d677ea9363f0ab677c6 ] The double copy of the string is a mistake, plus __assign_str() uses strlen(), which is wrong to do on a string that isn't guaranteed to be NUL-terminated. Fixes: 6019ce0742ca ("NFSD: Add a tracepoint to record directory entry encoding") Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 17600880e1534b8851c82014b160bc4fa7a18fe5 Author: Colin Ian King Date: Thu May 13 16:16:39 2021 +0100 nfsd: remove redundant assignment to pointer 'this' [ Upstream commit e34c0ce9136a0fe96f0f547898d14c44f3c9f147 ] The pointer 'this' is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ce1819876203d27af97d527da7eb21f1e5f496bb Author: Chuck Lever Date: Thu Jun 3 16:53:29 2021 -0400 lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream [ Upstream commit 0ff5b50ab1f7f39862d0cdf6803978d31b27f25e ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin commit fec07309928151ea78d37ba4e8afc303b761a57f Author: Chuck Lever Date: Thu Jun 3 16:53:23 2021 -0400 lockd: Update the NLMv4 nlm_res results encoder to use struct xdr_stream [ Upstream commit 447c14d48968d0d4c2733c3f8052cb63aa1deb38 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e1e61d647f264d9196b4ddb1ec518405c2318ee4 Author: Chuck Lever Date: Thu Jun 3 16:53:17 2021 -0400 lockd: Update the NLMv4 TEST results encoder to use struct xdr_stream [ Upstream commit 1beef1473ccaa70a2d54f9e76fba5f534931ea23 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4f5ba2e6b434d4ca5535ba43ca5a1bdb471348cd Author: Chuck Lever Date: Thu Jun 3 16:53:11 2021 -0400 lockd: Update the NLMv4 void results encoder to use struct xdr_stream [ Upstream commit ec757e423b4fcd6e5ea4405d1e8243c040458d78 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0add7c13bf78f8ecc8ac74f753d368c0e3e0089e Author: Chuck Lever Date: Thu Jun 3 16:53:04 2021 -0400 lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream [ Upstream commit 3049e974a7c7cfa0c15fb807f4a3e75b2ab8517a ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 604c8a432c6c76368d7e21a2defc5cefbcc0d240 Author: Chuck Lever Date: Thu Jun 3 16:52:58 2021 -0400 lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream [ Upstream commit 7cf96b6d0104b12aa30961901879e428884b1695 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 300a4b1632c34bd7d9a0d48a0bd3c93929719d4e Author: Chuck Lever Date: Thu Jun 3 16:52:52 2021 -0400 lockd: Update the NLMv4 SM_NOTIFY arguments decoder to use struct xdr_stream [ Upstream commit bc3665fd718b325cfff3abd383b00d1a87e028dc ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 33f31f6e85d1ec9cb1cf9f8aadb800f340920bf4 Author: Chuck Lever Date: Thu Jun 3 16:52:46 2021 -0400 lockd: Update the NLMv4 nlm_res arguments decoder to use struct xdr_stream [ Upstream commit b4c24b5a41da63e5f3a9b6ea56cbe2a1efe49579 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9e1daae6303a550e6e9c1e138ac44aa59bc27c2f Author: Chuck Lever Date: Thu Jun 3 16:52:40 2021 -0400 lockd: Update the NLMv4 UNLOCK arguments decoder to use struct xdr_stream [ Upstream commit d76d8c25cea794f65615f3a2324052afa4b5f900 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0652983fbe1810066c531f1abe1cf0a7be82dd12 Author: Chuck Lever Date: Thu Jun 3 16:52:34 2021 -0400 lockd: Update the NLMv4 CANCEL arguments decoder to use struct xdr_stream [ Upstream commit 1e1f38dcf3c031715191e1fd26f70a0affca4dbd ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 101d45274abae2ad4d9021b2305be22ab63c819a Author: Chuck Lever Date: Thu Jun 3 16:52:28 2021 -0400 lockd: Update the NLMv4 LOCK arguments decoder to use struct xdr_stream [ Upstream commit 0e5977af4fdc277984fca7d8c2e0c880935775a0 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 360159aafa8b0717db40c3c188f4eab33d4fc9b2 Author: Chuck Lever Date: Thu Jun 3 16:52:22 2021 -0400 lockd: Update the NLMv4 TEST arguments decoder to use struct xdr_stream [ Upstream commit 345b4159a075b15dc4ae70f1db90fa8abf85d2e7 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c8f404825085edcc5cffc744ad35562a643d95d0 Author: Chuck Lever Date: Thu Jun 3 16:52:16 2021 -0400 lockd: Update the NLMv4 void arguments decoder to use struct xdr_stream [ Upstream commit 7956521aac58e434a05cf3c68c1b66c1312e5649 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 45c1384bd767026dff0bef2a716ed92984980509 Author: Chuck Lever Date: Thu Jun 3 16:52:10 2021 -0400 lockd: Update the NLMv1 SHARE results encoder to use struct xdr_stream [ Upstream commit 529ca3a116e8978575fec061a71fa6865a344891 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b049476790166a844aa37cb550ef1e8c3fe73ed6 Author: Chuck Lever Date: Thu Jun 3 16:52:04 2021 -0400 lockd: Update the NLMv1 nlm_res results encoder to use struct xdr_stream [ Upstream commit e96735a6980574ecbdb24c760b8d294095e47074 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d0ddd21bd52c23c9bc524660823cd92893f21a6a Author: Chuck Lever Date: Thu Jun 3 16:51:58 2021 -0400 lockd: Update the NLMv1 TEST results encoder to use struct xdr_stream [ Upstream commit adf98a4850b9ede9fc174c78a885845fb08499a5 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e6c92714e9a6007d6a804d2598c09c7599009611 Author: Chuck Lever Date: Thu Jun 3 16:51:52 2021 -0400 lockd: Update the NLMv1 void results encoder to use struct xdr_stream [ Upstream commit e26ec898b68b2ab64f379ba0fc0a615b2ad41f40 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 02a3c81665ac71960794b27c34a1ef0280d5cbbc Author: Chuck Lever Date: Thu Jun 3 16:51:46 2021 -0400 lockd: Update the NLMv1 FREE_ALL arguments decoder to use struct xdr_stream [ Upstream commit 14e105256b9dcdf50a003e2e9a0da77e06770a4b ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6c522daf60925469b5b302a5cb4cf238e392fc23 Author: Chuck Lever Date: Thu Jun 3 16:51:40 2021 -0400 lockd: Update the NLMv1 SHARE arguments decoder to use struct xdr_stream [ Upstream commit 890939e1266b9adf3b0acd5e0385b39813cb8f11 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 56c936af53e3de2ad5ee3de70ebd811d0dca2541 Author: Chuck Lever Date: Thu Jun 3 16:51:34 2021 -0400 lockd: Update the NLMv1 SM_NOTIFY arguments decoder to use struct xdr_stream [ Upstream commit 137e05e2f735f696e117553f7fa5ef8fb09953e1 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 90f483a775446fe4f731824fab7a4bfb71469ab5 Author: Chuck Lever Date: Thu Jun 3 16:51:28 2021 -0400 lockd: Update the NLMv1 nlm_res arguments decoder to use struct xdr_stream [ Upstream commit 16ddcabe6240c4fb01c97f6fce6c35ddf8626ad5 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b4ea38d69d891dec5025a27e3e81b2de3f871d6d Author: Chuck Lever Date: Thu Jun 3 16:51:22 2021 -0400 lockd: Update the NLMv1 UNLOCK arguments decoder to use struct xdr_stream [ Upstream commit c27045d302b022ed11d24a2653bceb6af56c6327 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2025b3acf6555114c13850bf32f6e7ea5365c1bb Author: Chuck Lever Date: Thu Jun 3 16:51:16 2021 -0400 lockd: Update the NLMv1 CANCEL arguments decoder to use struct xdr_stream [ Upstream commit f4e08f3ac8c4945ea54a740e3afcf44b34e7cf44 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3e8675ff1ebc87d144ea3b875e5627c71484782f Author: Chuck Lever Date: Thu Jun 3 16:51:10 2021 -0400 lockd: Update the NLMv1 LOCK arguments decoder to use struct xdr_stream [ Upstream commit c1adb8c672ca2b085c400695ef064547d77eda29 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8f9f41ebfa176ba6f61de7a84eefdf0be265d40d Author: Chuck Lever Date: Thu Jun 3 16:51:04 2021 -0400 lockd: Update the NLMv1 TEST arguments decoder to use struct xdr_stream [ Upstream commit 2fd0c67aabcf0f8821450b00ee511faa0b7761bf ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4c3f448aaa0bb251663c363b10d17a051e3b0e63 Author: Chuck Lever Date: Thu Jun 3 16:50:58 2021 -0400 lockd: Update the NLMv1 void argument decoder to use struct xdr_stream [ Upstream commit cc1029b51273da5b342683e9ae14ab4eeaa15997 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fa4b890c0da06873b0cb23ac96746fa180e8f3e9 Author: Chuck Lever Date: Thu Jun 3 16:50:52 2021 -0400 lockd: Common NLM XDR helpers [ Upstream commit a6a63ca5652ea05637ecfe349f9e895031529556 ] Add a .h file containing xdr_stream-based XDR helpers common to both NLMv3 and NLMv4. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3595ff1c2caa0b81bc2b3885e1b7589dd43b9a4c Author: Chuck Lever Date: Thu Jun 3 16:50:46 2021 -0400 lockd: Create a simplified .vs_dispatch method for NLM requests [ Upstream commit a9ad1a8090f58b2ed1774dd0f4c7cdb8210a3793 ] To enable xdr_stream-based encoding and decoding, create a bespoke RPC dispatch function for the lockd service. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit eeea3b96d15040a023008cb5c83738aa49260b9f Author: Chuck Lever Date: Thu Jun 3 16:50:40 2021 -0400 lockd: Remove stale comments [ Upstream commit 99cdf57b33e68df7afc876739c93a11f0b1ba807 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c58120ab476503cb6fcb4ecab052aaa56309c9b5 Author: J. Bruce Fields Date: Mon Jun 14 11:20:49 2021 -0400 nfsd: rpc_peeraddr2str needs rcu lock [ Upstream commit 05570a2b01117209b500e1989ce8f1b0524c489f ] I'm not even sure cl_xprt can change here, but we're getting "suspicious RCU usage" warnings, and other rpc_peeraddr2str callers are taking the rcu lock. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2983611a663e1ab84eddda8a1752e6f43b6f6246 Author: Wei Yongjun Date: Fri Jun 4 10:12:37 2021 +0000 NFSD: Fix error return code in nfsd4_interssc_connect() [ Upstream commit 54185267e1fe476875e649bb18e1c4254c123305 ] 'status' has been overwritten to 0 after nfsd4_ssc_setup_dul(), this cause 0 will be return in vfs_kern_mount() error case. Fix to return nfserr_nodev in this error. Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Reported-by: Hulk Robot Signed-off-by: Wei Yongjun Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c5a305d93e6b8a4e0609d1af3e58c5da35c9c175 Author: Dai Ngo Date: Thu Jun 3 20:02:26 2021 -0400 nfsd: fix kernel test robot warning in SSC code [ Upstream commit f47dc2d3013c65631bf8903becc7d88dc9d9966e ] Fix by initializing pointer nfsd4_ssc_umount_item with NULL instead of 0. Replace return value of nfsd4_ssc_setup_dul with __be32 instead of int. Reported-by: kernel test robot Signed-off-by: Dai Ngo Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 22b7c93d967462cfcdfbe130327f2e4aa6ee8875 Author: Dave Wysochanski Date: Wed Jun 2 13:51:39 2021 -0400 nfsd4: Expose the callback address and state of each NFS4 client [ Upstream commit 3518c8666f15cdd5d38878005dab1d589add1c19 ] In addition to the client's address, display the callback channel state and address in the 'info' file. Signed-off-by: Dave Wysochanski Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit dbc0aa47959567b428a5a9fa8c9d421dd3c2678b Author: J. Bruce Fields Date: Tue May 25 14:53:44 2021 -0400 nfsd: move fsnotify on client creation outside spinlock [ Upstream commit 934bd07fae7e55232845f909f78873ab8678ca74 ] This was causing a "sleeping function called from invalid context" warning. I don't think we need the set_and_test_bit() here; clients move from unconfirmed to confirmed only once, under the client_lock. The (conf == unconf) is a way to check whether we're in that confirming case, hopefully that's not too obscure. Fixes: 472d155a0631 "nfsd: report client confirmation status in "info" file" Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a4bc287943f5695209ff36bdc89f17b48d68fae7 Author: Dai Ngo Date: Fri May 21 15:09:37 2021 -0400 NFSD: delay unmount source's export after inter-server copy completed. [ Upstream commit f4e44b393389c77958f7c58bf4415032b4cda15b ] Currently the source's export is mounted and unmounted on every inter-server copy operation. This patch is an enhancement to delay the unmount of the source export for a certain period of time to eliminate the mount and unmount overhead on subsequent copy operations. After a copy operation completes, a work entry is added to the delayed unmount list with an expiration time. This list is serviced by the laundromat thread to unmount the export of the expired entries. Each time the export is being used again, its expiration time is extended and the entry is re-inserted to the tail of the list. The unmount task and the mount operation of the copy request are synced to make sure the export is not unmounted while it's being used. Signed-off-by: Dai Ngo Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 817c6eb975798647c4bbbfc9eb288aa413ec7f65 Author: Olga Kornievskaia Date: Wed May 19 14:48:27 2021 -0400 NFSD add vfs_fsync after async copy is done [ Upstream commit eac0b17a77fbd763d305a5eaa4fd1119e5a0fe0d ] Currently, the server does all copies as NFS_UNSTABLE. For synchronous copies linux client will append a COMMIT to the COPY compound but for async copies it does not (because COMMIT needs to be done after all bytes are copied and not as a reply to the COPY operation). However, in order to save the client doing a COMMIT as a separate rpc, the server can reply back with NFS_FILE_SYNC copy. This patch proposed to add vfs_fsync() call at the end of the async copy. Signed-off-by: Olga Kornievskaia Signed-off-by: J. Bruce Fields [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 94a89247017396dab16d2adb1a6859e2b46c5039 Author: J. Bruce Fields Date: Fri May 14 18:21:37 2021 -0400 nfsd: move some commit_metadata()s outside the inode lock [ Upstream commit eeeadbb9bd5652c47bb9b31aa9ad8b4f1b4aa8b3 ] The commit may be time-consuming and there's no need to hold the lock for it. More of these are possible, these were just some easy ones. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f666a75ccd9cd9cdb02d8a3f588fc40a6b32ec03 Author: Yu Hsiang Huang Date: Fri May 14 11:58:29 2021 +0800 nfsd: Prevent truncation of an unlinked inode from blocking access to its directory [ Upstream commit e5d74a2d0ee67ae00edad43c3d7811016e4d2e21 ] Truncation of an unlinked inode may take a long time for I/O waiting, and it doesn't have to prevent access to the directory. Thus, let truncation occur outside the directory's mutex, just like do_unlinkat() does. Signed-off-by: Yu Hsiang Huang Signed-off-by: Bing Jing Chang Signed-off-by: Robbie Ko Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e7bbdd7deeb2f1cc3985b2bfa1b0a36fe97fb899 Author: Chuck Lever Date: Fri May 14 15:57:39 2021 -0400 NFSD: Update nfsd_cb_args tracepoint [ Upstream commit d6cbe98ff32aef795462a309ef048cfb89d1a11d ] Clean-up: Re-order the display of IP address and client ID to be consistent with other _cb_ tracepoints. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3e8aeb13a730af10c8d36da7c18c889daf7a3ae7 Author: Chuck Lever Date: Fri May 14 15:57:32 2021 -0400 NFSD: Remove the nfsd_cb_work and nfsd_cb_done tracepoints [ Upstream commit 1d2bf65983a137121c165a7e69b2885572954915 ] Clean up: These are noise in properly working systems. If you really need to observe the operation of the callback mechanism, use the sunrpc:rpc\* tracepoints along with the workqueue tracepoints. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3a63aa2459dc8d4d1f79d52f9882f898c6028db7 Author: Chuck Lever Date: Fri May 14 15:57:26 2021 -0400 NFSD: Add an nfsd_cb_probe tracepoint [ Upstream commit 4ade892ae1c35527584decb7fa026553d53cd03f ] Record a tracepoint event when the server performs a callback probe. This event can be enabled as a group with other nfsd_cb tracepoints. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a577eb06dee49c22f92ea04289199b99c07ba324 Author: Chuck Lever Date: Fri May 14 15:57:20 2021 -0400 NFSD: Replace the nfsd_deleg_break tracepoint [ Upstream commit 17d76ddf76e4972411402743eea7243d9a46f4f9 ] Renamed so it can be enabled as a set with the other nfsd_cb_ tracepoints. And, consistent with those tracepoints, report the address of the client, the client ID the server has given it, and the state ID being recalled. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9f76187f0a46f23029f2c4ea5b96180cc284dc6d Author: Chuck Lever Date: Fri May 14 15:57:14 2021 -0400 NFSD: Add an nfsd_cb_offload tracepoint [ Upstream commit 87512386e951ee28ba2e7ef32b843ac97621d371 ] Record the arguments of CB_OFFLOAD callbacks so we can better observe asynchronous copy-offload behavior. For example: nfsd-995 [008] 7721.934222: nfsd_cb_offload: addr=192.168.2.51:0 client 6092a47c:35a43fc1 fh_hash=0x8739113a count=116528 status=0 Signed-off-by: Chuck Lever Cc: Olga Kornievskaia Cc: Dai Ngo Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 60aac215347ce083bfeaf71cecd69896ee9104d0 Author: Chuck Lever Date: Fri May 14 15:57:08 2021 -0400 NFSD: Add an nfsd_cb_lm_notify tracepoint [ Upstream commit 2cde7f8118f0fea29ad73ddcf28817f95adeffd5 ] When the server kicks off a CB_LM_NOTIFY callback, record its arguments so we can better observe asynchronous locking behavior. For example: nfsd-998 [002] 1471.705873: nfsd_cb_notify_lock: addr=192.168.2.51:0 client 6092a47c:35a43fc1 fh_hash=0x8950b23a Signed-off-by: Chuck Lever Cc: Jeff Layton Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 59ddc5a82bc300597210c7433b28e6152d860f21 Author: Chuck Lever Date: Fri May 14 15:57:02 2021 -0400 NFSD: Enhance the nfsd_cb_setup tracepoint [ Upstream commit 9f57c6062bf3ce2c6ab9ba60040b34e8134ef259 ] Display the transport protocol and authentication flavor so admins can see what they might be getting wrong. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fc3b4f0188e99422fe8840b4f079ee3f4e397afe Author: Chuck Lever Date: Fri May 14 15:56:49 2021 -0400 NFSD: Adjust cb_shutdown tracepoint [ Upstream commit b200f0e35338b052976b6c5759e4f77a3013e6f6 ] Show when the upper layer requested a shutdown. RPC tracepoints can already show when rpc_shutdown_client() is called. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 634816f9d3de92ee2635158e84c4731074131799 Author: Chuck Lever Date: Fri May 14 15:56:43 2021 -0400 NFSD: Add cb_lost tracepoint [ Upstream commit 806d65b617d89be887fe68bfa051f78143669cd7 ] Provide more clarity about when the callback channel is in trouble. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3076ede3fc10c5b1f118173df499bbd46427f93a Author: Chuck Lever Date: Fri May 14 15:56:37 2021 -0400 NFSD: Drop TRACE_DEFINE_ENUM for NFSD4_CB_ macros [ Upstream commit 167145cc64ce4b4b177e636829909a6b14004f9e ] TRACE_DEFINE_ENUM() is necessary for enum {} but not for C macros. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2be1f2275193a3adf5cee89bdf4578487061adc6 Author: Chuck Lever Date: Fri May 14 15:56:31 2021 -0400 NFSD: Capture every CB state transition [ Upstream commit 8476c69a7fa0f1f9705ec0caa4e97c08b5045779 ] We were missing one. As a clean-up, add a helper that sets the new CB state and fires a tracepoint. The tracepoint fires only when the state changes, to help reduce trace log noise. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b6ba775ccc947731ddd8f2473e9e734c46374494 Author: Chuck Lever Date: Fri May 14 15:56:25 2021 -0400 NFSD: Constify @fh argument of knfsd_fh_hash() [ Upstream commit 1736aec82a15cb5d4b3bbe0b2fbae0ede66b1a1a ] Enable knfsd_fh_hash() to be invoked in functions where the filehandle pointer is a const. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 88b3cdfd487357b16c285d7a2f9517669a526fa7 Author: Chuck Lever Date: Fri May 14 15:56:19 2021 -0400 NFSD: Add tracepoints for EXCHANGEID edge cases [ Upstream commit e8f80c5545ec5794644b48537449e48b009d608d ] Some of the most common cases are traced. Enough infrastructure is now in place that more can be added later, as needed. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5070351cdcebecac057b6648672eb3308cabde59 Author: Chuck Lever Date: Fri May 14 15:56:13 2021 -0400 NFSD: Add tracepoints for SETCLIENTID edge cases [ Upstream commit 237f91c85acef206a33bc02f3c4e856128fd7994 ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 650530d52260f1853faa6c05649101182c935f7f Author: Chuck Lever Date: Fri May 14 15:56:06 2021 -0400 NFSD: Add a couple more nfsd_clid_expired call sites [ Upstream commit 2958d2ee71021b6c44212ec6c2a39cc71d9cd4a9 ] Improve observation of NFSv4 lease expiry. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 056332823cdc7fd3b3f78f87dbb80f90420c5011 Author: Chuck Lever Date: Fri May 14 15:56:00 2021 -0400 NFSD: Add nfsd_clid_destroyed tracepoint [ Upstream commit c41a9b7a906fb872f8b2b1a34d2a1d5ef7f94adb ] Record client-requested termination of client IDs. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 580ec8b6536a1cb35fbdab4b0a4a27c29e5fc0a5 Author: Chuck Lever Date: Fri May 14 15:55:54 2021 -0400 NFSD: Add nfsd_clid_reclaim_complete tracepoint [ Upstream commit cee8aa074281e5269d8404be2b6388bb29ea8efc ] Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3b6808c793f3dbcd5e46c74bfa7301a2bf6dbe5b Author: Chuck Lever Date: Fri May 14 15:55:48 2021 -0400 NFSD: Add nfsd_clid_confirmed tracepoint [ Upstream commit 7e3b32ace6094aadfa2e1e54ca4c6bbfd07646af ] This replaces a dprintk call site in order to get greater visibility on when client IDs are confirmed or re-used. Simple example: nfsd-995 [000] 126.622975: nfsd_compound: xid=0x3a34e2b1 opcnt=1 nfsd-995 [000] 126.623005: nfsd_cb_args: addr=192.168.2.51:45901 client 60958e3b:9213ef0e prog=1073741824 ident=1 nfsd-995 [000] 126.623007: nfsd_compound_status: op=1/1 OP_SETCLIENTID status=0 nfsd-996 [001] 126.623142: nfsd_compound: xid=0x3b34e2b1 opcnt=1 >>>> nfsd-996 [001] 126.623146: nfsd_clid_confirmed: client 60958e3b:9213ef0e nfsd-996 [001] 126.623148: nfsd_cb_probe: addr=192.168.2.51:45901 client 60958e3b:9213ef0e state=UNKNOWN nfsd-996 [001] 126.623154: nfsd_compound_status: op=1/1 OP_SETCLIENTID_CONFIRM status=0 Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c6889b75a6171888b142a38f7b50949ed5f4a906 Author: Chuck Lever Date: Fri May 14 15:55:42 2021 -0400 NFSD: Remove trace_nfsd_clid_inuse_err [ Upstream commit 0bfaacac57e64aa342f865b8ddcab06ca59a6f83 ] This tracepoint has been replaced by nfsd_clid_cred_mismatch and nfsd_clid_verf_mismatch, and can simply be removed. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8da1871206651d7b2a32d6d7fca10f24821f9fcb Author: Chuck Lever Date: Fri May 14 15:55:36 2021 -0400 NFSD: Add nfsd_clid_verf_mismatch tracepoint [ Upstream commit 744ea54c869cebe41fbad5f53f8a8ca5d93a5c97 ] Record when a client presents a different boot verifier than the one we know about. Typically this is a sign the client has rebooted, but sometimes it signals a conflicting client ID, which the client's administrator will need to address. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c8493d73083c37872bfa4959fec21ab431173356 Author: Chuck Lever Date: Fri May 14 15:55:29 2021 -0400 NFSD: Add nfsd_clid_cred_mismatch tracepoint [ Upstream commit 27787733ef44332fce749aa853f2749d141982b0 ] Record when a client tries to establish a lease record but uses an unexpected credential. This is often a sign of a configuration problem. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b00bb7dfe2594932bfcdf65955a4b87b5faa96d8 Author: Chuck Lever Date: Fri May 14 15:55:23 2021 -0400 NFSD: Add an RPC authflavor tracepoint display helper [ Upstream commit 87b2394d60c32c158ebb96ace4abee883baf1239 ] To be used in subsequent patches. Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a4d250f5107c6d270bcb1071c00977273068c0ea Author: Amir Goldstein Date: Mon May 24 16:53:21 2021 +0300 fanotify: fix permission model of unprivileged group [ Upstream commit a8b98c808eab3ec8f1b5a64be967b0f4af4cae43 ] Reporting event->pid should depend on the privileges of the user that initialized the group, not the privileges of the user reading the events. Use an internal group flag FANOTIFY_UNPRIV to record the fact that the group was initialized by an unprivileged user. To be on the safe side, the premissions to setup filesystem and mount marks now require that both the user that initialized the group and the user setting up the mark have CAP_SYS_ADMIN. Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiA77_P5vtv7e83g0+9d7B5W9ZTE4GfQEYbWmfT1rA=VA@mail.gmail.com/ Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users") Cc: # v5.12+ Link: https://lore.kernel.org/r/20210524135321.2190062-1-amir73il@gmail.com Reviewed-by: Matthew Bobrowski Acked-by: Christian Brauner Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0245993ace73088472190161965fff25ba685f68 Author: Trond Myklebust Date: Wed Mar 24 15:32:21 2021 -0400 NFS: fix nfs_fetch_iversion() [ Upstream commit b876d708316bf9b6b9678eb2beb289b93cfe6369 ] The change attribute is always set by all NFS client versions so get rid of the open-coded version. Fixes: 3cc55f4434b4 ("nfs: use change attribute for NFS re-exports") Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b2c0c7cb7fe34d1f18f1292b39c7221dd9765ca7 Author: Dai Ngo Date: Thu Apr 22 03:37:49 2021 -0400 NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code. [ Upstream commit d9092b4bb2109502eb8972021a3f74febc931a63 ] The client SSC code should not depend on any of the CONFIG_NFSD config. This patch removes all CONFIG_NFSD from NFSv4.2 client SSC code and simplifies the config of CONFIG_NFS_V4_2_SSC_HELPER, NFSD_V4_2_INTER_SSC. Signed-off-by: Dai Ngo Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit 3793f28102f185969e2f7d4641160179b6240249 Author: Gustavo A. R. Silva Date: Fri Nov 20 12:26:40 2020 -0600 nfsd: Fix fall-through warnings for Clang [ Upstream commit 76c50eb70d8e1133eaada0013845619c36345fbc ] In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding a couple of break statements instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 39ab09108e2863392cac94be9f4116c2f0fed3ac Author: J. Bruce Fields Date: Fri Apr 16 14:00:18 2021 -0400 nfsd: grant read delegations to clients holding writes [ Upstream commit aba2072f452346d56a462718bcde93d697383148 ] It's OK to grant a read delegation to a client that holds a write, as long as it's the only client holding the write. We originally tried to do this in commit 94415b06eb8a ("nfsd4: a client's own opens needn't prevent delegations"), which had to be reverted in commit 6ee65a773096 ("Revert "nfsd4: a client's own opens needn't prevent delegations""). Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d2431cc9670a0b7dee8829d82ba1e960c0e9c6b7 Author: J. Bruce Fields Date: Fri Apr 16 14:00:17 2021 -0400 nfsd: reshuffle some code [ Upstream commit ebd9d2c2f5a7ebaaed2d7bb4dee148755f46033d ] No change in behavior, I'm just moving some code around to avoid forward references in a following patch. (To do someday: figure out how to split up nfs4state.c. It's big and disorganized.) Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ee548b1629909626e56d4b90b1affad3310bc380 Author: J. Bruce Fields Date: Fri Apr 16 14:00:16 2021 -0400 nfsd: track filehandle aliasing in nfs4_files [ Upstream commit a0ce48375a367222989c2618fe68bf34db8c7bb7 ] It's unusual but possible for multiple filehandles to point to the same file. In that case, we may end up with multiple nfs4_files referencing the same inode. For delegation purposes it will turn out to be useful to flag those cases. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cc6d658669f8dc737affc88d4a06348f43627aa2 Author: J. Bruce Fields Date: Fri Apr 16 14:00:15 2021 -0400 nfsd: hash nfs4_files by inode number [ Upstream commit f9b60e2209213fdfcc504ba25a404977c5d08b77 ] The nfs4_file structure is per-filehandle, not per-inode, because the spec requires open and other state to be per filehandle. But it will turn out to be convenient for nfs4_files associated with the same inode to be hashed to the same bucket, so let's hash on the inode instead of the filehandle. Filehandle aliasing is rare, so that shouldn't have much performance impact. (If you have a ton of exported filesystems, though, and all of them have a root with inode number 2, could that get you an overlong hash chain? Perhaps this (and the v4 open file cache) should be hashed on the inode pointer instead.) Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e63b956b2da90c59a476dd743179377d28581552 Author: Vasily Averin Date: Thu Apr 15 15:00:58 2021 +0300 nfsd: removed unused argument in nfsd_startup_generic() [ Upstream commit 70c5307564035c160078401f541c397d77b95415 ] Since commit 501cb1849f86 ("nfsd: rip out the raparms cache") nrservs is not used in nfsd_startup_generic() Signed-off-by: Vasily Averin Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 856b0c4979c7d1f6a4c2608a2115fdf7edb81fff Author: Jiapeng Chong Date: Thu Apr 15 16:38:24 2021 +0800 nfsd: remove unused function [ Upstream commit 363f8dd5eecd6c67fe9840ef6065440f0ee7df3a ] Fix the following clang warning: fs/nfsd/nfs4state.c:6276:1: warning: unused function 'end_offset' [-Wunused-function]. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bd373a90d048093030c5ce376a87340ac42c28e2 Author: Christian Brauner Date: Thu Mar 25 09:37:43 2021 +0100 fanotify_user: use upper_32_bits() to verify mask [ Upstream commit 22d483b99863202e3631ff66fa0f3c2302c0f96f ] I don't see an obvious reason why the upper 32 bit check needs to be open-coded this way. Switch to upper_32_bits() which is more idiomatic and should conceptually be the same check. Cc: Amir Goldstein Cc: Jan Kara Link: https://lore.kernel.org/r/20210325083742.2334933-1-brauner@kernel.org Signed-off-by: Christian Brauner Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4ac0ad23728ab4907f82636da867a7becb0a2b15 Author: Amir Goldstein Date: Thu Mar 4 13:29:21 2021 +0200 fanotify: support limited functionality for unprivileged users [ Upstream commit 7cea2a3c505e87a9d6afc78be4a7f7be636a73a7 ] Add limited support for unprivileged fanotify groups. An unprivileged users is not allowed to get an open file descriptor in the event nor the process pid of another process. An unprivileged user cannot request permission events, cannot set mount/filesystem marks and cannot request unlimited queue/marks. This enables the limited functionality similar to inotify when watching a set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without requiring SYS_CAP_ADMIN privileges. The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged listener watching a set of directories (with FAN_EVENT_ON_CHILD) to monitor all changes inside those directories. This typically requires that the listener keeps a map of watched directory fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() before starting to watch for changes. When getting an event, the reported fid of the parent should be resolved to dirfd and fstatsat(2) with dirfd and name should be used to query the state of the filesystem entry. Link: https://lore.kernel.org/r/20210304112921.3996419-3-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3e441a872a57003038b249f6260782ab629d3631 Author: Amir Goldstein Date: Thu Mar 4 13:29:20 2021 +0200 fanotify: configurable limits via sysfs [ Upstream commit 5b8fea65d197f408bb00b251c70d842826d6b70b ] fanotify has some hardcoded limits. The only APIs to escape those limits are FAN_UNLIMITED_QUEUE and FAN_UNLIMITED_MARKS. Allow finer grained tuning of the system limits via sysfs tunables under /proc/sys/fs/fanotify, similar to tunables under /proc/sys/fs/inotify, with some minor differences. - max_queued_events - global system tunable for group queue size limit. Like the inotify tunable with the same name, it defaults to 16384 and applies on initialization of a new group. - max_user_marks - user ns tunable for marks limit per user. Like the inotify tunable named max_user_watches, on a machine with sufficient RAM and it defaults to 1048576 in init userns and can be further limited per containing user ns. - max_user_groups - user ns tunable for number of groups per user. Like the inotify tunable named max_user_instances, it defaults to 128 in init userns and can be further limited per containing user ns. The slightly different tunable names used for fanotify are derived from the "group" and "mark" terminology used in the fanotify man pages and throughout the code. Considering the fact that the default value for max_user_instances was increased in kernel v5.10 from 8192 to 1048576, leaving the legacy fanotify limit of 8192 marks per group in addition to the max_user_marks limit makes little sense, so the per group marks limit has been removed. Note that when a group is initialized with FAN_UNLIMITED_MARKS, its own marks are not accounted in the per user marks account, so in effect the limit of max_user_marks is only for the collection of groups that are not initialized with FAN_UNLIMITED_MARKS. Link: https://lore.kernel.org/r/20210304112921.3996419-2-amir73il@gmail.com Suggested-by: Jan Kara Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7df80a90e1a115cfd0e684015ee3bc780999b8c6 Author: Amir Goldstein Date: Thu Mar 4 12:48:26 2021 +0200 fanotify: limit number of event merge attempts [ Upstream commit b8cd0ee8cda68a888a317991c1e918a8cba1a568 ] Event merges are expensive when event queue size is large, so limit the linear search to 128 merge tests. In combination with 128 size hash table, there is a potential to merge with up to 16K events in the hashed queue. Link: https://lore.kernel.org/r/20210304104826.3993892-6-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 40e1e98c1bb24318e2287bdb949fac45e746608b Author: Amir Goldstein Date: Thu Mar 4 12:48:25 2021 +0200 fsnotify: use hash table for faster events merge [ Upstream commit 94e00d28a680dff18805ca472b191364347d2234 ] In order to improve event merge performance, hash events in a 128 size hash table by the event merge key. The fanotify_event size grows by two pointers, but we just reduced its size by removing the objectid member, so overall its size is increased by one pointer. Permission events and overflow event are not merged so they are also not hashed. Link: https://lore.kernel.org/r/20210304104826.3993892-5-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ae7fd89daeb695636f9453207c7a61f1b62ce38d Author: Amir Goldstein Date: Thu Mar 4 12:48:24 2021 +0200 fanotify: mix event info and pid into merge key hash [ Upstream commit 7e3e5c6943994943eb76cab2d3a1806bc10b9045 ] Improve the merge key hash by mixing more values relevant for merge. For example, all FAN_CREATE name events in the same dir used to have the same merge key based on the dir inode. With this change the created file name is mixed into the merge key. The object id that was used as merge key is redundant to the event info so it is no longer mixed into the hash. Permission events are not hashed, so no need to hash their info. Link: https://lore.kernel.org/r/20210304104826.3993892-4-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5b57a2b74d01c6f2c3698697f7beedd0ccf25f12 Author: Amir Goldstein Date: Thu Mar 4 12:48:23 2021 +0200 fanotify: reduce event objectid to 29-bit hash [ Upstream commit 8988f11abb820bacfcc53d498370bfb30f792ec4 ] objectid is only used by fanotify backend and it is just an optimization for event merge before comparing all fields in event. Move the objectid member from common struct fsnotify_event into struct fanotify_event and reduce it to 29-bit hash to cram it together with the 3-bit event type. Events of different types are never merged, so the combination of event type and hash form a 32-bit key for fast compare of events. This reduces the size of events by one pointer and paves the way for adding hashed queue support for fanotify. Link: https://lore.kernel.org/r/20210304104826.3993892-3-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4f14948942937eeb2f4c363f5912d143049cdf26 Author: Chuck Lever Date: Thu Mar 7 09:22:43 2024 -0500 Revert "fanotify: limit number of event merge attempts" Temporarily revert commit ad3ea16746cc ("fanotify: limit number of event merge attempts") to enable subsequent upstream commits to apply and build cleanly. Stable-dep-of: 8988f11abb82 ("fanotify: reduce event objectid to 29-bit hash") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 62b7f3847373a48568162419f3a2250bce048756 Author: Amir Goldstein Date: Thu Mar 4 12:48:22 2021 +0200 fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue [ Upstream commit 6f73171e192366ff7c98af9fb50615ef9615f8a7 ] Current code has an assumtion that fsnotify_notify_queue_is_empty() is called to verify that queue is not empty before trying to peek or remove an event from queue. Remove this assumption by moving the fsnotify_notify_queue_is_empty() into the functions, allow them to return NULL value and check return value by all callers. This is a prep patch for multi event queues. Link: https://lore.kernel.org/r/20210304104826.3993892-2-amir73il@gmail.com Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d9168ab8d714e7931438088dba3eb79c5a04152d Author: Guobin Huang Date: Tue Apr 6 20:08:18 2021 +0800 NFSD: Use DEFINE_SPINLOCK() for spinlock [ Upstream commit b73ac6808b0f7994a05ebc38571e2e9eaf98a0f4 ] spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Reported-by: Hulk Robot Signed-off-by: Guobin Huang Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b20d88bf1eab5f33103457f535dad686c849b325 Author: Gustavo A. R. Silva Date: Tue Mar 23 17:48:58 2021 -0500 UAPI: nfsfh.h: Replace one-element array with flexible-array member [ Upstream commit c0a744dcaa29e9537e8607ae9c965ad936124a4d ] There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Use an anonymous union with a couple of anonymous structs in order to keep userspace unchanged: $ pahole -C nfs_fhbase_new fs/nfsd/nfsfh.o struct nfs_fhbase_new { union { struct { __u8 fb_version_aux; /* 0 1 */ __u8 fb_auth_type_aux; /* 1 1 */ __u8 fb_fsid_type_aux; /* 2 1 */ __u8 fb_fileid_type_aux; /* 3 1 */ __u32 fb_auth[1]; /* 4 4 */ }; /* 0 8 */ struct { __u8 fb_version; /* 0 1 */ __u8 fb_auth_type; /* 1 1 */ __u8 fb_fsid_type; /* 2 1 */ __u8 fb_fileid_type; /* 3 1 */ __u32 fb_auth_flex[0]; /* 4 0 */ }; /* 0 4 */ }; /* 0 8 */ /* size: 8, cachelines: 1, members: 1 */ /* last cacheline: 8 bytes */ }; Also, this helps with the ongoing efforts to enable -Warray-bounds by fixing the following warnings: fs/nfsd/nfsfh.c: In function ‘nfsd_set_fh_dentry’: fs/nfsd/nfsfh.c:191:41: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 191 | ntohl((__force __be32)fh->fh_fsid[1]))); | ~~~~~~~~~~~^~~ ./include/linux/kdev_t.h:12:46: note: in definition of macro ‘MKDEV’ 12 | #define MKDEV(ma,mi) (((ma) << MINORBITS) | (mi)) | ^~ ./include/uapi/linux/byteorder/little_endian.h:40:26: note: in expansion of macro ‘__swab32’ 40 | #define __be32_to_cpu(x) __swab32((__force __u32)(__be32)(x)) | ^~~~~~~~ ./include/linux/byteorder/generic.h:136:21: note: in expansion of macro ‘__be32_to_cpu’ 136 | #define ___ntohl(x) __be32_to_cpu(x) | ^~~~~~~~~~~~~ ./include/linux/byteorder/generic.h:140:18: note: in expansion of macro ‘___ntohl’ 140 | #define ntohl(x) ___ntohl(x) | ^~~~~~~~ fs/nfsd/nfsfh.c:191:8: note: in expansion of macro ‘ntohl’ 191 | ntohl((__force __be32)fh->fh_fsid[1]))); | ^~~~~ fs/nfsd/nfsfh.c:192:32: warning: array subscript 2 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 192 | fh->fh_fsid[1] = fh->fh_fsid[2]; | ~~~~~~~~~~~^~~ fs/nfsd/nfsfh.c:192:15: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 192 | fh->fh_fsid[1] = fh->fh_fsid[2]; | ~~~~~~~~~~~^~~ [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/109 Signed-off-by: Gustavo A. R. Silva Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 117dac268d805a7f3990caab43ab62529686865a Author: Chuck Lever Date: Fri Jan 29 13:04:04 2021 -0500 SUNRPC: Export svc_xprt_received() [ Upstream commit 7dcfbd86adc45f6d6b37278efd22530cf80ab474 ] Prepare svc_xprt_received() to be called from transport code instead of from generic RPC server code. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 289adc864d0a0e754d9773887ce3c1493b76132b Author: NeilBrown Date: Sat Mar 20 09:38:04 2021 +1100 nfsd: report client confirmation status in "info" file [ Upstream commit 472d155a0631bd1a09b5c0c275a254e65605d683 ] mountd can now monitor clients appearing and disappearing in /proc/fs/nfsd/clients, and will log these events, in liu of the logging of mount/unmount events for NFSv3. Currently it cannot distinguish between unconfirmed clients (which might be transient and totally uninteresting) and confirmed clients. So add a "status: " line which reports either "confirmed" or "unconfirmed", and use fsnotify to report that the info file has been modified. This requires a bit of infrastructure to keep the dentry for the "info" file. There is no need to take a counted reference as the dentry must remain around until the client is removed. Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 14b13e0603f83b724da37e7eac2462fcd19d3ada Author: J. Bruce Fields Date: Thu Mar 18 20:03:22 2021 -0400 nfsd: don't ignore high bits of copy count [ Upstream commit e7a833e9cc6c3b58fe94f049d2b40943cba07086 ] Note size_t is 32-bit on a 32-bit architecture, but cp_count is defined by the protocol to be 64 bit, so we could be turning a large copy into a 0-length copy here. Reported-by: Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1f76b1e65926a460cf782e187defed6f482b5f15 Author: J. Bruce Fields Date: Thu Mar 18 20:03:23 2021 -0400 nfsd: COPY with length 0 should copy to end of file [ Upstream commit 792a5112aa90e59c048b601c6382fe3498d75db7 ] >From https://tools.ietf.org/html/rfc7862#page-65 A count of 0 (zero) requests that all bytes from ca_src_offset through EOF be copied to the destination. Reported-by: Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ed01819390647a4fed22b56611bf0e2da6d27c66 Author: Ricardo Ribalda Date: Thu Mar 18 21:22:21 2021 +0100 nfsd: Fix typo "accesible" [ Upstream commit 34a624931b8c12b435b5009edc5897e4630107bc ] Trivial fix. Cc: linux-nfs@vger.kernel.org Signed-off-by: Ricardo Ribalda Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2a5df97ba41c5b519b9d46b439b1a9927d59e9ff Author: Paul Menzel Date: Fri Mar 12 22:03:00 2021 +0100 nfsd: Log client tracking type log message as info instead of warning [ Upstream commit f988a7b71d1e66e63f79cd59c763875347943a7a ] `printk()`, by default, uses the log level warning, which leaves the user reading NFSD: Using UMH upcall client tracking operations. wondering what to do about it (`dmesg --level=warn`). Several client tracking methods are tried, and expected to fail. That’s why a message is printed only on success. It might be interesting for users to know the chosen method, so use info-level instead of debug-level. Cc: linux-nfs@vger.kernel.org Signed-off-by: Paul Menzel Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0fa20162bfc7f3b72faed475d0c4a083d859d4e3 Author: J. Bruce Fields Date: Tue Mar 2 10:46:23 2021 -0500 nfsd: helper for laundromat expiry calculations [ Upstream commit 7f7e7a4006f74b031718055a0751c70c2e3d5e7e ] We do this same logic repeatedly, and it's easy to get the sense of the comparison wrong. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit aab7be2475d1e49c9e83e6b7cef3a0edfe7ce199 Author: Chuck Lever Date: Fri Mar 5 14:22:32 2021 -0500 NFSD: Clean up NFSDDBG_FACILITY macro [ Upstream commit 219a170502b3d597c52eeec088aee8fbf7b90da5 ] These are no longer needed because there are no dprintk() call sites in these files. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e7dac943b4d4ad8fd9a171bfbf73ca161fd75157 Author: Chuck Lever Date: Fri Mar 5 13:57:40 2021 -0500 NFSD: Add a tracepoint to record directory entry encoding [ Upstream commit 6019ce0742ca55d3e45279a19b07d1542747a098 ] Enable watching the progress of directory encoding to capture the timing of any issues with reading or encoding a directory. The new tracepoint captures dirent encoding for all NFS versions. For example, here's what a few NFSv4 directory entries might look like: nfsd-989 [002] 468.596265: nfsd_dirent: fh_hash=0x5d162594 ino=2 name=. nfsd-989 [002] 468.596267: nfsd_dirent: fh_hash=0x5d162594 ino=1 name=.. nfsd-989 [002] 468.596299: nfsd_dirent: fh_hash=0x5d162594 ino=3827 name=zlib.c nfsd-989 [002] 468.596325: nfsd_dirent: fh_hash=0x5d162594 ino=3811 name=xdiff nfsd-989 [002] 468.596351: nfsd_dirent: fh_hash=0x5d162594 ino=3810 name=xdiff-interface.h nfsd-989 [002] 468.596377: nfsd_dirent: fh_hash=0x5d162594 ino=3809 name=xdiff-interface.c Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a6d9f6f371cb016ccccbf4f04f47074ab507eada Author: Chuck Lever Date: Sun Nov 15 15:09:16 2020 -0500 NFSD: Clean up after updating NFSv3 ACL encoders [ Upstream commit 1416f435303d81070c6bcf5a4a9b4ed0f7a9f013 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 857a37235cf0d2c5ee999b2f5f31dffdbdb209ae Author: Chuck Lever Date: Wed Nov 18 16:21:24 2020 -0500 NFSD: Update the NFSv3 SETACL result encoder to use struct xdr_stream [ Upstream commit 15e432bf0cfd1e6aebfa9ffd4e0cc2ff4f3ae2db ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d505e66191072748620fc0af038cea4e4da0e3cd Author: Chuck Lever Date: Wed Nov 18 16:11:42 2020 -0500 NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream [ Upstream commit 20798dfe249a01ad1b12eec7dbc572db5003244a ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 67d4f36707ade27001eb50412bab24422bcc0cc4 Author: Chuck Lever Date: Sun Nov 15 14:31:42 2020 -0500 NFSD: Clean up after updating NFSv2 ACL encoders [ Upstream commit 83d0b84572775a29f800de67a1b9b642a5376bc3 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3d2033a58c6c9de62d98dd7bf5b93acd40903b5f Author: Chuck Lever Date: Wed Nov 18 14:52:09 2020 -0500 NFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_stream [ Upstream commit 07f5c2963c04b11603e9667f89bb430c132e9cc1 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6ef7a56fd7fa845fa13278dbb385a5f191659884 Author: Chuck Lever Date: Wed Nov 18 14:49:57 2020 -0500 NFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_stream [ Upstream commit 8d2009a10b3abaa12a39deb4876b215714993fe8 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 82ac35b16710034ac2eaddf598a3190ad9675cb8 Author: Chuck Lever Date: Wed Nov 18 14:47:56 2020 -0500 NFSD: Update the NFSv2 SETACL result encoder to use struct xdr_stream [ Upstream commit 778f068fa0c0846b650ebdb8795fd51b5badc332 ] The SETACL result encoder is exactly the same as the NFSv2 attrstatres decoder. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6677b0d16abe77702040768c96e2ea17cd5b3f6e Author: Chuck Lever Date: Wed Nov 18 14:38:47 2020 -0500 NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream [ Upstream commit f8cba47344f794b54373189bec23195b51020faf ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 89ac9a8101ad4aff456116d2dcc9a56df3d9052d Author: Chuck Lever Date: Wed Nov 18 14:55:05 2020 -0500 NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs [ Upstream commit 8edc0648880a151026fe625fa1b76772b5766f68 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 93584780eb4d8f58303c4d828b18bf7b136b917c Author: Chuck Lever Date: Sun Nov 15 14:30:13 2020 -0500 NFSD: Remove unused NFSv2 directory entry encoders [ Upstream commit 8a2cf9f5709cc20a1114a7d22655928314fc86f8 ] Clean up. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b8658c947d540cf5d135176354da5c8f2811efa0 Author: Chuck Lever Date: Sat Nov 14 13:45:35 2020 -0500 NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream [ Upstream commit f5dcccd647da513a89f3b6ca392b0c1eb050b9fc ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 801e4d79b779ff63b8f42bd4eff7f7eda9d538af Author: Chuck Lever Date: Fri Oct 23 16:49:01 2020 -0400 NFSD: Update the NFSv2 READDIR result encoder to use struct xdr_stream [ Upstream commit 94c8f8c682a6497af7ea71351b18f637c6337d42 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bc17759a4e9928a869542a714d1e997073b24413 Author: Chuck Lever Date: Fri Nov 13 16:57:44 2020 -0500 NFSD: Count bytes instead of pages in the NFSv2 READDIR encoder [ Upstream commit 8141d6a2bb6c655ff0c0b81ced80d9025f03e926 ] Clean up: Counting the bytes used by each returned directory entry seems less brittle to me than trying to measure consumed pages after the fact. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c4e272758974cba11fd3611a8f39aa227cd130e0 Author: Chuck Lever Date: Fri Nov 13 16:53:17 2020 -0500 NFSD: Add a helper that encodes NFSv3 directory offset cookies [ Upstream commit d52532002ffa217ad3fa4c3ba86c95203d21dd21 ] Refactor: Add helper function similar to nfs3svc_encode_cookie3(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 60bc5af5b8dc11cc2198b132c52b5105dbc13904 Author: Chuck Lever Date: Fri Oct 23 19:01:38 2020 -0400 NFSD: Update the NFSv2 STATFS result encoder to use struct xdr_stream [ Upstream commit bf15229f2ced4f14946eef958336f764e30f8efb ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ad0614d3a857363b6e3978ce8adb7d8fedd4e18b Author: Chuck Lever Date: Fri Oct 23 16:40:11 2020 -0400 NFSD: Update the NFSv2 READ result encoder to use struct xdr_stream [ Upstream commit a6f8d9dc9e44b51303d9abde4643460137d19b28 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 27909a583cc399a0587ae676733d4b210b2f92c4 Author: Chuck Lever Date: Fri Oct 23 15:41:09 2020 -0400 NFSD: Update the NFSv2 READLINK result encoder to use struct xdr_stream [ Upstream commit d9014b0f8fae11f22a3d356553844e06ddcdce4a ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9aab4f03e8f222452f3d234b6d38998b25a0e291 Author: Chuck Lever Date: Fri Oct 23 16:44:16 2020 -0400 NFSD: Update the NFSv2 diropres encoder to use struct xdr_stream [ Upstream commit e3b4ef221ac57c08341c97a10c8a81c041f76716 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c64d5d0ca9f94594f37d3f2f575433674a44feb8 Author: Chuck Lever Date: Fri Oct 23 15:28:59 2020 -0400 NFSD: Update the NFSv2 attrstat encoder to use struct xdr_stream [ Upstream commit 92b54a4fa4224e6116eb0d87a39dd05af23fcdfa ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 816c23c911f61ba7df9e60a0411aae7a683b342c Author: Chuck Lever Date: Fri Oct 23 11:08:02 2020 -0400 NFSD: Update the NFSv2 stat encoder to use struct xdr_stream [ Upstream commit a887eaed2a964754334cd3f8c5fe87e413e68fef ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e4e6019ce5a24c9acc8bbd21e0df38e85b49c132 Author: Chuck Lever Date: Fri Jan 15 09:28:44 2021 -0500 NFSD: Reduce svc_rqst::rq_pages churn during READDIR operations [ Upstream commit 76ed0dd96eeb2771b21bf5dcbd88326ef89ee0ed ] During NFSv2 and NFSv3 READDIR/PLUS operations, NFSD advances rq_next_page to the full size of the client-requested buffer, then releases all those pages at the end of the request. The next request to use that nfsd thread has to refill the pages. NFSD does this even when the dirlist in the reply is small. With NFSv3 clients that send READDIR operations with large buffer sizes, that can be 256 put_page/alloc_page pairs per READDIR request, even though those pages often remain unused. We can save some work by not releasing dirlist buffer pages that were not used to form the READDIR Reply. I've left the NFSv2 code alone since there are never more than three pages involved in an NFSv2 READDIR Reply. Eventually we should nail down why these pages need to be released at all in order to avoid allocating and releasing pages unnecessarily. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d8554802010d13c60cbf147deec7b91db89adbbc Author: Chuck Lever Date: Fri Nov 13 11:27:13 2020 -0500 NFSD: Remove unused NFSv3 directory entry encoders [ Upstream commit 1411934627f9fe31a36ac8c43179ce9b63edce5c ] Clean up. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 37aa5e64022243e721b8334122997881177a4cfc Author: Chuck Lever Date: Thu Oct 22 19:46:58 2020 -0400 NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream [ Upstream commit 7f87fc2d34d475225e78b7f5c4eabb121f4282b2 ] The benefit of the xdr_stream helpers is that they transparently handle encoding an XDR data item that crosses page boundaries. Most of the open-coded logic to do that here can be eliminated. A sub-buffer and sub-stream are set up as a sink buffer for the directory entry encoder. As an entry is encoded, it is added to the end of the content in this buffer/stream. The total length of the directory list is tracked in the buffer's @len field. When it comes time to encode the Reply, the sub-buffer is merged into rq_res's page array at the correct place using xdr_write_pages(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7cbec0dc097a803307bca96bad20e73e645f1539 Author: Chuck Lever Date: Thu Oct 22 19:31:48 2020 -0400 NFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_stream [ Upstream commit e4ccfe3014de435984939a3d84b7f241d3b57b0d ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cacfe8f6d809fef3f966c017f2e8c9efa287e53c Author: Chuck Lever Date: Mon Nov 9 13:13:21 2020 -0500 NFSD: Count bytes instead of pages in the NFSv3 READDIR encoder [ Upstream commit a1409e2de4f11034c8eb30775cc3e37039a4ef13 ] Clean up: Counting the bytes used by each returned directory entry seems less brittle to me than trying to measure consumed pages after the fact. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3b2fef48b77c611c19081805b9378e1fe80ad8a6 Author: Chuck Lever Date: Tue Nov 10 09:57:14 2020 -0500 NFSD: Add a helper that encodes NFSv3 directory offset cookies [ Upstream commit a161e6c76aeba835e475a2f27dbbe5c37e565e94 ] Refactor: De-duplicate identical code that handles encoding of directory offset cookies across page boundaries. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 30dabf1d4fd4b7a9af6237169d118a92ebfcb47b Author: Chuck Lever Date: Thu Oct 22 15:35:46 2020 -0400 NFSD: Update the NFSv3 COMMIT3res encoder to use struct xdr_stream [ Upstream commit 5ef2826c761079e27904c85034df34e601b82d94 ] As an additional clean up, encode_wcc_data() is removed because it is now no longer used. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 349d96b070de15376b3db8fb0c8da4e1f46a0eab Author: Chuck Lever Date: Fri Nov 6 13:15:09 2020 -0500 NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream [ Upstream commit ded04a587f6ceaaba3caefad4021f2212b46c9ff ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4c06f831d28b8ffb436818440202c5795ed242e1 Author: Chuck Lever Date: Thu Oct 22 13:42:13 2020 -0400 NFSD: Update the NFSv3 FSINFO3res encoder to use struct xdr_stream [ Upstream commit 0a139d1b7f327010acc36e8162936d3108c7addb ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f6908e2bcd84b0ac74bf04f41363b3085be63c8b Author: Chuck Lever Date: Fri Nov 6 13:08:45 2020 -0500 NFSD: Update the NFSv3 FSSTAT3res encoder to use struct xdr_stream [ Upstream commit 8b7044984fd6eeadf72285e3617116bd15e9e676 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 066dc317fa65e975dc17a0d4e4e8998122d627d2 Author: Chuck Lever Date: Thu Oct 22 15:08:29 2020 -0400 NFSD: Update the NFSv3 LINK3res encoder to use struct xdr_stream [ Upstream commit 4d74380a446f75eebb2171687d9b8baf0025bdf1 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0404cffec4136da3db5ccda30c47a06f33f63ec0 Author: Chuck Lever Date: Thu Oct 22 15:33:05 2020 -0400 NFSD: Update the NFSv3 RENAMEv3res encoder to use struct xdr_stream [ Upstream commit 89d79e9672dfa6d0cc416699c16f2d312da58ff2 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1863ca4c9e29227eab16587938ebf3dc0bcea920 Author: Chuck Lever Date: Thu Oct 22 15:27:23 2020 -0400 NFSD: Update the NFSv3 CREATE family of encoders to use struct xdr_stream [ Upstream commit 78315b36781d259dcbdc102ff22c3f2f25712223 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8737a75f265dd4e1b448338ec989fa1c551063ee Author: Chuck Lever Date: Thu Oct 22 15:26:31 2020 -0400 NFSD: Update the NFSv3 WRITE3res encoder to use struct xdr_stream [ Upstream commit ecb7a085ac15a8844ebf12fca6ae51ce71ac9b3b ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b241ab982373b0b16fd2145ef6a745222ad9bc54 Author: Chuck Lever Date: Thu Oct 22 15:23:50 2020 -0400 NFSD: Update the NFSv3 READ3res encode to use struct xdr_stream [ Upstream commit cc9bcdad7773c295375e66c892c7ac00524706f2 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 170e6bd25e691e2a9e2ed69437491354b46c9808 Author: Chuck Lever Date: Thu Oct 22 15:18:40 2020 -0400 NFSD: Update the NFSv3 READLINK3res encoder to use struct xdr_stream [ Upstream commit 9a9c8923b3efd593d0e6a405efef9d58c6e6804b ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c3995f8be13a4265223125131a9fd472634ea00f Author: Chuck Lever Date: Thu Oct 22 15:12:38 2020 -0400 NFSD: Update the NFSv3 wccstat result encoder to use struct xdr_stream [ Upstream commit 70f8e839859a994e324e1d18889f8319bbd5bff9 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f74e0652a60b0d98de5c9617ea5d80c940a22c7a Author: Chuck Lever Date: Thu Oct 22 14:46:58 2020 -0400 NFSD: Update the NFSv3 LOOKUP3res encoder to use struct xdr_stream [ Upstream commit 5cf353354af1a385f29dec4609a1532d32c83a25 ] Also, clean up: Rename the encoder function to match the name of the result structure in RFC 1813, consistent with other encoder function names in nfs3xdr.c. "diropres" is an NFSv2 thingie. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fd9e183df6255038f78a40993ba694a6d3b50bdd Author: Chuck Lever Date: Thu Oct 22 13:56:58 2020 -0400 NFSD: Update the NFSv3 ACCESS3res encoder to use struct xdr_stream [ Upstream commit 907c38227fb57f5c537491ca76dd0b9636029393 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0ef12d755c4b0d43c72a3457aa5eb965a66ff722 Author: Chuck Lever Date: Wed Oct 21 11:58:41 2020 -0400 NFSD: Update the GETATTR3res encoder to use struct xdr_stream [ Upstream commit 2c42f804d30f6a8d86665eca84071b316821ea08 ] As an additional clean up, some renaming is done to more closely reflect the data type and variable names used in the NFSv3 XDR definition provided in RFC 1813. "attrstat" is an NFSv2 thingie. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 48aadfa75b619611206351754c318aaeb71aa45c Author: Chuck Lever Date: Tue Oct 27 15:53:42 2020 -0400 NFSD: Extract the svcxdr_init_encode() helper [ Upstream commit bddfdbcddbe267519cd36aeb115fdf8620980111 ] NFSD initializes an encode xdr_stream only after the RPC layer has already inserted the RPC Reply header. Thus it behaves differently than xdr_init_encode does, which assumes the passed-in xdr_buf is entirely devoid of content. nfs4proc.c has this server-side stream initialization helper, but it is visible only to the NFSv4 code. Move this helper to a place that can be accessed by NFSv2 and NFSv3 server XDR functions. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e864d4d834f820c8be01c9d048cdab6051fe3932 Author: Christian Brauner Date: Thu Jan 21 14:19:32 2021 +0100 namei: introduce struct renamedata [ Upstream commit 9fe61450972d3900bffb1dc26a17ebb9cdd92db2 ] In order to handle idmapped mounts we will extend the vfs rename helper to take two new arguments in follow up patches. Since this operations already takes a bunch of arguments add a simple struct renamedata and make the current helper use it before we extend it. Link: https://lore.kernel.org/r/20210121131959.646623-14-christian.brauner@ubuntu.com Cc: Christoph Hellwig Cc: David Howells Cc: Al Viro Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig [ cel: backported to 5.10.y, prior to idmapped mounts ] Signed-off-by: Christian Brauner Signed-off-by: Sasha Levin commit b0fa673c8c248ec2a0b6e563fb586df355b4f427 Author: Christian Brauner Date: Thu Jan 21 14:19:22 2021 +0100 fs: add file and path permissions helpers [ Upstream commit 02f92b3868a1b34ab98464e76b0e4e060474ba10 ] Add two simple helpers to check permissions on a file and path respectively and convert over some callers. It simplifies quite a few codepaths and also reduces the churn in later patches quite a bit. Christoph also correctly points out that this makes codepaths (e.g. ioctls) way easier to follow that would otherwise have to do more complex argument passing than necessary. Link: https://lore.kernel.org/r/20210121131959.646623-4-christian.brauner@ubuntu.com Cc: David Howells Cc: Al Viro Cc: linux-fsdevel@vger.kernel.org Suggested-by: Christoph Hellwig Reviewed-by: Christoph Hellwig Reviewed-by: James Morris Signed-off-by: Christian Brauner Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 666a413295929ca30703ac162886f205ff2ae6ad Author: Christoph Hellwig Date: Tue Feb 2 13:13:27 2021 +0100 kallsyms: only build {,module_}kallsyms_on_each_symbol when required [ Upstream commit 3e3552056ab42f883d7723eeb42fed712b66bacf ] kallsyms_on_each_symbol and module_kallsyms_on_each_symbol are only used by the livepatching code, so don't build them if livepatching is not enabled. Reviewed-by: Miroslav Benes Signed-off-by: Christoph Hellwig Signed-off-by: Jessica Yu Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f8d8568627241ac8501288206d1e620b250bac7c Author: Christoph Hellwig Date: Tue Feb 2 13:13:26 2021 +0100 kallsyms: refactor {,module_}kallsyms_on_each_symbol [ Upstream commit 013c1667cf78c1d847152f7116436d82dcab3db4 ] Require an explicit call to module_kallsyms_on_each_symbol to look for symbols in modules instead of the call from kallsyms_on_each_symbol, and acquire module_mutex inside of module_kallsyms_on_each_symbol instead of leaving that up to the caller. Note that this slightly changes the behavior for the livepatch code in that the symbols from vmlinux are not iterated anymore if objname is set, but that actually is the desired behavior in this case. Reviewed-by: Petr Mladek Acked-by: Miroslav Benes Signed-off-by: Christoph Hellwig Signed-off-by: Jessica Yu Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bef9d8b4f84b6e8482d72096a049139898a2213c Author: Christoph Hellwig Date: Tue Feb 2 13:13:25 2021 +0100 module: use RCU to synchronize find_module [ Upstream commit a006050575745ca2be25118b90f1c37f454ac542 ] Allow for a RCU-sched critical section around find_module, following the lower level find_module_all helper, and switch the two callers outside of module.c to use such a RCU-sched critical section instead of module_mutex. Reviewed-by: Petr Mladek Acked-by: Miroslav Benes Signed-off-by: Christoph Hellwig Signed-off-by: Jessica Yu Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 32edffff869a224f0c4705e4c46a0598e49cf7a6 Author: Christoph Hellwig Date: Tue Feb 2 13:13:24 2021 +0100 module: unexport find_module and module_mutex [ Upstream commit 089049f6c9956c5cf1fc89fe10229c76e99f4bef ] find_module is not used by modular code any more, and random driver code has no business calling it to start with. Reviewed-by: Miroslav Benes Signed-off-by: Christoph Hellwig Signed-off-by: Jessica Yu Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 51f620fcc419913315623e81b69a4f2bc59100a8 Author: Shakeel Butt Date: Sat Dec 19 20:46:08 2020 -0800 inotify, memcg: account inotify instances to kmemcg [ Upstream commit ac7b79fd190b02e7151bc7d2b9da692f537657f3 ] Currently the fs sysctl inotify/max_user_instances is used to limit the number of inotify instances on the system. For systems running multiple workloads, the per-user namespace sysctl max_inotify_instances can be used to further partition inotify instances. However there is no easy way to set a sensible system level max limit on inotify instances and further partition it between the workloads. It is much easier to charge the underlying resource (i.e. memory) behind the inotify instances to the memcg of the workload and let their memory limits limit the number of inotify instances they can create. With inotify instances charged to memcg, the admin can simply set max_user_instances to INT_MAX and let the memcg limits of the jobs limit their inotify instances. Link: https://lore.kernel.org/r/20201220044608.1258123-1-shakeelb@google.com Reviewed-by: Amir Goldstein Signed-off-by: Shakeel Butt Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c1fe2bb305a23b4b9060b44b44ac9d34b8750e3b Author: J. Bruce Fields Date: Fri Jan 29 14:27:01 2021 -0500 nfsd: skip some unnecessary stats in the v4 case [ Upstream commit 428a23d2bf0ca8fd4d364a464c3e468f0e81671e ] In the typical case of v4 and an i_version-supporting filesystem, we can skip a stat which is only required to fake up a change attribute from ctime. Signed-off-by: J. Bruce Fields Reviewed-by: Christoph Hellwig Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0220d51186482f54199dbb3349b951235c6bca69 Author: J. Bruce Fields Date: Fri Jan 29 14:26:29 2021 -0500 nfs: use change attribute for NFS re-exports [ Upstream commit 3cc55f4434b421d37300aa9a167ace7d60b45ccf ] When exporting NFS, we may as well use the real change attribute returned by the original server instead of faking up a change attribute from the ctime. Note we can't do that by setting I_VERSION--that would also turn on the logic in iversion.h which treats the lower bit specially, and that doesn't make sense for NFS. So instead we define a new export operation for filesystems like NFS that want to manage the change attribute themselves. Signed-off-by: J. Bruce Fields Reviewed-by: Christoph Hellwig Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5a0b45626fc199bc27858e9e4333a829c80fb875 Author: Dai Ngo Date: Thu Jan 28 01:42:26 2021 -0500 NFSv4_2: SSC helper should use its own config. [ Upstream commit 02591f9febd5f69bb4c266a4abf899c4cf21964f ] Currently NFSv4_2 SSC helper, nfs_ssc, incorrectly uses GRACE_PERIOD as its config. Fix by adding new config NFS_V4_2_SSC_HELPER which depends on NFS_V4_2 and is automatically selected when NFSD_V4 is enabled. Also removed the file name from a comment in nfs_ssc.c. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b267f61182c1054e4b7e6ce91563432e464094e7 Author: J. Bruce Fields Date: Thu Jan 21 17:57:45 2021 -0500 nfsd: cstate->session->se_client -> cstate->clp [ Upstream commit ec59659b4972ec25851aa03b4b5baba6764a62e4 ] I'm not sure why we're writing this out the hard way in so many places. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bc6015541cda7063077201415b1b1f3310b67923 Author: J. Bruce Fields Date: Thu Jan 21 17:57:44 2021 -0500 nfsd: simplify nfsd4_check_open_reclaim [ Upstream commit 1722b04624806ced51693f546edb83e8b2297a77 ] The set_client() was already taken care of by process_open1(). The comments here are mostly redundant with the code. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 25ac4fdbdce718edc7cbb5e71760eb746ac026c2 Author: J. Bruce Fields Date: Thu Jan 21 17:57:43 2021 -0500 nfsd: remove unused set_client argument [ Upstream commit f71475ba8c2a77fff8051903cf4b7d826c3d1693 ] Every caller is setting this argument to false, so we don't need it. Also cut this comment a bit and remove an unnecessary warning. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 87ab73c1cc75ec55b012552811c66443b2b71651 Author: J. Bruce Fields Date: Thu Jan 21 17:57:42 2021 -0500 nfsd: find_cpntf_state cleanup [ Upstream commit 47fdb22dacae78f37701d82a94c16a014186d34e ] I think this unusual use of struct compound_state could cause confusion. It's not that much more complicated just to open-code this stateid lookup. The only change in behavior should be a different error return in the case the copy is using a source stateid that is a revoked delegation, but I doubt that matters. Signed-off-by: J. Bruce Fields [ cel: squashed in fix reported by Coverity ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1d4ccfdc7d0eaf941fddce788f2abd781193e1e2 Author: J. Bruce Fields Date: Thu Jan 21 17:57:41 2021 -0500 nfsd: refactor set_client [ Upstream commit 7950b5316e40d99dcb85ab81a2d1dbb913d7c1c8 ] This'll be useful elsewhere. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 17006574683f3b4a62dbaa275260731cef369485 Author: J. Bruce Fields Date: Thu Jan 21 17:57:40 2021 -0500 nfsd: rename lookup_clientid->set_client [ Upstream commit 460d27091ae2c23e7ac959a61cd481c58832db58 ] I think this is a better name, and I'm going to reuse elsewhere the code that does the lookup itself. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ea92c0768f98c733280b6ab6326ea2f11deac502 Author: J. Bruce Fields Date: Thu Jan 21 17:57:39 2021 -0500 nfsd: simplify nfsd_renew [ Upstream commit b4587eb2cf4b6271f67fb93b75f7de2a2026e853 ] You can take the single-exit thing too far, I think. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 52923f25be3c3479a2af17f94c80b85b5bb4c383 Author: J. Bruce Fields Date: Thu Jan 21 17:57:38 2021 -0500 nfsd: simplify process_lock [ Upstream commit a9d53a75cf574d6aa41f3cb4968fffe4f64e0fad ] Similarly, this STALE_CLIENTID check is already handled by: nfs4_preprocess_confirmed_seqid_op()-> nfs4_preprocess_seqid_op()-> nfsd4_lookup_stateid()-> set_client()-> STALE_CLIENTID() (This may cause it to return a different error in some cases where there are multiple things wrong; pynfs test SEQ10 regressed on this commit because of that, but I think that's the test's fault, and I've fixed it separately.) Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4f26b1747a2ec827045dc1d8f04b641c76e1d5c9 Author: J. Bruce Fields Date: Thu Jan 21 17:57:37 2021 -0500 nfsd4: simplify process_lookup1 [ Upstream commit 33311873adb0d55c287b164117b5b4bb7b1bdc40 ] This STALE_CLIENTID check is redundant with the one in lookup_clientid(). There's a difference in behavior is in case of memory allocation failure, which I think isn't a big deal. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 42cf742d8626ccb051042de7c9d991b271c9d32c Author: Amir Goldstein Date: Wed Jan 6 09:52:36 2021 +0200 nfsd: report per-export stats [ Upstream commit 20ad856e47323e208ae8d6a9ecfe5bf0be6f505e ] Collect some nfsd stats per export in addition to the global stats. A new nfsdfs export_stats file is created. It uses the same ops as the exports file to iterate the export entries and we use the file's name to determine the reported info per export. For example: $ cat /proc/fs/nfsd/export_stats # Version 1.1 # Path Client Start-time # Stats /test localhost 92 fh_stale: 0 io_read: 9 io_write: 1 Every export entry reports the start time when stats collection started, so stats collecting scripts can know if stats where reset between samples. Signed-off-by: Amir Goldstein Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 65b1df1358842854ee4b9830b41ccfdf7ae57246 Author: Amir Goldstein Date: Wed Jan 6 09:52:35 2021 +0200 nfsd: protect concurrent access to nfsd stats counters [ Upstream commit e567b98ce9a4b35b63c364d24828a9e5cd7a8179 ] nfsd stats counters can be updated by concurrent nfsd threads without any protection. Convert some nfsd_stats and nfsd_net struct members to use percpu counters. The longest_chain* members of struct nfsd_net remain unprotected. Signed-off-by: Amir Goldstein Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d1344de0d66da89f8df00969a17cdb3f76a95800 Author: Amir Goldstein Date: Wed Jan 6 09:52:34 2021 +0200 nfsd: remove unused stats counters [ Upstream commit 1b76d1df1a3683b6b23cd1c813d13c5e6a9d35e5 ] Commit 501cb1849f86 ("nfsd: rip out the raparms cache") removed the code that updates read-ahead cache stats counters, commit 8bbfa9f3889b ("knfsd: remove the nfsd thread busy histogram") removed code that updates the thread busy stats counters back in 2009 and code that updated filehandle cache stats was removed back in 2002. Remove the unused stats counters from nfsd_stats struct and print hardcoded zeros in /proc/net/rpc/nfsd. Signed-off-by: Amir Goldstein Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0a13baa6ab5a9e035e8e78618c8253e25299b5b2 Author: Chuck Lever Date: Tue Oct 20 09:56:52 2020 -0400 NFSD: Clean up after updating NFSv3 ACL decoders [ Upstream commit 9cee763ee654ce8622d673b8e32687d738e24ace ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 22af3dfbe657a1548779e1dab5bad0a61418d29d Author: Chuck Lever Date: Tue Nov 17 11:56:26 2020 -0500 NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream [ Upstream commit 68519ff2a1c72c67fcdc4b81671acda59f420af9 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f89e3fa89e465bee9a569d5b792b943c16e06b80 Author: Chuck Lever Date: Tue Nov 17 11:52:04 2020 -0500 NFSD: Update the NFSv3 GETACL argument decoder to use struct xdr_stream [ Upstream commit 05027eafc266487c6e056d10ab352861df95b5d4 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5ea5e56cfb57797b907d810dcd222735c886a873 Author: Chuck Lever Date: Mon Oct 19 17:49:16 2020 -0400 NFSD: Clean up after updating NFSv2 ACL decoders [ Upstream commit baadce65d6ee3032b921d9c043ba808bc69d6b13 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 81f79eb2237b76ed5faf762239beaede7d2b821f Author: Chuck Lever Date: Tue Nov 17 11:49:29 2020 -0500 NFSD: Update the NFSv2 ACL ACCESS argument decoder to use struct xdr_stream [ Upstream commit 64063892efc1daa3a48882673811ff327ba75ed5 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9eea3915dd811b941cdb8b5114c4e384cfaa60b2 Author: Chuck Lever Date: Tue Nov 17 11:46:50 2020 -0500 NFSD: Update the NFSv2 ACL GETATTR argument decoder to use struct xdr_stream [ Upstream commit 571d31f37a57729c9d3463b5a692a84e619b408a ] Since the ACL GETATTR procedure is the same as the normal GETATTR procedure, simply re-use nfssvc_decode_fhandleargs. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 508a791fbe87bdbeb88c1d29eaa487ae823cb413 Author: Chuck Lever Date: Tue Nov 17 11:37:35 2020 -0500 NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream [ Upstream commit 427eab3ba22891845265f9a3846de6ac152ec836 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e077857ef0f84ba0ea99b2be796e7264a136c90a Author: Chuck Lever Date: Tue Nov 17 10:38:46 2020 -0500 NFSD: Add an xdr_stream-based decoder for NFSv2/3 ACLs [ Upstream commit 6bb844b4eb6e3b109a2fdaffb60e6da722dc4356 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ea6b0e02dcacac18adb300636ce30e88f64dc912 Author: Chuck Lever Date: Tue Nov 17 11:32:04 2020 -0500 NFSD: Update the NFSv2 GETACL argument decoder to use struct xdr_stream [ Upstream commit 635a45d34706400c59c3b18ca9fccba195147bda ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e44061388635614756ede3f21369c3a43ef17ee5 Author: Chuck Lever Date: Tue Oct 20 10:08:19 2020 -0400 NFSD: Remove argument length checking in nfsd_dispatch() [ Upstream commit 5650682e16f41722f735b7beeb2dbc3411dfbeb6 ] Now that the argument decoders for NFSv2 and NFSv3 use the xdr_stream mechanism, the version-specific length checking logic in nfsd_dispatch() is no longer necessary. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7e6746027b058a0f01a73d3ed2ed352c7cdaf28f Author: Chuck Lever Date: Wed Oct 21 12:46:03 2020 -0400 NFSD: Update the NFSv2 SYMLINK argument decoder to use struct xdr_stream [ Upstream commit 09f75a5375ac61f4adb94da0accc1cfc60eb4f2b ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1db54ce543bc184d36a623e7e7c31a393287a364 Author: Chuck Lever Date: Wed Oct 21 12:43:58 2020 -0400 NFSD: Update the NFSv2 CREATE argument decoder to use struct xdr_stream [ Upstream commit 7dcf65b91ecaf60ce593e7859ae2b29b7c46ccbd ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 40de4113f801f3e63e0150acbe1cf77298e6a27e Author: Chuck Lever Date: Wed Oct 21 12:39:06 2020 -0400 NFSD: Update the NFSv2 SETATTR argument decoder to use struct xdr_stream [ Upstream commit 2fdd6bd293b9e7dda61220538b2759fbf06f5af0 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ebfb21605f1af172fe47710c41e10fd778f75ff5 Author: Chuck Lever Date: Wed Oct 21 12:34:24 2020 -0400 NFSD: Update the NFSv2 LINK argument decoder to use struct xdr_stream [ Upstream commit 77edcdf91f6245a9881b84e4e101738148bd039a ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a362dd478be0769426baceba48e196024e648d57 Author: Chuck Lever Date: Wed Oct 21 12:35:41 2020 -0400 NFSD: Update the NFSv2 RENAME argument decoder to use struct xdr_stream [ Upstream commit 62aa557efb81ea3339fabe7f5b1a343e742bbbdf ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0047abd4c411d8976a98ddf9ba1b8a5737dff9bb Author: Chuck Lever Date: Mon Oct 19 14:33:24 2020 -0400 NFSD: Update NFSv2 diropargs decoding to use struct xdr_stream [ Upstream commit 6d742c1864c18f143ea2031f1ed66bcd8f4812de ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7d9ab8ee576f034bdfaf792704b095ec7c750684 Author: Chuck Lever Date: Mon Oct 19 14:15:51 2020 -0400 NFSD: Update the NFSv2 READDIR argument decoder to use struct xdr_stream [ Upstream commit 8688361ae2edb8f7e61d926dc5000c9a44f29370 ] As an additional clean up, move code not related to XDR decoding into readdir's .pc_func call out. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 672111a408726d0a0a24782ded902b0323160440 Author: Chuck Lever Date: Fri Nov 13 17:03:49 2020 -0500 NFSD: Add helper to set up the pages where the dirlist is encoded [ Upstream commit 788cd46ecf83ee2d561cb4e754e276dc8089b787 ] Add a helper similar to nfsd3_init_dirlist_pages(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 365835d2ff67e1a1e78dfd23b315a9ed0f8185bc Author: Chuck Lever Date: Wed Oct 21 12:21:25 2020 -0400 NFSD: Update the NFSv2 READLINK argument decoder to use struct xdr_stream [ Upstream commit 1fcbd1c9456ba129d38420e345e91c4b6363db47 ] If the code that sets up the sink buffer for nfsd_readlink() is moved adjacent to the nfsd_readlink() call site that uses it, then the only argument is a file handle, and the fhandle decoder can be used instead. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ecee6ba5920cfbd63646750b87db445583fd9c7c Author: Chuck Lever Date: Wed Oct 21 12:18:36 2020 -0400 NFSD: Update the NFSv2 WRITE argument decoder to use struct xdr_stream [ Upstream commit a51b5b737a0be93fae6ea2a18df03ab2359a3f4b ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6e88b7ec6cd52d0ccf2a1961f1550093f0f2c581 Author: Chuck Lever Date: Wed Oct 21 12:15:51 2020 -0400 NFSD: Update the NFSv2 READ argument decoder to use struct xdr_stream [ Upstream commit 8c293ef993c8df0b1bea9ecb0de6eb96dec3ac9d ] The code that sets up rq_vec is refactored so that it is now adjacent to the nfsd_read() call site where it is used. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ba7e0412fb5a7786de292357cfa94dc35ae5778a Author: Chuck Lever Date: Wed Oct 21 12:14:23 2020 -0400 NFSD: Update the NFSv2 GETATTR argument decoder to use struct xdr_stream [ Upstream commit ebcd8e8b28535b643a4c06685bd363b3b73a96af ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9ceeee0ec8877bb93e1514353cfe86b8602f67d8 Author: Chuck Lever Date: Tue Oct 20 17:04:03 2020 -0400 NFSD: Update the MKNOD3args decoder to use struct xdr_stream [ Upstream commit f8a38e2d6c885f9d7cd03febc515d36293de4a5b ] This commit removes the last usage of the original decode_sattr3(), so it is removed as a clean-up. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8841760f685bb50c67f8529bf4b821389d0ec002 Author: Chuck Lever Date: Tue Oct 20 16:01:16 2020 -0400 NFSD: Update the SYMLINK3args decoder to use struct xdr_stream [ Upstream commit da39201637297460c13134c29286a00f3a1c92fe ] Similar to the WRITE decoder, code that checks the sanity of the payload size is re-wired to work with xdr_stream infrastructure. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b5d1ae6cc4c2d6677ed115f376188b3050499dca Author: Chuck Lever Date: Tue Oct 20 17:02:16 2020 -0400 NFSD: Update the MKDIR3args decoder to use struct xdr_stream [ Upstream commit 83374c278db193f3e8b2608b45da1132b867a760 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bd54084b587f833bff139784ec6259d0cc55bcef Author: Chuck Lever Date: Tue Oct 20 15:56:11 2020 -0400 NFSD: Update the CREATE3args decoder to use struct xdr_stream [ Upstream commit 6b3a11960d898b25a30103cc6a2ff0b24b90a83b ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 48ea0cb79b45e1f139a42257cb46ed8fc6aaca1c Author: Chuck Lever Date: Tue Oct 20 15:48:22 2020 -0400 NFSD: Update the SETATTR3args decoder to use struct xdr_stream [ Upstream commit 9cde9360d18d8b352b737d10f90f2aecccf93dbe ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 71d7e7c6a6f4a4853612afe058e3fc923a4045fb Author: Chuck Lever Date: Mon Oct 19 13:26:32 2020 -0400 NFSD: Update the LINK3args decoder to use struct xdr_stream [ Upstream commit efaa1e7c2c7475f0a9bbeb904d9aba09b73dd52a ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e84af2339181503d34edd5fce1e1af8c02cceaa6 Author: Chuck Lever Date: Tue Oct 20 15:44:12 2020 -0400 NFSD: Update the RENAME3args decoder to use struct xdr_stream [ Upstream commit d181e0a4bef36ee74d1338e5b5c2561d7463a5d0 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 69e54a4470a43190ac5e61103c00903847099e21 Author: Chuck Lever Date: Tue Oct 20 15:42:33 2020 -0400 NFSD: Update the NFSv3 DIROPargs decoder to use struct xdr_stream [ Upstream commit 54d1d43dc709f58be38d278bfc38e9bfb38d35fc ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 47614a374e652b250355c7fa55352ac94cc5b56a Author: Chuck Lever Date: Tue Oct 20 14:41:56 2020 -0400 NFSD: Update COMMIT3arg decoder to use struct xdr_stream [ Upstream commit c8d26a0acfe77f0880e0acfe77e4209cf8f3a38b ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fbcd668016107be7e0007b7b5fcc4bf7e63c5bcf Author: Chuck Lever Date: Mon Oct 19 13:23:52 2020 -0400 NFSD: Update READDIR3args decoders to use struct xdr_stream [ Upstream commit 9cedc2e64c296efb3bebe93a0ceeb5e71e8d722d ] As an additional clean up, neither nfsd3_proc_readdir() nor nfsd3_proc_readdirplus() make use of the dircount argument, so remove it from struct nfsd3_readdirargs. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e0ddafcc25e5a1f1686769fa14e2c7ffa97266a8 Author: Chuck Lever Date: Tue Nov 17 09:50:23 2020 -0500 NFSD: Add helper to set up the pages where the dirlist is encoded [ Upstream commit 40116ebd0934cca7e46423bdb3397d3d27eb9fb9 ] De-duplicate some code that is used by both READDIR and READDIRPLUS to build the dirlist in the Reply. Because this code is not related to decoding READ arguments, it is moved to a more appropriate spot. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 29270d477fff9be7d068a15bfdf005658cbf119e Author: Chuck Lever Date: Tue Nov 10 10:24:39 2020 -0500 NFSD: Fix returned READDIR offset cookie [ Upstream commit 0a8f37fb34a96267c656f7254e69bb9a2fc89fe4 ] Code inspection shows that the server's NFSv3 READDIR implementation handles offset cookies slightly differently than the NFSv2 READDIR, NFSv3 READDIRPLUS, and NFSv4 READDIR implementations, and there doesn't seem to be any need for this difference. As a clean up, I copied the logic from nfsd3_proc_readdirplus(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 19285d319f7c143f8cc0d461c7b626c08ff8f046 Author: Chuck Lever Date: Sat Oct 24 12:51:18 2020 -0400 NFSD: Update READLINK3arg decoder to use struct xdr_stream [ Upstream commit 224c1c894e48cd72e4dd9fb6311be80cbe1369b0 ] The NFSv3 READLINK request takes a single filehandle, so it can re-use GETATTR's decoder. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5f36ae59d6cc55437474fc02a450b62b885bd9be Author: Chuck Lever Date: Thu Oct 22 11:14:55 2020 -0400 NFSD: Update WRITE3arg decoder to use struct xdr_stream [ Upstream commit c43b2f229a01969a7ccf94b033c5085e0ec2040c ] As part of the update, open code that sanity-checks the size of the data payload against the length of the RPC Call message has to be re-implemented to use xdr_stream infrastructure. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b77a4a968d1d2ab9f36556efc8e73772e28f8c90 Author: Chuck Lever Date: Tue Oct 20 14:34:40 2020 -0400 NFSD: Update READ3arg decoder to use struct xdr_stream [ Upstream commit be63bd2ac6bbf8c065a0ef6dfbea76934326c352 ] The code that sets up rq_vec is refactored so that it is now adjacent to the nfsd_read() call site where it is used. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7bb23be4501bb0c760ae182b0943836fe5e12e97 Author: Chuck Lever Date: Tue Oct 20 14:32:04 2020 -0400 NFSD: Update ACCESS3arg decoder to use struct xdr_stream [ Upstream commit 3b921a2b14251e9e203f1e8af76e8ade79f50e50 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d668aa92a6240269e25eb53343215f55245245ec Author: Chuck Lever Date: Tue Oct 20 14:30:02 2020 -0400 NFSD: Update GETATTR3args decoder to use struct xdr_stream [ Upstream commit 9575363a9e4c8d7e2f9ba5e79884d623fff0be6f ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 22b19656eaacd32c7b679d853c68b098337c52c6 Author: Chuck Lever Date: Fri Nov 27 17:37:02 2020 -0500 SUNRPC: Move definition of XDR_UNIT [ Upstream commit 81d217474326b25d7f14274b02fe3da1e85ad934 ] Clean up: The unit of XDR alignment is defined by RFC 4506, not as part of the RPC message header. Thus it belongs in include/linux/sunrpc/xdr.h. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 97d254cba30d04c968080ed5018e0b2bba7ec430 Author: Chuck Lever Date: Thu Dec 3 10:22:09 2020 -0500 SUNRPC: Display RPC procedure names instead of proc numbers [ Upstream commit 89ff87494c6e4b32ea7960d0c644efdbb2fe6ef5 ] Make the sunrpc trace subsystem trace events easier to use. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c336597d03ec4f3ce270312a064309bc9dc316ac Author: Chuck Lever Date: Thu Sep 17 17:22:49 2020 -0400 SUNRPC: Make trace_svc_process() display the RPC procedure symbolically [ Upstream commit 2289e87b5951f97783f07fc895e6c5e804b53668 ] The next few patches will employ these strings to help make server- side trace logs more human-readable. A similar technique is already in use in kernel RPC client code. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5b82798f78f9861b01c6c725c062de91f80ad489 Author: Chuck Lever Date: Fri Dec 18 12:28:58 2020 -0500 NFSD: Restore NFSv4 decoding's SAVEMEM functionality [ Upstream commit 7b723008f9c95624c848fad661c01b06e47b20da ] While converting the NFSv4 decoder to use xdr_stream-based XDR processing, I removed the old SAVEMEM() macro. This macro wrapped a bit of logic that avoided a memory allocation by recognizing when the decoded item resides in a linear section of the Receive buffer. In that case, it returned a pointer into that buffer instead of allocating a bounce buffer. The bounce buffer is necessary only when xdr_inline_decode() has placed the decoded item in the xdr_stream's scratch buffer, which disappears the next time xdr_inline_decode() is called with that xdr_stream. That happens only if the data item crosses a page boundary in the receive buffer, an exceedingly rare occurrence. Allocating a bounce buffer every time results in a minor performance regression that was introduced by the recent NFSv4 decoder overhaul. Let's restore the previous behavior. On average, it saves about 1.5 kmalloc() calls per COMPOUND. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bca0057f686bff5999896e5780dfcbce4b9fbc45 Author: Chuck Lever Date: Fri Dec 18 12:28:23 2020 -0500 NFSD: Fix sparse warning in nfssvc.c [ Upstream commit d6c9e4368cc6a61bf25c9c72437ced509c854563 ] fs/nfsd/nfssvc.c:36:6: warning: symbol 'inter_copy_offload_enable' was not declared. Should it be static? The parameter was added by commit ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy"). Relocate it into the source file that uses it, and make it static. This approach is similar to the nfs4_disable_idmapping, cltrack_prog, and cltrack_legacy_disable module parameters. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 131676b8240fdded867167b3501b4e53897cc9f4 Author: Zheng Yongjun Date: Fri Dec 11 16:41:58 2020 +0800 fs/lockd: convert comma to semicolon [ Upstream commit 3316fb80a0b4c1fef03a3eb1a7f0651e2133c429 ] Replace a comma between expression statements by a semicolon. Signed-off-by: Zheng Yongjun Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 185e81a977d1ca0adc83aa9a1b8c6564772a791c Author: Waiman Long Date: Sun Nov 8 22:59:31 2020 -0500 inotify: Increase default inotify.max_user_watches limit to 1048576 [ Upstream commit 92890123749bafc317bbfacbe0a62ce08d78efb7 ] The default value of inotify.max_user_watches sysctl parameter was set to 8192 since the introduction of the inotify feature in 2005 by commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too small for many modern usage. As a result, users have to explicitly set it to a larger value to make it work. After some searching around the web, these are the inotify.max_user_watches values used by some projects: - vscode: 524288 - dropbox support: 100000 - users on stackexchange: 12228 - lsyncd user: 2000000 - code42 support: 1048576 - monodevelop: 16384 - tectonic: 524288 - openshift origin: 65536 Each watch point adds an inotify_inode_mark structure to an inode to be watched. It also pins the watched inode. Modeled after the epoll.max_user_watches behavior to adjust the default value according to the amount of addressable memory available, make inotify.max_user_watches behave in a similar way to make it use no more than 1% of addressable memory within the range [8192, 1048576]. We estimate the amount of memory used by inotify mark to size of inotify_inode_mark plus two times the size of struct inode (we double the inode size to cover the additional filesystem private inode part). That means that a 64-bit system with 128GB or more memory will likely have the maximum value of 1048576 for inotify.max_user_watches. This default should be big enough for most use cases. Link: https://lore.kernel.org/r/20201109035931.4740-1-longman@redhat.com Reviewed-by: Amir Goldstein Signed-off-by: Waiman Long Signed-off-by: Jan Kara Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1aecdaa7e2c6619a7d2c0a81c8f5c06e52f870f3 Author: Eric W. Biederman Date: Fri Nov 20 17:14:39 2020 -0600 file: Replace ksys_close with close_fd [ Upstream commit 1572bfdf21d4d50e51941498ffe0b56c2289f783 ] Now that ksys_close is exactly identical to close_fd replace the one caller of ksys_close with close_fd. [1] https://lkml.kernel.org/r/20200818112020.GA17080@infradead.org Suggested-by: Christoph Hellwig Link: https://lkml.kernel.org/r/20201120231441.29911-22-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6d256a904cd700fb374f689697901686bf3e9bdd Author: Eric W. Biederman Date: Fri Nov 20 17:14:38 2020 -0600 file: Rename __close_fd to close_fd and remove the files parameter [ Upstream commit 8760c909f54a82aaa6e76da19afe798a0c77c3c3 ] The function __close_fd was added to support binder[1]. Now that binder has been fixed to no longer need __close_fd[2] all calls to __close_fd pass current->files. Therefore transform the files parameter into a local variable initialized to current->files, and rename __close_fd to close_fd to reflect this change, and keep it in sync with the similar changes to __alloc_fd, and __fd_install. This removes the need for callers to care about the extra care that needs to be take if anything except current->files is passed, by limiting the callers to only operation on current->files. [1] 483ce1d4b8c3 ("take descriptor-related part of close() to file.c") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-17-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-21-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7458c5ae465ef8f052e5bb5a7c4755e9c755d752 Author: Eric W. Biederman Date: Fri Nov 20 17:14:37 2020 -0600 file: Merge __alloc_fd into alloc_fd [ Upstream commit aa384d10f3d06d4b85597ff5df41551262220e16 ] The function __alloc_fd was added to support binder[1]. With binder fixed[2] there are no more users. As alloc_fd just calls __alloc_fd with "files=current->files", merge them together by transforming the files parameter into a local variable initialized to current->files. [1] dcfadfa4ec5a ("new helper: __alloc_fd()") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-16-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-20-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9e8ef54ca890eda6f1993405684a68cc8dff06e7 Author: Eric W. Biederman Date: Fri Nov 20 17:14:36 2020 -0600 file: In f_dupfd read RLIMIT_NOFILE once. Simplify the code, and remove the chance of races by reading RLIMIT_NOFILE only once in f_dupfd. Pass the read value of RLIMIT_NOFILE into alloc_fd which is the other location the rlimit was read in f_dupfd. As f_dupfd is the only caller of alloc_fd this changing alloc_fd is trivially safe. Further this causes alloc_fd to take all of the same arguments as __alloc_fd except for the files_struct argument. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-15-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-19-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 89f9e529643ab5b1a98b98e30ac4177d0f7ba1cc Author: Eric W. Biederman Date: Fri Nov 20 17:14:35 2020 -0600 file: Merge __fd_install into fd_install [ Upstream commit d74ba04d919ebe30bf47406819c18c6b50003d92 ] The function __fd_install was added to support binder[1]. With binder fixed[2] there are no more users. As fd_install just calls __fd_install with "files=current->files", merge them together by transforming the files parameter into a local variable initialized to current->files. [1] f869e8a7f753 ("expose a low-level variant of fd_install() for binder") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-14-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-18-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b4b827da9096162b3cd4e3d7cee6da087c47e6f0 Author: Eric W. Biederman Date: Fri Nov 20 17:14:34 2020 -0600 proc/fd: In fdinfo seq_show don't use get_files_struct [ Upstream commit 775e0656b27210ae668e33af00bece858f44576f ] When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Instead hold task_lock for the duration that task->files needs to be stable in seq_show. The task_lock was already taken in get_files_struct, and so skipping get_files_struct performs less work overall, and avoids the problems with the files_struct reference count. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-12-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-17-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c0e3f6df04ce00260f1a108d69f6928816d43681 Author: Eric W. Biederman Date: Fri Nov 20 17:14:32 2020 -0600 proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu [ Upstream commit 5b17b61870e2f4b0a4fdc5c6039fbdb4ffb796df ] When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Using task_lookup_next_fd_rcu simplifies proc_readfd_common, by moving the checking for the maximum file descritor into the generic code, and by remvoing the need for capturing and releasing a reference on files_struct. As task_lookup_fd_rcu may update the fd ctx->pos has been changed to be the fd +2 after task_lookup_fd_rcu returns. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov Tested-by: Andy Lavr v1: https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-15-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a6da7536e488ae64726350fcd50d14f7309ffeaa Author: Eric W. Biederman Date: Fri Nov 20 17:14:31 2020 -0600 file: Implement task_lookup_next_fd_rcu [ Upstream commit e9a53aeb5e0a838f10fcea74235664e7ad5e6e1a ] As a companion to fget_task and task_lookup_fd_rcu implement task_lookup_next_fd_rcu that will return the struct file for the first file descriptor number that is equal or greater than the fd argument value, or NULL if there is no such struct file. This allows file descriptors of foreign processes to be iterated through safely, without needed to increment the count on files_struct. Some concern[1] has been expressed that this function takes the task_lock for each iteration and thus for each file descriptor. This place where this function will be called in a commonly used code path is for listing /proc//fd. I did some small benchmarks and did not see any measurable performance differences. For ordinary users ls is likely to stat each of the directory entries and tid_fd_mode called from tid_fd_revalidae has always taken the task lock for each file descriptor. So this does not look like it will be a big change in practice. At some point is will probably be worth changing put_files_struct to free files_struct after an rcu grace period so that task_lock won't be needed at all. [1] https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com v1: https://lkml.kernel.org/r/20200817220425.9389-9-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-14-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6007aeeaefb31b22b66e1f7c4dc5dc49c5ab6e98 Author: Eric W. Biederman Date: Fri Nov 20 17:14:30 2020 -0600 kcmp: In get_file_raw_ptr use task_lookup_fd_rcu [ Upstream commit ed77e80e14a3cd55c73848b9e8043020e717ce12 ] Modify get_file_raw_ptr to use task_lookup_fd_rcu. The helper task_lookup_fd_rcu does the work of taking the task lock and verifying that task->files != NULL and then calls files_lookup_fd_rcu. So let use the helper to make a simpler implementation of get_file_raw_ptr. Acked-by: Cyrill Gorcunov Link: https://lkml.kernel.org/r/20201120231441.29911-13-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c2291f7bdf25d655e80da628a2512f3ffb44db9b Author: Eric W. Biederman Date: Fri Nov 20 17:14:29 2020 -0600 proc/fd: In tid_fd_mode use task_lookup_fd_rcu [ Upstream commit 64eb661fda0269276b4c46965832938e3f268268 ] When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Instead of manually coding finding the files struct for a task and then calling files_lookup_fd_rcu, use the helper task_lookup_fd_rcu that combines those to steps. Making the code simpler and removing the need to get a reference on a files_struct. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-7-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-12-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 32ac87287d0b50eb7d6933e58580fc78198f1f35 Author: Eric W. Biederman Date: Fri Nov 20 17:14:28 2020 -0600 file: Implement task_lookup_fd_rcu [ Upstream commit 3a879fb38082125cc0d8aa89b70c7f3a7cdf584b ] As a companion to lookup_fd_rcu implement task_lookup_fd_rcu for querying an arbitrary process about a specific file. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200818103713.aw46m7vprsy4vlve@wittgenstein Link: https://lkml.kernel.org/r/20201120231441.29911-11-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c4716bb296504cbc64aeefb370df44e821214c44 Author: Eric W. Biederman Date: Fri Nov 20 17:14:27 2020 -0600 file: Rename fcheck lookup_fd_rcu [ Upstream commit 460b4f812a9d473d4b39d87d37844f9fc30a9eb3 ] Also remove the confusing comment about checking if a fd exists. I could not find one instance in the entire kernel that still matches the description or the reason for the name fcheck. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Link: https://lkml.kernel.org/r/20201120231441.29911-10-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 23f55649921b78ddc3b24a443e755e546a574288 Author: Eric W. Biederman Date: Fri Nov 20 17:14:26 2020 -0600 file: Replace fcheck_files with files_lookup_fd_rcu [ Upstream commit f36c2943274199cb8aef32ac96531ffb7c4b43d0 ] This change renames fcheck_files to files_lookup_fd_rcu. All of the remaining callers take the rcu_read_lock before calling this function so the _rcu suffix is appropriate. This change also tightens up the debug check to verify that all callers hold the rcu_read_lock. All callers that used to call files_check with the files->file_lock held have now been changed to call files_lookup_fd_locked. This change of name has helped remind me of which locks and which guarantees are in place helping me to catch bugs later in the patchset. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Link: https://lkml.kernel.org/r/20201120231441.29911-9-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9080557c56cd673941675f38805356f7f72949fa Author: Eric W. Biederman Date: Fri Nov 20 17:14:25 2020 -0600 file: Factor files_lookup_fd_locked out of fcheck_files [ Upstream commit 120ce2b0cd52abe73e8b16c23461eb14df5a87d8 ] To make it easy to tell where files->file_lock protection is being used when looking up a file create files_lookup_fd_locked. Only allow this function to be called with the file_lock held. Update the callers of fcheck and fcheck_files that are called with the files->file_lock held to call files_lookup_fd_locked instead. Hopefully this makes it easier to quickly understand what is going on. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Link: https://lkml.kernel.org/r/20201120231441.29911-8-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ddb21f9984209b2c502ed28698918975528721f5 Author: Eric W. Biederman Date: Thu Dec 10 12:39:54 2020 -0600 file: Rename __fcheck_files to files_lookup_fd_raw [ Upstream commit bebf684bf330915e6c96313ad7db89a5480fc9c2 ] The function fcheck despite it's comment is poorly named as it has no callers that only check it's return value. All of fcheck's callers use the returned file descriptor. The same is true for fcheck_files and __fcheck_files. A new less confusing name is needed. In addition the names of these functions are confusing as they do not report the kind of locks that are needed to be held when these functions are called making error prone to use them. To remedy this I am making the base functio name lookup_fd and will and prefixes and sufficies to indicate the rest of the context. Name the function (previously called __fcheck_files) that proceeds from a struct files_struct, looks up the struct file of a file descriptor, and requires it's callers to verify all of the appropriate locks are held files_lookup_fd_raw. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Link: https://lkml.kernel.org/r/20201120231441.29911-7-ebiederm@xmission.com Signed-off-by: Eric W. Biederman [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e6f42bc11a60571768c96f74a8bdf4fac5624801 Author: Chuck Lever Date: Thu Feb 29 18:19:36 2024 -0500 Revert "fget: clarify and improve __fget_files() implementation" Temporarily revert commit 0849f83e4782 ("fget: clarify and improve __fget_files() implementation") to enable subsequent upstream commits to apply and build cleanly. Stable-dep-of: bebf684bf330 ("file: Rename __fcheck_files to files_lookup_fd_raw") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4d037e1173b5d26e3f9382c6c917f8447bc37543 Author: Eric W. Biederman Date: Fri Nov 20 17:14:23 2020 -0600 proc/fd: In proc_fd_link use fget_task [ Upstream commit 439be32656035d3239fd56f9b83353ec06cb3b45 ] When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Simplifying proc_fd_link is a little bit tricky. It is necessary to know that there is a reference to fd_f ile while path_get is running. This reference can either be guaranteed to exist either by locking the fdtable as the code currently does or by taking a reference on the file in question. Use fget_task to remove the need for get_files_struct and to take a reference to file in question. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov v1: https://lkml.kernel.org/r/20200817220425.9389-8-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-6-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c874ec02cb8ac5edeb1b01b1df41ed23bbbea562 Author: Eric W. Biederman Date: Fri Nov 20 17:14:22 2020 -0600 bpf: In bpf_task_fd_query use fget_task [ Upstream commit b48845af0152d790a54b8ab78cc2b7c07485fc98 ] Use the helper fget_task to simplify bpf_task_fd_query. As well as simplifying the code this removes one unnecessary increment of struct files_struct. This unnecessary increment of files_struct.count can result in exec unnecessarily unsharing files_struct and breaking posix locks, and it can result in fget_light having to fallback to fget reducing performance. This simplification comes from the observation that none of the callers of get_files_struct actually need to call get_files_struct that was made when discussing[1] exec and posix file locks. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov v1: https://lkml.kernel.org/r/20200817220425.9389-5-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-5-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fe1722255ebd9b7fa3602ea1e86cf79f8844d43e Author: Eric W. Biederman Date: Fri Nov 20 17:14:21 2020 -0600 kcmp: In kcmp_epoll_target use fget_task [ Upstream commit f43c283a89a7dc531a47d4b1e001503cf3dc3234 ] Use the helper fget_task and simplify the code. As well as simplifying the code this removes one unnecessary increment of struct files_struct. This unnecessary increment of files_struct.count can result in exec unnecessarily unsharing files_struct and breaking posix locks, and it can result in fget_light having to fallback to fget reducing performance. Suggested-by: Oleg Nesterov Reviewed-by: Cyrill Gorcunov v1: https://lkml.kernel.org/r/20200817220425.9389-4-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-4-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ba7aac19b4be916b68067bfe09b1b648dd0bdd8f Author: Eric W. Biederman Date: Fri Nov 20 17:14:20 2020 -0600 exec: Remove reset_files_struct [ Upstream commit 950db38ff2c01b7aabbd7ab4a50b7992750fa63d ] Now that exec no longer needs to restore the previous value of current->files on error there are no more callers of reset_files_struct so remove it. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-3-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-3-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 44f79df28b47749be944e0bea86ccd7ad9772636 Author: Eric W. Biederman Date: Fri Nov 20 17:14:19 2020 -0600 exec: Simplify unshare_files [ Upstream commit 1f702603e7125a390b5cdf5ce00539781cfcc86a ] Now that exec no longer needs to return the unshared files to their previous value there is no reason to return displaced. Instead when unshare_fd creates a copy of the file table, call put_files_struct before returning from unshare_files. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-2-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-2-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5091d051c51db83c15f3e0b493f1c0da30316a36 Author: Eric W. Biederman Date: Fri Nov 20 17:14:18 2020 -0600 exec: Move unshare_files to fix posix file locking during exec [ Upstream commit b6043501289ebf169ae19b810a882d517377302f ] Many moons ago the binfmts were doing some very questionable things with file descriptors and an unsharing of the file descriptor table was added to make things better[1][2]. The helper steal_lockss was added to avoid breaking the userspace programs[3][4][6]. Unfortunately it turned out that steal_locks did not work for network file systems[5], so it was removed to see if anyone would complain[7][8]. It was thought at the time that NPTL would not be affected as the unshare_files happened after the other threads were killed[8]. Unfortunately because there was an unshare_files in binfmt_elf.c before the threads were killed this analysis was incorrect. This unshare_files in binfmt_elf.c resulted in the unshares_files happening whenever threads were present. Which led to unshare_files being moved to the start of do_execve[9]. Later the problems were rediscovered and the suggested approach was to readd steal_locks under a different name[10]. I happened to be reviewing patches and I noticed that this approach was a step backwards[11]. I proposed simply moving unshare_files[12] and it was pointed out that moving unshare_files without auditing the code was also unsafe[13]. There were then several attempts to solve this[14][15][16] and I even posted this set of changes[17]. Unfortunately because auditing all of execve is time consuming this change did not make it in at the time. Well now that I am cleaning up exec I have made the time to read through all of the binfmts and the only playing with file descriptors is either the security modules closing them in security_bprm_committing_creds or is in the generic code in fs/exec.c. None of it happens before begin_new_exec is called. So move unshare_files into begin_new_exec, after the point of no return. If memory is very very very low and the application calling exec is sharing file descriptor tables between processes we might fail past the point of no return. Which is unfortunate but no different than any of the other places where we allocate memory after the point of no return. This movement allows another process that shares the file table, or another thread of the same process and that closes files or changes their close on exec behavior and races with execve to cause some unexpected things to happen. There is only one time of check to time of use race and it is just there so that execve fails instead of an interpreter failing when it tries to open the file it is supposed to be interpreting. Failing later if userspace is being silly is not a problem. With this change it the following discription from the removal of steal_locks[8] finally becomes true. Apps using NPTL are not affected, since all other threads are killed before execve. Apps using LinuxThreads are only affected if they - have multiple threads during exec (LinuxThreads doesn't kill other threads, the app may do it with pthread_kill_other_threads_np()) - rely on POSIX locks being inherited across exec Both conditions are documented, but not their interaction. Apps using clone() natively are affected if they - use clone(CLONE_FILES) - rely on POSIX locks being inherited across exec I have investigated some paths to make it possible to solve this without moving unshare_files but they all look more complicated[18]. Reported-by: Daniel P. Berrangé Reported-by: Jeff Layton History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git [1] 02cda956de0b ("[PATCH] unshare_files" [2] 04e9bcb4d106 ("[PATCH] use new unshare_files helper") [3] 088f5d7244de ("[PATCH] add steal_locks helper") [4] 02c541ec8ffa ("[PATCH] use new steal_locks helper") [5] https://lkml.kernel.org/r/E1FLIlF-0007zR-00@dorka.pomaz.szeredi.hu [6] https://lkml.kernel.org/r/0060321191605.GB15997@sorel.sous-sol.org [7] https://lkml.kernel.org/r/E1FLwjC-0000kJ-00@dorka.pomaz.szeredi.hu [8] c89681ed7d0e ("[PATCH] remove steal_locks()") [9] fd8328be874f ("[PATCH] sanitize handling of shared descriptor tables in failing execve()") [10] https://lkml.kernel.org/r/20180317142520.30520-1-jlayton@kernel.org [11] https://lkml.kernel.org/r/87r2nwqk73.fsf@xmission.com [12] https://lkml.kernel.org/r/87bmfgvg8w.fsf@xmission.com [13] https://lkml.kernel.org/r/20180322111424.GE30522@ZenIV.linux.org.uk [14] https://lkml.kernel.org/r/20180827174722.3723-1-jlayton@kernel.org [15] https://lkml.kernel.org/r/20180830172423.21964-1-jlayton@kernel.org [16] https://lkml.kernel.org/r/20180914105310.6454-1-jlayton@kernel.org [17] https://lkml.kernel.org/r/87a7ohs5ow.fsf@xmission.com [18] https://lkml.kernel.org/r/87pn8c1uj6.fsf_-_@x220.int.ebiederm.org Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-1-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-1-ebiederm@xmission.com Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 527c9b6eb18dc49da73468cb2808c201f6ef64a1 Author: Eric W. Biederman Date: Wed Dec 9 15:42:57 2020 -0600 exec: Don't open code get_close_on_exec [ Upstream commit 878f12dbb8f514799d126544d59be4d2675caac3 ] Al Viro pointed out that using the phrase "close_on_exec(fd, rcu_dereference_raw(current->files->fdt))" instead of wrapping it in rcu_read_lock(), rcu_read_unlock() is a very questionable optimization[1]. Once wrapped with rcu_read_lock()/rcu_read_unlock() that phrase becomes equivalent the helper function get_close_on_exec so simplify the code and make it more robust by simply using get_close_on_exec. [1] https://lkml.kernel.org/r/20201207222214.GA4115853@ZenIV.linux.org.uk Suggested-by: Al Viro Link: https://lkml.kernel.org/r/87k0tqr6zi.fsf_-_@x220.int.ebiederm.org Signed-off-by: Eric W. Biederman Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8f1df3d0c1469d4b4020e078efb724e6e535cd4c Author: Trond Myklebust Date: Mon Nov 30 23:14:27 2020 -0500 nfsd: Record NFSv4 pre/post-op attributes as non-atomic [ Upstream commit 716a8bc7f706eeef80ab42c99d9f210eda845c81 ] For the case of NFSv4, specify to the client that the pre/post-op attributes were not recorded atomically with the main operation. Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0750e494c75e48eaf49460c330585e0e1e01b47e Author: Trond Myklebust Date: Mon Nov 30 17:03:19 2020 -0500 nfsd: Set PF_LOCAL_THROTTLE on local filesystems only [ Upstream commit 01cbf3853959feec40ec9b9a399e12a021cd4d81 ] Don't set PF_LOCAL_THROTTLE on remote filesystems like NFS, since they aren't expected to ever be subject to double buffering. Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f3056a0ac2c5ced2a9d9f9f7d639094fc3df922a Author: Trond Myklebust Date: Mon Nov 30 17:03:18 2020 -0500 nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE [ Upstream commit 2e19d10c1438241de32467637a2a411971547991 ] If the underlying filesystem times out, then we want knfsd to return NFSERR_JUKEBOX/DELAY rather than NFSERR_STALE. Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 38e213c1e41eabc1f5aba5d7e5bc237b62f6658c Author: Trond Myklebust Date: Mon Nov 30 17:03:17 2020 -0500 exportfs: Add a function to return the raw output from fh_to_dentry() [ Upstream commit d045465fc6cbfa4acfb5a7d817a7c1a57a078109 ] In order to allow nfsd to accept return values that are not acceptable to overlayfs and others, add a new function. Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 93f7d515d873e37be0b9e4712e9171a3443f2d6a Author: Jeff Layton Date: Mon Nov 30 17:03:16 2020 -0500 nfsd: close cached files prior to a REMOVE or RENAME that would replace target [ Upstream commit 7f84b488f9add1d5cca3e6197c95914c7bd3c1cf ] It's not uncommon for some workloads to do a bunch of I/O to a file and delete it just afterward. If knfsd has a cached open file however, then the file may still be open when the dentry is unlinked. If the underlying filesystem is nfs, then that could trigger it to do a sillyrename. On a REMOVE or RENAME scan the nfsd_file cache for open files that correspond to the inode, and proactively unhash and put their references. This should prevent any delete-on-last-close activity from occurring, solely due to knfsd's open file cache. This must be done synchronously though so we use the variants that call flush_delayed_fput. There are deadlock possibilities if you call flush_delayed_fput while holding locks, however. In the case of nfsd_rename, we don't even do the lookups of the dentries to be renamed until we've locked for rename. Once we've figured out what the target dentry is for a rename, check to see whether there are cached open files associated with it. If there are, then unwind all of the locking, close them all, and then reattempt the rename. None of this is really necessary for "typical" filesystems though. It's mostly of use for NFS, so declare a new export op flag and use that to determine whether to close the files beforehand. Signed-off-by: Jeff Layton Signed-off-by: Lance Shelton Signed-off-by: Trond Myklebust [ cel: adjusted to apply to 5.10.y ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 203ca3253b34a1786c885491f0643d4d95a1d690 Author: Jeff Layton Date: Mon Nov 30 17:03:15 2020 -0500 nfsd: allow filesystems to opt out of subtree checking [ Upstream commit ba5e8187c55555519ae0b63c0fb681391bc42af9 ] When we start allowing NFS to be reexported, then we have some problems when it comes to subtree checking. In principle, we could allow it, but it would mean encoding parent info in the filehandles and there may not be enough space for that in a NFSv3 filehandle. To enforce this at export upcall time, we add a new export_ops flag that declares the filesystem ineligible for subtree checking. Signed-off-by: Jeff Layton Signed-off-by: Lance Shelton Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d5314c9bb7f521714ff0e0bcd24ca2eaf00f622b Author: Jeff Layton Date: Mon Nov 30 17:03:14 2020 -0500 nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations [ Upstream commit daab110e47f8d7aa6da66923e3ac1a8dbd2b2a72 ] With NFSv3 nfsd will always attempt to send along WCC data to the client. This generally involves saving off the in-core inode information prior to doing the operation on the given filehandle, and then issuing a vfs_getattr to it after the op. Some filesystems (particularly clustered or networked ones) have an expensive ->getattr inode operation. Atomicity is also often difficult or impossible to guarantee on such filesystems. For those, we're best off not trying to provide WCC information to the client at all, and to simply allow it to poll for that information as needed with a GETATTR RPC. This patch adds a new flags field to struct export_operations, and defines a new EXPORT_OP_NOWCC flag that filesystems can use to indicate that nfsd should not attempt to provide WCC info in NFSv3 replies. It also adds a blurb about the new flags field and flag to the exporting documentation. The server will also now skip collecting this information for NFSv2 as well, since that info is never used there anyway. Note that this patch does not add this flag to any filesystem export_operations structures. This was originally developed to allow reexporting nfs via nfsd. Other filesystems may want to consider enabling this flag too. It's hard to tell however which ones have export operations to enable export via knfsd and which ones mostly rely on them for open-by-filehandle support, so I'm leaving that up to the individual maintainers to decide. I am cc'ing the relevant lists for those filesystems that I think may want to consider adding this though. Cc: HPDD-discuss@lists.01.org Cc: ceph-devel@vger.kernel.org Cc: cluster-devel@redhat.com Cc: fuse-devel@lists.sourceforge.net Cc: ocfs2-devel@oss.oracle.com Signed-off-by: Jeff Layton Signed-off-by: Lance Shelton Signed-off-by: Trond Myklebust Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 34de27ed8447d9cffe0d6bedceb1238c50e8c78b Author: J. Bruce Fields Date: Mon Nov 30 17:46:18 2020 -0500 Revert "nfsd4: support change_attr_type attribute" This reverts commit a85857633b04d57f4524cca0a2bfaf87b2543f9f. We're still factoring ctime into our change attribute even in the IS_I_VERSION case. If someone sets the system time backwards, a client could see the change attribute go backwards. Maybe we can just say "well, don't do that", but there's some question whether that's good enough, or whether we need a better guarantee. Also, the client still isn't actually using the attribute. While we're still figuring this out, let's just stop returning this attribute. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b720ceec88a75fc6ea171519c3d30854c57d0e41 Author: J. Bruce Fields Date: Mon Nov 30 17:46:17 2020 -0500 nfsd4: don't query change attribute in v2/v3 case [ Upstream commit 942b20dc245590327ee0187c15c78174cd96dd52 ] inode_query_iversion() has side effects, and there's no point calling it when we're not even going to use it. We check whether we're currently processing a v4 request by checking fh_maxsize, which is arguably a little hacky; we could add a flag to svc_fh instead. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 796785a79b4aefecab31bada4e7bc46e25fd91aa Author: J. Bruce Fields Date: Mon Nov 30 17:46:16 2020 -0500 nfsd: minor nfsd4_change_attribute cleanup [ Upstream commit 4b03d99794eeed27650597a886247c6427ce1055 ] Minor cleanup, no change in behavior. Also pull out a common helper that'll be useful elsewhere. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 88dea0f92b20593a5bd48c1256c862794fc99f97 Author: J. Bruce Fields Date: Mon Nov 30 17:46:15 2020 -0500 nfsd: simplify nfsd4_change_info [ Upstream commit b2140338d8dca827ad9e83f3e026e9d51748b265 ] It doesn't make sense to carry all these extra fields around. Just make everything into change attribute from the start. This is just cleanup, there should be no change in behavior. Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f8032b859df63e8fee31d74b3f738ed8a862e107 Author: J. Bruce Fields Date: Mon Nov 30 17:46:14 2020 -0500 nfsd: only call inode_query_iversion in the I_VERSION case [ Upstream commit 70b87f77294d16d3e567056ba4c9ee2b091a5b50 ] inode_query_iversion() can modify i_version. Depending on the exported filesystem, that may not be safe. For example, if you're re-exporting NFS, NFS stores the server's change attribute in i_version and does not expect it to be modified locally. This has been observed causing unnecessary cache invalidations. The way a filesystem indicates that it's OK to call inode_query_iverson() is by setting SB_I_VERSION. So, move the I_VERSION check out of encode_change(), where it's used only in GETATTR responses, to nfsd4_change_attribute(), which is also called for pre- and post- operation attributes. (Note we could also pull the NFSEXP_V4ROOT case into nfsd4_change_attribute() as well. That would actually be a no-op, since pre/post attrs are only used for metadata-modifying operations, and V4ROOT exports are read-only. But we might make the change in the future just for simplicity.) Reported-by: Daire Byrne Signed-off-by: J. Bruce Fields Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3aea16e6b70b04db380a11e9aac53ade07a23788 Author: Chuck Lever Date: Wed Nov 4 11:12:18 2020 -0500 NFSD: Remove macros that are no longer used [ Upstream commit 5cfc822f3e77b0477e6602d399116130317f537a ] Now that all the NFSv4 decoder functions have been converted to make direct calls to the xdr helpers, remove the unused C macros. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b24e6a40eebae079882e2d4e776a3b5950359cf7 Author: Chuck Lever Date: Wed Nov 4 11:07:06 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_compound() [ Upstream commit d9b74bdac6f24afc3101b6a5b6f59842610c9c94 ] And clean-up: Now that we have removed the DECODE_TAIL macro from nfsd4_decode_compound(), we observe that there's no benefit for nfsd4_decode_compound() to return nfs_ok or nfserr_bad_xdr only to have its sole caller convert those values to one or zero, respectively. Have nfsd4_decode_compound() return 1/0 instead. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6b48808835a2b92caeb90d8840b8dc068488b61c Author: Chuck Lever Date: Sun Nov 22 12:49:52 2020 -0500 NFSD: Make nfsd4_ops::opnum a u32 [ Upstream commit 3a237b4af5b7b0e77588e120554077cab3341943 ] Avoid passing a "pointer to int" argument to xdr_stream_decode_u32. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c2d0c16990b97a598a0f9740baefe357f2230d73 Author: Chuck Lever Date: Wed Nov 4 11:04:02 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_listxattrs() [ Upstream commit 2212036cadf4da3c4b0e4bd2a9a8c3d78617ab4f ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8e1b8a78a9291533a3658872c7988e7647a62a1b Author: Chuck Lever Date: Wed Nov 4 10:59:57 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_setxattr() [ Upstream commit 403366a7e8e2930002157525cd44add7fa01bca9 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9bc67df0f9a2819c77e1d7e0c0aed76a0d46974d Author: Chuck Lever Date: Wed Nov 4 10:56:52 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_xattr_name() [ Upstream commit 830c71502ae0ae1677ac6c08ffbcf85a6e7b2937 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b719fc9375ccc54613818e220262625e08fab813 Author: Chuck Lever Date: Wed Nov 4 10:46:46 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_clone() [ Upstream commit 3dfd0b0e15671e2b4047ccb9222432f0b2d930be ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a2f6c16ad1383130300162ee548aeeac1f8da461 Author: Chuck Lever Date: Wed Nov 4 10:54:47 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_seek() [ Upstream commit 9d32b412fe0a6186cc57789d218e8f8299454ae2 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f8eb5424e3184908f9cd61fdb7fe48de1b0cef52 Author: Chuck Lever Date: Sat Nov 21 14:21:25 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_offload_status() [ Upstream commit 2846bb0525a73e00b3566fda535ea6a5879e2971 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c2d2a919b2f249c48eaad76f7c7a8bf3f9f46479 Author: Chuck Lever Date: Sat Nov 21 14:19:24 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_copy_notify() [ Upstream commit f9a953fb369bbd2135ccead3393ec1ef66544471 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8604d294c1286267e5f1355c002a223723ffafea Author: Chuck Lever Date: Wed Nov 4 10:49:37 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_copy() [ Upstream commit e8febea7190bcbd1e608093acb67f2a5009556aa ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit dc1a31ca8e96f4c0362c8d7da10c98f757061289 Author: Chuck Lever Date: Mon Nov 16 18:05:06 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_nl4_server() [ Upstream commit f49e4b4d58cc835d8bd0cc9663f7b9c5497e0e7e ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a0b8dabc5906e92ab24221ceffb2653569532c0d Author: Chuck Lever Date: Wed Nov 4 10:44:05 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_fallocate() [ Upstream commit 6aef27aaeae7611f98af08205acc79f5a8f3aa59 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit de0dc37a791e92f945cf62c6dafd19aa830192bb Author: Chuck Lever Date: Tue Nov 3 15:02:11 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_reclaim_complete() [ Upstream commit 0d6467844d437e07db1e76d96176b1a55401018c ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 093f9d2c8f4cd4b42d3725a4e69af72762f54b05 Author: Chuck Lever Date: Wed Nov 4 15:15:09 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_destroy_clientid() [ Upstream commit c95f2ec3490586cbb33badc8f4c82d6aa4955078 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7675420fdebea540b5b7672733c7630876fef382 Author: Chuck Lever Date: Tue Nov 3 14:57:44 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_test_stateid() [ Upstream commit b7a0c8f6e741bf9dee0d24e69d3ce51fa4ccce78 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f0de0b6895490b15121b12dcadc92310139c75f8 Author: Chuck Lever Date: Tue Nov 3 14:55:19 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_sequence() [ Upstream commit cf907b11326d9360877d6c6ea8f75e1b29f39f2f ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 1ea743dc481f14c0b08bafff83596f4ad6127a8c Author: Chuck Lever Date: Tue Nov 3 14:33:12 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_secinfo_no_name() [ Upstream commit 53d70873e37c09a582167ed73d1858e3a2af0157 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b63e313dce04349876c577bc40fd705bbaf0b523 Author: Chuck Lever Date: Wed Nov 4 10:42:25 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_layoutreturn() [ Upstream commit 645fcad371420913c30e9aca80fc0a38f3acf432 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 40e627c502da12cd5b670a92606bfbb2693602e0 Author: Chuck Lever Date: Tue Nov 3 15:06:04 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_layoutget() [ Upstream commit c8e88e3aa73889421461f878cd569ef84f231ceb ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 40770a0f8ef6fcb5c3baf91e9e8fbd11a9b1390b Author: Chuck Lever Date: Wed Nov 4 10:40:07 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_layoutcommit() [ Upstream commit 5185980d8a23001c2317c290129ab7ab20067e20 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c0a4c4e46b8ab28ec7b33fa82e9d775d6e1e8f80 Author: Chuck Lever Date: Tue Nov 3 15:03:50 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_getdeviceinfo() [ Upstream commit 044959715f370b24870c95df3940add8710c5a29 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5f892c11787e12d1fcf13ceb45d66059cd056724 Author: Chuck Lever Date: Sun Nov 1 13:38:27 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_free_stateid() [ Upstream commit aec387d5909304810d899f7d90ae57df33f3a75c ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 92ae309a9908b63976be92ec62fe3051abbb39d4 Author: Chuck Lever Date: Wed Nov 4 13:50:55 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_destroy_session() [ Upstream commit 94e254af1f873b4b551db4c4549294f2c4d385ef ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 73684a8118f35eead4ceae01feace36a41a34a11 Author: Chuck Lever Date: Tue Nov 3 14:52:44 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_create_session() [ Upstream commit 81243e3fe37ed547fc4ed8aab1cec2865540bb18 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2bd9ef494a2c5fa0a5ec6fae657c844fcc7885c1 Author: Chuck Lever Date: Mon Nov 16 15:35:05 2020 -0500 NFSD: Add a helper to decode channel_attrs4 [ Upstream commit 3a3f1fbacb0960b628e5a9f07c78287312f7a99d ] De-duplicate some code. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d01f41320d2ae7bdbb24fa935dfe5f6887abfa17 Author: Chuck Lever Date: Mon Nov 16 15:21:55 2020 -0500 NFSD: Add a helper to decode nfs_impl_id4 [ Upstream commit 10ff84228197f47401833495ba19a50131323b4a ] Refactor for clarity. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d50a76f1f3fcd3407d135eeaefdf7e572abbeb0c Author: Chuck Lever Date: Mon Nov 2 15:19:12 2020 -0500 NFSD: Add a helper to decode state_protect4_a [ Upstream commit 523ec6ed6fb80fd1537d748a06bffd060a8b3235 ] Refactor for clarity. Also, remove a stale comment. Commit ed94164398c9 ("nfsd: implement machine credential support for some operations") added support for SP4_MACH_CRED, so state_protect_a is no longer completely ignored. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0c935af3cfb73065308c62a03118726c92567d05 Author: Chuck Lever Date: Tue Nov 3 11:17:50 2020 -0500 NFSD: Add a separate decoder for ssv_sp_parms [ Upstream commit 547bfeb4cd8d491aabbd656d5a6f410cb4249b4e ] Refactor for clarity. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit cb568dbdef68e036ff271f9aaecb29824d4133f8 Author: Chuck Lever Date: Tue Nov 3 11:13:00 2020 -0500 NFSD: Add a separate decoder to handle state_protect_ops [ Upstream commit 2548aa784d760567c2a77cbd8b7c55b211167c37 ] Refactor for clarity and de-duplication of code. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b73633804246be23ef596abab24598e86995ccfd Author: Chuck Lever Date: Tue Nov 3 13:16:23 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_bind_conn_to_session() [ Upstream commit 571e0451c4de0a545960ffaea16d969931afc563 ] A dedicated sessionid4 decoder is introduced that will be used by other operation decoders in subsequent patches. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7d2108407466c644ad43836d44179b4f81a6b840 Author: Chuck Lever Date: Tue Nov 3 13:14:35 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_backchannel_ctl() [ Upstream commit 0f81d96098f8eb707afe2f8d5c3fe0f9316ef5ce ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5658ca0651e6fbcf5b91dd649e248502fb8b6b35 Author: Chuck Lever Date: Tue Nov 3 13:09:34 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_cb_sec() [ Upstream commit 1a99440807bfc66597aaa2e0f0213c319b023e34 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 79f1a8323a3432ca600bdc973cde4242ffde9e9b Author: Chuck Lever Date: Wed Nov 4 13:42:25 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_release_lockowner() [ Upstream commit a4a80c15ca4dd998ab5cbe87bd856c626a318a80 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit eeab2f3bf284ac8e9da8c8807ffdf2a8a4276a02 Author: Chuck Lever Date: Tue Nov 3 14:44:28 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_write() [ Upstream commit 244e2befcba80f42c65293b6c56282bb78f9f417 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit b1af8f131eb8d84f50f8778c538ae1937bc51fe8 Author: Chuck Lever Date: Tue Nov 3 14:40:32 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_verify() [ Upstream commit 67cd453eeda86be90f83a0f4798f33832cf2d98c ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 19a4c05e816776c8176cf4ac10870c4c303eed43 Author: Chuck Lever Date: Wed Nov 4 15:12:33 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_setclientid_confirm() [ Upstream commit d1ca55149d67e5896f89a30053f5d83c002ac10e ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2503dcf0f68a40078eb6b231037988929eefd8b9 Author: Chuck Lever Date: Tue Nov 3 14:35:02 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_setclientid() [ Upstream commit 92fa6c08c251d52d0d7b46066ecf87b96a0c4b8f ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7c06ba5c8bf4e0edbb113b0fa97af31ac7cdc662 Author: Chuck Lever Date: Sat Nov 21 14:14:59 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_setattr() [ Upstream commit 44592fe9479d8d4b88594365ab825f7b07afdf7c ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 5277d6034642fa9621ff67242fef30c86a28fc52 Author: Chuck Lever Date: Wed Nov 4 15:09:42 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_secinfo() [ Upstream commit d0abdae5191a916d767164f6fc6c0e2e814a20a7 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2cef1009f8e730e9dd5824d2ba3b5f984554c922 Author: Chuck Lever Date: Wed Nov 4 15:08:50 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_renew() [ Upstream commit d12f90458dc8c11734ba44ec88f109bf8de86ff0 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e2b287a53ccaf284d1b010f87e64682bd82771bc Author: Chuck Lever Date: Wed Nov 4 15:05:58 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_rename() [ Upstream commit ba881a0a5342b3aaf83958901ebe3fe752eaab46 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 274b8f0597cf56043bac0c7e1188c38fbeddbeb4 Author: Chuck Lever Date: Wed Nov 4 15:04:36 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_remove() [ Upstream commit b7f5fbf219aecda98e32de305551e445f9438899 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c24e2a4943abdddae26c5e4760ef871ad30223a4 Author: Chuck Lever Date: Tue Nov 3 14:30:59 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_readdir() [ Upstream commit 0dfaf2a371436860ace6af889e6cd8410ee63164 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d0a0219a35fced96522aac4a2bb1f142e55c90e2 Author: Chuck Lever Date: Tue Nov 3 14:28:24 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_read() [ Upstream commit 3909c3bc604688503e31ddceb429dc156c4720c1 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4b28cd7e832213a82667eb604100b4eeda6fea6a Author: Chuck Lever Date: Tue Nov 3 14:23:02 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_putfh() [ Upstream commit a73bed98413b1d9eb4466f776a56d2fde8b3b2c9 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ad1ea32c9732e40d0eabbd0c7a25e16a74765d83 Author: Chuck Lever Date: Tue Nov 3 14:21:01 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_open_downgrade() [ Upstream commit dca71651f097ea608945d7a66bf62761a630de9a ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e557a2eabb350c03a8f3e546530d9760161d2648 Author: Chuck Lever Date: Tue Nov 3 14:18:57 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_open_confirm() [ Upstream commit 06bee693a1f1cb774b91000f05a6e183c257d8e9 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f6eb911d790bff3ea2f5324c3128e3fa080b0d60 Author: Chuck Lever Date: Sun Nov 1 12:04:06 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_open() [ Upstream commit 61e5e0b3ec713d1365008c8af3fe5fdd262e2a60 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 4507c23e4204aadc7d968273ac4c47171b3fd55b Author: Chuck Lever Date: Mon Nov 16 17:45:04 2020 -0500 NFSD: Add helper to decode OPEN's open_claim4 argument [ Upstream commit 1708e50b0145f393acbec9e319bdf0e33f765d25 ] Refactor for clarity. Note that op_fname is the only instance of an NFSv4 filename stored in a struct xdr_netobj. Convert it to a u32/char * pair so that the new nfsd4_decode_filename() helper can be used. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 48385b58bcf637a5e20d1c062d08471fa481a387 Author: Chuck Lever Date: Mon Nov 16 17:56:17 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_share_deny() [ Upstream commit b07bebd9eb9842e2d0dea87efeb92884556e55b0 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fa60cc6971fbd58bb00f38072ef21f1931ad7d85 Author: Chuck Lever Date: Mon Nov 16 17:54:48 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_share_access() [ Upstream commit 9aa62f5199749b274454b6d7d914c9b2a5e77031 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 070df4a4e9868485245e33823d7a1a366382008f Author: Chuck Lever Date: Mon Nov 16 17:41:21 2020 -0500 NFSD: Add helper to decode OPEN's openflag4 argument [ Upstream commit e6ec04b27bfb4869c0e35fbcf24333d379f101d5 ] Refactor for clarity. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c1ea8812d4210fd77ff8490b3eb75e7be688ec08 Author: Chuck Lever Date: Mon Nov 16 17:37:42 2020 -0500 NFSD: Add helper to decode OPEN's createhow4 argument [ Upstream commit bf33bab3c4182cdd795983f14de5606e82fab377 ] Refactor for clarity. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 11ea3e65f070e389234ed98cdd2b670cb206faa6 Author: Chuck Lever Date: Mon Nov 16 17:34:01 2020 -0500 NFSD: Add helper to decode NFSv4 verifiers [ Upstream commit 796dd1c6b680959ac968b52aa507911b288b1749 ] This helper will be used to simplify decoders in subsequent patches. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit def95074db3c1e955f24b7d3b3001b1e2c4cea18 Author: Chuck Lever Date: Wed Nov 4 15:02:40 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_lookup() [ Upstream commit 3d5877e8e03f60d7cc804d7b230ff9c00c9c07bd ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 340878b2e0a5f82238e75aad678309e90381b9d8 Author: Chuck Lever Date: Tue Nov 3 13:33:28 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_locku() [ Upstream commit ca9cf9fc27f8f722e9eb2763173ba01f6ac3dad1 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3625de1522fa14d85546139f0e77f1622785c7e6 Author: Chuck Lever Date: Tue Nov 3 13:31:44 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_lockt() [ Upstream commit 0a146f04aa0fa7a57aaed3913d1c2732b3853f31 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8357985d2185e00fad7227f3834e0cd1985636cf Author: Chuck Lever Date: Tue Nov 3 13:29:27 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_lock() [ Upstream commit 7c59deed5cd2e1cfc6cbecf06f4584ac53755f53 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit d27f2dcedae2ab3afe6a82f0f449abbe081c4d01 Author: Chuck Lever Date: Mon Nov 16 17:16:52 2020 -0500 NFSD: Add helper for decoding locker4 [ Upstream commit 8918cc0d2b72db9997390626010b182c4500d749 ] Refactor for clarity. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 0c281b7083f23ad0c5612f9dc6110388dadadf0e Author: Chuck Lever Date: Mon Nov 16 17:25:02 2020 -0500 NFSD: Add helpers to decode a clientid4 and an NFSv4 state owner [ Upstream commit 144e82694092ff80b5e64749d6822cd8947587f2 ] These helpers will also be used to simplify decoders in subsequent patches. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 753bb6b0e78854756a55f88b9fe6e0a6a811e5a8 Author: Chuck Lever Date: Wed Nov 4 11:41:55 2020 -0500 NFSD: Relocate nfsd4_decode_opaque() [ Upstream commit 5dcbfabb676b2b6d97767209cf707eb463ca232a ] Enable nfsd4_decode_opaque() to be used in more decoders, and replace the READ* macros in nfsd4_decode_opaque(). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 84bc365eee7f55c32b07c0f7b904aea2da5cc4e3 Author: Chuck Lever Date: Wed Nov 4 15:01:24 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_link() [ Upstream commit 5c505d128691c70991b766dd6a3faf49fa59ecfb ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 42c4437d78e62ddd6a7e4394cd75274c40a074a3 Author: Chuck Lever Date: Thu Nov 19 14:40:20 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_getattr() [ Upstream commit f759eff260f1f0b0f56531517762f27ee3233506 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 42e319695efc041e6f085a2d89ea10ab5c2e7d7b Author: Chuck Lever Date: Sat Nov 21 14:11:58 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_delegreturn() [ Upstream commit 95e6482cedfc0785b85db49b72a05323bbf41750 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3012fe5fea5512f680ed1f68ef5810506092b74f Author: Chuck Lever Date: Tue Nov 3 13:24:10 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_create() [ Upstream commit 000dfa18b3df9c62df5f768f9187cf1a94ded71d ] A dedicated decoder for component4 is introduced here, which will be used by other operation decoders in subsequent patches. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 57516a96cae8c0b5ee53e25d141a4f96cbb1000e Author: Chuck Lever Date: Tue Nov 3 12:56:05 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_fattr() [ Upstream commit d1c263a031e876ac3ca5223c728e4d98ed50b3c0 ] Let's be more careful to avoid overrunning the memory that backs the bitmap array. This requires updating the synopsis of nfsd4_decode_fattr(). Bruce points out that a server needs to be careful to return nfs_ok when a client presents bitmap bits the server doesn't support. This includes bits in bitmap words the server might not yet support. The current READ* based implementation is good about that, but that requirement hasn't been documented. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9737a9a8f923ec7621ea6f9d180e176666b1f435 Author: Chuck Lever Date: Thu Nov 19 14:07:43 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 umask attribute [ Upstream commit 66f0476c704c86d44aa9da19d4753df66f2dbc96 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 91a6752daddd88d6e0da2e809b407d8a33770cf8 Author: Chuck Lever Date: Thu Nov 19 14:05:51 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 security label attribute [ Upstream commit dabe91828f92cd493e9e75efbc10f9878d2a73fe ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 064e439befc906970645f04d01ec7e2ce832f27b Author: Chuck Lever Date: Thu Nov 19 14:01:08 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 time_set attributes [ Upstream commit 1c3eff7ea4a98c642134ee493001ae13b79ff38c ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit df42ebb61bbe9436028bfc0e938287ac1f20a7ee Author: Chuck Lever Date: Thu Nov 19 13:58:18 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 owner_group attribute [ Upstream commit 393c31dd27f83adb06b07a1b5f0a5b8966a0f01e ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit dec78fb66dd6c57a39998022c3c6c8c34b5c11cf Author: Chuck Lever Date: Thu Nov 19 13:56:42 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 owner attribute [ Upstream commit 9853a5ac9be381917e9be0b4133cd4ac5a7ad875 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 8801b0c2842173291701ef6290df973f4969275d Author: Chuck Lever Date: Thu Nov 19 13:54:26 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 mode attribute [ Upstream commit 1c8f0ad7dd35fd12307904036c7c839f77b6e3f9 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 3d3690b6620e6de4ff4ea48b99b7df7a234ac4af Author: Chuck Lever Date: Thu Nov 19 13:02:54 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 acl attribute [ Upstream commit c941a96823cf52e742606b486b81ab346bf111c9 ] Refactor for clarity and to move infrequently-used code out of line. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit ee02662724e31d372c4cd755039a8511b57db4a1 Author: Chuck Lever Date: Thu Nov 19 13:47:16 2020 -0500 NFSD: Replace READ* macros that decode the fattr4 size attribute [ Upstream commit 2ac1b9b2afbbacf597dbec722b23b6be62e4e41e ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2a8ae039571cec9c3e554615daaa02f06a68429f Author: Chuck Lever Date: Thu Nov 19 13:09:13 2020 -0500 NFSD: Change the way the expected length of a fattr4 is checked [ Upstream commit 081d53fe0b43c47c36d1832b759bf14edde9cdbb ] Because the fattr4 is now managed in an xdr_stream, all that is needed is to store the initial position of the stream before decoding the attribute list. Then the actual length of the list is computed using the final stream position, after decoding is complete. No behavior change is expected. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f82c6ad7e2fb55743ef28bfa6466387146b5adb9 Author: Chuck Lever Date: Tue Nov 3 13:19:51 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_commit() [ Upstream commit cbd9abb3706e96563b36af67595707a7054ab693 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c701c0e5a95675b03313487d7d75a5205292db7d Author: Chuck Lever Date: Tue Nov 3 13:18:23 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_close() [ Upstream commit d3d2f38154571e70d5806b5c5264bf61c101ea15 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9921353a52a7d621eb64ccaabaf28cc29be1c765 Author: Chuck Lever Date: Tue Nov 3 13:12:27 2020 -0500 NFSD: Replace READ* macros in nfsd4_decode_access() [ Upstream commit d169a6a9e5fd7f9e4b74e5e5d2e5a4fd0f84ef05 ] Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit bbb0a710a2c7a0b171e7a0d6fbdd6d5a2acd003b Author: Chuck Lever Date: Tue Nov 3 11:54:23 2020 -0500 NFSD: Replace the internals of the READ_BUF() macro [ Upstream commit c1346a1216ab5cb04a265380ac9035d91b16b6d5 ] Convert the READ_BUF macro in nfs4xdr.c from open code to instead use the new xdr_stream-style decoders already in use by the encode side (and by the in-kernel NFS client implementation). Once this conversion is done, each individual NFSv4 argument decoder can be independently cleaned up to replace these macros with C code. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2994c8888472ac1fe1ce17895786eb76f2483d9b Author: Chuck Lever Date: Sat Nov 21 11:36:42 2020 -0500 NFSD: Add tracepoints in nfsd4_decode/encode_compound() [ Upstream commit 08281341be8ebc97ee47999812bcf411942baa1e ] For troubleshooting purposes, record failures to decode NFSv4 operation arguments and encode operation results. trace_nfsd_compound_decode_err() replaces the dprintk() call sites that are embedded in READ_* macros that are about to be removed. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 568f9ca73d6ee19c2602dab10f5ff623c3ae7fda Author: Chuck Lever Date: Mon Oct 19 13:00:29 2020 -0400 NFSD: Add tracepoints in nfsd_dispatch() [ Upstream commit 0dfdad1c1d1b77b9b085f4da390464dd0ac5647a ] For troubleshooting purposes, record GARBAGE_ARGS and CANT_ENCODE failures. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit fbffaddb766b9fe4ab461c751e79b06334ddac22 Author: Chuck Lever Date: Thu Nov 5 14:48:29 2020 -0500 NFSD: Add common helpers to decode void args and encode void results [ Upstream commit 788f7183fba86b46074c16e7d57ea09302badff4 ] Start off the conversion to xdr_stream by de-duplicating the functions that decode void arguments and encode void results. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 79e4e0d489c8e72b9efa388e504a036eec1550c6 Author: Chuck Lever Date: Thu Nov 5 11:19:42 2020 -0500 SUNRPC: Prepare for xdr_stream-style decoding on the server-side [ Upstream commit 5191955d6fc65e6d4efe8f4f10a6028298f57281 ] A "permanent" struct xdr_stream is allocated in struct svc_rqst so that it is usable by all server-side decoders. A per-rqst scratch buffer is also allocated to handle decoding XDR data items that cross page boundaries. To demonstrate how it will be used, add the first call site for the new svcxdr_init_decode() API. As an additional part of the overall conversion, add symbolic constants for successful and failed XDR operations. Returning "0" is overloaded. Sometimes it means something failed, but sometimes it means success. To make it more clear when XDR decoding functions succeed or fail, introduce symbolic constants. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 2f46cc814106962a5b2a674cd9ff576b7dd00460 Author: Chuck Lever Date: Wed Nov 11 15:52:47 2020 -0500 SUNRPC: Add xdr_set_scratch_page() and xdr_reset_scratch_buffer() [ Upstream commit 0ae4c3e8a64ace1b8d7de033b0751afe43024416 ] Clean up: De-duplicate some frequently-used code. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 164937edca64cb63600b6bc452bf9673359fee4e Author: Huang Guobin Date: Wed Nov 25 03:39:33 2020 -0500 nfsd: Fix error return code in nfsd_file_cache_init() [ Upstream commit 231307df246eb29f30092836524ebb1fcb8f5b25 ] Fix to return PTR_ERR() error code from the error handling case instead of 0 in function nfsd_file_cache_init(), as done elsewhere in this function. Fixes: 65294c1f2c5e7("nfsd: add a new struct file caching facility to nfsd") Signed-off-by: Huang Guobin Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9393f1628f9abdf6c78f85cf8c1a8fcc9361e6ef Author: Chuck Lever Date: Thu Aug 27 16:09:53 2020 -0400 NFSD: Add SPDX header for fs/nfsd/trace.c [ Upstream commit f45a444cfe582b85af937a30d35d68d9a84399dd ] Clean up. The file was contributed in 2014 by Christoph Hellwig in commit 31ef83dc0538 ("nfsd: add trace events"). Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a7b8e883cef762e79baa3c407a3005148e28ab5b Author: Chuck Lever Date: Fri Sep 4 15:06:26 2020 -0400 NFSD: Remove extra "0x" in tracepoint format specifier [ Upstream commit 3a90e1dff16afdae6e1c918bfaff24f4d0f84869 ] Clean up: %p adds its own 0x already. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9f8405182bdd90be2d67f9b7684a9bf0b29c3678 Author: Chuck Lever Date: Wed Aug 19 12:56:40 2020 -0400 NFSD: Clean up the show_nf_may macro [ Upstream commit b76278ae68848cea13b325d247aa5cf31c87edac ] Display all currently possible NFSD_MAY permission flags. Move and rename show_nf_may with a more generic name because the NFSD_MAY permission flags are used in other places besides the file cache. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e51368510170946af965a1face59ca2c3b9550c1 Author: Alex Shi Date: Fri Nov 6 13:40:57 2020 +0800 nfsd/nfs3: remove unused macro nfsd3_fhandleres [ Upstream commit 71fd721839a74d945c242299f6be29a246fc2131 ] The macro is unused, remove it to tame gcc warning: fs/nfsd/nfs3proc.c:702:0: warning: macro "nfsd3_fhandleres" is not used [-Wunused-macros] Signed-off-by: Alex Shi Cc: "J. Bruce Fields" Cc: Chuck Lever Cc: linux-nfs@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 92f59545b914b761879e82a3dc514a377b2b2b7d Author: Tom Rix Date: Sun Nov 1 07:32:34 2020 -0800 NFSD: A semicolon is not needed after a switch statement. [ Upstream commit 25fef48bdbe7cac5ba5577eab6a750e1caea43bc ] Signed-off-by: Tom Rix Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit a2f25c3208d18c1bf6084d2bf290cda2ce4befda Author: Chuck Lever Date: Thu Nov 5 10:24:19 2020 -0500 NFSD: Invoke svc_encode_result_payload() in "read" NFSD encoders [ Upstream commit 76e5492b161f555c0fb69cad9eb39a7d8467f5fe ] Have the NFSD encoders annotate the boundaries of every direct-data-placement eligible result data payload. Then change svcrdma to use that annotation instead of the xdr->page_len when handling Write chunks. For NFSv4 on RDMA, that enables the ability to recognize multiple result payloads per compound. This is a pre-requisite for supporting multiple Write chunks per RPC transaction. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9aa0a43a55ff181abd771035f23941402db96829 Author: Chuck Lever Date: Wed Jun 10 10:36:42 2020 -0400 SUNRPC: Rename svc_encode_read_payload() [ Upstream commit 03493bca084fdca48abc59b00e06ce733aa9eb7d ] Clean up: "result payload" is a less confusing name for these payloads. "READ payload" reflects only the NFS usage. Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin