aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
authorMatthew Wilcox (Oracle) <willy@infradead.org>2023-08-18 21:23:34 +0100
committerAndrew Morton <akpm@linux-foundation.org>2023-08-24 16:20:29 -0700
commit40d49a3c9e4a0e5cf7a6fcebc8d4d7d63d1f3f1b (patch)
tree3bd5bfde4e95a2bb85c40cf795a8418eef084760 /Documentation/filesystems
parent051ddcfeb1bdbae45e660c0db2468d29ca15c6c2 (diff)
downloadlinux-40d49a3c9e4a0e5cf7a6fcebc8d4d7d63d1f3f1b.tar.gz
mm: allow ->huge_fault() to be called without the mmap_lock held
Remove the checks for the VMA lock being held, allowing the page fault path to call into the filesystem instead of retrying with the mmap_lock held. This will improve scalability for DAX page faults. Also update the documentation to match (and fix some other changes that have happened recently). Link: https://lkml.kernel.org/r/20230818202335.2739663-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/locking.rst36
-rw-r--r--Documentation/filesystems/porting.rst11
2 files changed, 34 insertions, 13 deletions
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 211a0305399223..1a2cb60b24992c 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -628,26 +628,29 @@ vm_operations_struct
prototypes::
- void (*open)(struct vm_area_struct*);
- void (*close)(struct vm_area_struct*);
- vm_fault_t (*fault)(struct vm_area_struct*, struct vm_fault *);
+ void (*open)(struct vm_area_struct *);
+ void (*close)(struct vm_area_struct *);
+ vm_fault_t (*fault)(struct vm_fault *);
+ vm_fault_t (*huge_fault)(struct vm_fault *, unsigned int order);
+ vm_fault_t (*map_pages)(struct vm_fault *, pgoff_t start, pgoff_t end);
vm_fault_t (*page_mkwrite)(struct vm_area_struct *, struct vm_fault *);
vm_fault_t (*pfn_mkwrite)(struct vm_area_struct *, struct vm_fault *);
int (*access)(struct vm_area_struct *, unsigned long, void*, int, int);
locking rules:
-============= ========= ===========================
+============= ========== ===========================
ops mmap_lock PageLocked(page)
-============= ========= ===========================
-open: yes
-close: yes
-fault: yes can return with page locked
-map_pages: read
-page_mkwrite: yes can return with page locked
-pfn_mkwrite: yes
-access: yes
-============= ========= ===========================
+============= ========== ===========================
+open: write
+close: read/write
+fault: read can return with page locked
+huge_fault: maybe-read
+map_pages: maybe-read
+page_mkwrite: read can return with page locked
+pfn_mkwrite: read
+access: read
+============= ========== ===========================
->fault() is called when a previously not present pte is about to be faulted
in. The filesystem must find and return the page associated with the passed in
@@ -657,6 +660,13 @@ then ensure the page is not already truncated (invalidate_lock will block
subsequent truncate), and then return with VM_FAULT_LOCKED, and the page
locked. The VM will unlock the page.
+->huge_fault() is called when there is no PUD or PMD entry present. This
+gives the filesystem the opportunity to install a PUD or PMD sized page.
+Filesystems can also use the ->fault method to return a PMD sized page,
+so implementing this function may not be necessary. In particular,
+filesystems should not call filemap_fault() from ->huge_fault().
+The mmap_lock may not be held when this method is called.
+
->map_pages() is called when VM asks to map easy accessible pages.
Filesystem should find and map pages associated with offsets from "start_pgoff"
till "end_pgoff". ->map_pages() is called with the RCU lock held and must
diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index d2d684ae77984f..7ce352265de103 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -943,3 +943,14 @@ file pointer instead of struct dentry pointer. d_tmpfile() is similarly
changed to simplify callers. The passed file is in a non-open state and on
success must be opened before returning (e.g. by calling
finish_open_simple()).
+
+---
+
+**mandatory**
+
+Calling convention for ->huge_fault has changed. It now takes a page
+order instead of an enum page_entry_size, and it may be called without the
+mmap_lock held. All in-tree users have been audited and do not seem to
+depend on the mmap_lock being held, but out of tree users should verify
+for themselves. If they do need it, they can return VM_FAULT_RETRY to
+be called with the mmap_lock held.