aboutsummaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2006-09-30[PATCH] ext3: make meta data reads use READ_METAJens Axboe2-3/+5
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-30[PATCH] cfq-iosched: kill cfq_exit_lockJens Axboe1-2/+2
cfq_exit_lock is protecting two things now: - The per-ioc rbtree of cfq_io_contexts - The per-cfqd linked list of cfq_io_contexts The per-cfqd linked list can be protected by the queue lock, as it is (by definition) per cfqd as the queue lock is. The per-ioc rbtree is mainly used and updated by the process itself only. The only outside use is the io priority changing. If we move the priority changing to not browsing the rbtree, we can remove any locking from the rbtree updates and lookup completely. Let the sys_ioprio syscall just mark processes as having the iopriority changed and lazily update the private cfq io contexts the next time io is queued, and we can remove this locking as well. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-29Merge git://oss.sgi.com:8090/xfs/xfs-2.6Linus Torvalds64-1037/+1060
* git://oss.sgi.com:8090/xfs/xfs-2.6: (49 commits) [XFS] Remove v1 dir trace macro - missed in a past commit. [XFS] 955947: Infinite loop in xfs_bulkstat() on formatter() error [XFS] pv 956241, author: nathans, rv: vapo - make ino validation checks [XFS] pv 956240, author: nathans, rv: vapo - Minor fixes in [XFS] Really fix use after free in xfs_iunpin. [XFS] Collapse sv_init and init_sv into just the one interface. [XFS] standardize on one sema init macro [XFS] Reduce endian flipping in alloc_btree, same as was done for [XFS] Minor cleanup from dio locking fix, remove an extra conditional. [XFS] Fix kmem_zalloc_greedy warnings on 64 bit platforms. [XFS] pv 955157, rv bnaujok - break the loop on EFAULT formatter() error [XFS] pv 955157, rv bnaujok - break the loop on formatter() error [XFS] Fixes the leak in reservation space because we weren't ungranting [XFS] Add lock annotations to xfs_trans_update_ail and [XFS] Fix a porting botch on the realtime subvol growfs code path. [XFS] Minor code rearranging and cleanup to prevent some coverity false [XFS] Remove a no-longer-correct debug assert from dio completion [XFS] Add a greedy allocation interface, allocating within a min/max size [XFS] Improve error handling for the zero-fsblock extent detection code. [XFS] Be more defensive with page flags (error/private) for metadata ...
2006-09-29[PATCH] Kcore elf note namesz field fixVivek Goyal1-2/+2
o As per ELF specifications, it looks like that elf note "namesz" field contains the length of "name" including the size of null character. And currently we are filling "namesz" without taking into the consideration the null character size. o Kexec-tools performs this check deligently hence I ran into the issue while trying to open /proc/kcore in kexec-tools for some info. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] expand_fdtable(): remove pointless unlock+lockAndrew Morton1-2/+0
This unlock/lock on a super-unlikely path isn't worth the kernel text. Cc: Vadim Lobanov <vlobanov@speakeasy.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Clean up expand_fdtable() and expand_files()Vadim Lobanov1-41/+35
Perform a code cleanup against the expand_fdtable() and expand_files() functions inside fs/file.c. It aims to make the flow of code within these functions simpler and easier to understand, via added comments and modest refactoring. Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Access Control Lists for tmpfsAndreas Gruenbacher1-0/+13
Add access control lists for tmpfs. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Generic infrastructure for aclsAndreas Gruenbacher3-0/+202
The patches solve the following problem: We want to grant access to devices based on who is logged in from where, etc. This includes switching back and forth between multiple user sessions, etc. Using ACLs to define device access for logged-in users gives us all the flexibility we need in order to fully solve the problem. Device special files nowadays usually live on tmpfs, hence tmpfs ACLs. Different distros have come up with solutions that solve the problem to different degrees: SUSE uses a resource manager which tracks login sessions and sets ACLs on device inodes as appropriate. RedHat uses pam_console, which changes the primary file ownership to the logged-in user. Others use a set of groups that users must be in in order to be granted the appropriate accesses. The freedesktop.org project plans to implement a combination of a console-tracker and a HAL-device-list based solution to grant access to devices to users, and more distros will likely follow this approach. These patches have first been posted here on 2 February 2005, and again on 8 January 2006. We have been shipping them in SLES9 and SLES10 with no problems reported. The previous submission is archived here: http://lkml.org/lkml/2006/1/8/229 http://lkml.org/lkml/2006/1/8/230 http://lkml.org/lkml/2006/1/8/231 This patch: Add some infrastructure for access control lists on in-memory filesystems such as tmpfs. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] enforce RLIMIT_NOFILE in poll()Chris Snook1-7/+1
POSIX states that poll() shall fail with EINVAL if nfds > OPEN_MAX. In this context, POSIX is referring to sysconf(OPEN_MAX), which is the value of current->signal->rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not the compile-time constant which happens to also be named OPEN_MAX. In the current code, an application may poll up to max_fdset file descriptors, even if this exceeds RLIMIT_NOFILE. The current code also breaks applications which poll more than max_fdset descriptors, which worked circa 2.4.18 when the check was against NR_OPEN, which is 1024*1024. This patch enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has been changed at run time with ulimit -n. To elaborate on the rationale for this, there are three cases: 1) RLIMIT_NOFILE is at the default value of 1024 In this (default) case, the patch changes nothing. Calls with nfds > 1024 fail with EINVAL both before and after the patch, and calls with nfds <= 1024 pass the check both before and after the patch, since 1024 is the initial value of max_fdset. 2) RLIMIT_NOFILE has been raised above the default In this case, poll() becomes more permissive, allowing polling up to RLIMIT_NOFILE file descriptors even if less than 1024 have been opened. The patch won't introduce new errors here. If an application somehow depends on poll() failing when it polls with duplicate or invalid file descriptors, it's already broken, since this is already allowed below 1024, and will also work above 1024 if enough file descriptors have been open at some point to cause max_fdset to have been increased above nfds. 3) RLIMIT_NOFILE has been lowered below the default In this case, the system administrator or the user has gone out of their way to protect the system from inefficient (or malicious) applications wasting kernel memory. The current code allows polling up to 1024 file descriptors even if RLIMIT_NOFILE is much lower, which is not what the user or administrator intended. Well-written applications which only poll valid, unique file descriptors will never notice the difference, because they'll hit the limit on open() first. If an application gets broken because of the patch in this case, then it was already poorly/maliciously designed, and allowing it to work in the past was a violation of POSIX and a DoS risk on low-resource systems. With this patch, poll() will permit exactly what POSIX suggests, no more, no less, and for any run-time value set with ulimit -n, not just 256 or 1024. There are existing apps which which poll a large number of file descriptors, some of which may be invalid, and if those numbers stradle 1024, they currently fail with or without the patch in -mm, though they worked fine under 2.4.18. Signed-off-by: Chris Snook <csnook@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fs/namei.c: replace multiple current->fs by shortcut variableAndreas Mohr1-39/+47
Replace current->fs by fs helper variable to reduce some indirection overhead and (at least at the moment, before the current_thread_info() %gs PDA improvement is available) get rid of more costly current references. Reduces fs/namei.o from 37786 to 37082 Bytes (704 Bytes saved). [akpm@osdl.org: cleanup] Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Ban register_filesystem(NULL);Alexey Dobriyan1-2/+0
Everyone passes valid pointer there. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] 9p: fix leak on error pathAlexey Dobriyan1-2/+4
If register_filesystem() fails mux workqueue must be killed. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@lanl.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] cramfs: make cramfs_uncompress_exit() return voidAlexey Dobriyan1-2/+1
It always returns 0, so relying on it is useless. The only caller isn't checking return value. In general, un-, de-, -free functions should return void. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] freevxfs: fix leak on error pathAlexey Dobriyan1-3/+8
If register_filesystem() fails, vxfs_inode cache must be destroyed. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] cramfs: rewrite init_cramfs_fs()Alexey Dobriyan1-2/+9
Two lines -- two bugs. :-( Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fix mem_write() return valueFrederik Deweerdt1-1/+2
At the beginning of the routine, "copied" is set to 0, but it is no good because in lines 805 and 812 it is set to other values. Finally, the routine returns as if it copied 12 (=ENOMEM) bytes less than it actually did. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Acked-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] block_dev.c mutex_lock_nested() fixJason Baron1-1/+1
In the case below we are locking the whole disk not a partition. This change simply brings the code in line with the piece above where when we are the 'first' opener, and we are a partition. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] autofs4: pending flag not cleared on mount failIan Kent1-3/+3
During testing I've found that the mount pending flag can be left set at exit from autofs4_lookup after a failed mount request. This shouldn't be allowed to happen and causes incorrect error returns. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] autofs4: autofs4_follow_link false negative fixIan Kent1-1/+1
The check for an empty directory in the autofs4_follow_link method fails occassionally due to old dentrys. We had the same problem autofs4_revalidate ages ago. I thought we wouldn't need this in autofs4_follow_link, silly me. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] cdev documentationJonathan Corbet1-0/+59
Add some documentation comments for the cdev interface. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Cc: Rolf Eike Beer <eike-kernel@sf-tec.de> Acked-by: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] tty: stop the tty vanishing under procfs accessAlan Cox1-0/+3
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] I/O Error attempting to read last partial block of a file in an ↵Joel & Rebecca VanderZee1-24/+24
ISO9660 file system There was an I/O error that prevented reading the last partial block of large files in an ISO9660 filesystem. The error was generated when a file comprised more than one section and had a size that was not an exact multiple of the filesystem block size. This patch removes the check (and failure) for reading into the last partial block (and possibly beyond) for multiple-section files. It worked in my testing to prevent reading beyond the end of the section; my first patch just incremented the sect_size block count for a partial block and continued doing the check. But there is a commment in the source code about reading beyond the end of the file to fill a page cache. Failing to access beyond the section would prevent reading beyond the end of the file. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] dquot: add proper locking when using current->signal->ttyJan Kara1-0/+5
Dquot passes the tty to tty_write_message without locking Signed-off-by: Jan Kara <jack@suse.cz> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] elf_fdpic_core_dump: don't take tasklist_lockOleg Nesterov1-4/+3
do_each_thread() is rcu-safe, and all tasks which use this ->mm must sleep in wait_for_completion(&mm->core_done) at this point, so we can use RCU locks. Also, remove unneeded INIT_LIST_HEAD(new) before list_add(new, head). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] elf_core_dump: don't take tasklist_lockOleg Nesterov1-4/+3
do_each_thread() is rcu-safe, and all tasks which use this ->mm must sleep in wait_for_completion(&mm->core_done) at this point, so we can use RCU locks. Also, remove unneeded INIT_LIST_HEAD(new) before list_add(new, head). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fix wrong error code on interrupted close syscallsErnie Petrides1-1/+11
The problem is that close() syscalls can call a file system's flush handler, which in turn might sleep interruptibly and ultimately pass back an -ERESTARTSYS return value. This happens for files backed by an interruptible NFS mount under nfs_file_flush() when a large file has just been written and nfs_wait_bit_interruptible() detects that there is a signal pending. I have a test case where the "strace" command is used to attach to a process sleeping in such a close(). Since the SIGSTOP is forced onto the victim process (removing it from the thread's "blocked" mask in force_sig_info()), the RPC wait is interrupted and the close() is terminated early. But the file table entry has already been cleared before the flush handler was called. Thus, when the syscall is restarted, the file descriptor appears closed and an EBADF error is returned (which is wrong). What's worse, there is the hypothetical case where another thread of a multi-threaded application might have reused the file descriptor, in which case that file would be mistakenly closed. The bottom line is that close() syscalls are not restartable, and thus -ERESTARTSYS return values should be mapped to -EINTR. This is consistent with the close(2) manual page. The fix is below. Signed-off-by: Ernie Petrides <petrides@redhat.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Chardev checking of overlapping rangesAmos Waterland1-5/+23
The code in __register_chrdev_region checks that if the driver wishing to register has the same major as an existing driver the new minor range is strictly less than the existing minor range. However, it does not also check that the new minor range is strictly greater than the existing minor range. That is, if driver X has registered with major=x and minor=0-3, __register_chrdev_region will allow driver Y to register with major=x and minor=1-4. Signed-off-by: Amos Waterland <apw@us.ibm.com> Cc: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Fix unserialized task->files changingKirill Korotaev3-10/+5
Fixed race on put_files_struct on exec with proc. Restoring files on current on error path may lead to proc having a pointer to already kfree-d files_struct. ->files changing at exit.c and khtread.c are safe as exit_files() makes all things under lock. Found during OpenVZ stress testing. [akpm@osdl.org: add export] Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] add -o flush for fatChris Mason3-3/+80
Fat is commonly used on removable media. Mounting with -o flush tells the FS to write things to disk as quickly as possible. It is like -o sync, but much faster (and not as safe). Signed-off-by: Chris Mason <mason@suse.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fs.h: ifdef security fieldsAlexey Dobriyan1-1/+0
[assuming BSD security levels are deleted] The only user of i_security, f_security, s_security fields is SELinux, however, quite a few security modules are trying to get into kernel. So, wrap them under CONFIG_SECURITY. Adding config option for each security field is likely an overkill. Following Stephen Smalley's suggestion, i_security initialization is moved to security_inode_alloc() to not clutter core code with ifdefs and make alloc_inode() codepath tiny little bit smaller and faster. The user of (highly greppable) struct fown_struct::security field is still to be found. I've checked every "fown_struct" and every "f_owner" occurence. Additionally it's removal doesn't break i386 allmodconfig build. struct inode, struct file, struct super_block, struct fown_struct become smaller. P.S. Combined with two reiserfs inode shrinking patches sent to linux-fsdevel, I can finally suck 12 reiserfs inodes into one page. /proc/slabinfo -ext2_inode_cache 388 10 +ext2_inode_cache 384 10 -inode_cache 280 14 +inode_cache 276 14 -proc_inode_cache 296 13 +proc_inode_cache 292 13 -reiser_inode_cache 336 11 +reiser_inode_cache 332 12 <= -shmem_inode_cache 372 10 +shmem_inode_cache 368 10 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] reiserfs: ifdef ACL stuff from inodeAlexey Dobriyan2-4/+12
Shrink reiserfs inode more (by 8 bytes) for ACL non-users: -reiser_inode_cache 344 11 +reiser_inode_cache 336 11 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] reiserfs: ifdef xattr_semAlexey Dobriyan1-2/+2
Shrink reiserfs inode by 12 bytes for xattr non-users (me). -reiser_inode_cache 356 11 +reiser_inode_cache 344 11 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Fix reiserfs latencies caused by data=orderedChris Mason1-11/+43
ReiserFS does periodic cleanup of old transactions in order to limit the length of time a journal replay may take after a crash. Sometimes, writing metadata from an old (already committed) transaction may require committing a newer transaction, which also requires writing all data=ordered buffers. This can cause very long stalls on journal_begin. This patch makes sure new transactions will not need to be committed before trying a periodic reclaim of an old transaction. It is low risk because if a bad decision is made, it just means a slightly longer journal replay after a crash. Signed-off-by: Chris Mason <mason@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] reiserfs_fsync should only use barriers when they are enabledChris Mason1-1/+1
make sure that reiserfs_fsync only triggers barriers when mounted with -o barrier=flush Signed-off-by: Chris Mason <mason@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] mount udf UDF_PART_FLAG_READ_ONLY partitions with MS_RDONLYEric Sandeen1-0/+4
There's a bug where a UDF_PART_FLAG_READ_ONLY udf partition gets mounted read-write, then subsequent problems happen; files seem to be able to be removed, but file creation results in EIO or worse, oops. EIO is coming from udf_new_block(), which returns EIO if the right flags aren't set; only UDF_PART_FLAG_READ_ONLY is set in this case. We probably s hould not have gotten this far... Attached patch seems to fix it - and includes a printk to alert the user that their "rw" mount request has been converted to "ro." Here's the testcase I used: [root@magnesium ~]# mkisofs -R -J -udf -o testiso /tmp/ ... Total translation table size: 0 Total rockridge attributes bytes: 342923 Total directory bytes: 382312 Path table size(bytes): 104 Max brk space used 103000 105059 extents written (205 MB) [root@magnesium ~]# mount -o loop testiso /mnt/test/ [root@magnesium ~]# ls /mnt/test/fsfile /mnt/test/fsfile [root@magnesium ~]# rm /mnt/test/fsfile [root@magnesium ~]# ls /mnt/test/fsfile ls: /mnt/test/fsfile: No such file or directory [root@magnesium ~]# touch /mnt/test/fsfile touch: cannot touch `/mnt/test/fsfile': Input/output error [root@magnesium tmp]# grep udf /proc/mounts /dev/loop1 /mnt/test udf rw 0 0 Force readonly mounts of UDF partitions marked as read-only. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] ignore partition table on disks with AIX labelOlaf Hering1-0/+31
The on-disk data structures from AIX are not known, also the filesystem layout is not known. There is a msdos partition signature at the end of the first block, and the kernel recognizes 3 small (and overlapping) partitions. But they are not usable. Maybe the firmware uses it to find the bootloader for AIX, but AIX boots also if the first block is cleared. This is the content of the partition table: # dd if=/dev/sdb count=$(( 4 * 16 )) bs=1 skip=$(( 0x1be )) | xxd 0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000010: 80ff ffff 41ff ffff 1b11 0000 381b 0000 ....A.......8... 0000020: 00ff ffff 41ff ffff 0211 0000 1900 0000 ....A........... 0000030: 80ff ffff 41ff ffff 1b11 0000 381b 0000 ....A.......8... Handle the whole disk as empty disk. This fixes also YaST which compares the output from parted (and formerly fdisk) with /proc/partitions. fdisk recognizes the AIX label since a long time, SuSE has a patch for parted to handle the disk label as unknown. dmesg will look like this: sda: [AIX] unknown partition table Tested on an IBM B50 with AIX V4.3.3. Signed-off-by: Olaf Hering <olh@suse.de> Cc: Albert Cahalan <acahalan@gmail.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] vfs: define new lookup flag for chdirMiklos Szeredi2-2/+3
In the "operation does permission checking" model used by fuse, chdir permission is not checked, since there's no chdir method. For this case set a lookup flag, which will be passed to ->permission(), so fuse can distinguish it from permission checks for other operations. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fuse: use dentry in statfsMiklos Szeredi1-0/+1
Some filesystems may want to report different values depending on the path within the filesystem, i.e. one mount is actually several filesystems. This can be the case for a network filesystem exported by an unprivileged server (e.g. sshfs). This is now possible, thanks to David Howells "VFS: Permit filesystem to perform statfs with a known root dentry" patch. This change is backward compatible, so no need to change interface version. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Require mmap handler for a.out executablesEugene Teo1-0/+14
Files supported by fs/proc/base.c, i.e. /proc/<pid>/*, are not capable of meeting the validity checks in ELF load_elf_*() handling because they have no mmap handler which is required by ELF. In order to stop a.out executables being used as part of an exploit attack against /proc-related vulnerabilities, we make a.out executables depend on ->mmap() existing. Signed-off-by: Eugene Teo <eteo@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fs: add lock annotation to grab_superJosh Triplett1-1/+1
grab_super gets called with sb_lock held, and releases it. Add a lock annotation to this function so that sparse can check callers for lock pairing, and so that sparse will not complain about this function since it intentionally uses the lock in this manner. Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] hugetlbfs: add lock annotation to hugetlbfs_forget_inode()Josh Triplett1-1/+1
hugetlbfs_forget_inode releases inode_lock. Add a lock annotation to this function so that sparse can check callers for lock pairing, and so that sparse will not complain about this functions since it intentionally uses the lock in this manner. Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fuse: add lock annotations to request_end and fuse_read_interruptJosh Triplett1-0/+2
request_end and fuse_read_interrupt release fc->lock. Add lock annotations to these two functions so that sparse can check callers for lock pairing, and so that sparse will not complain about these functions since they intentionally use locks in this manner. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] afs: add lock annotations to afs_proc_cell_servers_{start,stop}Josh Triplett1-0/+2
afs_proc_cell_servers_start acquires a lock, and afs_proc_cell_servers_stop releases that lock. Add lock annotations to these two functions so that sparse can check callers for lock pairing, and so that sparse will not complain about these functions since they intentionally use locks in this manner. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] mbcache: add lock annotation for __mb_cache_entry_release_unlock()Josh Triplett1-0/+1
__mb_cache_entry_release_unlock releases mb_cache_spinlock, so annotate it accordingly. Signed-off-by: Josh Triplett <josh@freedesktop.org> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] use gcc -O1 in fs/reiserfs only for ancient gcc versionsOlaf Hering1-1/+1
Only compile with -O1 if the (very old) compiler is broken. We use reiserfs alot since SLES9 on ppc64, and it was never seen with gcc33. Assume the broken gcc is gcc-3.4 or older. Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] libfs: remove page up-to-date check from simple_readpagePekka J Enberg1-9/+1
Remove the unnecessary PageUptodate check from simple_readpage. The only two callers for ->readpage that don't have explicit PageUptodate check are read_cache_pages and page_cache_read which operate on newly allocated pages which don't have the flag set. [akpm: use the allegedly-faster clear_page(), too] Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] fs/namespace: handle init/registration errorsRandy Dunlap1-2/+10
Check and handle init errors. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] blockdev.c: check driver layer errorsAndrew Morton1-12/+22
Check driver layer errors. Fix from: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com> In blockdevc-check-errors.patch, add_bd_holder() is modified to return error values when some of its operation failed. Among them, it returns -EEXIST when a given bd_holder object already exists in the list. However, in this case, the function completed its work successfully and need no action by its caller other than freeing unused bd_holder object. So I think it's better to return success after freeing by itself. Otherwise, bd_claim-ing with same claim pointer will fail. Typically, lvresize will fails with following message: device-mapper: reload ioctl failed: Invalid argument and you'll see messages like below in kernel log: device-mapper: table: 254:13: linear: dm-linear: Device lookup failed device-mapper: ioctl: error adding target to table Similarly, it should not add bd_holder to the list if either one of symlinking fails. I don't have a test case for this to happen but it should cause dereference of freed pointer. If a matching bd_holder is found in bd_holder_list, add_bd_holder() completes its job by just incrementing the reference count. In this case, it should be considered as success but it used to return 'fail' to let the caller free temporary bd_holder. Fixed it to return success and free given object by itself. Also, if either one of symlinking fails, the bd_holder should not be added to the list so that it can be discarded later. Otherwise, the caller will free bd_holder which is in the list. Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] JBD: memory leak in "journal_init_dev()"Zoltan Menyhart1-11/+10
We leak a bh ref in "journal_init_dev()" in case of failure. Signed-off-by: Zoltan Menyhart <Zoltan.Menyhart@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] JBD: Make journal_brelse_array() staticDave Kleikamp1-1/+1
It's always good to make symbols static when we can, and this also eliminates the need to rename the function in jbd2 Suggested by Eric Sandeen. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[XFS] Remove v1 dir trace macro - missed in a past commit.Tim Shimmin1-1/+0
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] 955947: Infinite loop in xfs_bulkstat() on formatter() errorVlad Apostolov1-0/+2
SGI-PV: 955947 SGI-Modid: xfs-linux-melb:xfs-kern:26986a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] pv 956241, author: nathans, rv: vapo - make ino validation checksVlad Apostolov2-10/+19
consistent in bulkstat SGI-PV: 956241 SGI-Modid: xfs-linux-melb:xfs-kern:26984a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] pv 956240, author: nathans, rv: vapo - Minor fixes inVlad Apostolov1-6/+11
kmem_zalloc_greedy() SGI-PV: 956240 SGI-Modid: xfs-linux-melb:xfs-kern:26983a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Really fix use after free in xfs_iunpin.David Chinner6-3/+35
The previous attempts to fix the linux inode use-after-free in xfs_iunpin simply made the problem harder to hit. We actually need complete exclusion between xfs_reclaim and xfs_iunpin, as well as ensuring that the i_flags are consistent during both of these functions. Introduce a new spinlock for exclusion and the i_flags, and fix up xfs_iunpin to use igrab before marking the inode dirty. SGI-PV: 952967 SGI-Modid: xfs-linux-melb:xfs-kern:26964a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Collapse sv_init and init_sv into just the one interface.Eric Sandeen1-2/+0
SGI-PV: 907752 SGI-Modid: xfs-linux-melb:xfs-kern:26925a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] standardize on one sema init macroEric Sandeen2-3/+1
One sema to rule them all, one sema to find them... SGI-PV: 907752 SGI-Modid: xfs-linux-melb:xfs-kern:26911a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Reduce endian flipping in alloc_btree, same as was done forEric Sandeen1-51/+57
ialloc_btree. SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26910a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Minor cleanup from dio locking fix, remove an extra conditional.Nathan Scott1-5/+5
SGI-PV: 955696 SGI-Modid: xfs-linux-melb:xfs-kern:26908a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Fix kmem_zalloc_greedy warnings on 64 bit platforms.Nathan Scott4-4/+5
SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26907a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] pv 955157, rv bnaujok - break the loop on EFAULT formatter() errorVlad Apostolov1-3/+3
SGI-PV: 955157 SGI-Modid: xfs-linux-melb:xfs-kern:26869a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] pv 955157, rv bnaujok - break the loop on formatter() errorVlad Apostolov1-0/+5
SGI-PV: 955157 SGI-Modid: xfs-linux-melb:xfs-kern:26866a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Fixes the leak in reservation space because we weren't ungrantingTim Shimmin2-2/+12
space for the unmount record - which becomes a problem in the freeze/thaw scenario. SGI-PV: 942533 SGI-Modid: xfs-linux-melb:xfs-kern:26815a Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Add lock annotations to xfs_trans_update_ail andJosh Triplett2-7/+9
xfs_trans_delete_ail xfs_trans_update_ail and xfs_trans_delete_ail get called with the AIL lock held, and release it. Add lock annotations to these two functions so that sparse can check callers for lock pairing, and so that sparse will not complain about these functions since they intentionally use locks in this manner. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26807a Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Fix a porting botch on the realtime subvol growfs code path.Nathan Scott2-1/+8
SGI-PV: 955515 SGI-Modid: xfs-linux-melb:xfs-kern:26806a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Minor code rearranging and cleanup to prevent some coverity falseNathan Scott3-37/+36
positives. SGI-PV: 955502 SGI-Modid: xfs-linux-melb:xfs-kern:26805a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Remove a no-longer-correct debug assert from dio completionNathan Scott1-1/+0
handling. SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26804a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Add a greedy allocation interface, allocating within a min/max sizeNathan Scott5-31/+34
range. SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26803a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Improve error handling for the zero-fsblock extent detection code.Nathan Scott4-68/+51
SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26802a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Be more defensive with page flags (error/private) for metadataNathan Scott1-3/+8
buffers. SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26801a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Add a debug flag for allocations which are known to be larger thanNathan Scott7-9/+18
one page. SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26800a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Remove several macros that are no longer used anywhereEric Sandeen10-68/+1
SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26749a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Remove unused iop_abort log item operationEric Sandeen5-131/+2
SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26747a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Remove a couple of unused BUF macrosEric Sandeen2-7/+0
SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26746a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] pass file mode on DMAPI remove eventsVlad Apostolov1-3/+13
SGI-PV: 953687 SGI-Modid: xfs-linux-melb:xfs-kern:26639a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Remove last bulkstat false-positives with debug kernels.Nathan Scott5-17/+22
SGI-PV: 953819 SGI-Modid: xfs-linux-melb:xfs-kern:26628a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Ensure xlog_state_do_callback does not report spurious warnings onNathan Scott1-2/+7
ramdisks. SGI-PV: 954802 SGI-Modid: xfs-linux-melb:xfs-kern:26627a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Increase the size of the buffer holding the local inode clusterNathan Scott1-6/+16
list, to increase our potential readahead window and in turn improve bulkstat performance. SGI-PV: 944409 SGI-Modid: xfs-linux-melb:xfs-kern:26607a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Drop unneeded endian conversion in bulkstat and start readahead forNathan Scott1-37/+31
batches of inode cluster buffers at once, before any blocking reads are issued. SGI-PV: 944409 SGI-Modid: xfs-linux-melb:xfs-kern:26606a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] When issuing metadata readahead, submit bio with READA not READ.Nathan Scott1-7/+7
SGI-PV: 944409 SGI-Modid: xfs-linux-melb:xfs-kern:26603a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Rework DMAPI bulkstat calls in such a way that we can directlyNathan Scott2-15/+61
extract inline attributes out of the bulkstat buffer (for that case), rather than using an (extremely expensive for large icount filesystems) iget for fetching attrs. SGI-PV: 944409 SGI-Modid: xfs-linux-melb:xfs-kern:26602a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Add EA list callbacks for xfs kernel use. Cleanup some namespaceTim Shimmin4-247/+334
code. SGI-PV: 954372 SGI-Modid: xfs-linux-melb:xfs-kern:26583a Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Update XFS for i_blksize removal from generic inode structureNathan Scott1-5/+20
SGI-PV: 954366 SGI-Modid: xfs-linux-melb:xfs-kern:26565a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] remove accidentally reintroduced vfs unmount flag, unneeded inNathan Scott1-1/+1
current kernels SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26564a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] remove bhv_lookup, _range version works aswell and has more usefulChristoph Hellwig4-25/+2
semantics. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26563a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] use NULL for pointer initialisation instead of zero-cast-to-ptrNathan Scott4-15/+15
SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26562a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] endianess annotations for xfs_bmbt_key Trivial as there are noChristoph Hellwig4-33/+25
incore users. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26561a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] endianess annotate XFS_BMAP_BROOT_PTR_ADDR Make sure it returns aChristoph Hellwig3-50/+47
__be64 and let the callers use the proper macros. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26560a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] endianess annotations for xfs_bmbt_ptr_t/xfs_bmdr_ptr_tChristoph Hellwig1-2/+4
SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26559a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] add xfs_btree_check_lptr_disk variant which handles endianChristoph Hellwig3-13/+16
conversion SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26558a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] remove left over INT_ comments in *alloc*.c We can verify endianessChristoph Hellwig2-16/+16
handling with sparse now, no need for comments. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26557a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] endianess annotations for xfs_inobt_rec_t / xfs_inobt_key_tChristoph Hellwig6-46/+51
SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26556a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] endianess annotation for xfs_agfl_t. Trivial, xfs_agfl_t is alwaysChristoph Hellwig2-4/+4
used for ondisk values. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26553a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Fix sparse warning found when page tracing enabled, due toNathan Scott1-4/+4
overloaded gfp_t param. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26552a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Fix rounding bug in xfs_free_file_space found by sparse checking.Nathan Scott1-3/+2
SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26551a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] move XFS_IOC_GETVERSION to main multiplexerAlexey Dobriyan1-7/+2
Avoids doing an unnecessary inode to vnode conversion and avoids a memory allocation. SGI-PV: 904196 SGI-Modid: xfs-linux-melb:xfs-kern:26492a Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] cleanup the field types of some item format structuresTim Shimmin2-54/+62
SGI-PV: 954365 SGI-Modid: xfs-linux-melb:xfs-kern:26406a Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] Improve xfsbufd delayed write submission patterns, after blktraceNathan Scott1-10/+14
analysis. Under a sequential create+allocate workload, blktrace reported backward writes being issued by xfsbufd, and frequent inappropriate queue unplugs. We now insert at the tail when moving from the delwri lists to the temp lists, which maintains correct ordering, and we avoid unplugging queues deep in the submit paths when we'd shortly do it at a higher level anyway. blktrace now reports much healthier write patterns from xfsbufd for this workload (and likely many others). SGI-PV: 954310 SGI-Modid: xfs-linux-melb:xfs-kern:26396a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28[XFS] pass inode to xfs_ioc_space(), simplify some code. There is trivialAlexey Dobriyan1-5/+5
"inode => vnode => inode" conversion, but only flags and mode of final inode are looked at. Pass original inode instead. SGI-PV: 904196 SGI-Modid: xfs-linux-melb:xfs-kern:26395a Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-27[PATCH] de_thread: Use tsk not currentEric W. Biederman1-23/+23
Ingo Oeser pointed out that because current expands to an inline function it is more space efficient and somewhat faster to simply keep a cached copy of current in another variable. This patch implements that for the de_thread function. (akpm: saves nearly 100 bytes of text on x86) Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] fs/nfs/: make code staticAdrian Bunk1-3/+9
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] add newline to nfs dprintkMartin Bligh1-1/+1
Add missing \n to dprintk Signed-off-by: Martin Bligh <mbligh@google.com> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] pid: Implement transfer_pid and use it to simplify de_threadEric W. Biederman1-7/+4
In de_thread we move pids from one process to another, a rather ugly case. The function transfer_pid makes it clear what we are doing, and makes the action atomic. This is useful we ever want to atomically traverse the process group and session lists, in a rcu safe manner. Even if the atomic properties this change should be a win as transfer_pid should be less code to execute than executing both attach_pid and detach_pid, and this should make de_thread slightly smaller as only a single function call needs to be emitted. The only downside is that the code might be slower to execute as the odds are against transfer_pid being in cache. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] sysctl: Allow /proc/sys without sys_sysctlEric W. Biederman1-0/+19
Since sys_sysctl is deprecated start allow it to be compiled out. This should catch any remaining user space code that cares, and paves the way for further sysctl cleanups. [akpm@osdl.org: If sys_sysctl() is not compiled-in, emit a warning] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] alloc_fdtable() cleanupAndrew Morton1-4/+2
free_fdset(NULL, ...) is legal. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] reiserfs: warn about the useless nolargeio optionAdrian Bunk1-19/+2
Since the nolargeio option no longer has any effect, print a warning instead of setting a write-only variable. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <mason@suse.com> Cc: Hans Reiser <reiser@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode-diet: Eliminate i_blksize from the inode structureTheodore Ts'o62-98/+14
This eliminates the i_blksize field from struct inode. Filesystems that want to provide a per-inode st_blksize can do so by providing their own getattr routine instead of using the generic_fillattr() function. Note that some filesystems were providing pretty much random (and incorrect) values for i_blksize. [bunk@stusta.de: cleanup] [akpm@osdl.org: generic_fillattr() fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode-diet: Move i_cdev into a unionTheodore Ts'o2-2/+2
Move the i_cdev pointer in struct inode into a union. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode-diet: Move i_bdev into a unionTheodore Ts'o1-1/+1
Move the i_bdev pointer in struct inode into a union. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_privateTheodore Ts'o13-38/+38
The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. [judith@osdl.org: powerpc build fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Judith Lebzelter <judith@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] fat: cleanup fat_get_block(s)OGAWA Hirofumi1-17/+12
get_blocks() was removed. So, this removes it on fat, and will take advantage of the multi block mapping. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] autofs4 needs to force fail return revalidateIan Kent2-23/+65
For a long time now I have had a problem with not being able to return a lookup failure on an existsing directory. In autofs this corresponds to a mount failure on a autofs managed mount entry that is browsable (and so the mount point directory exists). While this problem has been present for a long time I've avoided resolving it because it was not very visible. But now that autofs v5 has "mount and expire on demand" of nested multiple mounts, such as is found when mounting an export list from a server, solving the problem cannot be avoided any longer. I've tried very hard to find a way to do this entirely within the autofs4 module but have not been able to find a satisfactory way to achieve it. So, I need to propose a change to the VFS. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] NOMMU: move the fallback arch_vma_name() to a sensible placeDavid Howells1-5/+0
Move the fallback arch_vma_name() to a sensible place (kernel/signal.c). Currently it's in fs/proc/task_mmu.c, a file that is dependent on both CONFIG_PROC_FS and CONFIG_MMU being enabled, but it's used from kernel/signal.c from where it is called unconditionally. [akpm@osdl.org: build fix] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] NOMMU: Implement /proc/pid/maps for NOMMUDavid Howells3-20/+75
Implement /proc/pid/maps for NOMMU by reading the vm_area_list attached to current->mm->context.vmlist. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] NOMMU: Set BDI capabilities for /dev/mem and /dev/kmemDavid Howells1-0/+20
Set the backing device info capabilities for /dev/mem and /dev/kmem to permit direct sharing under no-MMU conditions and full mapping capabilities under MMU conditions. Make the BDI used by these available to all directly mappable character devices. Also comment the capabilities for /dev/zero. [akpm@osdl.org: ifdef reductions] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] Really ignore kmem_cache_destroy return valueAlexey Dobriyan33-107/+42
* Rougly half of callers already do it by not checking return value * Code in drivers/acpi/osl.c does the following to be sure: (void)kmem_cache_destroy(cache); * Those who check it printk something, however, slab_error already printed the name of failed cache. * XFS BUGs on failed kmem_cache_destroy which is not the decision low-level filesystem driver should make. Converted to ignore. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] fs: Removing useless castsPanagiotis Issaris21-48/+37
* Removing useless casts * Removing useless wrapper * Conversion from kmalloc+memset to kzalloc Signed-off-by: Panagiotis Issaris <takis@issaris.org> Acked-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] fs: Conversions from kmalloc+memset to k(z|c)allocPanagiotis Issaris35-84/+43
Conversions from kmalloc+memset to kzalloc. Signed-off-by: Panagiotis Issaris <takis@issaris.org> Jffs2-bit-acked-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] more ext3 16T overflow fixesEric Sandeen2-6/+23
Some of the changes in balloc.c are just cosmetic, as Andreas pointed out - if they overflow they'll then underflow and things are fine. 5th hunk actually fixes an overflow problem. Also check for potential overflows in inode & block counts when resizing. Signed-off-by: Eric Sandeen <esandeen@redhat.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext3: Fix sparse warningsDave Kleikamp3-14/+13
Fixing up some endian-ness warnings in preparation to clone ext4 from ext3. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext3: More whitespace cleanupsDave Kleikamp11-41/+41
More white space cleanups in preparation of cloning ext4 from ext3. Removing spaces that precede a tab. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext3: wrong error behaviorVasily Averin1-5/+6
SWsoft Virtuozzo/OpenVZ Linux kernel team has discovered that ext3 error behavior was broken in linux kernels since 2.5.x versions by the following patch: 2002/10/31 02:15:26-05:00 tytso@snap.thunk.org Default mount options from superblock for ext2/3 filesystems http://linux.bkbits.net:8080/linux-2.6/gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ In case ext3 file system is mounted with errors=continue (EXT3_ERRORS_CONTINUE) errors should be ignored when possible. However at present in case of any error kernel aborts journal and remounts filesystem to read-only. Such behavior was hit number of times and noted to differ from that of 2.4.x kernels. This patch fixes this: - do nothing in case of EXT3_ERRORS_CONTINUE, - set EXT3_MOUNT_ABORT and call journal_abort() in all other cases - panic() should be called after ext3_commit_super() to save sb marked as EXT3_ERROR_FS Signed-off-by: Vasily Averin <vvs@sw.ru> Acked-by: Kirill Korotaev <dev@sw.ru> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext3: more comments about block allocation/reservation codeMingming Cao1-45/+247
Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext3: turn on reservation dump on block allocation errorsMingming Cao1-4/+8
In the past there were a few kernel panics related to block reservation tree operations failure (insert/remove etc). It would be very useful to get the block allocation reservation map info when such error happens. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] JBD: 16T fixesEric Sandeen1-3/+3
These are a few places I've found in jbd that look like they may not be 16T-safe, or consistent with the use of unsigned longs for block containers. Problems here would be somewhat hard to hit, would require journal blocks past the 8T boundary, which would not be terribly common. Still, should fix. (some of these have come from the ext4 work on jbd as well). I think there's one more possibility that the wrap() function may not be safe IF your last block in the journal butts right up against the 232 block boundary, but that seems like a VERY remote possibility, and I'm not worrying about it at this point. Signed-off-by: Eric Sandeen <esandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext3: inode numbers are unsigned longEric Sandeen5-26/+27
This is primarily format string fixes, with changes to ialloc.c where large inode counts could overflow, and also pass around journal_inum as an unsigned long, just to be pedantic about it.... Signed-off-by: Eric Sandeen <esandeen@redhat.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext2: fix mounts at 16TEric Sandeen1-13/+19
Signed-off-by: Eric Sandeen <esandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] fix ext3 mounts at 16TEric Sandeen1-15/+19
I need to do some actual IO testing now, but this gets things mounting for a 16T ext3 filesystem. (patched up e2fsprogs is needed too, I'll send that off the kernel list) This patch fixes these issues in the kernel: o sbi->s_groups_count overflows in ext3_fill_super() sbi->s_groups_count = (le32_to_cpu(es->s_blocks_count) - le32_to_cpu(es->s_first_data_block) + EXT3_BLOCKS_PER_GROUP(sb) - 1) / EXT3_BLOCKS_PER_GROUP(sb); at 16T, s_blocks_count is already maxed out; adding EXT3_BLOCKS_PER_GROUP(sb) overflows it and groups_count comes out to 0. Not really what we want, and causes a failed mount. Feel free to check my math (actually, please do!), but changing it this way should work & avoid the overflow: (A + B - 1)/B changed to: ((A - 1)/B) + 1 o ext3_check_descriptors() overflows range checks ext3_check_descriptors() iterates over all block groups making sure that various bits are within the right block ranges... on the last pass through, it is checking the error case [item] >= block + EXT3_BLOCKS_PER_GROUP(sb) where "block" is the first block in the last block group. The last block in this group (and the last one that will fit in 32 bits) is block + EXT3_BLOCKS_PER_GROUP(sb)- 1. block + EXT3_BLOCKS_PER_GROUP(sb) wraps back around to 0. so, make things clearer with "first_block" and "last_block" where those are first and last, inclusive, and use <, > rather than <, >=. Finally, the last block group may be smaller than the rest, so account for this on the last pass through: last_block = sb->s_blocks_count - 1; (a similar patch could be done for ext2; does anyone in their right mind use ext2 at 16T? I'll send an ext2 patch doing the same thing if that's warranted) Signed-off-by: Eric Sandeen <esandeen@redhat.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] jbd: use BUILD_BUG_ON in journal initAlexey Dobriyan1-7/+1
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Stephen Tweedie <sct@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] ext3 and jbd cleanup: remove whitespaceMingming Cao15-274/+274
Remove whitespace from ext3 and jbd, before we clone ext4. Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] jbd: add lock annotation to jbd_sync_bhJosh Triplett1-0/+1
jbd_sync_bh releases journal->j_list_lock. Add a lock annotation to this function so that sparse can check callers for lock pairing, and so that sparse will not complain about this function since it intentionally uses the lock in this manner. Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds2-3/+5
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits) [PATCH] Don't set calgary iommu as default y [PATCH] i386/x86-64: New Intel feature flags [PATCH] x86: Add a cumulative thermal throttle event counter. [PATCH] i386: Make the jiffies compares use the 64bit safe macros. [PATCH] x86: Refactor thermal throttle processing [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64) [PATCH] Fix unwinder warning in traps.c [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1 [PATCH] x86: Move direct PCI scanning functions out of line [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI [PATCH] Don't leak NT bit into next task [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder [PATCH] Fix some broken white space in ia32_signal.c [PATCH] Initialize argument registers for 32bit signal handlers. [PATCH] Remove all traces of signal number conversion [PATCH] Don't synchronize time reading on single core AMD systems [PATCH] Remove outdated comment in x86-64 mmconfig code [PATCH] Use string instructions for Core2 copy/clear [PATCH] x86: - restore i8259A eoi status on resume [PATCH] i386: Split multi-line printk in oops output. ...
2006-09-26Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6Linus Torvalds8-61/+62
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (47 commits) Driver core: Don't call put methods while holding a spinlock Driver core: Remove unneeded routines from driver core Driver core: Fix potential deadlock in driver core PCI: enable driver multi-threaded probe Driver Core: add ability for drivers to do a threaded probe sysfs: add proper sysfs_init() prototype drivers/base: check errors drivers/base: Platform notify needs to occur before drivers attach to the device v4l-dev2: handle __must_check add CONFIG_ENABLE_MUST_CHECK add __must_check to device management code Driver core: fixed add_bind_files() definition Driver core: fix comments in drivers/base/power/resume.c sysfs_remove_bin_file: no return value, dump_stack on error kobject: must_check fixes Driver core: add ability for devices to create and remove bin files Class: add support for class interfaces for devices Driver core: create devices/virtual/ tree Driver core: add device_rename function Driver core: add ability for classes to handle devices properly ...
2006-09-26[PATCH] binfmt_elf: consistently use loff_tAndrew Morton1-5/+5
As David Howells <dhowells@redhat.com> points out, binfmt_elf sometimes uses off_t, sometimes uses loff_t. Use loff_t throughout. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] ZVC: Support NR_SLAB_RECLAIMABLE / NR_SLAB_UNRECLAIMABLEChristoph Lameter1-1/+6
Remove the atomic counter for slab_reclaim_pages and replace the counter and NR_SLAB with two ZVC counter that account for unreclaimable and reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE. Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE. The intend seems to be to check for slab pages that could be freed. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] reduce MAX_NR_ZONES: make display of highmem counters conditional on ↵Christoph Lameter1-0/+4
CONFIG_HIGHMEM Do not display HIGHMEM memory sizes if CONFIG_HIGHMEM is not set. Make HIGHMEM dependent texts and make display of highmem counters optional Some texts are depending on CONFIG_HIGHMEM. Remove those strings and remove the display of highmem counter values if CONFIG_HIGHMEM is not set. [akpm@osdl.org: remove some ifdefs] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] mm: tracking shared dirty pagesPeter Zijlstra1-1/+1
Tracking of dirty pages in shared writeable mmap()s. The idea is simple: write protect clean shared writeable pages, catch the write-fault, make writeable and set dirty. On page write-back clean all the PTE dirty bits and write protect them once again. The implementation is a tad harder, mainly because the default backing_dev_info capabilities were too loosely maintained. Hence it is not enough to test the backing_dev_info for cap_account_dirty. The current heuristic is as follows, a VMA is eligible when: - its shared writeable (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED) - it is not a 'special' mapping (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0 - the backing_dev_info is cap_account_dirty mapping_cap_account_dirty(vma->vm_file->f_mapping) - f_op->mmap() didn't change the default page protection Page from remap_pfn_range() are explicitly excluded because their COW semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and because they don't have a backing store anyway. mprotect() is taught about the new behaviour as well. However it overrides the last condition. Cleaning the pages on write-back is done with page_mkclean() a new rmap call. It can be called on any page, but is currently only implemented for mapped pages, if the page is found the be of a VMA that accounts dirty pages it will also wrprotect the PTE. Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from under ->private_lock. This seems to be safe, since ->private_lock is used to serialize access to the buffers, not the page itself. This is needed because clear_page_dirty() will call into page_mkclean() and would thereby violate locking order. [dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] jbd: fix commit of ordered data buffersJan Kara1-69/+113
Original commit code assumes, that when a buffer on BJ_SyncData list is locked, it is being written to disk. But this is not true and hence it can lead to a potential data loss on crash. Also the code didn't count with the fact that journal_dirty_data() can steal buffers from committing transaction and hence could write buffers that no longer belong to the committing transaction. Finally it could possibly happen that we tried writing out one buffer several times. The patch below tries to solve these problems by a complete rewrite of the data commit code. We go through buffers on t_sync_datalist, lock buffers needing write out and store them in an array. Buffers are also immediately refiled to BJ_Locked list or unfiled (if the write out is completed). When the array is full or we have to block on buffer lock, we submit all accumulated buffers for IO. [suitable for 2.6.18.x around the 2.6.19-rc2 timeframe] Signed-off-by: Jan Kara <jack@suse.cz> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] Check return value of copy_to_user in compat_sys_pselect7Andi Kleen1-2/+3
Fix linux/fs/compat.c: In function compat_sys_pselect7 linux/fs/compat.c:1869: warning: ignoring return value of copy_to_user, declared with attribute warn_unused_result To make it easier to handle I changed to semantics to not try to write out a timespec if an error occurred. I hope that's ok. Cc: dwmw2@infradead.org Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26[PATCH] i386/x86-64: Don't randomize stack top when no randomization ↵Andi Kleen1-1/+2
personality is set Based on patch from Frank van Maarseveen <frankvm@frankvm.com>, but extended. Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-25sysfs: add proper sysfs_init() prototypeAndrew Morton1-9/+1
Don't be crufty. Mark it __must_check too. Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25sysfs_remove_bin_file: no return value, dump_stack on errorRandy.Dunlap3-9/+17
Make sysfs_remove_bin_file() void. If it detects an error, printk the file name and call dump_stack(). sysfs_hash_and_remove() now returns an error code indicating its success or failure so that sysfs_remove_bin_file() can know success/failure. Convert the only driver that checked the return value of sysfs_remove_bin_file(). Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25SYSFS: allow sysfs_create_link to create symlinks in the root of sysfsGreg Kroah-Hartman1-2/+12
This is needed to make the compatible link for /sys/block in the future. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25Debugfs: kernel-doc fixes for debugfsRandy Dunlap2-40/+31
Fix kernel-doc and typos/spellos in fs/debugfs/. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25sysfs: Make poll behaviour consistentJuha Yrjölä1-1/+1
When no events have been reported by sysfs_notify(), sd->s_events was previously set to zero. The initial value for new readers is also zero, so poll was blocking, regardless of whether the attribute was read by the process or not. Make poll behave consistently by setting the initial value of sd->s_events to non-zero. Signed-off-by: Juha Yrjola <juha.yrjola@solidboot.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25[PATCH] autofs4: zero timeout prevents shutdownIan Kent1-3/+3
If the timeout of an autofs mount is set to zero then umounts are disabled. This works fine, however the kernel module checks the expire timeout and goes no further if it is zero. This is not the right thing to do at shutdown as the module is passed an option to expire mounts regardless of their timeout setting. This patch allows autofs to honor the force expire option. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-24ocfs2: Teach ocfs2_drop_lock() to use ->set_lvb() callbackMark Fasheh1-31/+28
With this, we don't need to pass an additional struct with function pointer. Now that the callbacks are fully used, comment the remaining API. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Remove ->unblock lockres operationMark Fasheh1-146/+6
Have ocfs2_process_blocked_lock() call ocfs2_generic_unblock_lock(), which gets to be ocfs2_unblock_lock() now that it's the only possible unblock function. Remove the ->unblock() callback from the structure, and all lock type specific unblock functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: move downconvert worker to lockres opsMark Fasheh1-18/+32
This way lock types don't have to manually pass it to ocfs2_generic_unblock_lock(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Remove unused dlmglue functionsMark Fasheh1-103/+0
The meta data unblocking code no longer needs ocfs2_do_unblock_meta() or ocfs2_can_downconvert_meta_lock(), so remove them. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Have the metadata lock use generic dlmglue functionsMark Fasheh1-1/+31
Fill in the ->check_downconvert and ->set_lvb callbacks with meta data specific operations and switch ocfs2_unblock_meta() to call ocfs2_generic_unblock_lock() Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Add ->set_lvb callback in dlmglueMark Fasheh1-2/+29
This allows a lock type to set the value block before downconvert. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Add ->check_downconvert callback in dlmglueMark Fasheh1-1/+18
This will allow lock types to force a requeue of a lock downconvert. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Check for refreshing locks in generic unblock functionMark Fasheh1-12/+19
Tidy up the exit path a bit too. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: don't unconditionally pass LVB flagsMark Fasheh1-3/+15
Allow a lock type to specifiy whether it makes use of the LVB. The only type which does this right now is the meta data lock. This should save us some space on network messages since they won't have to needlessly transmit value blocks. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: combine inode and generic blocking AST functionsMark Fasheh1-112/+11
There is extremely little difference between the two now. We can remove the callback from ocfs2_lock_res_ops as well. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Add ->get_osb() dlmglue locking operationMark Fasheh1-0/+33
Will be used to find the ocfs2_super structure from a given lockres. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: remove ->unlock_ast() callback from ocfs2_lock_res_opsMark Fasheh1-13/+3
This was always defined to the same function in all locks, so clean things up by removing and passing ocfs2_unlock_ast() directly to the DLM. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: combine inode and generic AST functionsMark Fasheh1-110/+10
There is extremely little difference between the two now. We can remove the callback from ocfs2_lock_res_ops as well. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Clean up lock resource refresh flagsMark Fasheh1-14/+35
Use of the refresh mechanism is lock-type wide, so move knowledge of that to the ocfs2_lock_res_ops structure. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Remove i_generation from inode lock namesMark Fasheh10-53/+170
OCFS2 puts inode meta data in the "lock value block" provided by the DLM. Typically, i_generation is encoded in the lock name so that a deleted inode on and a new one in the same block don't share the same lvb. Unfortunately, that scheme means that the read in ocfs2_read_locked_inode() is potentially thrown away as soon as the meta data lock is taken - we cannot encode the lock name without first knowing i_generation, which requires a disk read. This patch encodes i_generation in the inode meta data lvb, and removes the value from the inode meta data lock name. This way, the read can be covered by a lock, and at the same time we can distinguish between an up to date and a stale LVB. This will help cold-cache stat(2) performance in particular. Since this patch changes the protocol version, we take the opportunity to do a minor re-organization of two of the LVB fields. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Encode i_generation in the meta data lvbMark Fasheh2-7/+12
When i_generation is removed from the lockname, this will help us determine whether a meta data lvb has information that is in sync with the local struct inode. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Free up some space in the lvbMark Fasheh2-4/+6
lvb_version doesn't need to be a whole 32 bits. Make it an 8 bit field to free up some space. This should be backwards compatible until we use one of the fields, in which case we'd bump the lvb version anyway. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Remove special casing for inode creation in ocfs2_dentry_attach_lock()Mark Fasheh3-37/+14
We can't use LKM_LOCAL for new dentry locks because an unlink and subsequent re-create of a name/inode pair may result in the lock still being mastered somewhere in the cluster. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: manually d_move() during ocfs2_rename()Mark Fasheh2-2/+5
Make use of FS_RENAME_DOES_D_MOVE to avoid a race condition that can occur during ->rename() if we d_move() outside of the parent directory cluster locks, and another node discovers the new name (created during the rename) and unlinks it. d_move() will unconditionally rehash a dentry - which will leave stale data in the system. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24[PATCH] Allow file systems to manually d_move() inside of ->rename()Mark Fasheh3-10/+9
Some file systems want to manually d_move() the dentries involved in a rename. We can do this by making use of the FS_ODD_RENAME flag if we just have nfs_rename() unconditionally do the d_move(). While there, we rename the flag to be more descriptive. OCFS2 uses this to protect that part of the rename operation with a cluster lock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-24ocfs2: Remove the dentry voteMark Fasheh2-183/+2
This is unused now. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Hook rest of the file system into dentry locking APIMark Fasheh4-41/+94
Actually replace the vote calls with the new dentry operations. Make any necessary adjustments to get the scheme to work. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Add dentry tracking APIMark Fasheh3-32/+369
Replace the dentry vote mechanism with a cluster lock which covers a set of dentries. This allows us to force d_delete() only on nodes which actually care about an unlink. Every node that does a ->lookup() gets a read only lock on the dentry, until an unlink during which the unlinking node, will request an exclusive lock, forcing the other nodes who care about that dentry to d_delete() it. The effect is that we retain a very lightweight ->d_revalidate(), and at the same time get to make large improvements to the average case performance of the ocfs2 unlink and rename operations. This patch adds the higher level API and the dentry manipulation code. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Add new cluster lock typeMark Fasheh5-104/+436
Replace the dentry vote mechanism with a cluster lock which covers a set of dentries. This allows us to force d_delete() only on nodes which actually care about an unlink. Every node that does a ->lookup() gets a read only lock on the dentry, until an unlink during which the unlinking node, will request an exclusive lock, forcing the other nodes who care about that dentry to d_delete() it. The effect is that we retain a very lightweight ->d_revalidate(), and at the same time get to make large improvements to the average case performance of the ocfs2 unlink and rename operations. This patch adds the cluster lock type which OCFS2 can attach to dentries. A small number of fs/ocfs2/dcache.c functions are stubbed out so that this change can compile. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Update dlmglue for new dlmlock() APIMark Fasheh1-0/+3
File system lock names are very regular right now, so we really only need to pass an extra parameter to dlmlock(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Update dlmfs for new dlmlock() APIMark Fasheh2-51/+31
We just need to add a namelen field to the user_lock_res structure, and update a few debug prints. Instead of updating all debug prints, I took the opportunity to remove a few that are likely unnecessary these days. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Allow binary names in the DLMMark Fasheh5-8/+11
The OCFS2 DLM uses strlen() to determine lock name length, which excludes the possibility of putting binary values in the name string. Fix this by requiring that string length be passed in as a parameter. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24ocfs2: Silence dlm error printMark Fasheh1-3/+3
An AST can be delivered via the network after a lock has been removed, so no need to print an error when we see that. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24Move several *_SUPER_MAGIC symbols to include/linux/magic.h.Jeff Garzik9-7/+5
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-09-22NFS: unmark NFS direct I/O as experimentalChuck Lever1-2/+2
Remove the EXPERIMENTAL flag from the NFS_DIRECTIO option. Test plan: Unset the EXPERIMENTAL kernel build option and check to see that the NFS direct I/O option is still available. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: add comments clarifying the use of nfs_post_op_update()Chuck Lever2-1/+13
Comments-only change to clarify a detail of the NFS protocol and how it is implemented in Linux. Test plan: None. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: Use SEEK_END instead of hardcoded valueJosef 'Jeff' Sipek1-1/+1
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFSv4: When mounting with a port=0 argument, substitute port=2049Trond Myklebust1-0/+3
RFC3530 states that the registered port 2049 for the NFS protocol should be the default configuration in order to allow clients not to use the RPC binding protocols. If the mount program sends us a port=0, we therefore substitute port=2049. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFSv4: Poll more aggressively when handling NFS4ERR_DELAYTrond Myklebust1-1/+1
Change the initial retry delay from 1s to 0.1s (and then back off exponentially). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFSv4: Handle the condition NFS4ERR_FILE_OPENTrond Myklebust1-0/+1
Retry a few times before we give up: the error is usually due to ordering issues with asynchronous RPC calls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFSv4: Retry lease recovery if it failed during a synchronous operation.Trond Myklebust1-2/+9
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: Don't invalidate the symlink we just stuffed into the cacheTrond Myklebust1-7/+5
And slight optimisation of nfs_end_data_update(): directories never have delegations anyway. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: Make read() return an ESTALE if the file has been deletedTrond Myklebust1-3/+16
Currently, a read() request will return EIO even if the file has been deleted on the server, simply because that is what the VM will return if the call to readpage() fails to update the page. Ensure that readpage() marks the inode as stale if it receives an ESTALE. Then return that error to userland. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFSv4: It's perfectly legal for clp to be NULL here....J. Bruce Fields1-1/+1
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: nfs_lookup - don't hash dentry when optimising away the lookupTrond Myklebust1-3/+11
If the open intents tell us that a given lookup is going to result in a, exclusive create, we currently optimize away the lookup call itself. The reason is that the lookup would not be atomic with the create RPC call, so why do it in the first place? A problem occurs, however, if the VFS aborts the exclusive create operation after the lookup, but before the call to create the file/directory: in this case we will end up with a hashed negative dentry in the dcache that has never been looked up. Fix this by only actually hashing the dentry once the create operation has been successfully completed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22Fix a referral error Oopsandros@citi.umich.edu1-0/+2
Fix an oops when the referral server is not responding. Check the error return from nfs4_set_client() in nfs4_create_referral_server. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: NFS_ROOT should use the new rpc_create APIChuck Lever1-16/+13
Teach NFS_ROOT to use the new rpc_create API instead of the old two-call API for creating an RPC transport. Test plan: Compile the kernel with the NFS client build-in, and set CONFIG_NFS_ROOT. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: Fix up compiler warnings on 64-bit platforms in client.cDavid Howells1-6/+14
Fix up warnings from compiling on ppc64. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22SUNRPC: Make rpc_mkpipe() take the parent dentry as an argumentTrond Myklebust1-5/+1
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFSv4: Fix a use-after-free issue with the nfs server.Trond Myklebust3-18/+27
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22Add a real API for dealing with blk_congestion_wait()Trond Myklebust1-0/+1
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: Use cached page as buffer for NFS symlink requestsChuck Lever7-35/+49
Now that we have a copy of the symlink path in the page cache, we can pass a struct page down to the XDR routines instead of a string buffer. Test plan: Connectathon, all NFS versions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: copy symlinks into page cache before sending NFS SYMLINK requestChuck Lever1-18/+68
Currently the NFS client does not cache symlinks it creates. They get cached only when the NFS client reads them back from the server. Copy the symlink into the page cache before sending it. Test plan: Connectathon, all NFS versions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: Fix double d_drop in nfs_instantiate() error pathChuck Lever4-45/+57
If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a d_drop before returning. But some callers already do a d_drop in the case of an error return. Make certain we do only one d_drop in all error paths. This issue was introduced because over time, the symlink proc API diverged slightly from the create/mkdir/mknod proc API. To prevent other coding mistakes of this type, change the symlink proc API to be more like create/mkdir/mknod and move the nfs_instantiate call into the symlink proc routines so it is used in exactly the same way for create, mkdir, mknod, and symlink. Test plan: Connectathon, all versions of NFS. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: remove a no-longer-needed error check in nfs_symlink()Chuck Lever1-6/+2
In the early days of NFS, there was no duplicate reply cache on the server. Thus retransmitted non-idempotent requests often found that the request had already completed on the server. To avoid passing an unanticipated return code to unsuspecting applications, NFS clients would often shunt error codes that implied the request had been retried but already completed. Thanks to NFS over TCP, duplicate reply caches on the server, and network performance and reliability improvements, it is safe to remove such checks. Test plan: None. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFSD: Convert NFS server callback logic to use new rpc_create APIChuck Lever1-39/+27
Replace xprt_create_proto/rpc_create_client call in NFS server callback functions to use new rpc_create() API. Test plan: NFSv4 delegation functionality tests. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22NFS: Convert NFS client to use new rpc_create() APIChuck Lever1-16/+11
Convert NFS client mount logic to use rpc_create() instead of the old xprt_create_proto/rpc_create_client API. Test plan: Mount stress tests. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22LOCKD: Convert to use new rpc_create() APIChuck Lever2-47/+44
Replace xprt_create_proto/rpc_create_client with new rpc_create() interface in the Network Lock Manager. Note that the semantics of NLM transports is now "hard" instead of "soft" to provide a better guarantee that lock requests will get to the server. Test plan: Repeated runs of Connectathon locking suite. Check network trace to ensure NLM requests are working correctly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22SUNRPC: Clean-up after previous patches.Chuck Lever1-1/+0
Remove some unused macros related to accessing an RPC peer address Test plan: Compile kernel with CONFIG_NFS option enabled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>