aboutsummaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2006-09-19Merge branch 'fixes' of git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds3-7/+9
* 'fixes' of git://git.linux-nfs.org/pub/linux/nfs-2.6: NFS: Fix nfs_page use after free issues in fs/nfs/write.c NFSv4: Fix incorrect semaphore release in _nfs4_do_open() NFS: Fix Oopsable condition in nfs_readpage_sync()
2006-09-19NFS: Fix nfs_page use after free issues in fs/nfs/write.cTrond Myklebust1-2/+2
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-19NFSv4: Fix incorrect semaphore release in _nfs4_do_open()Trond Myklebust1-3/+3
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-19NFS: Fix Oopsable condition in nfs_readpage_sync()Trond Myklebust1-2/+4
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-19Merge git://git.infradead.org/mtd-2.6Linus Torvalds3-3/+5
* git://git.infradead.org/mtd-2.6: [MTD] Use SEEK_{SET,CUR,END} instead of hardcoded values in mtdchar lseek() MTD: Fix bug in fixup_convert_atmel_pri [JFFS2][SUMMARY] Fix a summary collecting bug. [PATCH] [MTD] DEVICES: Fill more device IDs in the structure of m25p80 MTD: Add lock/unlock operations for Atmel AT49BV6416 MTD: Convert Atmel PRI information to AMD format fs/jffs2/xattr.c: remove dead code [PATCH] [MTD] Maps: Add dependency on alternate probe methods to physmap [PATCH] MTD: Add Macronix MX29F040 to JEDEC [MTD] Fixes of performance and stability issues in CFI driver. block2mtd.c: Make kernel boot command line arguments work (try 4) [MTD NAND] Fix lookup error in nand_get_flash_type() remove #error on !PCI from pmc551.c MTD: [NAND] Fix the sharpsl driver after breakage from a core conversion [MTD] NAND: OOB buffer offset fixups make fs/jffs2/nodelist.c:jffs2_obsolete_node_frag() static [PATCH] [MTD] NAND: fix dead URL in Kconfig
2006-09-19[PATCH] EXT2: Remove superblock lock contention in ext2_statfsDave Kleikamp3-4/+0
Fix a performance degradation introduced in 2.6.17. (30% degradation running dbench with 16 threads) Commit 21730eed11de42f22afcbd43f450a1872a0b5ea1, which claims to make EXT2_DEBUG work again, moves the taking of the kernel lock out of debug-only code in ext2_count_free_inodes and ext2_count_free_blocks and into ext2_statfs. The same problem was fixed in ext3 by removing the lock completely (commit 5b11687924e40790deb0d5f959247ade82196665) Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-16[PATCH] JFFS2: SUMMARY: fix a summary collecting bugZoltan Sogor1-0/+5
In some special case (padding because of sync or umount) it can be possible that summary information is not fit to the end of the erase block. In these cases the collecting of summary is disabled for this erase block. The problem was that this was not respected by jffs2_sum_add_kvec(). This patch fix this bug. Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-16[PATCH] ext3 sequential read regression fixSuparna Bhattacharya1-1/+1
ext3-get-blocks support caused ~20% degrade in Sequential read performance (tiobench). Problem is with marking the buffer boundary so IO can be submitted right away. Here is the patch to fix it. 2.6.18-rc6: ----------- # ./iotest 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB) copied, 75.2726 seconds, 57.1 MB/s real 1m15.285s user 0m0.276s sys 0m3.884s 2.6.18-rc6 + fix: ----------------- [root@elm3a241 ~]# ./iotest 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB) copied, 62.9356 seconds, 68.2 MB/s The boundary block check in ext3_get_blocks_handle needs to be adjusted against the count of blocks mapped in this call, now that it can map more than one block. Signed-off-by: Suparna Bhattacharya <suparna@in.ibm.com> Tested-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-16[PATCH] knfsd: Make ext3 reject filehandles referring to invalid inode numberNeilBrown1-0/+42
Inodes earlier than the 'first' inode (e.g. journal, resize) should be rejected early - except the root inode. Also inode numbers that are too big should be rejected early. [akpm@osdl.org: cleanup] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-16[PATCH] knfsd: Have ext2 reject file handles with bad inode numbers earlyNeilBrown1-0/+39
This prevents bad inode numbers from triggering errors in ext2_get_inode. [akpm@osdl.org: speedup, cleanup] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-16[JFFS2][SUMMARY] Fix a summary collecting bug.Havasi Ferenc1-0/+5
In some special case (padding because of sync or umount) it can be possible that summary information is not fit to the end of the erase block. In these cases the collecting of summary is disabled for this erase block. The problem was that this was not respected by jffs2_sum_add_kvec(). This patch fix this bug. From: Zoltan Sogor <weth@inf.u-szeged.hu> Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-09-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds1-4/+7
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Fix CIFS readdir access denied when SE Linux enabled
2006-09-12Merge git://oss.sgi.com:8090/xfs/xfs-2.6Linus Torvalds7-42/+76
* git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] Fix a bad pointer dereference in the quota statvfs handling. [XFS] Fix xfs_splice_write() so appended data gets to disk. [XFS] Fix ABBA deadlock between i_mutex and iolock. Avoid calling [XFS] Prevent free space oversubscription and xfssyncd looping.
2006-09-08[PATCH] NFS: large non-page-aligned direct I/O clobbers memoryTrond Myklebust3-69/+42
The logic in nfs_direct_read_schedule and nfs_direct_write_schedule can allow data->npages to be one larger than rpages. This causes a page pointer to be written beyond the end of the pagevec in nfs_read_data (or nfs_write_data). Fix this by making nfs_(read|write)_alloc() calculate the size of the pagevec array, and initialise data->npages. Also get rid of the redundant argument to nfs_commit_alloc(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-08[PATCH] ext3_getblk() should handle HOLE correctlyBadari Pulavarty1-4/+7
It has been reported that ext3_getblk() is not doing the right thing and triggering following WARN(): BUG: warning at fs/ext3/inode.c:1016/ext3_getblk() <c01c5140> ext3_getblk+0x98/0x2a6 <c03b2806> md_wakeup_thread+0x26/0x2a <c01c536d> ext3_bread+0x1f/0x88 <c01cedf9> ext3_quota_read+0x136/0x1ae <c018b683> v1_read_dqblk+0x61/0xac <c0188f32> dquot_acquire+0xf6/0x107 <c01ceaba> ext3_acquire_dquot+0x46/0x68 <c01897d4> dqget+0x155/0x1e7 <c018a97b> dquot_transfer+0x3e0/0x3e9 <c016fe52> dput+0x23/0x13e <c01c7986> ext3_setattr+0xc3/0x240 <c0120f66> current_fs_time+0x52/0x6a <c017320e> notify_change+0x2bd/0x30d <c0159246> chown_common+0x9c/0xc5 <c02a222c> strncpy_from_user+0x3b/0x68 <c0167fe6> do_path_lookup+0xdf/0x266 <c016841b> __user_walk_fd+0x44/0x5a <c01592b9> sys_chown+0x4a/0x55 <c015a43c> vfs_write+0xe7/0x13c <c01695d4> sys_mkdir+0x1f/0x23 <c0102a97> syscall_call+0x7/0xb Looking at the code, it looks like it's not handle HOLE correctly. It ends up returning -EIO. Here is the patch to fix it. If we really want to be paranoid, we can allow return values 0 (HOLE), 1 (we asked for one block) and return -EIO for more than 1 block. But I really don't see a reason for doing it - all we need is the block# here. (doesn't matter how many blocks are mapped). ext3_get_blocks_handle() returns number of blocks it mapped. It returns 0 in case of HOLE. ext3_getblk() should handle HOLE properly (currently its dumping warning stack and returning -EIO). Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-07[XFS] Fix a bad pointer dereference in the quota statvfs handling.Nathan Scott1-1/+1
SGI-PV: 955993 SGI-Modid: xfs-linux-melb:xfs-kern:26934a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: David Chatterton <chatz@sgi.com>
2006-09-07[XFS] Fix xfs_splice_write() so appended data gets to disk.David Chinner1-0/+16
xfs_splice_write() failed to update the on disk inode size when extending the so when the file was closed the range extended by splice was truncated off. Hence any region of a file written to by splice would end up as a hole full of zeros. SGI-PV: 955939 SGI-Modid: xfs-linux-melb:xfs-kern:26920a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: David Chatterton <chatz@sgi.com>
2006-09-07[XFS] Fix ABBA deadlock between i_mutex and iolock. Avoid callingLachlan McIlroy2-10/+19
__blockdev_direct_IO for the DIO_OWN_LOCKING case for direct I/O reads since it drops and reacquires the i_mutex while holding the iolock and this violates the locking order. SGI-PV: 955696 SGI-Modid: xfs-linux-melb:xfs-kern:26898a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chatterton <chatz@sgi.com>
2006-09-07[XFS] Prevent free space oversubscription and xfssyncd looping.David Chinner4-31/+40
The fix for recent ENOSPC deadlocks introduced certain limitations on allocations. The fix could cause xfssyncd to loop endlessly if we did not leave some space free for the allocator to work correctly. Basically, we needed to ensure that we had at least 4 blocks free for an AG free list and a block for the inode bmap btree at all times. However, this did not take into account the fact that each AG has a free list that needs 4 blocks. Hence any filesystem with more than one AG could cause oversubscription of free space and make xfssyncd spin forever trying to allocate space needed for AG freelists that was not available in the AG. The following patch reserves space for the free lists in all AGs plus the inode bmap btree which prevents oversubscription. It also prevents those blocks from being reported as free space (as they can never be used) and makes the SMP in-core superblock accounting code and the reserved block ioctl respect this requirement. SGI-PV: 955674 SGI-Modid: xfs-linux-melb:xfs-kern:26894a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: David Chatterton <chatz@sgi.com>
2006-09-06[CIFS] Fix CIFS readdir access denied when SE Linux enabledSteve French1-4/+7
CIFS had one path in which dentry was instantiated before the corresponding inode metadata was filled in. Fixes Redhat bugzilla bug #163493 Signed-off-by: Steve French <sfrench@us.ibm.com> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
2006-09-06[PATCH] add missing desctiption in super.cHenrik Kretzschmar1-0/+1
Adds kernel-doc for alloc_super() type in fs/super.c. Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-01[PATCH] manage-jbd-its-own-slab fixBadari Pulavarty1-1/+1
Missed a place where I forgot to convert kfree() to kmem_cache_free() as part of jbd-manage-its-own-slab changes. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30Merge branch 'master' of ↵David Woodhouse112-1294/+3018
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
2006-08-30[XFS] Fix char size overflow in bmap_alloc call for unwritten extentAdrian Bunk1-1/+1
conversion. Since bma.conv is a char and XFS_BMAPI_CONVERT is 0x1000, bma.conv was always assigned zero. Spotted by the GNU C compiler (SVN version). SGI-PV: 947312 SGI-Modid: xfs-linux-melb:xfs-kern:26887a Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds17-260/+576
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Do not send Query All EAs SMB when mount option nouser_xattr [CIFS] endian errors in lanman protocol support [CIFS] Fix oops in cifs_close due to unitialized lock sem and list in [CIFS] Fix oops when negotiating lanman and no password specified [CIFS] [CIFS] Allow cifsd to suspend if connection is lost [CIFS] Make midState usage more consistent [CIFS] spinlock protect read of last srv response time in timeout path [CIFS] Do not time out posix brl requests when using new posix setfileinfo
2006-08-27[PATCH] /proc/meminfo: don't put spaces in namesAndrew Morton1-1/+1
None of the other /proc/meminfo lines have a space in the identifier. This post-2.6.17 addition has the potential to break existing parsers, so use an underscore instead (like Committed_AS). Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] fix up lockdep trace in fs/exec.cDave Jones1-1/+1
This fixes the locking error noticed by lockdep: ============================================= [ INFO: possible recursive locking detected ] --------------------------------------------- init/1 is trying to acquire lock: (&sighand->siglock){....}, at: [<c047a78a>] flush_old_exec+0x3ae/0x859 but task is already holding lock: (&sighand->siglock){....}, at: [<c047a77a>] flush_old_exec+0x39e/0x859 other info that might help us debug this: 2 locks held by init/1: #0: (tasklist_lock){..--}, at: [<c047a76a>] flush_old_exec+0x38e/0x859 #1: (&sighand->siglock){....}, at: [<c047a77a>] flush_old_exec+0x39e/0x859 stack backtrace: [<c04051e1>] show_trace_log_lvl+0x54/0xfd [<c040579d>] show_trace+0xd/0x10 [<c04058b6>] dump_stack+0x19/0x1b [<c043b33a>] __lock_acquire+0x773/0x997 [<c043bacf>] lock_acquire+0x4b/0x6c [<c060630b>] _spin_lock+0x19/0x28 [<c047a78a>] flush_old_exec+0x3ae/0x859 [<c0498053>] load_elf_binary+0x4aa/0x1628 [<c0479cab>] search_binary_handler+0xa7/0x24e [<c047b577>] do_execve+0x15b/0x1f9 [<c04022b4>] sys_execve+0x29/0x4d [<c0403faf>] syscall_call+0x7/0xb Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] lockdep: annotate reiserfsIngo Molnar1-1/+1
reiserfs seems to have another locking level layer for the i_mutex due to the xattrs-are-a-directory thing. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] Manage jbd allocations from its own slabsBadari Pulavarty3-13/+94
JBD currently allocates commit and frozen buffers from slabs. With CONFIG_SLAB_DEBUG, its possible for an allocation to cross the page boundary causing IO problems. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=200127 So, instead of allocating these from regular slabs - manage allocation from its own slabs and disable slab debug for these slabs. [akpm@osdl.org: cleanups] Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] eventpoll.c compile fixMasoud Asgharifard Sharbiani1-2/+2
Fix two compile failures in eventpoll.c code which would happen if DEBUG_EPOLL is bigger than zero. Signed-off-by: Masoud Sharbiani <masouds@google.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] ufs: truncate correctionEvgeniy Dushistov1-52/+25
1) When we allocated last fragment in ufs_truncate, we read page, check if block mapped to address, and if not trying to allocate it. This is wrong behaviour, fragment may be NOT allocated, but mapped, this happened because of "block map" function not checked allocated fragment or not, it just take address of the first fragment in the block, add offset of fragment and return result, this is correct behaviour in almost all situation except call from ufs_truncate. 2) Almost all implementation of UFS, which I can investigate have such "defect": if you have full disk, and try truncate file, for example 3GB to 2MB, and have hole in this region, truncate return -ENOSPC. I tried evade from this problem, but "block allocation" algorithm is tied to right value of i_lastfrag, and fix of this corner case may slow down of ordinaries scenarios, so this patch makes behavior of "truncate" operations similar to what other UFS implementations do. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] ufs: write to hole in big fileEvgeniy Dushistov1-14/+21
On UFS, this scenario: open(O_TRUNC) lseek(1024 * 1024 * 80) write("A") lseek(1024 * 2) write("A") may cause access to invalid address. This happened because of "goal" is calculated in wrong way in block allocation path, as I see this problem exists also in 2.4. We use construction like this i_data[lastfrag], i_data array of pointers to direct blocks, indirect and so on, it has ceratain size ~20 elements, and lastfrag may have value for example 40000. Also this patch fixes related to handling such scenario issues, wrong zeroing metadata, in case of block(not fragment) allocation, and wrong goal calculation, when we allocate block Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] ext3 filesystem bogus ENOSPC with reservation fixMingming Cao1-3/+3
To handle the earlier bogus ENOSPC error caused by filesystem full of block reservation, current code falls back to non block reservation, starts to allocate block(s) from the goal allocation block group as if there is no block reservation. Current code needs to re-load the corresponding block group descriptor for the initial goal block group in this case. The patch fixes this. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] ext2: prevent div-by-zero on corrupted fsAndries Brouwer1-1/+1
Mounting an ext2 filesystem with zero s_inodes_per_group will cause a divide error. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] Fix for minix crashAndries Brouwer1-3/+10
Mounting a (corrupt) minix filesystem with zero s_zmap_blocks gives a spectacular crash on my 2.6.17.8 system, no doubt because minix/inode.c does an unconditional minix_set_bit(0,sbi->s_zmap[0]->b_data); [akpm@osdl.org: make labels conistent while we're there] Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27[PATCH] lockdep: fix blkdev_open() warningPeter Zijlstra1-58/+56
On Wed, 2006-08-09 at 07:57 +0200, Rolf Eike Beer wrote: > ============================================= > [ INFO: possible recursive locking detected ] > --------------------------------------------- > parted/7929 is trying to acquire lock: > (&bdev->bd_mutex){--..}, at: [<c105eb8d>] __blkdev_put+0x1e/0x13c > > but task is already holding lock: > (&bdev->bd_mutex){--..}, at: [<c105eec6>] do_open+0x72/0x3a8 > > other info that might help us debug this: > 1 lock held by parted/7929: > #0: (&bdev->bd_mutex){--..}, at: [<c105eec6>] do_open+0x72/0x3a8 > stack backtrace: > [<c1003aad>] show_trace_log_lvl+0x58/0x15b > [<c100495f>] show_trace+0xd/0x10 > [<c1004979>] dump_stack+0x17/0x1a > [<c102dee5>] __lock_acquire+0x753/0x99c > [<c102e3b0>] lock_acquire+0x4a/0x6a > [<c1204501>] mutex_lock_nested+0xc8/0x20c > [<c105eb8d>] __blkdev_put+0x1e/0x13c > [<c105ecc4>] blkdev_put+0xa/0xc > [<c105f18a>] do_open+0x336/0x3a8 > [<c105f21b>] blkdev_open+0x1f/0x4c > [<c1057b40>] __dentry_open+0xc7/0x1aa > [<c1057c91>] nameidata_to_filp+0x1c/0x2e > [<c1057cd1>] do_filp_open+0x2e/0x35 > [<c1057dd7>] do_sys_open+0x38/0x68 > [<c1057e33>] sys_open+0x16/0x18 > [<c1002845>] sysenter_past_esp+0x56/0x8d OK, I'm having a look here; its all new to me so bear with me. blkdev_open() calls do_open(bdev, ...,BD_MUTEX_NORMAL) and takes mutex_lock_nested(&bdev->bd_mutex, BD_MUTEX_NORMAL) then something fails, and we're thrown to: out_first: where if (bdev != bdev->bd_contains) blkdev_put(bdev->bd_contains) which is __blkdev_put(bdev->bd_contains, BD_MUTEX_NORMAL) which does mutex_lock_nested(&bdev->bd_contains->bd_mutex, BD_MUTEX_NORMAL) <--- lockdep trigger When going to out_first, dbev->bd_contains is either bdev or whole, and since we take the branch it must be whole. So it seems to me the following patch would be the right one: [akpm@osdl.org: compile fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-26[DISKLABEL] SUN: Fix signed int usage for sector countJeff Mahoney1-1/+1
The current sun disklabel code uses a signed int for the sector count. When partitions larger than 1 TB are used, the cast to a sector_t causes the partition sizes to be invalid: # cat /proc/paritions | grep sdan 66 112 2146435072 sdan 66 115 9223372036853660736 sdan3 66 120 9223372036853660736 sdan8 This patch switches the sector count to an unsigned int to fix this. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-24Merge branch 'fixes' of git://git.linux-nfs.org/pub/linux/nfs-2.6Greg Kroah-Hartman8-39/+80
2006-08-24VFS: Remove redundant open-coded mode bit checks in open_exec().Trond Myklebust1-2/+0
The check in open_exec() for inode->i_mode & 0111 has been made redundant by the fix to permission(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 1d3741c5d991686699f100b65b9956f7ee7ae0ae commit)
2006-08-24VFS: Remove redundant open-coded mode bit check in prepare_binfmt().Trond Myklebust1-6/+0
The check in prepare_binfmt() for inode->i_mode & 0111 is redundant, since open_exec() will already have done that. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 822dec482ced07af32c378cd936d77345786572b commit)
2006-08-24VFS: Fix access("file", X_OK) in the presence of ACLsTrond Myklebust1-1/+8
Currently, the access() call will return incorrect information on NFS if there exists an ACL that grants execute access to the user on a regular file. The reason the information is incorrect is that the VFS overrides this execute access in open_exec() by checking (inode->i_mode & 0111). This patch propagates the VFS execute bit check back into the generic permission() call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 64cbae98848c4c99851cb0a405f0b4982cd76c1e commit)
2006-08-24NFSv4: Add v4 exception handling for the ACL functions.Trond Myklebust1-2/+27
This is needed in order to handle any NFS4ERR_DELAY errors that might be returned by the server. It also ensures that we map the NFSv4 errors before they are returned to userland. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 71c12b3f0abc7501f6ed231a6d17bc9c05a238dc commit)
2006-08-24NFS: Check lengths more thoroughly in NFS4 readdir XDR decodeDavid Howells1-10/+11
Check the bounds of length specifiers more thoroughly in the XDR decoding of NFS4 readdir reply data. Currently, if the server returns a bitmap or attr length that causes the current decode point pointer to wrap, this could go undetected (consider a small "negative" length on a 32-bit machine). Also add a check into the main XDR decode handler to make sure that the amount of data is a multiple of four bytes (as specified by RFC-1014). This makes sure that we can do u32* pointer subtraction in the NFS client without risking an undefined result (the result is undefined if the pointers are not correctly aligned with respect to one another). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 5861fddd64a7eaf7e8b1a9997455a24e7f688092 commit)
2006-08-24NFS: Fix issue with EIO on NFS readTrond Myklebust1-8/+15
The problem is that we may be caching writes that would extend the file and create a hole in the region that we are reading. In this case, we need to detect the eof from the server, ensure that we zero out the pages that are part of the hole and mark them as up to date. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 856b603b01b99146918c093969b6cb1b1b0f1c01 commit)
2006-08-24LOCKD: Fix a deadlock in nlm_traverse_files()Trond Myklebust1-6/+9
nlm_traverse_files() is not allowed to hold the nlm_file_mutex while calling nlm_inspect file, since it may end up calling nlm_release_file() when releaseing the blocks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from e558d3cde986e04f68afe8c790ad68ef4b94587a commit)
2006-08-24SUNRPC: Fix dentry refcounting issues with users of rpc_pipefsTrond Myklebust1-1/+0
rpc_unlink() and rpc_rmdir() will dput the dentry reference for you. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from a05a57effa71a1f67ccbfc52335c10c8b85f3f6a commit)
2006-08-24SUNRPC: make rpc_unlink() take a dentry argument instead of a pathTrond Myklebust1-2/+1
Signe-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 88bf6d811b01a4be7fd507d18bf5f1c527989089 commit)
2006-08-24VFS: add lookup hint for network file systemsASANO Masahiro1-0/+2
I'm trying to speeding up mkdir(2) for network file systems. A typical mkdir(2) calls two inode_operations: lookup and mkdir. The lookup operation would fail with ENOENT in common case. I think it is unnecessary because the subsequent mkdir operation can check it. In case of creat(2), lookup operation is called with the LOOKUP_CREATE flag, so individual filesystem can omit real lookup. e.g. nfs_lookup(). Here is a sample patch which uses LOOKUP_CREATE and O_EXCL on mkdir, symlink and mknod. This uses the gadget for creat(2). And here is the result of a benchmark on NFSv3. mkdir(2) 10,000 times: original 50.5 sec patched 29.0 sec Signed-off-by: ASANO Masahiro <masano@tnes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from fab7bf44449b29f9d5572a5dd8adcf7c91d5bf0f commit)
2006-08-24NFS: Fix a potential deadlock in nfs_release_pageNikita Danilov1-1/+7
nfs_wb_page() waits on request completion and, as a result, is not safe to be called from nfs_release_page() invoked by VM scanner as part of GFP_NOFS allocation. Fix possible deadlock by analyzing gfp mask and refusing to release page if __GFP_FS is not set. Signed-off-by: Nikita Danilov <danilov@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 374d969debfb290bafcb41d28918dc6f7e43ce31 commit)
2006-08-22Fix possible UDF deadlock and memory corruption (CVE-2006-4145)Jan Kara2-26/+40
UDF code is not really ready to handle extents larger that 1GB. This is the easy way to forbid creating those. Also truncation code did not count with the case when there are no extents in the file and we are extending the file. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-21[PATCH] uninline ioprio_best()Oleg Nesterov1-0/+23
Saves 376 bytes (5 callers) for me. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-08-21[PATCH] Fix current_io_context() vs set_task_ioprio() raceOleg Nesterov1-0/+3
I know nothing about io scheduler, but I suspect set_task_ioprio() is not safe. current_io_context() initializes "struct io_context", then sets ->io_context. set_task_ioprio() running on another cpu may see the changes out of order, so ->set_ioprio(ioc) may use io_context which was not initialized properly. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-08-21[PATCH] sys_ioprio_set: minor do_each_thread+break fixOleg Nesterov1-2/+2
From include/linux/sched.h: * Careful: do_each_thread/while_each_thread is a double loop so * 'break' will not work as expected - use goto instead. */ Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-08-16[CIFS] Do not send Query All EAs SMB when mount option nouser_xattrSteve French3-3/+8
specified Pointed out by Bjoern Jacke Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-08-15fs/jffs2/xattr.c: remove dead codeAdrian Bunk1-1/+0
This patch removes some obvious dead code spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2006-08-15Merge branch 'upstream-linus' of ↵Greg Kroah-Hartman7-62/+263
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
2006-08-15[CIFS] endian errors in lanman protocol supportSteve French3-3/+3
le16 compared to host-endian constant u8 fed to le32_to_cpu() le16 compared to host-endian constant Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-08-15[CIFS] Fix oops in cifs_close due to unitialized lock sem and list inSteve French3-7/+16
new POSIX locking code Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-08-14[PATCH] fcntl(F_SETSIG) fixTrond Myklebust1-2/+4
fcntl(F_SETSIG) no longer works on leases because lease_release_private_callback() gets called as the lease is copied in order to initialise it. The problem is that lease_alloc() performs an unnecessary initialisation, which sets the lease_manager_ops. Avoid the problem by allocating the target lease structure using locks_alloc_lock(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14[PATCH] fuse: fix error case in fuse_readpagesAlexander Zarochentsev1-2/+8
Don't let fuse_readpages leave the @pages list not empty when exiting on error. [akpm@osdl.org: kernel-doc fixes] Signed-off-by: Alexander Zarochentsev <zam@namesys.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14[PATCH] initialize parts of udf inode earlier in createDan Bastone1-0/+7
Eric says: > I saw an oops down this path when trying to create a new file on a UDF > filesystem which was internally marked as readonly, but mounted rw: > > udf_create > udf_new_inode > new_inode > alloc_inode > udf_alloc_inode > udf_new_block > returns EIO due to readonlyness > iput (on error) I ran into the same issue today, but when listing a directory with invalid/corrupt entries: udf_lookup udf_iget get_new_inode_fast alloc_inode udf_alloc_inode __udf_read_inode fails for any reason iput (on error) ... The following patch to udf_alloc_inode() should take care of both (and other similar) cases, but I've only tested it with udf_lookup(). Signed-off-by: Dan Bastone <dan@pwienterprises.com> Cc: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14[PATCH] adfs error message fixAndrew Morton1-1/+1
Don't use NULL as a printf control string. Fixes bug #6889. Cc: Ralph Corderoy <ralph@inputplus.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-11[CIFS] Fix oops when negotiating lanman and no password specifiedSteve French1-1/+2
Pointed out by Guenter Kukkukk Signed-of-by: Steve French <sfrench@us.ibm.com> (cherry picked from bbf33d512da608c7221fec42b56b9ef89c25a5ee commit)
2006-08-11[CIFS]Jeremy Allison7-296/+512
Allow Windows blocking locks to be cancelled via a CANCEL_LOCK call. TODO - restrict this to servers that support NT_STATUS codes (Win9x will probably not support this call). Signed-off-by: Jeremy Allison <jra@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com> (cherry picked from 570d4d2d895569825d0d017d4e76b51138f68864 commit)
2006-08-11[CIFS] Allow cifsd to suspend if connection is lostSteve French1-0/+1
Make cifsd allow us to suspend if it has lost the connection with a server Ref: http://bugzilla.kernel.org/show_bug.cgi?id=6811 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Steve French <sfrench@us.ibm.com> (cherry picked from 27bd6cd87b0ada66515ad49bc346d77d1e9d3e05 commit)
2006-08-11[CIFS] Make midState usage more consistentSteve French1-6/+6
Although harmless, we were sometimes treating midState like it contained flags but they are exclusive states, and this makes that more clear. Signed-off-by: Jeremy Allison <jra@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com> (cherry picked from 586c057c3a68dd6ae0f3ba94fbf76798b1558074 commit)
2006-08-11[CIFS] spinlock protect read of last srv response time in timeout pathSteve French1-23/+76
Signed-off-by: Jeremy Allison <jra@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com> (cherry picked from b33a3f55e54fd210fc043eafcf83728b03bc9e02 commit)
2006-08-11[CIFS] Do not time out posix brl requests when using new posix setfileinfoSteve French8-18/+49
request and do not time out slow requests to a server that is still responding well to other threads Suggested by jra of Samba team Signed-off-by: Steve French <sfrench@us.ibm.com> (cherry picked from 89b57148115479eef074b8d3f86c4c86c96ac969 commit)
2006-08-09Merge git://oss.sgi.com:8090/nathans/xfs-rc-2.6Greg Kroah-Hartman1-49/+54
2006-08-10[XFS] Fix xfs_free_extent related NULL pointer dereference.Nathan Scott1-49/+54
We recently fixed an out-of-space deadlock in XFS, and part of that fix involved the addition of the XFS_ALLOC_FLAG_FREEING flag to some of the space allocator calls to indicate they're freeing space, not allocating it. There was a missed xfs_alloc_fix_freelist condition test that did not correctly test "flags". The same test would also test an uninitialised structure field (args->userdata) and depending on its value either would or would not return early with a critical buffer pointer set to NULL. This fixes that up, adds asserts to several places to catch future botches of this nature, and skips sections of xfs_alloc_fix_freelist that are irrelevent for the space-freeing case. SGI-PV: 955303 SGI-Modid: xfs-linux-melb:xfs-kern:26743a Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-08-07ocfs2: allocation hintsMark Fasheh4-13/+145
Record the most recently used allocation group on the allocation context, so that subsequent allocations can attempt to optimize for contiguousness. Local alloc especially should benefit from this as the current chain search tends to let it spew across the disk. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-08-07ocfs2: better group descriptor consistency checksMark Fasheh1-20/+94
Try to catch corrupted group descriptors with some stronger checks placed in a couple of strategic locations. Detect a failed resizefs and refuse to allocate past what bitmap i_clusters allows. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-08-07ocfs2: limit cluster bitmap information saved at mountMark Fasheh2-3/+6
We were storing cluster count on the ocfs2_super structure, but never actually using it so remove that. Also, we don't want to populate the uptodate cache with the unlocked block read - it is technically safe as is, but we should change it for correctness. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-08-07[PATCH] fs/ocfs2/dlm/dlmmaster.c: unexport dlm_migrate_lockresAdrian Bunk1-1/+0
This patch removes the unused EXPORT_SYMBOL_GPL(dlm_migrate_lockres). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-08-07ocfs2: fix check for locally granted state during dlmunlock()Kurt Hackel1-1/+1
If a process requests a lock cancel but the lock has been remotely granted already then there is no need to send the cancel message. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-08-07ocfs2: do not modify lksb->status in the unlock astKurt Hackel1-25/+12
This can race with other ast notification, which can cause bad status values to propagate into the unlock ast. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-08-07ocfs2: Fix lvb corruptionKurt Hackel1-0/+6
Properly ignore LVB flags during a PR downconvert. This avoids an illegal lvb update. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-08-06Merge branch 'master' of /home/greg/linux/git/torvalds-2.6/Greg Kroah-Hartman7-30/+41
2006-08-06[PATCH] udf: initialize parts of inode earlier in createEric Sandeen1-5/+6
I saw an oops down this path when trying to create a new file on a UDF filesystem which was internally marked as readonly, but mounted rw: udf_create udf_new_inode new_inode alloc_inode udf_alloc_inode udf_new_block returns EIO due to readonlyness iput (on error) udf_put_inode udf_discard_prealloc udf_next_aext udf_current_aext udf_get_fileshortad OOPS the udf_discard_prealloc() path was examining uninitialized fields of the udf inode. udf_discard_prealloc() already has this code to short-circuit the discard path if no extents are preallocated: if (UDF_I_ALLOCTYPE(inode) == ICBTAG_FLAG_AD_IN_ICB || inode->i_size == UDF_I_LENEXTENTS(inode)) { return; } so if we initialize UDF_I_LENEXTENTS(inode) = 0 earlier in udf_new_inode, we won't try to free the (not) preallocated blocks, since this will match the i_size = 0 set when the inode was initialized. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06[PATCH] reiserfs_write_full_page() should not get_block past eofChris Mason1-2/+12
reiserfs_write_full_page does zero bytes in the file past eof, but it may call get_block on those buffers as well. On machines where the page size is larger than the blocksize, this can result in mmaped files incorrectly growing up to a block boundary during writepage. The fix is to avoid calling get_block for any blocks that are entirely past eof Signed-off-by: Chris Mason <mason@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06[PATCH] fix reiserfs lock inversion of bkl vs inode semaphoreChris Mason2-2/+2
The correct lock ordering is inode lock -> BKL Signed-off-by: Chris Mason <mason@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06[PATCH] Fix BeFS slab corruptionDiego Calleja1-2/+9
In bugzilla #6941, Jens Kilian reported: "The function befs_utf2nls (in fs/befs/linuxvfs.c) writes a 0 byte past the end of a block of memory allocated via kmalloc(), leading to memory corruption. This happens only for filenames which are pure ASCII and a multiple of 4 bytes in length. [...] Without DEBUG_SLAB, this leads to further corruption and hard lockups; I believe this is the bug which has made kernels later than 2.6.8 unusable for me. (This must be due to changes in memory management, the bug has been in the BeFS driver since the time it was introduced (AFAICT).) Steps to reproduce: Create a directory (in BeOS, naturally :-) with files named, e.g., "1", "22", "333", "4444", ... Mount it in Linux and do an "ls" or "find"" This patch implements the suggested fix. Credits to Jens Kilian for debugging the problem and finding the right fix. Signed-off-by: Diego Calleja <diegocg@gmail.com> Cc: Jens Kilian <jjk@acm.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06[PATCH] i_mutex does not need to be locked in reiserfs_delete_inode()Alexander Zarochentsev1-10/+2
Fixes an i_mutex-inside-i_mutex lockdep nasty. Signed-off-by: Alexander Zarochentsev <zam@namesys.com> Cc: <reiserfs-dev@namesys.com> Cc: Hans Reiser <reiser@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06[PATCH] ufs: handle truncated pagesEvgeniy Dushistov2-3/+3
ufs_get_locked_page is called twice in ufs code, one time in ufs_truncate path(we allocated last block), and another time when fragments are reallocated. In ideal world in the second case on allocation/free block layer we should not know that things like `truncate' exists, but now with such crutch like ufs_get_locked_page we can (or should?) skip truncated pages. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06[PATCH] ufs: ufs_get_locked_page() race fixEvgeniy Dushistov1-7/+8
As discussed earlier: http://lkml.org/lkml/2006/6/28/136 this patch fixes such issue: `ufs_get_locked_page' takes page from cache after that `vmtruncate' takes page and deletes it from cache `ufs_get_locked_page' locks page, and reports about EIO error. Also because of find_lock_page always return valid page or NULL, we have no need to check it if page not NULL. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-05Merge ../torvalds-2.6/Greg Kroah-Hartman4-12/+8
2006-08-03Merge branch 'for-linus' of ↵Greg Kroah-Hartman3-15/+120
git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6
2006-08-03NLM/lockd: remove b_doneJ. Bruce Fields1-9/+3
We never actually set the b_done field any more; it's always zero. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from af8412d4283ef91356e65e0ed9b025b376aebded commit)
2006-08-03NFS: make 2 functions staticAdrian Bunk2-2/+2
nfs_writedata_free() and nfs_readdata_free() can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 5e1ce40f0c3c8f67591aff17756930d7a18ceb1a commit)
2006-08-03NFS: Release dcache_lock in an error path of nfs_pathJosh Triplett1-1/+3
In one of the error paths of nfs_path, it may return with dcache_lock still held; fix this by adding and using a new error path Elong_unlock which unlocks dcache_lock. Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from f4b90b43677fb23297c56802c3056fc304f988d9 commit)
2006-08-03[PATCH] don't bother with aux entires for dummy contextAl Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-08-03[PATCH] fix missed create event for directory auditAmy Griffis1-1/+1
When an object is created via a symlink into an audited directory, audit misses the event due to not having collected the inode data for the directory. Modify __audit_inode_child() to copy the parent inode data if a parent wasn't found in audit_names[]. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-08-03[PATCH] fix faulty inode data collection for open() with O_CREATAmy Griffis1-0/+2
When the specified path is an existing file or when it is a symlink, audit collects the wrong inode number, which causes it to miss the open() event. Adding a second hook to the open() path fixes this. Also add audit_copy_inode() to consolidate some code. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-07-31[PATCH] 9p: fix fid behavior on failed removeEric Van Hensbergen1-3/+3
Based on a bug report from Russ Ross <russruss@gmail.com> According to the spec: "The remove request asks the file server both to remove the file represented by fid and to clunk the fid, even if the remove fails." but the Linux client seems to expect the fid to be valid after a failed remove attempt. Specifically, I'm getting this behavior when attempting to remove a non-empty directory. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] 9p: fix marshalling bug in tcreate with empty extension fieldRuss Ross1-2/+4
Signed-off-by: Russ Ross <russross@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] ext3 -nobh option causes oopsBadari Pulavarty1-3/+3
For files other than IFREG, nobh option doesn't make sense. Modifications to them are journalled and needs buffer heads to do that. Without this patch, we get kernel oops in page_buffers(). Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] freevxfs: Add missing lock_kernel() to vxfs_readdirJosh Triplett1-0/+2
Commit 7b2fd697427e73c81d5fa659efd91bd07d303b0e in the historical GIT tree stopped calling the readdir member of a file_operations struct with the big kernel lock held, and fixed up all the readdir functions to do their own locking. However, that change added calls to unlock_kernel() in vxfs_readdir, but no call to lock_kernel(). Fix this by adding a call to lock_kernel(). Signed-off-by: Josh Triplett <josh@freedesktop.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] fuse: fix typoMiklos Szeredi1-2/+2
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] fuse: use jiffies_64Miklos Szeredi3-10/+38
It is entirely possible (though rare) that jiffies half-wraps around, while a dentry/inode remains in the cache. This could mean that the dentry/inode is not invalidated for another half wraparound-time. To get around this problem, use 64-bit jiffies. The only problem with this is that dentry->d_time is 32 bits on 32-bit archs. So use d_fsdata as the high 32 bits. This is an ugly hack, but far simpler, than having to allocate private data just for this purpose. Since 64-bit jiffies can be assumed never to wrap around, simple comparison can be used, and a zero time value can represent "invalid". Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] fuse: fix zero timeoutMiklos Szeredi1-2/+5
An attribute and entry timeout of zero should mean, that the entity is invalidated immediately after the operation. Previously invalidation only happened at the next clock tick. Reported and tested by Craig Davies. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] ufs: remove incorrect unlock_kernel from failure path in ufs_symlink()Josh Triplett1-1/+2
ufs_symlink, in one of its error paths, calls unlock_kernel without ever having called lock_kernel(); fix this by creating and jumping to a new label out_notlocked rather than the out label used after calling lock_kernel(). Signed-off-by: Josh Triplett <josh@freedesktop.org> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] efs: Remove incorrect unlock_kernel from failure path in ↵Josh Triplett1-1/+2
efs_symlink_readpage() If efs_symlink_readpage hits the -ENAMETOOLONG error path, it will call unlock_kernel without ever having called lock_kernel(); fix this by creating and jumping to a new label fail_notlocked rather than the fail label used after calling lock_kernel(). Signed-off-by: Josh Triplett <josh@freedesktop.org> Cc: Marcelo Tosatti <marcelo.tosatti@cyclades.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] Remove incorrect unlock_kernel from allocation failure path in ↵Josh Triplett1-3/+1
coda_open() Commit 398c53a757702e1e3a7a2c24860c7ad26acb53ed (in the historical GIT tree) moved the lock_kernel() in coda_open after the allocation of a coda_file_info struct, but left an unlock_kernel() in the allocation failure error path; remove it. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] inotify: fix deadlock found by lockdepArjan van de Ven1-1/+1
This is a real deadlock, a nice complex one: (warning: long explanation follows so that Andrew can have a complete patch description) it's an ABCDA deadlock: A iprune_mutex B inode->inotify_mutex C ih->mutex D dev->ev_mutex The AB relationship comes straight from invalidate_inodes() int invalidate_inodes(struct super_block * sb) { int busy; LIST_HEAD(throw_away); mutex_lock(&iprune_mutex); spin_lock(&inode_lock); inotify_unmount_inodes(&sb->s_inodes); where inotify_umount_inodes() takes the mutex_lock(&inode->inotify_mutex); The BC relationship comes directly from inotify_find_update_watch(): s32 inotify_find_update_watch(struct inotify_handle *ih, struct inode *inode, u32 mask) { ... mutex_lock(&inode->inotify_mutex); mutex_lock(&ih->mutex); The CD relationship comes from inotify_rm_wd: inotify_rm_wd does mutex_lock(&inode->inotify_mutex); mutex_lock(&ih->mutex) and then calls inotify_remove_watch_locked() which calls notify_dev_queue_event() which does mutex_lock(&dev->ev_mutex); (this strictly is a BCD relationship) The DA relationship comes from the most interesting part: [<ffffffff8022d9f2>] shrink_icache_memory+0x42/0x270 [<ffffffff80240dc4>] shrink_slab+0x11d/0x1c9 [<ffffffff802b5104>] try_to_free_pages+0x187/0x244 [<ffffffff8020efed>] __alloc_pages+0x1cd/0x2e0 [<ffffffff8025e1f8>] cache_alloc_refill+0x3f8/0x821 [<ffffffff8020a5e5>] kmem_cache_alloc+0x85/0xcb [<ffffffff802db027>] kernel_event+0x2e/0x122 [<ffffffff8021d61c>] inotify_dev_queue_event+0xcc/0x140 inotify_dev_queue_event schedules a kernel_event which does a kmem_cache_alloc( , GFP_KERNEL) which may try to shrink slabs, including the inode cache .. which then takes iprune_mutex. And voila, there is an AB, a BC, a CD relationship (even a direct BCD), and also now a DA relationship -> a circular type AB-BA deadlock but involving 4 locks. The solution is simple: kernel_event() is NOT allowed to use GFP_KERNEL, but must use GFP_NOFS to not cause recursion into the VFS. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Robert Love <rml@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] enable mac partition label per default on pmacOlaf Hering1-1/+1
Enable mac partition table support per default also for a powermac config. Signed-off-by: Olaf Hering <olh@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] invalidate_bdev() speedupAndrew Morton1-1/+6
We can immediately bail from invalidate_bdev() if the blockdev has no pagecache. This solves the huge IPI storms which hald is causing on the big ia64 machines when it polls CDROM drives. Acked-by: Jes Sorensen <jes@sgi.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] knfsd: Fix stale file handle problem with subtree_checking.NeilBrown1-7/+13
A recent commit (7fc90ec93a5eb71f4b08403baf5ba7176b3ec6b1) moved the call to nfsd_setuser out of the 'find a dentry for a filehandle' branch of fh_verify so that it would always be called. This had the unfortunately side-effect of moving *after* the call to decode_fh, so the prober fsuid was not set when nfsd_acceptable was called, the 'permission' check did the wrong thing. This patch moves the nfsd_setuser call back where it was, and add as call in the other branch of the if. Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] ext3: avoid triggering ext3_error on bad NFS file handleNeil Brown2-8/+20
The inode number out of an NFS file handle gets passed eventually to ext3_get_inode_block() without any checking. If ext3_get_inode_block() allows it to trigger an error, then bad filehandles can have unpleasant effect - ext3_error() will usually cause a forced read-only remount, or a panic if `errors=panic' was used. So remove the call to ext3_error there and put a matching check in ext3/namei.c where inode numbers are read off storage. [akpm@osdl.org: fix off-by-one error] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: <stable@kernel.org> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Eric Sandeen <esandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28JFS: Fix bug in quota code. tmp_bh.b_size must be initializedDave Kleikamp1-0/+1
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2006-07-28[XFS] Ensure bulkstat from an invalid inode number gets caught always withNathan Scott1-7/+10
EINVAL. SGI-PV: 953819 SGI-Modid: xfs-linux-melb:xfs-kern:26629a Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-07-28[XFS] Fix a barrier related forced shutdown on mounts with quota enabled.Nathan Scott2-1/+8
SGI-PV: 912426 SGI-Modid: xfs-linux-melb:xfs-kern:26622a Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-07-28[XFS] Fix remount vs no/barrier options by ensuring we clear unwantedNathan Scott2-8/+8
flags from iclog buffers before submitting them for writing. SGI-PV: 954772 SGI-Modid: xfs-linux-melb:xfs-kern:26605a Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-07-28[XFS] All xfs_disk_dquot_t values are (as the name says) disk endian.Christoph Hellwig1-6/+13
Before putting them into struct statfs they should be endian-swapped. SGI-PV: 954580 SGI-Modid: xfs-linux-melb:xfs-kern:26550a Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-07-26JFS: Quota support broken, no quota_read and quota_writeDave Kleikamp3-15/+119
jfs_quota_read/write are very near duplicates of ext2_quota_read/write. Cleaned up jfs_get_block as long as I had to change it to be non-static. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2006-07-15Merge branch 'for-linus' of ↵Linus Torvalds2-18/+17
git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: JFS: commit_mutex cleanups
2006-07-15Don't allow chmod() on the /proc/<pid>/ filesLinus Torvalds1-1/+30
This just turns off chmod() on the /proc/<pid>/ files, since there is no good reason to allow it, and had we disallowed it originally, the nasty /proc race exploit wouldn't have been possible. The other patches already fixed the problem chmod() could cause, so this is really just some final mop-up.. This particular version is based off a patch by Eugene and Marcel which had much better naming than my original equivalent one. Signed-off-by: Eugene Teo <eteo@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-15Mark /proc MS_NOSUID and MS_NOEXECLinus Torvalds1-1/+1
Not that we really need this any more, but at the same time there's no reason not to do this. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: /proc export of aggregated block I/O delaysShailabh Nagar1-2/+4
Export I/O delays seen by a task through /proc/<tgid>/stats for use in top etc. Note that delays for I/O done for swapping in pages (swapin I/O) is clubbed together with all other I/O here (this is not the case in the netlink interface where the swapin I/O is kept distinct) [akpm@osdl.org: printk warning fix] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] add function documentation for register_chrdev()Rolf Eike Beer1-0/+22
Documentation for register_chrdev() was missing completely. [akpm@osdl.org: kerneldocification] Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] reiserfs: fix handling of device names with /'s in themJeff Mahoney1-4/+21
On systems with block devices containing a slash (virtual dasd, cciss, etc), reiserfs will fail to initialize /proc/fs/reiserfs/<dev> due to it being interpreted as a subdirectory. The generic block device code changes the / to ! for use in the sysfs tree. This patch uses that convention. Tested by making dm devices use dm/<number> rather than dm-<number> [akpm@osdl.org: name variables consistently] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] struct file leakageKirill Korotaev1-1/+7
2.6.16 leaks like hell. While testing, I found massive leakage (reproduced in openvz) in: *filp *size-4096 And 1 object leaks in *size-32 *size-64 *size-128 It is the fix for the first one. filp leaks in the bowels of namei.c. Seems, size-4096 is file table leaking in expand_fdtables. I have no idea what are the rest and why they show only accompanying another leaks. Some debugging structs? [akpm@osdl.org, Trond: remove the IS_ERR() check] Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Kirill Korotaev <dev@openvz.org> Cc: <stable@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14Relax /proc fix a bitLinus Torvalds1-1/+2
Clearign all of i_mode was a bit draconian. We only really care about S_ISUID/ISGID, after all. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14Fix nasty /proc vulnerabilityLinus Torvalds1-0/+1
We have a bad interaction with both the kernel and user space being able to change some of the /proc file status. This fixes the most obvious part of it, but I expect we'll also make it harder for users to modify even their "own" files in /proc. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds1-0/+1
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] CIFS_DEBUG2 depends on CIFS
2006-07-12[PATCH] alloc_fdtable() expansion fixAndrew Morton1-1/+1
We're supposed to go the next power of two if nfds==nr. Of `nr', not of `nfsd'. Spotted by Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-12[PATCH] /fs/proc/: 'larger than buffer size' memory accessed by clear_user()Adam B. Jerome1-1/+1
Address a potential 'larger than buffer size' memory access by clear_user(). Without this patch, this call to clear_user() can attempt to clear too many (tsz) bytes resulting in a wrong (-EFAULT) return code by read_kcore(). Signed-off-by: Adam B. Jerome <abj@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-12[PATCH] lockdep: annotate the sysfs i_mutex to be a separate classArjan van de Ven1-0/+12
sysfs has a different i_mutex lock order behavior for i_mutex than the other filesystems; sysfs i_mutex is called in many places with subsystem locks held. At the same time, many of the VFS locking rules do not apply to sysfs at all (cross directory rename for example). To untangle this mess (which gives false positives in lockdep), we're giving sysfs inodes their own class for i_mutex. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-12[PATCH] fix fdset leakageKirill Korotaev1-1/+3
When found, it is obvious. nfds calculated when allocating fdsets is rewritten by calculation of size of fdtable, and when we are unlucky, we try to free fdsets of wrong size. Found due to OpenVZ resource management (User Beancounters). Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-12Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds1-105/+133
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] splice: fix problems with sys_tee()
2006-07-10[PATCH] knfsd: nfsd4: add per-operation server statsShankar Anand2-0/+18
Add an nfs4 operations count array to nfsd_stats structure. The count is incremented in nfsd4_proc_compound() where all the operations are handled by the nfsv4 server. This count of individual nfsv4 operations is also entered into /proc filesystem. Signed-off-by: Shankar Anand<shanand@novell.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] Remove leftover ext3 acl declarationsAndreas Gruenbacher1-3/+0
These functions no longer exist; remove their declarations. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] fix weird logic in alloc_fdtable()Andrew Morton1-7/+3
There's a fairly obvious infinite loop in there. Also, use roundup_pow_of_two() rather than open-coding stuff. Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] FDPIC: Add coredump capability for the ELF-FDPIC binfmtDavid Howells1-2/+674
Add coredump capability for the ELF-FDPIC binfmt. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] FDPIC: Move roundup() into linux/kernel.hDavid Howells3-5/+0
Move the roundup() macro from binfmt_elf.c into linux/kernel.h as it's generally useful. [akpm@osdl.org: nuke all the other implementations] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] FDPIC: Adjust the ELF-FDPIC driver to conform more to the CodingStyleDavid Howells1-137/+168
Adjust the ELF-FDPIC binfmt driver to conform much more to the CodingStyle, silly though it may be. Further changes: (*) Drop the casts to long for addresses in kdebug() statements (they're unsigned long already). (*) Use extra variables to avoid expressions longer than 80 chars by splitting the statement into multiple statements and letting the compiler optimise them back together. (*) Eliminate duplicate call of ksize() when working out how much space was actually allocated for the stack. (*) Discard the commented-out load_shlib prototype and op pointer as this will not be supported in ELF-FDPIC for the foreseeable future. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] NOMMU: Fix execution off of ramfs with mmap()David Howells1-2/+2
Fix execution through the FDPIC binfmt of programs stored on ramfs by preventing the ramfs mmap() returning successfully on a private mapping of a ramfs file. This causes NOMMU mmap to make a copy of the mapped portion of the file and map that instead. This could be improved by granting direct mapping access to read-only private mappings for which the data is stored on a contiguous run of pages. However, this is only likely to be the case if the file was extended with truncate before being written. ramfs is left to map the file directly for shared mappings so that SYSV IPC and POSIX shared memory both still work. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] FDPIC: Fix FDPIC compile errorsDavid Howells1-0/+1
Fix FDPIC compile errors. (akpm: we suspect it fixes a warning) Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] mmap zero-length hugetlb file with PROT_NONE to protect a hugetlb ↵Zhang, Yanmin1-3/+1
virtual area Sometimes, applications need below call to be successful although "/mnt/hugepages/file1" doesn't exist. fd = open("/mnt/hugepages/file1", O_CREAT|O_RDWR, 0755); *addr = mmap(NULL, 0x1024*1024*256, PROT_NONE, 0, fd, 0); As for regular pages (or files), above call does work, but as for huge pages, above call would fail because hugetlbfs_file_mmap would fail if (!(vma->vm_flags & VM_WRITE) && len > inode->i_size). This capability on huge page is useful on ia64 when the process wants to protect one area on region 4, so other threads couldn't read/write this area. A famous JVM (Java Virtual Machine) implementation on IA64 needs the capability. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Hugh Dickins <hugh@veritas.com> [ Expand-on-mmap semantics again... this time matching normal fs's. wli ] Acked-by: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] fs/read_write.c: EXPORT_UNUSED_SYMBOLAdrian Bunk1-1/+1
This patch marks an unused export as EXPORT_UNUSED_SYMBOL. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] partitions: let partitions inherit policy from diskPeter Oberparleiter1-0/+1
Change the partition code in fs/partitions/check.c to initialize a newly detected partition's policy field with that of the containing block device (see patch below). My reasoning is that function set_disk_ro() in block/genhd.c modifies the policy field (read-only indicator) of a disk and all contained partitions. When a partition is detected after the call to set_disk_ro(), the policy field of this partition will currently not inherit the disk's policy field. This behavior poses a problem in cases where a block device can be 'logically de- and reactivated' like e.g. the s390 DASD driver because partition detection may run after the policy field has been modified. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Acked-by: Al Viro <viro@ftp.linux.org.uk> Makes-sense-to: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] reiserfs: fix journaling issue regarding fsync()Hisashi Hifumi1-1/+5
When write() extends a file(i_size is increased) and fsync() is called, change of inode must be written to journaling area through fsync(). But,currently the i_trans_id is not correctly updated when i_size is increased. So fsync() does not kick the journal writer. Reiserfs_file_write() already updates the transaction when blocks are allocated, but the case when i_size increases and new blocks are not added is not correctly treated. Following patch fix this bug. Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <mason@suse.com> Cc: Hans Reiser <reiser@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] splice: fix problems with sys_tee()Jens Axboe1-105/+133
Several issues noticed/fixed: - We cannot reliably block in link_pipe() while holding both input and output mutexes. So do preparatory checks before locking down both mutexes and doing the link. - The ipipe->nrbufs vs i check was bad, because we could have dropped the ipipe lock in-between. This causes us to potentially look at unknown buffers if we were racing with someone else reading this pipe. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-07-08[CIFS] CIFS_DEBUG2 depends on CIFSSteve French1-0/+1
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-07-07make fs/jffs2/nodelist.c:jffs2_obsolete_node_frag() staticAdrian Bunk2-2/+5
This patch makes the needlessly global jffs2_obsolete_node_frag() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-07-05Merge branch 'locks'Trond Myklebust3-44/+79
2006-07-05NFS: Optimise away an excessive GETATTR call when a file is symlinkedTrond Myklebust1-1/+3
In the case when compiling via a symlink tree, we want to ensure that the close-to-open GETATTR call is applied only to the final file, and not to the symlink. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05NFS: Fix NFS page_state usageTrond Myklebust1-3/+17
The introduction of the FLUSH_INVALIDATE argument to nfs_sync_inode_wait() does not clear the nr_unstable page state counter for pages that are being released. Also fix a longstanding similar bug when nfs_commit_list() fails. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05NLM,NFSv4: Wait on local locks before we put RPC calls on the wireTrond Myklebust2-12/+31
Use FL_ACCESS flag to test and/or wait for local locks before we try requesting a lock from the server Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05VFS: Add support for the FL_ACCESS flag to flock_lock_file()Trond Myklebust1-0/+5
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05NFSv4: Ensure nfs4_lock_expired() caches delegated locksTrond Myklebust1-3/+5
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05NLM,NFSv4: Don't put UNLOCK requests on the wire unless we hold a lockTrond Myklebust2-27/+22
Use the new behaviour of {flock,posix}_file_lock(F_UNLCK) to determine if we held a lock, and only send the RPC request to the server if this was the case. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05VFS: Allow caller to determine if BSD or posix locks were actually freedTrond Myklebust1-2/+16
Change posix_lock_file_conf(), and flock_lock_file() so that if called with an F_UNLCK argument, and the FL_EXISTS flag they will indicate whether or not any locks were actually freed by returning 0 or -ENOENT. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05Merge branch 'master' of /home/trondmy/kernel/linux-2.6/Trond Myklebust21-80/+220
2006-07-03[PATCH] uclinux: fix proc_task()/get_proc-task() namingGreg Ungerer1-1/+1
Fix changed name of proc_task() to get_proc_task(). Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03Merge git://git.infradead.org/mtd-2.6Linus Torvalds7-33/+29
* git://git.infradead.org/mtd-2.6: [JFFS2][XATTR] Fix memory leak in POSIX-ACL support fs/jffs2/: make 2 functions static [MTD] NAND: Fix broken sharpsl driver [JFFS2][XATTR] Fix xd->refcnt race condition MTD: kernel-doc fixes + additions MTD: fix all kernel-doc warnings [MTD] DOC: Fixup read functions and do a little cleanup
2006-07-03[PATCH] sched: cleanup, remove task_t, convert to struct task_structIngo Molnar1-2/+2
cleanup: remove task_t and convert all the uses to struct task_struct. I introduced it for the scheduler anno and it was a mistake. Conversion was mostly scripted, the result was reviewed and all secondary whitespace and style impact (if any) was fixed up by hand. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate blkdev nestingIngo Molnar1-15/+87
Teach special (recursive) locking code to the lock validator. Effects on non-lockdep kernels: - the introduction of the following function variants: extern struct block_device *open_partition_by_devnum(dev_t, unsigned); extern int blkdev_put_partition(struct block_device *); static int blkdev_get_whole(struct block_device *bdev, mode_t mode, unsigned flags); which on non-lockdep are the same as open_by_devnum(), blkdev_put() and blkdev_get(). - a subclass parameter to do_open(). [unused on non-lockdep] - a subclass parameter to __blkdev_put(), which is a new internal function for the main blkdev_put*() functions. [parameter unused on non-lockdep kernels, except for two sanity check WARN_ON()s] these functions carry no semantical difference - they only express object dependencies towards the lockdep subsystem. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate sb ->s_umountArjan van de Ven1-0/+1
The s_umount rwsem needs to be classified as per-superblock since it's perfectly legit to keep multiple of those recursively in the VFS locking rules. Has no effect on non-lockdep kernels. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate ->s_lockIngo Molnar1-2/+8
Teach special (per-filesystem) locking code to the lock validator. Minimal effect on non-lockdep kernels: one extra parameter to alloc_super(). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate the quota codeArjan van de Ven4-4/+4
The quota code plays interesting games with the lock ordering; to quote Jan: | i_mutex of inode containing quota file is acquired after all other | quota locks. i_mutex of all other inodes is acquired before quota | locks. Quota code makes sure (by resetting inode operations and | setting special flag on inode) that noone tries to enter quota code | while holding i_mutex on a quota file... The good news is that all of this special case i_mutex grabbing happens in the (per filesystem) low level quota write function. For this special case we need a new I_MUTEX_* nesting level, since this just entirely outside any of the regular VFS locking rules for i_mutex. I trust Jan on his blue eyes that this is not ever going to deadlock; and based on that the patch below is what it takes to inform lockdep of these very interesting new locking rules. The new locking rule for the I_MUTEX_QUOTA nesting level is that this is the deepest possible level of nesting for i_mutex, and that this only should be used in quota write (and possibly read) function of filesystems. This makes the lock ordering of the I_MUTEX_* levels: I_MUTEX_PARENT -> I_MUTEX_CHILD -> I_MUTEX_NORMAL -> I_MUTEX_QUOTA Has no effect on non-lockdep kernels. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate NTFS locking rulesIngo Molnar2-0/+64
NTFS uses lots of type-opaque objects which acquire their true identity runtime - so the lock validator needs to be helped in a couple of places to figure out object types. Many thanks to Anton Altaparmakov for giving lots of explanations about NTFS locking rules. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate i_mutexIngo Molnar1-10/+10
Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate dcacheIngo Molnar1-2/+2
Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate direct ioIngo Molnar1-2/+4
Teach special (rwsem-in-irq) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: locking init debugging improvementIngo Molnar1-1/+1
Locking init improvement: - introduce and use __SPIN_LOCK_UNLOCKED for array initializations, to pass in the name string of locks, used by debugging Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] binfmt_elf: fix checks for bad addressChuck Ebbert1-8/+7
Fix check for bad address; use macro instead of open-coding two checks. Taken from RHEL4 kernel update. From: Ernie Petrides <petrides@redhat.com> For background, the BAD_ADDR() macro should return TRUE if the address is TASK_SIZE, because that's the lowest address that is *not* valid for user-space mappings. The macro was correct in binfmt_aout.c but was wrong for the "equal to" case in binfmt_elf.c. There were two in-line validations of user-space addresses in binfmt_elf.c, which have been appropriately converted to use the corrected BAD_ADDR() macro in the patch you posted yesterday. Note that the size checks against TASK_SIZE are okay as coded. The additional changes that I propose are below. These are in the error paths for bad ELF entry addresses once load_elf_binary() has already committed to exec'ing the new image (following the tearing down of the task's original address space). The 1st hunk deals with the interp-side of the outer "if". There were two problems here. The printk() should be removed because this path can be triggered at will by a bogus interpreter image created and used by a malicious user. Further, the error code should not be ENOEXEC, because that causes the loop in search_binary_handler() to continue trying other exec handlers (twice, in fact). But it's too late for this to work correctly, because the user address space has already been torn down, and an exec() failure cannot be returned to the user code because the code no longer exists. The only recovery is to force a SIGSEGV, but it's best to terminate the search loop immediately. I somewhat arbitrarily chose EINVAL as a fallback error code, but any error returned by load_elf_interp() will override that (but this value will never be seen by user-space). The 2nd hunk deals with the non-interp-side of the outer "if". There were two problems here as well. The SIGSEGV needs to be forced, because a prior sigaction() syscall might have set the associated disposition to SIG_IGN. And the ENOEXEC should be changed to EINVAL as described above. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Ernie Petrides <petrides@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03Merge branch 'master' of /home/trondmy/kernel/linux-2.6/Trond Myklebust171-3505/+451
2006-07-02[PATCH] nfs: non-procfs build fixDominik Hackl1-2/+2
This fixes a bug in fs/nfs which makes it impossible to build nfs without having procfs enabled. Signed-off-by: Dominik Hackl <dominik@hackl.dhs.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02[JFFS2][XATTR] Fix memory leak in POSIX-ACL supportKaiGai Kohei3-5/+4
jffs2_clear_acl() which releases acl caches allocated by kmalloc() was defined but it was never called. Thus, we faced to the risk of memory leaking. This patch plugs jffs2_clear_acl() into jffs2_do_clear_inode(). It ensures to release acl cache when inode is cleared. Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-07-01[PATCH] reiserfs: update ctime and mtime on expanding truncateVladimir Saveliev1-0/+5
Reiserfs does not update ctime and mtime on expanding truncate via truncate(). This patch fixes it. Signed-off-by: Vladimir Saveliev <vs@namesys.com> Cc: Hans Reiser <reiser@namesys.com> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-01[PATCH] ufs: truncate should allocate block for last byteEvgeniy Dushistov6-65/+204
This patch fixes buggy behaviour of UFS in such kind of scenario: open(, O_TRUNC...) ftruncate(, 1024) ftruncate(, 0) Such a scenario causes ufs_panic and remount read-only. This happen because of according to specification UFS should always allocate block for last byte, and many parts of our implementation rely on this, but `ufs_truncate' doesn't care about this. To make possible return error code and to know about old size, this patch removes `truncate' from ufs inode_operations and uses `setattr' method to call ufs_truncate. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds129-147/+21
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: Remove obsolete #include <linux/config.h> remove obsolete swsusp_encrypt arch/arm26/Kconfig typos Documentation/IPMI typos Kconfig: Typos in net/sched/Kconfig v9fs: do not include linux/version.h Documentation/DocBook/mtdnand.tmpl: typo fixes typo fixes: specfic -> specific typo fixes in Documentation/networking/pktgen.txt typo fixes: occuring -> occurring typo fixes: infomation -> information typo fixes: disadvantadge -> disadvantage typo fixes: aquire -> acquire typo fixes: mecanism -> mechanism typo fixes: bandwith -> bandwidth fix a typo in the RTC_CLASS help text smb is no longer maintained Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
2006-06-30[PATCH] knfsd: nfsd: mark rqstp to prevent use of sendfile in privacy caseJ. Bruce Fields1-1/+1
Add a rq_sendfile_ok flag to svc_rqst which will be cleared in the privacy case so that the wrapping code will get copies of the read data instead of real page cache pages. This makes life simpler when we encrypt the response. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: fix open flag passingJ. Bruce Fields2-4/+7
Since nfsv4 actually keeps around the file descriptors it gets from open (instead of just using them for a single read or write operation), we need to make sure that we can do RDWR opens and not just RDONLY/WRONLY. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: fix some open argument testsJ. Bruce Fields1-4/+13
These tests always returned true; clearly that wasn't what was intended. In keeping with kernel style, make them functions instead of macros while we're at it. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd: fix misplaced fh_unlock() in nfsd_link()David M. Richter1-2/+3
In the event that lookup_one_len() fails in nfsd_link(), fh_unlock() is skipped and locks are held overlong. Patch was tested on 2.6.17-rc2 by causing lookup_one_len() to fail and verifying that fh_unlock() gets called appropriately. Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: remove superfluous grace period checksJ. Bruce Fields1-4/+0
We're checking nfs_in_grace here a few times when there isn't really any reason to--bad_stateid is probably the more sensible return value anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd: call nfsd_setuser() on fh_compose(), fix nfsd4 ↵J. Bruce Fields1-7/+8
permissions problem In the typical v2/v3 case the only new filehandles used as arguments to operations are filehandles taken directly off the wire, which don't get dentries until fh_verify() is called. But in v4 the filehandles that are arguments to operations were often created by previous operations (putrootfh, lookup, etc.) using fh_compose, which sets the dentry in the filehandle without calling nfsd_setuser(). This also means that, for example, if filesystem B is mounted on filesystem A, and filesystem A is exported without root-squashing, then a client can bypass the rootsquashing on B using a compound that starts at a filehandle in A, crosses into B using lookups, and then does stuff in B. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: fix open_confirm lockingJ. Bruce Fields1-2/+3
Fix an improper unlock in an error path. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: ignore ref_fh when crossing a mountpointNeilBrown1-3/+3
nfsd tries to return to a client the same sort of filehandle as was used by the client. This removes some filehandle aliasing issues and means that a server upgrade followed by a downgrade will not confused clients not restarted during that time. However when crossing a mountpoint, the filehandle used for one filesystem doesn't provide any useful information on what sort of filehandle should be used on the other, and can provide misleading information. So if the reference filehandle is on a different filesystem to the one being generated, ignore it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: remove noise about filehandle being uptodateNeilBrown1-5/+1
There is a perfectly valid situation where fh_update gets called on an already uptodate filehandle - in nfsd_create_v3 where a CREATE_UNCHECKED finds an existing file and wants to just set the size. We could possible optimise out the call in that case, but the only harm involved is that fh_update prints a warning, so it is easier to remove the warning. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: fixing missing 'expkey' support for fsid type 3Frank Filz1-1/+1
Type '3' is used for the fsid in filehandles when the device number of the device holding the filesystem has more than 8 bits in either major or minor. Unfortunately expkey_parse doesn't recognise type 3. Fix this. (Slighty modified from Frank's original) Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: improve the test for cross-device-rename in nfsdNeilBrown1-1/+1
Just testing the i_sb isn't really enough, at least the vfsmnt must be the same. Thanks Al. Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] SELinux: Add security hook definition for getioprio and insert hooksDavid Quigley1-5/+24
Add a new security hook definition for the sys_ioprio_get operation. At present, the SELinux hook function implementation for this hook is identical to the getscheduler implementation but a separate hook is introduced to allow this check to be specialized in the future if necessary. This patch also creates a helper function get_task_ioprio which handles the access check in addition to retrieving the ioprio value for the task. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] Light weight event countersChristoph Lameter2-6/+5
The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_bounce to per zone counterChristoph Lameter1-0/+2
Conversion of nr_bounce to a per zone counter nr_bounce is only used for proc output. So it could be left as an event counter. However, the event counters may not be accurate and nr_bounce is categorizing types of pages in a zone. So we really need this to also be a per zone counter. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_unstable to per zone counterChristoph Lameter3-7/+5
Conversion of nr_unstable to a per zone counter We need to do some special modifications to the nfs code since there are multiple cases of disposition and we need to have a page ref for proper accounting. This converts the last critical page state of the VM and therefore we need to remove several functions that were depending on GET_PAGE_STATE_LAST in order to make the kernel compile again. We are only left with event type counters in page state. [akpm@osdl.org: bugfixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_writeback to per zone counterChristoph Lameter1-1/+1
Conversion of nr_writeback to per zone counter. This removes the last page_state counter from arch/i386/mm/pgtable.c so we drop the page_state from there. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_dirty to per zone counterChristoph Lameter5-5/+5
This makes nr_dirty a per zone counter. Looping over all processors is avoided during writeback state determination. The counter aggregation for nr_dirty had to be undone in the NFS layer since we summed up the page counts from multiple zones. Someone more familiar with NFS should probably review what I have done. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_pagetables to per zone counterChristoph Lameter1-2/+2
Conversion of nr_page_table_pages to a per zone counter [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_slab to per zone counterChristoph Lameter1-1/+1
- Allows reclaim to access counter without looping over processor counts. - Allows accurate statistics on how many pages are used in a zone by the slab. This may become useful to balance slab allocations over various zones. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: split NR_ANON_PAGES off from NR_FILE_MAPPEDChristoph Lameter1-0/+2
The current NR_FILE_MAPPED is used by zone reclaim and the dirty load calculation as the number of mapped pagecache pages. However, that is not true. NR_FILE_MAPPED includes the mapped anonymous pages. This patch separates those and therefore allows an accurate tracking of the anonymous pages per zone. It then becomes possible to determine the number of unmapped pages per zone and we can avoid scanning for unmapped pages if there are none. Also it may now be possible to determine the mapped/unmapped ratio in get_dirty_limit. Isnt the number of anonymous pages irrelevant in that calculation? Note that this will change the meaning of the number of mapped pages reported in /proc/vmstat /proc/meminfo and in the per node statistics. This may affect user space tools that monitor these counters! NR_FILE_MAPPED works like NR_FILE_DIRTY. It is only valid for pagecache pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_pagecache to per zone counterChristoph Lameter1-1/+2
Currently a single atomic variable is used to establish the size of the page cache in the whole machine. The zoned VM counters have the same method of implementation as the nr_pagecache code but also allow the determination of the pagecache size per zone. Remove the special implementation for nr_pagecache and make it a zoned counter named NR_FILE_PAGES. Updates of the page cache counters are always performed with interrupts off. We can therefore use the __ variant here. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: convert nr_mapped to per zone counterChristoph Lameter1-1/+1
nr_mapped is important because it allows a determination of how many pages of a zone are not mapped, which would allow a more efficient means of determining when we need to reclaim memory in a zone. We take the nr_mapped field out of the page state structure and define a new per zone counter named NR_FILE_MAPPED (the anonymous pages will be split off from NR_MAPPED in the next patch). We replace the use of nr_mapped in various kernel locations. This avoids the looping over all processors in try_to_free_pages(), writeback, reclaim (swap + zone reclaim). [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel124-124/+0
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30v9fs: do not include linux/version.hPaul Collins2-2/+0
I noticed that part of v9fs was being rebuilt when version.h changed. Signed-off-by: Paul Collins <paul@ondioline.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30typo fixes: aquire -> acquireAdrian Bunk3-21/+21
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-29Merge branch 'upstream-linus' of ↵Linus Torvalds20-93/+116
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: remove redundant NULL checks in ocfs2_direct_IO_get_blocks() ocfs2: clean up some osb fields ocfs2: fix init of uuid_net_key ocfs2: silence a debug print ocfs2: silence ENOENT during lookup of broken links ocfs2: Cleanup message prints ocfs2: silence -EEXIST from ocfs2_extent_map_insert/lookup [PATCH] fs/ocfs2/dlm/dlmrecovery.c: make dlm_lockres_master_requery() static ocfs2: warn the user on a dead timeout mismatch ocfs2: OCFS2_FS must depend on SYSFS ocfs2: Compile-time disabling of ocfs2 debugging output. configfs: Clear up a few extra spaces where there should be TABs. configfs: Release memory in configfs_example.
2006-06-29ocfs2: remove redundant NULL checks in ocfs2_direct_IO_get_blocks()Florin Malita1-8/+1
Signed-off-by: Florin Malita <fmalita@gmail.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-29ocfs2: clean up some osb fieldsMark Fasheh4-42/+4
Get rid of osb->uuid, osb->proc_sub_dir, and osb->osb_id. Those fields were unused, or could easily be removed. As a result, we also no longer need MAX_OSB_ID or ocfs2_globals_lock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>