kernel/git/torvalds/linux.git - Linux kernel source tree

Age	Commit message (Collapse)	Author	Files	Lines
2006-06-30	Pull kmalloc into release branch	Len Brown	2	-3/+1

2006-06-30	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds	1	-17/+0
	* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Kill sun4v virtual device layer. [SERIAL] sunhv: Convert to of_driver layer. [SPARC64]: Mask out top 8-bits in physical address when building resources. [SERIAL] sunsu: Missing return statement in su_probe().
2006-06-30	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	7	-15/+35
	* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV6]: Added GSO support for TCPv6 [NET]: Generalise TSO-specific bits from skb_setup_caps [IPV6]: Added GSO support for TCPv6 [IPV6]: Remove redundant length check on input [NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunks [TG3]: Update version and reldate [TG3]: Add TSO workaround using GSO [TG3]: Turn on hw fix for ASF problems [TG3]: Add rx BD workaround [TG3]: Add tg3_netif_stop() in vlan functions [TCP]: Reset gso_segs if packet is dodgy
2006-06-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial	Linus Torvalds	13	-15/+13
	* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: Remove obsolete #include <linux/config.h> remove obsolete swsusp_encrypt arch/arm26/Kconfig typos Documentation/IPMI typos Kconfig: Typos in net/sched/Kconfig v9fs: do not include linux/version.h Documentation/DocBook/mtdnand.tmpl: typo fixes typo fixes: specfic -> specific typo fixes in Documentation/networking/pktgen.txt typo fixes: occuring -> occurring typo fixes: infomation -> information typo fixes: disadvantadge -> disadvantage typo fixes: aquire -> acquire typo fixes: mecanism -> mechanism typo fixes: bandwith -> bandwidth fix a typo in the RTC_CLASS help text smb is no longer maintained Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
2006-06-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6/	Linus Torvalds	1	-0/+1
	* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6/: [PATCH] pcmcia: fix deadlock in pcmcia_parse_events [PATCH] com20020_cs: more device support [PATCH] au1xxx: pcmcia: fix __init called from non-init [PATCH] kill open-coded offsetof in cm4000_cs.c ZERO_DEV() [PATCH] pcmcia: convert pcmcia_cs to kthread [PATCH] pcmcia: fix kernel-doc function name [PATCH] pcmcia: hostap_cs.c - 0xc00f,0x0000 conflicts with pcnet_cs [PATCH] pcmcia: at91_cf suspend/resume/wakeup [PATCH] pcmcia: Make ide_cs work with the memory space of CF-Cards if IO space is not available [PATCH] pcmcia: TI PCIxx12 CardBus controller support [PATCH] pcmcia: warn if driver requests exclusive, but gets a shared IRQ [PATCH] pcmcia: expose tool in pcmcia/Documentation/pcmcia/ [PATCH] pcmcia: another ID for serial_cs.c [PATCH] yenta: fix hidden PCI bus numbers [PATCH] yenta: do power-up only after socket is configured
2006-06-30	Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb	Linus Torvalds	1	-0/+55
	* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: V4L/DVB (4290): Add support for the TCL M2523_3DB_E tuner. V4L/DVB (4289): Missing statement in drivers/media/dvb/frontends/cx22700.c V4L/DVB (4288): Clean out a zillion sparse warnings in pvrusb2 V4L/DVB (4287): Pvrusb2/: possible cleanups V4L/DVB (4285): Cx88: add support for Geniatech Digistar / Digiwave 103g V4L/DVB (4284): Cx24123: fix set_voltage function according to the specs V4L/DVB (4282): Fix: use swzigzag for swalgo V4L/DVB (4281): TDA9887_SET_CONFIG should only be handled by the tda9887. V4L/DVB (4277): Fix CI interface on PRO KNC1 cards V4L/DVB (4276): Fix CI on old KNC1 DVBC cards V4L/DVB (4275): The FE_SET_FRONTEND_TUNE_MODE ioctl always returns EOPNOTSUPP V4L/DVB (4274): Eliminate use of tda9887 from pvrusb2 driver V4L/DVB (4273): Always log pvrusb2 device register / unregister events V4L/DVB (4272): Fix tveeprom supported standards V4L/DVB (4270): Add tda9887-specific tuner configuration V4L/DVB (4269): Subject: videocodec: make 1-bit fields unsigned V4L/DVB (4267): Remove all instances of request_module("tda9887") V4L/DVB (4264): Cx88-blackbird: implement VIDIOC_QUERYCTRL and VIDIOC_QUERYMENU
2006-06-30	Merge branch 'release' of ↵	Linus Torvalds	15	-62/+138
	git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (25 commits) ACPI: Kconfig: ACPI_SRAT depends on ACPI ACPI: drivers/acpi/scan.c: make acpi_bus_type static ACPI: fixup memhotplug debug message ACPI: ACPICA 20060623 ACPI: C-States: only demote on current bus mastering activity ACPI: C-States: bm_activity improvements ACPI: C-States: accounting of sleep states ACPI: additional blacklist entry for ThinkPad R40e ACPI: restore comment justifying 'extra' P_LVLx access ACPI: fix battery on HP NX6125 ACPIPHP: prevent duplicate slot numbers when no _SUN ACPI: static-ize handle_hotplug_event_func() ACPIPHP: use ACPI dock driver ACPI: dock driver KEVENT: add new uevent for dock ACPI: asus_acpi_init: propagate correct return value [ACPI] Print error message if remove/install notify handler fails ACPI: delete tracing macros from drivers/acpi/*.c ACPI: HW P-state coordination support ACPI: un-export ACPI_ERROR() -- use printk(KERN_ERR...) ...
2006-06-30	[SPARC64]: Kill sun4v virtual device layer.	David S. Miller	1	-17/+0
	Replace with a simple IRQ translater in the PROM device tree builder. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30	[IPV6]: Added GSO support for TCPv6	Herbert Xu	4	-8/+9
	This patch adds GSO support for IPv6 and TCPv6. This is based on a patch by Ananda Raju <Ananda.Raju@neterion.com>. His original description is: This patch enables TSO over IPv6. Currently Linux network stacks restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from "dev->features". This patch will remove this restriction. This patch will introduce a new flag NETIF_F_TSO6 which will be used to check whether device supports TSO over IPv6. If device support TSO over IPv6 then we don't clear of NETIF_F_TSO and which will make the TCP layer to create TSO packets. Any device supporting TSO over IPv6 will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO. In case when user disables TSO using ethtool, NETIF_F_TSO will get cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't get TSO packets created by TCP layer. SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet. SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature. UFO is supported over IPv6 also The following table shows there is significant improvement in throughput with normal frames and CPU usage for both normal and jumbo. -------------------------------------------------- \| \| 1500 \| 9600 \| \| ------------------\|-------------------\| \| \| thru CPU \| thru CPU \| -------------------------------------------------- \| TSO OFF \| 2.00 5.5% id \| 5.66 20.0% id \| -------------------------------------------------- \| TSO ON \| 2.63 78.0 id \| 5.67 39.0% id \| -------------------------------------------------- Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30	[NET]: Generalise TSO-specific bits from skb_setup_caps	Herbert Xu	3	-7/+20
	This patch generalises the TSO-specific bits from sk_setup_caps by adding the sk_gso_type member to struct sock. This makes sk_setup_caps generic so that it can be used by TCPv6 or UFO. The only catch is that whoever uses this must provide a GSO implementation for their protocol which I think is a fair deal :) For now UFO continues to live without a GSO implementation which is OK since it doesn't use the sock caps field at the moment. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30	[IPV6]: Added GSO support for TCPv6	Herbert Xu	1	-0/+6
	This patch adds GSO support for IPv6 and TCPv6. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30	[PATCH] pcmcia: TI PCIxx12 CardBus controller support	Alex Williamson	1	-0/+1
	The patch below adds support for the TI PCIxx12 CardBus controllers. This seems to be sufficient to detect the cardbus bridge on an HP nc6320 and works with an orinoco wifi card. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2006-06-30	V4L/DVB (4270): Add tda9887-specific tuner configuration	Hans Verkuil	1	-0/+55
	Many tda9887 settings depend on the chosen tuner. Expand the tuner parameters to include these tda9887 settings. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-30	[PATCH] knfsd: nfsd: mark rqstp to prevent use of sendfile in privacy case	J. Bruce Fields	1	-1/+3
	Add a rq_sendfile_ok flag to svc_rqst which will be cleared in the privacy case so that the wrapping code will get copies of the read data instead of real page cache pages. This makes life simpler when we encrypt the response. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] rcu: Add lock annotations to RCU locking primitives	Josh Triplett	1	-4/+20
	Add __acquire annotations to rcu_read_lock and rcu_read_lock_bh, and add __release annotations to rcu_read_unlock and rcu_read_unlock_bh. This allows sparse to detect improperly paired calls to these functions. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] Correct rtc_wkalrm comments	Andrew Victor	1	-2/+2
	This corrects the comments describing the 'enabled' and 'pending' flags in struct rtc_wkalrm of include/linux/rtc.h. Signed-off-by: Andrew Victor <andrew@sanpeople.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] uml: add __raw_writeq definition	Jeff Dike	1	-0/+5
	The x86_64 build requires a definition for __raw_writeq. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] add smp_setup_processor_id()	Andrew Morton	1	-0/+2
	Presently, smp_processor_id() isn't necessarily set up until setup_arch(). But it's used in boot_cpu_init() and printk() and perhaps in other places, prior to setup_arch() being called. So provide a new smp_setup_processor_id() which is called before anything else, wire it up for Voyager (which boots on a CPU other than #0, and broke). Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] SELinux: Add security hook definition for getioprio and insert hooks	David Quigley	1	-0/+15
	Add a new security hook definition for the sys_ioprio_get operation. At present, the SELinux hook function implementation for this hook is identical to the getscheduler implementation but a separate hook is introduced to allow this check to be specialized in the future if necessary. This patch also creates a helper function get_task_ioprio which handles the access check in addition to retrieving the ioprio value for the task. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] SELinux: add security hook call to kill_proc_info_as_uid	David Quigley	1	-1/+1
	This patch adds a call to the extended security_task_kill hook introduced by the prior patch to the kill_proc_info_as_uid function so that these signals can be properly mediated by security modules. It also updates the existing hook call in check_kill_permission. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] SELinux: extend task_kill hook to handle signals sent by AIO completion	David Quigley	1	-4/+19
	This patch extends the security_task_kill hook to handle signals sent by AIO completion. In this case, the secid of the task responsible for the signal needs to be obtained and saved earlier, so a security_task_getsecid() hook is added, and then this saved value is passed subsequently to the extended task_kill hook for use in checking. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] Light weight event counters	Christoph Lameter	1	-104/+66
	The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] Use Zoned VM Counters for NUMA statistics	Christoph Lameter	2	-10/+17
	The numa statistics are really event counters. But they are per node and so we have had special treatment for these counters through additional fields on the pcp structure. We can now use the per zone nature of the zoned VM counters to realize these. This will shrink the size of the pcp structure on NUMA systems. We will have some room to add additional per zone counters that will all still fit in the same cacheline. Bits Prior pcp size Size after patch We can add ------------------------------------------------------------------ 64 128 bytes (16 words) 80 bytes (10 words) 48 32 76 bytes (19 words) 56 bytes (14 words) 8 (64 byte cacheline) 72 (128 byte) Remove the special statistics for numa and replace them with zoned vm counters. This has the side effect that global sums of these events now show up in /proc/vmstat. Also take the opportunity to move the zone_statistics() function from page_alloc.c into vmstat.c. Discussions: V2 http://marc.theaimsgroup.com/?t=115048227000002&r=1&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned-vm-counters: remove read_page_state()	Andrew Morton	1	-4/+0
	No callers. Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_bounce to per zone counter	Christoph Lameter	2	-1/+1
	Conversion of nr_bounce to a per zone counter nr_bounce is only used for proc output. So it could be left as an event counter. However, the event counters may not be accurate and nr_bounce is categorizing types of pages in a zone. So we really need this to also be a per zone counter. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_unstable to per zone counter	Christoph Lameter	2	-9/+1
	Conversion of nr_unstable to a per zone counter We need to do some special modifications to the nfs code since there are multiple cases of disposition and we need to have a page ref for proper accounting. This converts the last critical page state of the VM and therefore we need to remove several functions that were depending on GET_PAGE_STATE_LAST in order to make the kernel compile again. We are only left with event type counters in page state. [akpm@osdl.org: bugfixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_writeback to per zone counter	Christoph Lameter	3	-5/+5
	Conversion of nr_writeback to per zone counter. This removes the last page_state counter from arch/i386/mm/pgtable.c so we drop the page_state from there. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_dirty to per zone counter	Christoph Lameter	2	-1/+1
	This makes nr_dirty a per zone counter. Looping over all processors is avoided during writeback state determination. The counter aggregation for nr_dirty had to be undone in the NFS layer since we summed up the page counts from multiple zones. Someone more familiar with NFS should probably review what I have done. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_pagetables to per zone counter	Christoph Lameter	2	-2/+2
	Conversion of nr_page_table_pages to a per zone counter [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_slab to per zone counter	Christoph Lameter	2	-2/+2
	- Allows reclaim to access counter without looping over processor counts. - Allows accurate statistics on how many pages are used in a zone by the slab. This may become useful to balance slab allocations over various zones. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: zone_reclaim: remove ↵	Christoph Lameter	2	-7/+0
	/proc/sys/vm/zone_reclaim_interval The zone_reclaim_interval was necessary because we were not able to determine how many unmapped pages exist in a zone. Therefore we had to scan in intervals to figure out if any pages were unmapped. With the zoned counters and NR_ANON_PAGES we now know the number of pagecache pages and the number of mapped pages in a zone. So we can simply skip the reclaim if there is an insufficient number of unmapped pages. We use SWAP_CLUSTER_MAX as the boundary. Drop all support for /proc/sys/vm/zone_reclaim_interval. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: split NR_ANON_PAGES off from NR_FILE_MAPPED	Christoph Lameter	1	-1/+2
	The current NR_FILE_MAPPED is used by zone reclaim and the dirty load calculation as the number of mapped pagecache pages. However, that is not true. NR_FILE_MAPPED includes the mapped anonymous pages. This patch separates those and therefore allows an accurate tracking of the anonymous pages per zone. It then becomes possible to determine the number of unmapped pages per zone and we can avoid scanning for unmapped pages if there are none. Also it may now be possible to determine the mapped/unmapped ratio in get_dirty_limit. Isnt the number of anonymous pages irrelevant in that calculation? Note that this will change the meaning of the number of mapped pages reported in /proc/vmstat /proc/meminfo and in the per node statistics. This may affect user space tools that monitor these counters! NR_FILE_MAPPED works like NR_FILE_DIRTY. It is only valid for pagecache pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_pagecache to per zone counter	Christoph Lameter	2	-46/+1
	Currently a single atomic variable is used to establish the size of the page cache in the whole machine. The zoned VM counters have the same method of implementation as the nr_pagecache code but also allow the determination of the pagecache size per zone. Remove the special implementation for nr_pagecache and make it a zoned counter named NR_FILE_PAGES. Updates of the page cache counters are always performed with interrupts off. We can therefore use the __ variant here. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: convert nr_mapped to per zone counter	Christoph Lameter	2	-2/+3
	nr_mapped is important because it allows a determination of how many pages of a zone are not mapped, which would allow a more efficient means of determining when we need to reclaim memory in a zone. We take the nr_mapped field out of the page state structure and define a new per zone counter named NR_FILE_MAPPED (the anonymous pages will be split off from NR_MAPPED in the next patch). We replace the use of nr_mapped in various kernel locations. This avoids the looping over all processors in try_to_free_pages(), writeback, reclaim (swap + zone reclaim). [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: basic ZVC (zoned vm counter) implementation	Christoph Lameter	2	-1/+137
	Per zone counter infrastructure The counters that we currently have for the VM are split per processor. The processor however has not much to do with the zone these pages belong to. We cannot tell f.e. how many ZONE_DMA pages are dirty. So we are blind to potentially inbalances in the usage of memory in various zones. F.e. in a NUMA system we cannot tell how many pages are dirty on a particular node. If we knew then we could put measures into the VM to balance the use of memory between different zones and different nodes in a NUMA system. For example it would be possible to limit the dirty pages per node so that fast local memory is kept available even if a process is dirtying huge amounts of pages. Another example is zone reclaim. We do not know how many unmapped pages exist per zone. So we just have to try to reclaim. If it is not working then we pause and try again later. It would be better if we knew when it makes sense to reclaim unmapped pages from a zone. This patchset allows the determination of the number of unmapped pages per zone. We can remove the zone reclaim interval with the counters introduced here. Futhermore the ability to have various usage statistics available will allow the development of new NUMA balancing algorithms that may be able to improve the decision making in the scheduler of when to move a process to another node and hopefully will also enable automatic page migration through a user space program that can analyse the memory load distribution and then rebalance memory use in order to increase performance. The counter framework here implements differential counters for each processor in struct zone. The differential counters are consolidated when a threshold is exceeded (like done in the current implementation for nr_pageache), when slab reaping occurs or when a consolidation function is called. Consolidation uses atomic operations and accumulates counters per zone in the zone structure and also globally in the vm_stat array. VM functions can access the counts by simply indexing a global or zone specific array. The arrangement of counters in an array also simplifies processing when output has to be generated for /proc/. Counters can be updated by calling inc/dec_zone_page_state or _inc/dec_zone_page_state analogous to _page_state. The second group of functions can be called if it is known that interrupts are disabled. Special optimized increment and decrement functions are provided. These can avoid certain checks and use increment or decrement instructions that an architecture may provide. We also add a new CONFIG_DMA_IS_NORMAL that signifies that an architecture can do DMA to all memory and therefore ZONE_NORMAL will not be populated. This is only currently set for IA64 SGI SN2 and currently only affects node_page_state(). In the best case node_page_state can be reduced to retrieving a single counter for the one zone on the node. [akpm@osdl.org: cleanups] [akpm@osdl.org: export vm_stat[] for filesystems] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: create vmstat.c/.h from page_alloc.c/.h	Christoph Lameter	4	-141/+151
	NOTE: ZVC are not the lightweight event counters. ZVCs are reliable whereas event counters do not need to be. Zone based VM statistics are necessary to be able to determine what the state of memory in one zone is. In a NUMA system this can be helpful for local reclaim and other memory optimizations that may be able to shift VM load in order to get more balanced memory use. It is also useful to know how the computing load affects the memory allocations on various zones. This patchset allows the retrieval of that data from userspace. The patchset introduces a framework for counters that is a cross between the existing page_stats --which are simply global counters split per cpu-- and the approach of deferred incremental updates implemented for nr_pagecache. Small per cpu 8 bit counters are added to struct zone. If the counter exceeds certain thresholds then the counters are accumulated in an array of atomic_long in the zone and in a global array that sums up all zone values. The small 8 bit counters are next to the per cpu page pointers and so they will be in high in the cpu cache when pages are allocated and freed. Access to VM counter information for a zone and for the whole machine is then possible by simply indexing an array (Thanks to Nick Piggin for pointing out that approach). The access to the total number of pages of various types does no longer require the summing up of all per cpu counters. Benefits of this patchset right now: - Ability for UP and SMP configuration to determine how memory is balanced between the DMA, NORMAL and HIGHMEM zones. - loops over all processors are avoided in writeback and reclaim paths. We can avoid caching the writeback information because the needed information is directly accessible. - Special handling for nr_pagecache removed. - zone_reclaim_interval vanishes since VM stats can now determine when it is worth to do local reclaim. - Fast inline per node page state determination. - Accurate counters in /sys/devices/system/node/node*/meminfo. Current counters are counting simply which processor allocated a page somewhere and guestimate based on that. So the counters were not useful to show the actual distribution of page use on a specific zone. - The swap_prefetch patch requires per node statistics in order to figure out when processors of a node can prefetch. This patch provides some of the needed numbers. - Detailed VM counters available in more /proc and /sys status files. References to earlier discussions: V1 http://marc.theaimsgroup.com/?l=linux-kernel&m=113511649910826&w=2 V2 http://marc.theaimsgroup.com/?l=linux-kernel&m=114980851924230&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115014697910351&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767318740&w=2 Performance tests with AIM7 did not show any regressions. Seems to be a tad faster even. Tested on ia64/NUMA. Builds fine on i386, SMP / UP. Includes fixes for s390/arm/uml arch code. This patch: Move counter code from page_alloc.c/page-flags.h to vmstat.c/h. Create vmstat.c/vmstat.h by separating the counter code and the proc functions. Move the vm_stat_text array before zoneinfo_show. [akpm@osdl.org: s390 build fix] [akpm@osdl.org: HOTPLUG_CPU build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	Remove obsolete #include <linux/config.h>	Jörn Engel	2	-2/+0
	Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30	typo fixes: specfic -> specific	Adrian Bunk	1	-1/+1
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30	typo fixes: occuring -> occurring	Adrian Bunk	2	-2/+2
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30	typo fixes: infomation -> information	Adrian Bunk	2	-3/+3
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30	typo fixes: disadvantadge -> disadvantage	Adrian Bunk	1	-1/+1
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30	typo fixes: mecanism -> mechanism	Adrian Bunk	4	-4/+4
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30	typo fixes: bandwith -> bandwidth	Adrian Bunk	1	-2/+2
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30	ACPI: delete acpi_os_free(), use kfree() directly	Len Brown	2	-3/+1
	Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-29	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	32	-26/+94
	* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) [TIPC]: Initial activation message now includes TIPC version number [TIPC]: Improve response to requests for node/link information [TIPC]: Fixed skb_under_panic caused by tipc_link_bundle_buf [IrDA]: Fix the AU1000 FIR dependencies [IrDA]: Fix RCU lock pairing on error path [XFRM]: unexport xfrm_state_mtu [NET]: make skb_release_data() static [NETFILTE] ipv4: Fix typo (Bugzilla #6753) [IrDA]: MCS7780 usb_driver struct should be static [BNX2]: Turn off link during shutdown [BNX2]: Use dev_kfree_skb() instead of the _irq version [ATM]: basic sysfs support for ATM devices [ATM]: [suni] change suni_init to __devinit [ATM]: [iphase] should be __devinit not __init [ATM]: [idt77105] should be __devinit not __init [BNX2]: Add NETIF_F_TSO_ECN [NET]: Add ECN support for TSO [AF_UNIX]: Datagram getpeersec [NET]: Fix logical error in skb_gso_ok [PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000 ...
2006-06-29	[NET]: make skb_release_data() static	Adrian Bunk	1	-1/+0
	skb_release_data() no longer has any users in other files. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[ATM]: basic sysfs support for ATM devices	Roman Kagan	1	-1/+3
	Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[NET]: Add ECN support for TSO	Michael Chan	4	-4/+9
	In the current TSO implementation, NETIF_F_TSO and ECN cannot be turned on together in a TCP connection. The problem is that most hardware that supports TSO does not handle CWR correctly if it is set in the TSO packet. Correct handling requires CWR to be set in the first packet only if it is set in the TSO header. This patch adds the ability to turn on NETIF_F_TSO and ECN using GSO if necessary to handle TSO packets with CWR set. Hardware that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev-> features flag. All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set. If the output device does not have the NETIF_F_TSO_ECN feature set, GSO will split the packet up correctly with CWR only set in the first segment. With help from Herbert Xu <herbert@gondor.apana.org.au>. Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock flag is completely removed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[AF_UNIX]: Datagram getpeersec	Catherine Zhang	23	-0/+44
	This patch implements an API whereby an application can determine the label of its peer's Unix datagram sockets via the auxiliary data mechanism of recvmsg. Patch purpose: This patch enables a security-aware application to retrieve the security context of the peer of a Unix datagram socket. The application can then use this security context to determine the security context for processing on behalf of the peer who sent the packet. Patch design and implementation: The design and implementation is very similar to the UDP case for INET sockets. Basically we build upon the existing Unix domain socket API for retrieving user credentials. Linux offers the API for obtaining user credentials via ancillary messages (i.e., out of band/control messages that are bundled together with a normal message). To retrieve the security context, the application first indicates to the kernel such desire by setting the SO_PASSSEC option via getsockopt. Then the application retrieves the security context using the auxiliary data mechanism. An example server application for Unix datagram socket should look like this: toggle = 1; toggle_len = sizeof(toggle); setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len); recvmsg(sockfd, &msg_hdr, 0); if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) { cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr); if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) && cmsg_hdr->cmsg_level == SOL_SOCKET && cmsg_hdr->cmsg_type == SCM_SECURITY) { memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext)); } } sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow a server socket to receive security context of the peer. Testing: We have tested the patch by setting up Unix datagram client and server applications. We verified that the server can retrieve the security context using the auxiliary data mechanism of recvmsg. Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com> Acked-by: Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[NET]: Fix logical error in skb_gso_ok	Herbert Xu	1	-2/+2
	The test in skb_gso_ok is backwards. Noticed by Michael Chan <mchan@broadcom.com>. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000	Shuya MAEDA	1	-6/+12
	Signed-off-by: Shuya MAEDA <maeda-sxb@necst.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[NETLINK]: Encapsulate eff_cap usage within security framework.	Darrel Goeddel	1	-6/+7
	This patch encapsulates the usage of eff_cap (in netlink_skb_params) within the security framework by extending security_netlink_recv to include a required capability parameter and converting all direct usage of eff_caps outside of the lsm modules to use the interface. It also updates the SELinux implementation of the security_netlink_send and security_netlink_recv hooks to take advantage of the sid in the netlink_skb_params struct. This also enables SELinux to perform auditing of netlink capability checks. Please apply, for 2.6.18 if possible. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[NET]: Added GSO header verification	Herbert Xu	4	-8/+19
	When GSO packets come from an untrusted source (e.g., a Xen guest domain), we need to verify the header integrity before passing it to the hardware. Since the first step in GSO is to verify the header, we can reuse that code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this bit set can only be fed directly to devices with the corresponding bit NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb is fed to the GSO engine which will allow the packet to be sent to the hardware if it passes the header check. This patch changes the sg flag to a full features flag. The same method can be used to implement TSO ECN support. We simply have to mark packets with CWR set with SKB_GSO_ECN so that only hardware with a corresponding NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment the packet, or segment the first MTU and pass the rest to the hardware for further segmentation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	merge linus into release branch	Len Brown	170	-845/+2537
	Conflicts: drivers/acpi/acpi_memhotplug.c
2006-06-29	Pull acpica into release branch	Len Brown	10	-59/+113

2006-06-29	[SPARC]: sparc32 side of of_device layer IRQ resolution.	David S. Miller	1	-1/+4
	Happily, life is much simpler on 32-bit sparc systems. The "intr" property, preferred over the "interrupts" property is used-as. Some minor translations of this value happen on sun4d systems. The stage is now set to rewrite the sparc serial driver probing to use the of_driver framework, and then to convert all SBUS, EBUS, and ISA drivers in-kind so that we can nuke all those special bus frameworks. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[SPARC64]: of_device layer IRQ resolution	David S. Miller	3	-2/+14
	Do IRQ determination generically by parsing the PROM properties, and using IRQ controller drivers for final resolution. One immediate positive effect is that all of the IRQ frobbing in the EBUS, ISA, and PCI controller layers has been eliminated. We just look up the of_device and use the properly computed value. The PCI controller irq_build() routines are gone and no longer used. Unfortunately sbus_build_irq() has to remain as there is a direct reference to this in the sunzilog driver. That can be killed off once the sparc32 side of this is written and the sunzilog driver is transformed into an "of" bus driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[SPARC]: Kill interrupt stuff and linux_phandle from device_node.	David S. Miller	2	-16/+0
	Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[SPARC]: Add of_io{remap,unmap}().	David S. Miller	2	-0/+6
	Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[SPARC]: Beginnings of generic of_device framework.	David S. Miller	4	-4/+28
	The idea is to fully construct the device register and interrupt values into these of_device objects, and convert all of SBUS, EBUS, ISA drivers to use this new stuff. Much ideas and code taken from Ben H.'s powerpc work. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[SPARC]: Add of_n_{addr,size}_cells().	David S. Miller	2	-0/+4
	Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	[SPARC64]: Kill starfire_cookie from SBUS/PCI.	David S. Miller	3	-4/+1
	Totally unused. We need to traverse the list of global IRQ translaters, so storing it in the per-bus structures was useless. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6	Linus Torvalds	11	-115/+10
	* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits) [PATCH] devfs: Remove it from the feature_removal.txt file [PATCH] devfs: Last little devfs cleanups throughout the kernel tree. [PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV [PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed [PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed [PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed [PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed [PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed [PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree [PATCH] devfs: Remove devfs_remove() function from the kernel tree [PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree [PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree [PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree [PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree [PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree [PATCH] devfs: Remove devfs support from the sound subsystem [PATCH] devfs: Remove devfs support from the ide subsystem. [PATCH] devfs: Remove devfs support from the serial subsystem [PATCH] devfs: Remove devfs from the init code [PATCH] devfs: Remove devfs from the partition code ...
2006-06-29	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds	12	-37/+52
	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (33 commits) [MIPS] Add missing backslashes to macro definitions. [MIPS] Death list of board support to be removed after 2.6.18. [MIPS] Remove BSD and Sys V compat data types. [MIPS] ioc3.h: Uses u8, so include <linux/types.h>. [MIPS] 74K: Assume it will also have an AR bit in config7 [MIPS] Treat CPUs with AR bit as physically indexed. [MIPS] Oprofile: Support VSMP on 34K. [MIPS] MIPS32/MIPS64 S-cache fix and cleanup [MIPS] excite: PCI makefile needs to use += if it wants a chance to work. [MIPS] excite: plat_setup -> plat_mem_setup. [MIPS] au1xxx: export dbdma functions [MIPS] au1xxx: dbdma, no sleeping under spin_lock [MIPS] au1xxx: fix PSC_SMBTXRX_RSR. [MIPS] Early printk for IP27. [MIPS] Fix handling of 0 length I & D caches. [MIPS] Typo fixes. [MIPS] MIPS32/MIPS64 secondary cache management [MIPS] Fix FIXADDR_TOP for TX39/TX49. [MIPS] Remove first timer interrupt setup in wrppmc_timer_setup() [MIPS] Fix configuration of R2 CPU features and multithreading. ...
2006-06-29	[MIPS] Add missing backslashes to macro definitions.	Ralf Baechle	1	-2/+2
	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] Remove BSD and Sys V compat data types.	Ralf Baechle	1	-5/+5
	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] ioc3.h: Uses u8, so include <linux/types.h>.	Ralf Baechle	1	-0/+2
	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] au1xxx: fix PSC_SMBTXRX_RSR.	Domen Puncer	1	-1/+1
	Signed-off-by: Domen Puncer <domen.puncer@ultra.si> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] Fix FIXADDR_TOP for TX39/TX49.	Atsushi Nemoto	1	-0/+4
	FIXADDR_TOP is used for HIGHMEM and for upper limit of vmalloc area on 32bit kernel. TX39XX and TX49XX have "reserved" segment in CKSEG3 area. 0xff000000-0xff3fffff on TX49XX and 0xff000000-0xfffeffff on TX39XX are reserved (unmapped, uncached) therefore can not be used as mapped area. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] Fix configuration of R2 CPU features and multithreading.	Ralf Baechle	1	-12/+8
	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	elf-em.h: Define and explain both EM_MIPS_RS3_LE and EM_MIPS_RS4_BE.	Ralf Baechle	1	-0/+5
	They have been obsoleted by the ELF header EI_CLASS and EI_DATA fields in combination with e_flags. Afaics EM_MIPS_RS3_LE and EM_MIPS_RS4_BE never had any practical relevance. Binutils will not produce such binaries and the kernel will not accept them as MIPS binaries. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] Fix use of ehb instruction for non-R2 configurations.	Ralf Baechle	3	-11/+12
	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] Define ARCH_HAS_IRQ_PER_CPU for all SMP systems.	Ralf Baechle	2	-6/+4
	Without SMTC on non-Malta will blow up. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	[MIPS] Wire up tee(2).	Ralf Baechle	1	-6/+9
	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-06-29	Merge master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa	Linus Torvalds	3	-13/+26
	* master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa: [ALSA] echoaudio - Remove kfree_nocheck() [ALSA] echoaudio - Fix Makefile [ALSA] Add Intel D965 board support [ALSA] Fix/add support of Realtek ALC883 / ALC888 and ALC861 codecs [ALSA] Fix a typo in echoaudio/midi.c [ALSA] snd-aoa: enable dual-edge in GPIOs [ALSA] snd-aoa: support iMac G5 iSight [ALSA] snd-aoa: not experimental [ALSA] Add echoaudio sound drivers [ALSA] ak4xxx-adda - Code clean-up [ALSA] Remove CONFIG_EXPERIMENTAL from intel8x0m driver [ALSA] Stereo controls for M-Audio Revolution cards [ALSA] Fix misuse of __list_add() in seq_ports.c [ALSA] hda-codec - Add model entry for Samsung X60 Chane [ALSA] make CONFIG_SND_DYNAMIC_MINORS non-experimental [ALSA] Fix wrong dependencies of snd-aoa driver [ALSA] fix build failure due to snd-aoa [ALSA] AD1888 mixer controls for DC mode [ALSA] Suppress irq handler mismatch messages in ALSA ISA drivers [ALSA] usb-audio support for Turtle Beach Roadie
2006-06-29	Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc	Linus Torvalds	13	-59/+663
	* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (43 commits) [POWERPC] Use little-endian bit from firmware ibm,pa-features property [POWERPC] Make sure smp_processor_id works very early in boot [POWERPC] U4 DART improvements [POWERPC] todc: add support for Time-Of-Day-Clock [POWERPC] Make lparcfg.c work when both iseries and pseries are selected [POWERPC] Fix idr locking in init_new_context [POWERPC] mpc7448hpc2 (taiga) board config file [POWERPC] Add tsi108 pci and platform device data register function [POWERPC] Add general support for mpc7448hpc2 (Taiga) platform [POWERPC] Correct the MAX_CONTEXT definition powerpc: minor cleanups for mpc86xx [POWERPC] Make sure we select CONFIG_NEW_LEDS if ADB_PMU_LED is set [POWERPC] Simplify the code defining the 64-bit CPU features [POWERPC] powerpc: kconfig warning fix [POWERPC] Consolidate some of kernel/misc*.S [POWERPC] Remove unused function call_with_mmu_off [POWERPC] update asm-powerpc/time.h [POWERPC] Clean up it_lp_queue.h [POWERPC] Skip the "copy down" of the kernel if it is already at zero. [POWERPC] Add the use of the firmware soft-reset-nmi to kdump. ...
2006-06-29	Merge master.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6	Linus Torvalds	8	-35/+61
	* master.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6: (23 commits) [PARISC] Move os_id_to_string() inside #ifndef __ASSEMBLY__ [PARISC] Fix do_gettimeofday() hang [PARISC] Fix PCREL22F relocation problem for most modules [PARISC] Refactor show_regs in traps.c [PARISC] Add os_id_to_string helper [PARISC] OS_ID_LINUX == 0x0006 [PARISC] Ensure Space ID hashing is turned off [PARISC] Match show_cache_info with reality [PARISC] Remove unused macro fixup_branch in syscall.S [PARISC] Add is_compat_task() helper [PARISC] Update Thibaut Varene's CREDITS entry [PARISC] Reduce data footprint in pdc_stable.c [PARISC] pdc_stable version 0.30 [PARISC] Work around machines which do not support chassis warnings [PARISC] PDC_CHASSIS is implemented on all machines [PARISC] Remove unconditional #define PIC in syscall macros [PARISC] Use MFIA in current_text_addr on pa2.0 processors [PARISC] Remove dead function pc_in_user_space [PARISC] Test ioc_needs_fdc variable instead of open coding [PARISC] Fix gcc 4.1 warnings in sba_iommu.c ...
2006-06-29	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6	Linus Torvalds	6	-49/+12
	* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (28 commits) [S390] rework of channel measurement facility. [S390] appldata enhancements. [S390] Add vmpanic parameter. [S390] add PAV support to the dasd driver. [S390] remove export of sys_call_table [S390] remove unused macros from binfmt_elf32.c [S390] fix duplicate export of overflow{ug}id [S390] cio chpid offline. [S390] avenrun export in appdata_base.c Convert s390_collect_crw_info() in s390mach.c from being started [S390] dasd eer data format. [S390] preempt_count initialization. [S390] head.S code moving. [S390] dasd whitespace and other cosmetics. [S390] virtual cpu accounting vs. machine checks. [S390] add __cpuinit to appldata cpu hotplug notifier. [S390] dasd_eckd_dump_sense bug. [S390] missing check in dasd_eer_open. [S390] modular 3270 driver. [S390] console_unblank woes. ...
2006-06-29	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6	Linus Torvalds	7	-22/+38
	* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: [PATCH] i386: export memory more than 4G through /proc/iomem [PATCH] 64bit Resource: finally enable 64bit resource sizes [PATCH] 64bit Resource: convert a few remaining drivers to use resource_size_t where needed [PATCH] 64bit resource: change pnp core to use resource_size_t [PATCH] 64bit resource: change pci core and arch code to use resource_size_t [PATCH] 64bit resource: change resource core to use resource_size_t [PATCH] 64bit resource: introduce resource_size_t for the start and end of struct resource [PATCH] 64bit resource: fix up printks for resources in misc drivers [PATCH] 64bit resource: fix up printks for resources in arch and core code [PATCH] 64bit resource: fix up printks for resources in pcmcia drivers [PATCH] 64bit resource: fix up printks for resources in video drivers [PATCH] 64bit resource: fix up printks for resources in ide drivers [PATCH] 64bit resource: fix up printks for resources in mtd drivers [PATCH] 64bit resource: fix up printks for resources in pci core and hotplug drivers [PATCH] 64bit resource: fix up printks for resources in networks drivers [PATCH] 64bit resource: fix up printks for resources in sound drivers [PATCH] 64bit resource: C99 changes for struct resource declarations Fixed up trivial conflict in drivers/ide/pci/cmd64x.c (the printk that was changed by the 64-bit resources had been deleted in the meantime ;)
2006-06-29	[PATCH] genirq: add chip->eoi(), fastack -> fasteoi	Ingo Molnar	1	-2/+4
	Clean up the fastack concept by turning it into fasteoi and introducing the ->eoi() method for chips. This also allows the cleanup of an i386 EOI quirk - now the quirk is cleanly separated from the pure ACK implementation. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add IRQ_TYPE_SENSE_MASK	Benjamin Herrenschmidt	1	-0/+1
	Add a #define for the mask of the part of IRQ_TYPE that represents the trigger type. I use that in my in-progress work as I've standardized the way the irq description in the firmware device-tree get translated to linux useable things by using those constants. Having this mask to isolate the "trigger type" part of the flags is useful in a few places. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add irq-wake (power-management) support	Thomas Gleixner	1	-0/+14
	Enable platforms to set the irq-wake (power-management) properties of an IRQ. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add irq-chip support	Thomas Gleixner	1	-0/+11
	Enable platforms to use the irq-chip and irq-flow abstractions: allow setting of the chip, the type and provide highlevel handlers for common irq-flows. [rostedt@goodmis.org: misroute-irq: Don't call desc->chip->end because of edge interrupts] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq MSI fixes	Ingo Molnar	1	-11/+16
	This is a fixed up and cleaned up replacement for genirq-msi-fixes.patch, which should solve the i386 4KSTACKS problem. I also added Ben's idea of pushing the __do_IRQ() check into generic_handle_irq(). I booted this with MSI enabled, but i only have MSI devices, not MSI-X devices. I'd still expect MSI-X to work now. irqchip migration helper: call __do_IRQ() if a descriptor is attached to an irqtype-style controller. This also fixes MSI-X IRQ handling on i386 and x86_64. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: core	Thomas Gleixner	1	-19/+141
	Core genirq support: add the irq-chip and irq-flow abstractions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add IRQ_NOAUTOEN support	Thomas Gleixner	1	-0/+1
	Enable platforms to disable the automatic enabling of freshly set up irqs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add IRQ_NOREQUEST support	Thomas Gleixner	1	-0/+1
	Enable platforms to disable request_irq() for certain interrupts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add IRQ_NOPROBE support	Thomas Gleixner	1	-0/+1
	Introduce IRQ_NOPROBE: enables platforms to control chip-probing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add genirq sw IRQ-retrigger	Thomas Gleixner	1	-0/+3
	Enable platforms that do not have a hardware-assisted hardirq-resend mechanism to resend them via a softirq-driven IRQ emulation mechanism. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: doc: comment include/linux/irq.h structures	Ingo Molnar	1	-7/+37
	Better document the hw_interrupt_type and irq_desc structures. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()	Ingo Molnar	15	-63/+10
	Add ->retrigger() irq op to consolidate hw_irq_resend() implementations. (Most architectures had it defined to NOP anyway.) NOTE: ia64 needs testing. i386 and x86_64 tested. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: turn ARCH_HAS_IRQ_PER_CPU into CONFIG_IRQ_PER_CPU	Ingo Molnar	6	-27/+1
	Cleanup: change ARCH_HAS_IRQ_PER_CPU into a Kconfig method. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: merge pending_irq_cpumask[] into irq_desc[]	Ingo Molnar	1	-1/+1
	Consolidation: remove the pending_irq_cpumask[NR_IRQS] array and move it into the irq_desc[NR_IRQS].pending_mask field. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: merge irq_dir[], smp_affinity_entry[] into irq_desc[]	Ingo Molnar	1	-0/+5
	Consolidation: remove the irq_dir[NR_IRQS] and the smp_affinity_entry[NR_IRQS] arrays and move them into the irq_desc[] array. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: include/linux/irq.h	Ingo Molnar	1	-28/+28
	Small cleanups in include/linux/irq.h. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: reduce irq_desc_t use, mark it obsolete	Ingo Molnar	1	-5/+13
	Cleanup: remove irq_desc_t use from the generic IRQ code, and mark it obsolete. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: misc code cleanups	Ingo Molnar	1	-22/+32
	Assorted code cleanups to the generic IRQ code. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: remove fastcall	Ingo Molnar	1	-3/+7
	Now that i386 defaults to regparm, explicit uses of fastcall are not needed anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: remove irq_descp()	Ingo Molnar	1	-7/+0
	Cleanup: remove irq_descp() - explicit use of irq_desc[] is shorter and more readable. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: cleanup: merge irq_affinity[] into irq_desc[]	Ingo Molnar	1	-3/+4
	Consolidation: remove the irq_affinity[NR_IRQS] array and move it into the irq_desc[NR_IRQS].affinity field. [akpm@osdl.org: sparc64 build fix] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] genirq: rename desc->handler to desc->chip	Ingo Molnar	2	-8/+8
	This patch-queue improves the generic IRQ layer to be truly generic, by adding various abstractions and features to it, without impacting existing functionality. While the queue can be best described as "fix and improve everything in the generic IRQ layer that we could think of", and thus it consists of many smaller features and lots of cleanups, the one feature that stands out most is the new 'irq chip' abstraction. The irq-chip abstraction is about describing and coding and IRQ controller driver by mapping its raw hardware capabilities [and quirks, if needed] in a straightforward way, without having to think about "IRQ flow" (level/edge/etc.) type of details. This stands in contrast with the current 'irq-type' model of genirq architectures, which 'mixes' raw hardware capabilities with 'flow' details. The patchset supports both types of irq controller designs at once, and converts i386 and x86_64 to the new irq-chip design. As a bonus side-effect of the irq-chip approach, chained interrupt controllers (master/slave PIC constructs, etc.) are now supported by design as well. The end result of this patchset intends to be simpler architecture-level code and more consolidation between architectures. We reused many bits of code and many concepts from Russell King's ARM IRQ layer, the merging of which was one of the motivations for this patchset. This patch: rename desc->handler to desc->chip. Originally i did not want to do this, because it's a big patch. But having both "desc->handler", "desc->handle_irq" and "action->handler" caused a large degree of confusion and made the code appear alot less clean than it truly is. I have also attempted a dual approach as well by introducing a desc->chip alias - but that just wasnt robust enough and broke frequently. So lets get over with this quickly. The conversion was done automatically via scripts and converts all the code in the kernel. This renaming patch is the first one amongst the patches, so that the remaining patches can stay flexible and can be merged and split up without having some big monolithic patch act as a merge barrier. [akpm@osdl.org: build fix] [akpm@osdl.org: another build fix] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] i4l: remove unneeded include/linux/isdn/tpam.h	Karsten Keil	1	-55/+0
	The TPAM isdn driver was removed in 2.6.12, but include/linux/isdn/tpam.h was missed. Signed-off-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] Keys: Allow in-kernel key requestor to pass auxiliary data to upcaller	David Howells	1	-1/+7
	The proposed NFS key type uses its own method of passing key requests to userspace (upcalling) rather than invoking /sbin/request-key. This is because the responsible userspace daemon should already be running and will be contacted through rpc_pipefs. This patch permits the NFS filesystem to pass auxiliary data to the upcall operation (struct key_type::request_key) so that the upcaller can use a pre-existing communications channel more easily. Signed-off-by: David Howells <dhowells@redhat.com> Acked-By: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[PATCH] fix sgivwfb compile	Adrian Bunk	1	-0/+3
	drivers/built-in.o: In function `sgivwfb_set_par': sgivwfb.c:(.text+0x88583): undefined reference to `sgivwfb_mem_phys' sgivwfb.c:(.text+0x88596): undefined reference to `sgivwfb_mem_phys' sgivwfb.c:(.text+0x885a8): undefined reference to `sgivwfb_mem_phys' drivers/built-in.o: In function `sgivwfb_check_var': sgivwfb.c:(.text+0x88ad0): undefined reference to `sgivwfb_mem_size' drivers/built-in.o: In function `sgivwfb_mmap': sgivwfb.c:(.text+0x88c75): undefined reference to `sgivwfb_mem_size' sgivwfb.c:(.text+0x88c7f): undefined reference to `sgivwfb_mem_phys' drivers/built-in.o: In function `sgivwfb_probe': sgivwfb.c:(.init.text+0x4060): undefined reference to `sgivwfb_mem_size' sgivwfb.c:(.init.text+0x4065): undefined reference to `sgivwfb_mem_phys' sgivwfb.c:(.init.text+0x4076): undefined reference to `sgivwfb_mem_phys' sgivwfb.c:(.init.text+0x409c): undefined reference to `sgivwfb_mem_size' sgivwfb.c:(.init.text+0x410e): undefined reference to `sgivwfb_mem_size' sgivwfb.c:(.init.text+0x4113): undefined reference to `sgivwfb_mem_phys' sgivwfb.c:(.init.text+0x4162): undefined reference to `sgivwfb_mem_size' sgivwfb.c:(.init.text+0x4168): undefined reference to `sgivwfb_mem_phys' make: *** [.tmp_vmlinux1] Error 1 Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29	[S390] rework of channel measurement facility.	Cornelia Huck	1	-4/+0
	Fixes for several channel measurement facility bugs: * Blocks copied from the hardware might not be consistent. Solve this by moving the copying into idle state and repeating the copying. * avg_sample_interval changed with every read, even though no new block was available. Solve this by storing a timestamp when the last new block was received. * Several locking issues. * Measurements were not reenabled after a disconnected device became available again. * Remove #defines for ioctls that were never implemented. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-06-29	[S390] add PAV support to the dasd driver.	Horst Hummel	1	-3/+5
	Add support for parallel-access-volumes to the dasd driver. This allows concurrent access to dasd devices with multiple channel programs. Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-06-29	[S390] preempt_count initialization.	Heiko Carstens	1	-0/+1
	The preempt_count in the thread_info structure must be initialized to 1. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-06-29	[S390] __syscall_return error check.	Heiko Carstens	1	-3/+1
	Fix __syscall_return macro: valid error numbers are in the range of -1..-4095. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-06-29	[S390] cio async subchannel reprobe.	Peter Oberparleiter	1	-0/+2
	Changes in the DASD driver require an asynchronous implementation of the subchannel reprobe loop. This loop was so far only used by the blacklisting mechanism but is now available to all CCW device drivers. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-06-29	[S390] cleanup bitops.h.	Heiko Carstens	1	-39/+3
	Encapsulate complete bitops.h with #ifdef __KERNEL__ and remove the now superfluous ALIGN_CS define and its users. This patch is needed for compiling klibc. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-06-29	[POWERPC] todc: add support for Time-Of-Day-Clock	Mark A. Greer	1	-0/+487
	This is a resubmit with a proper subject and with all comments addressed. Applies cleanly to powerpc.git 649e85797259162f7fdc696420e7492f20226f2d Mark -- The todc code from arch/ppc supports many todc/rtc chips and is needed in arch/powerpc. This patch adds the todc code to arch/powerpc. Signed-off-by: Mark A. Greer <mgreer@mvista.com> -- arch/powerpc/Kconfig \| 7 arch/powerpc/sysdev/Makefile \| 1 arch/powerpc/sysdev/todc.c \| 392 ++++++++++++++++++++++++++++++++++ include/asm-powerpc/todc.h \| 487 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 887 insertions(+) -- Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-29	[POWERPC] Add tsi108 pci and platform device data register function	Zang Roy-r61911	1	-0/+109
	Add Tundra Semiconductor tsi108 pci and platform device data register function support. Signed-off-by: Alexandre Bounine <alexandreb@tundra.com> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com> --- Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-29	[POWERPC] Correct the MAX_CONTEXT definition	Paul Mackerras	1	-1/+6
	When we increased the address space per process to 2^44 bytes, the number of contexts that we could actually use reduced, but we forgot to decrease the MAX_CONTEXT definition. (Fortunately this would only cause problems if we actually had more than 512k user processes running.) This patch corrects the definition. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	Merge branch 'nommu' of master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds	13	-78/+378
	* 'nommu' of master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] nommu: backtrace code must not reference a discarded section [ARM] nommu: Initial uCLinux support for MMU-based CPUs [ARM] nommu: prevent Xscale-based machines being selected [ARM] nommu: export flush_dcache_page() [ARM] nommu: remove fault-armv, mmap and mm-armv files from nommu build [ARM] Remove TABLE_SIZE, and several unused function prototypes [ARM] nommu: Provide a simple flush_dcache_page implementation [ARM] nommu: add arch/arm/Kconfig-nommu to Kconfig files [ARM] nommu: add stubs for ioremap and friends [ARM] nommu: avoid selecting TLB and CPU specific copy code [ARM] nommu: uaccess tweaks [ARM] nommu: adjust headers for !MMU ARM systems [ARM] nommu: we need the TLS register emulation for nommu mode
2006-06-28	Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds	12	-61/+66
	* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] 3672/1: PXA: don't probe output GPIOs for interrupt [ARM] 3671/1: ep93xx: add cirrus logic edb9315 support [ARM] 3370/2: ep93xx: add crunch support [ARM] 3665/1: crunch: add ptrace support [ARM] 3664/1: crunch: add signal frame save/restore [ARM] 3663/1: fix resource->end off-by-one thinko during physmap conversion [ARM] 3662/1: ixp23xx: don't include asm/hardware.h in uncompress.h [ARM] 3660/1: Remove legacy defines [ARM] 3661/1: S3C2412: Fix compilation if CPU_S3C2410 only [ARM] 3658/1: S3C244X: Change usb-gadget name to s3c2440-usbgadget [ARM] Remove the __arch_* layer from uaccess.h
2006-06-28	Merge master.kernel.org:/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog	Linus Torvalds	1	-3/+7
	* master.kernel.org:/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: [WATCHDOG] Documentation/watchdog update [WATCHDOG] convert AT91RM9200 watchdog to platform driver [WATCHDOG] add WDIOC_GETTIMELEFT ioctl [WATCHDOG] Pre-Timeout flags
2006-06-28	[PATCH] Fix plist include dependency	Thomas Gleixner	1	-0/+1
	plist.h uses container_of, which is defined in kernel.h. Include kernel.h in plist.h as the kernel.h include does not longer happen automatically on all architectures. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[PATCH] SPI: infrastructure to initialize spi_device.mode early	David Brownell	1	-1/+5
	This patch adds earlier initialization of spi_device.mode, as needed on boards using nondefault chipselect polarity. An example would be ones using the RS5C348 RTC without an external signal inverter between the RTC chipselect and the SPI controller. Without this mechanism, the first setup() call for that chip would wrongly enable chips, corrupting transfers to/from other chips sharing that SPI bus. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[PATCH] m68knommu: fix clobber list in uCdimm/uCsimm helper asm	Greg Ungerer	1	-6/+6
	Fix clobber list in uCsimm/uCdimm boot load helper asm. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	Merge branch 'release' of ↵	Linus Torvalds	1	-0/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64-SGI] fix prom revision checks in SN kernel [IA64] tiger_defconfig s/NR_CPUS=4/NR_CPUS=16/ [IA64-SGI] - Pass OS logical cpu number to the SN prom (bios) [IA64] palinfo.c: s/register_cpu_notifier/register_hotcpu_notifier/
2006-06-28	[PATCH] ide: fix error handling for drives which clear the FIFO on error	Alan Cox	1	-0/+1
	If the controller FIFO cleared automatically on error we must not try and drain it as this will hang some chips. Based in concept on a broken patch from -mm some while back Signed-off-by: Alan Cox <alan@redhat.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[PATCH] ac97_codec: make bitfield unsigned	Randy Dunlap	1	-1/+1
	Make a 1-bit bitfield unsigned (no space for sign bit). Removes 24 sparse warnings from this one file: include/linux/ac97_codec.h:262:13: error: dubious one-bit signed bitfield Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[PATCH] remove active field from tty buffer structure	Paul Fulghum	3	-5/+2
	Remove 'active' field from tty buffer structure. This was added in 2.6.16 as part of a patch to make the new tty buffering SMP safe. This field is unnecessary with the more intelligently written flush_to_ldisc that adds receive_room handling. Removing this field reverts to simpler logic where the tail buffer is always the 'active' buffer, which should not be freed by flush_to_ldisc. (active == buffer being filled with new data) The result is simpler, smaller, and faster tty buffer code. Signed-off-by: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[PATCH] remove TTY_DONT_FLIP	Paul Fulghum	1	-1/+0
	Remove TTY_DONT_FLIP tty flag. This flag was introduced in 2.1.X kernels to prevent the N_TTY line discipline functions read_chan() and n_tty_receive_buf() from running at the same time. 2.2.15 introduced tty->read_lock to protect access to the N_TTY read buffer, which is the only state requiring protection between these two functions. The current TTY_DONT_FLIP implementation is broken for SMP, and is not universally honored by drivers that send data directly to the line discipline receive_buf function. Because TTY_DONT_FLIP is not necessary, is broken in implementation, and is not universally honored, it is removed. Signed-off-by: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[PATCH] Add EXPORT_UNUSED_SYMBOL and EXPORT_UNUSED_SYMBOL_GPL	Arjan van de Ven	2	-0/+48
	Temporarily add EXPORT_UNUSED_SYMBOL and EXPORT_UNUSED_SYMBOL_GPL. These will be used as a transition measure for symbols that aren't used in the kernel and are on the way out. When a module uses such a symbol, a warning is printk'd at modprobe time. The main reason for removing unused exports is size: eacho export takes roughly between 100 and 150 bytes of kernel space in the binary. This patch gives users the option to immediately get this size gain via a config option. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[PATCH] mark address_space_operations const	Christoph Hellwig	6	-8/+8
	Same as with already do with the file operations: keep them in .rodata and prevents people from doing runtime patching. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28	[ALSA] ak4xxx-adda - Code clean-up	Takashi Iwai	1	-14/+23
	Fix spaces, fold lines to fit 80 columns in ak4xxx-adda driver codes. Split a long reset function to each codec routine just for better readability. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
2006-06-28	[ALSA] Stereo controls for M-Audio Revolution cards	Jani Alinikula	1	-0/+2
	This patch adds stereo controls to revo cards by making the ak4xxx driver mixers configurable from the card driver. Signed-off-by: Jani Alinikula <janialinikula@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
2006-06-28	[ALSA] AD1888 mixer controls for DC mode	Jaya Kumar	1	-0/+1
	This patch adds two mixer controls. The V_REFOUT enable is a documented register that couples the microphone input lines to the V_REFOUT DC source. The High Pass Filter enable in the AC97_AD_TEST2 (0x5c) is an undocumented register provided by Miller Puckette via Analog Devices that enables the AD codec to apply a high pass filter to the input. Signed-off-by: Jaya Kumar <jayakumar.alsa@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
2006-06-28	[ALSA] Suppress irq handler mismatch messages in ALSA ISA drivers	Takashi Iwai	1	-1/+2
	Suppress 'irq handler mismatch' messages at auto-probing of irqs in ALSA ISA drivers. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
2006-06-28	[ARM] nommu: remove fault-armv, mmap and mm-armv files from nommu build	Russell King	1	-0/+4
	Remove fault-armv.o, mmap.o and mm-armv.o from uclinux builds - these are concerned with MMU-ful operations, and as such are redundant for uclinux. Since this also removes iotable_init() and iotable_init() is used extensively in the platform support files, just make it a no-op. Based upon a couple of patches by Hyok. Signed-off-by: Hyok S. Choi <hyok.choi@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] Remove TABLE_SIZE, and several unused function prototypes	Russell King	1	-5/+0
	TABLE_SIZE is never used in arch/arm/mm/init.c. create_memmap_holes(), memtable_init, and setup_io_desc() no longer exist in the kernel. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] nommu: uaccess tweaks	Russell King	1	-52/+87
	MMUless systems have only one address space for all threads, so both the usual access_ok() checks, and the exception handling do not make much sense. Hence, discard the fixup and exception tables at link time, use memcpy/memset for the user copy/clearing functions, and define the permission check macros to be constants. Some of this patch was derived from the equivalent patch by Hyok S. Choi. Signed-off-by: Hyok S. Choi <hyok.choi@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] nommu: adjust headers for !MMU ARM systems	Russell King	11	-21/+287
	Majorily based on Hyok Choi's patches, this fixes up the asm-arm header files for mmuless systems. Over and above Hyok's patches: - nommu.h merged into mmu.h (it's only a structure) - nommu_context.h is essentially the same as mmu_context.h, but without the MM switching code. so there's no point having separate files. Also, in memory.h, there's no point #ifndef'ing PHYS_OFFSET and END_MEM - both CONFIG_DRAM_BASE and CONFIG_DRAM_SIZE will always be set by the configuration scripts. Other files have minor formatting changes, but are essentially the same. Hyok's original patches were signed off thusly: Signed-off-by: Hyok S. Choi <hyok.choi@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[IA64-SGI] - Pass OS logical cpu number to the SN prom (bios)	Jack Steiner	1	-0/+10
	Pass the OS logical cpu number to the PROM. This allows PROM to log the OS logical cpu number in error records viewed thru POD. Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-06-28	[ARM] 3370/2: ep93xx: add crunch support	Lennert Buytenhek	2	-0/+14
	Patch from Lennert Buytenhek Add the necessary kernel bits for crunch task switching. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] 3665/1: crunch: add ptrace support	Lennert Buytenhek	1	-0/+5
	Patch from Lennert Buytenhek This patch makes it possible to get/set a task's Crunch state via the ptrace(2) system call. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] 3664/1: crunch: add signal frame save/restore	Lennert Buytenhek	1	-0/+14
	Patch from Lennert Buytenhek This patch makes the kernel save Crunch state in userland signal frames, so that any userland signal handler can safely use the Crunch coprocessor without corrupting the Crunch state of the code it preempted. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] 3662/1: ixp23xx: don't include asm/hardware.h in uncompress.h	Lennert Buytenhek	3	-12/+11
	Patch from Lennert Buytenhek ixp23xx was including asm/hardware.h in its version of uncompress.h, to get at the physical address of the debug UART, but this include was causing various inline functions that are totally unrelated to the decompressor, defined in headers in include/asm-arm/arch-ixp23xx, to be included in the decompressor image. Include asm/arch/ixp23xx.h instead, and move the sole inline function in ixp23xx.h to another header. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] 3660/1: Remove legacy defines	Andrew Victor	3	-8/+2
	Patch from Andrew Victor Remove the remaining legacy __virt_to_bus__is_a_macro and __bus_to_virt__is_a_macro defines in some ARM platforms. Signed-off-by: Andrew Victor <andrew@sanpeople.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] 3661/1: S3C2412: Fix compilation if CPU_S3C2410 only	Ben Dooks	1	-9/+7
	Patch from Ben Dooks If only the S3C2412 based machines are selected, then the regs-dsc.h does not export the S3C2412_DSC registers as it is wrapped in CONFIG_CPU_S3C2440. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	[ARM] Remove the __arch_* layer from uaccess.h	Russell King	1	-32/+13
	Back in the days when we had armo (26-bit) and armv (32-bit) combined, we had an additional layer to the uaccess macros to ensure correct typing. Since we no longer have 26-bit in this tree, we no longer need this layer, so eliminate it. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-28	ACPI: C-States: accounting of sleep states	Dominik Brodowski	1	-0/+1
	Track the actual time spent in C-States (C2 upwards, we can't determine this for C1), not only the number of invocations. This is especially useful for dynamic ticks / "tickless systems", but is also of interest on normal systems, as any interrupt activity leads to C-States being exited, not only the timer interrupt. The time is being measured in PM timer ticks, so an increase by one equals 279 nanoseconds. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-28	ACPI: ACPICA 20060623	Bob Moore	10	-59/+113
	Implemented a new acpi_spinlock type for the OSL lock interfaces. This allows the type to be customized to the host OS for improved efficiency (since a spinlock is usually a very small object.) Implemented support for "ignored" bits in the ACPI registers. According to the ACPI specification, these bits should be preserved when writing the registers via a read/modify/write cycle. There are 3 bits preserved in this manner: PM1_CONTROL[0] (SCI_EN), PM1_CONTROL[9], and PM1_STATUS[11]. http://bugzilla.kernel.org/show_bug.cgi?id=3691 Implemented the initial deployment of new OSL mutex interfaces. Since some host operating systems have separate mutex and semaphore objects, this feature was requested. The base code now uses mutexes (and the new mutex interfaces) wherever a binary semaphore was used previously. However, for the current release, the mutex interfaces are defined as macros to map them to the existing semaphore interfaces. Fixed several problems with the support for the control method SyncLevel parameter. The SyncLevel now works according to the ACPI specification and in concert with the Mutex SyncLevel parameter, since the current SyncLevel is a property of the executing thread. Mutual exclusion for control methods is now implemented with a mutex instead of a semaphore. Fixed three instances of the use of the C shift operator in the bitfield support code (exfldio.c) to avoid the use of a shift value larger than the target data width. The behavior of C compilers is undefined in this case and can cause unpredictable results, and therefore the case must be detected and avoided. (Fiodor Suietov) Added an info message whenever an SSDT or OEM table is loaded dynamically via the Load() or LoadTable() ASL operators. This should improve debugging capability since it will show exactly what tables have been loaded (beyond the tables present in the RSDT/XSDT.) Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-28	ACPI: dock driver	Kristen Accardi	2	-1/+18
	Create a driver which lives in the acpi subsystem to handle dock events. This driver is not an "ACPI" driver, because acpi drivers require that the object be present when the driver is loaded. Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-28	Merge branch 'for_paulus' of ↵	Paul Mackerras	1	-4/+0
	master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc
2006-06-28	powerpc: minor cleanups for mpc86xx	Kumar Gala	1	-4/+0
	* Remove duplicated cputable entry for 8641 (matches w/7448) * Removed __init from function prototypes in mpc86xx.h * Moved pci fixups into board specific code * Moved mpc86xx_exclude_device to generic mpc86xx pci code * Fixed sparse warnings in mpc86xx_smp.c * Removed board specific header include from asm-powerpc/mpc86xx.h Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2006-06-28	[POWERPC] Simplify the code defining the 64-bit CPU features	Paul Mackerras	1	-28/+20
	Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] update asm-powerpc/time.h	Stephen Rothwell	1	-2/+4
	If we ever build a combined kernel including iSeries, then this will be needed. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] Clean up it_lp_queue.h	Stephen Rothwell	1	-20/+20
	No more StudlyCaps. Remove from a couple of places it is no longer needed. Use C style comments. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] Add the use of the firmware soft-reset-nmi to kdump.	David Wilder	1	-0/+8
	With this patch, kdump uses the firmware soft-reset NMI for two purposes: 1) Initiate the kdump (take a crash dump) by issuing a soft-reset. 2) Break a CPU out of a deadlock condition that is detected during kdump processing. When a soft-reset is initiated each CPU will enter system_reset_exception() and set its corresponding bit in the global bit-array cpus_in_sr then call die(). When die() finds the CPU's bit set in cpu_in_sr crash_kexec() is called to initiate a crash dump. The first CPU to enter crash_kexec() is called the "crashing CPU". All other CPUs are "secondary CPUs". The secondary CPU's pass through to crash_kexec_secondary() and sleep. The crashing CPU waits for all CPUs to enter via soft-reset then boots the kdump kernel (see crash_soft_reset_check()) When the system crashes due to a panic or exception, crash_kexec() is called by panic() or die(). The crashing CPU sends an IPI to all other CPUs to notify them of the pending shutdown. If a CPU is in a deadlock or hung state with interrupts disabled, the IPI will not be delivered. The result being, that the kdump kernel is not booted. This problem is solved with the use of a firmware generated soft-reset. After the crashing_cpu has issued the IPI, it waits for 10 sec for all CPUs to enter crash_ipi_callback(). A CPU signifies its entry to crash_ipi_callback() by setting its corresponding bit in the cpus_in_crash bit array. After 10 sec, if one or more CPUs have not set their bit in cpus_in_crash we assume that the CPU(s) is deadlocked. The operator is then prompted to generate a soft-reset to break the deadlock. Each CPU enters the soft reset handler as described above. Two conditions must be handled at this point: 1) The system crashed because the operator generated a soft-reset. See 2) The system had crashed before the soft-reset was generated ( in the case of a Panic or oops). The first CPU to enter crash_kexec() uses the state of the kexec_lock to determine this state. If kexec_lock is already held then condition 2 is true and crash_kexec_secondary() is called, else; this CPU is flagged as the crashing CPU, the kexec_lock is acquired and crash_kexec() proceeds as described above. Each additional CPUs responding to the soft-reset will pass through crash_kexec() to kexec_secondary(). All secondary CPUs call crash_ipi_callback() readying them self's for the shutdown. When ready they clear their bit in cpus_in_sr. The crashing CPU waits in kexec_secondary() until all other CPUs have cleared their bits in cpus_in_sr. The kexec kernel boot is then started. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-27	Merge git://git.infradead.org/mtd-2.6	Linus Torvalds	2	-1/+48
	* git://git.infradead.org/mtd-2.6: [MTD] NAND: Select chip before checking write protect status [MTD] CORE mtdchar.c: fix off-by-one error in lseek() [MTD] NAND: Fix typo in mtd/nand/ts7250.c [JFFS2][XATTR] coexistence between xattr and write buffering support. [JFFS2][XATTR] Fix wrong copyright [JFFS2][XATTR] Re-define xd->refcnt as atomic_t [JFFS2][XATTR] Fix memory leak with jffs2_xattr_ref [JFFS2][XATTR] rid unnecessary writing of delete marker. [JFFS2][XATTR] Fix ACL bug when updating null xattr by null ACL. [JFFS2][XATTR] using 'delete marker' for xdatum/xref deletion [MTD] Fix off-by-one error in physmap.c [MTD] Remove unused 'nr_banks' variable from ixp2000 map driver [MTD NAND] s3c2412 support in s3c2410.c [MTD] Initialize 'writesize' [MTD] NAND: ndfc fix address offset thinko [MTD] NAND: S3C2410 convert prinks to dev_*()s [MTD] NAND: Missing fixups
2006-06-27	Merge branch 'upstream-linus' of ↵	Linus Torvalds	2	-1/+16
	master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] ata_piix: add ICH6/7/8 to Kconfig [PATCH] sata_sil: disable hotplug interrupts on two ATI IXPs [PATCH] libata: cosmetic updates [PATCH] ata: add some NVIDIA chipset IDs [PATCH] libata reduce timeouts [PATCH] libata: implement ata_port_max_devices() [PATCH] libata: make two functions global [PATCH] libata: update ata_do_simple_cmd() [PATCH] libata: move ata_do_simple_cmd() below ata_exec_internal() [PATCH] libata: clear EH action on device detach [PATCH] libata: implement and use ata_deh_dev_action() [PATCH] libata: move ata_eh_clear_action() upward [PATCH] libata.h needs scatterlist.h [libata] sata_vsc: partially revert a PCI ID-related commit [libata] Bump versions
2006-06-28	[POWERPC] Add udbg support for RTAS console	Michael Ellerman	1	-1/+2
	Add udbg hooks for the RTAS console, based on the RTAS put-term-char and get-term-char calls. Along with my previous patches, this should enable debugging as soon as early_init_dt_scan_rtas() is called. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] Setup RTAS values earlier, to enable rtas_call() earlier	Michael Ellerman	1	-0/+3
	Althought RTAS is instantiated when we enter the kernel, we can't actually call into it until we know its entry point address. Currently we grab that in rtas_initialize(), however that's quite late in the boot sequence. To enable rtas_call() earlier, we can grab the RTAS entry etc. values while we're scanning the flattened device tree. There's existing code to retrieve the values from /chosen, however we don't store them there anymore, so remove that code. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] Make kexec_setup() a regular initcall	Michael Ellerman	1	-1/+0
	There's no reason kexec_setup() needs to be called explicitly from setup_system(), it can just be a regular initcall. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] powerpc: Initialise ppc_md htab pointers earlier	Michael Ellerman	1	-1/+0
	Initialise the ppc_md htab callbacks earlier, in the probe routines. This allows us to call htab_finish_init() from htab_initialize(), and makes it private to hash_utils_64.c. Move htab_finish_init() and make_bl() above htab_initialize() to avoid forward declarations. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] kdump: Reserve the existing TCE mappings left by the first kernel	Haren Myneni	2	-0/+4
	During kdump boot, noticed some machines checkstop on dma protection fault for ongoing DMA left in the first kernel. Instead of initializing TCE entries in iommu_init() for the kdump boot, this patch fixes this issue by walking through the each TCE table and checks whether the entries are in use by the first kernel. If so, reserve those entries by setting the corresponding bit in tbl->it_map such that these entries will not be available for the kdump boot. However it could be possible that all TCE entries might be used up due to the driver bug that does continuous mapping. My observation is around 1700 TCE entries are used on some systems (Ex: P4) at some point of time during kdump boot and saving dump (either write into the disk or sending to remote machine). Hence, this patch will make sure that minimum of 2048 entries will be available such that kdump boot could be successful in some cases. Signed-off-by: Haren Myneni <haren@us.ibm.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28	[POWERPC] Remove obsolete #include <linux/config.h>.	Jon Loeliger	1	-1/+0
	Signed-off-by: Jon Loeliger <jdl@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-27	[PATCH] m68knommu: remove NO_FORMAT_VECi from ptrace.h header	Greg Ungerer	1	-2/+0
	Remove NO_FORMAT_VEC conditional check. It is not used or defined anywhere. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb	Linus Torvalds	1	-2/+2
	* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (26 commits) V4L/DVB (4263): Fix warning when compiling on 64 bit machines V4L/DVB (4261): Included required header for in-kernel compilation V4L/DVB (4260): Stradis.c: make 2 functions static V4L/DVB (4259): Pass an explicit log prefix to cx2341x_log_status V4L/DVB (4257): Fix 64-bit compile warnings. V4L/DVB (4255): Tda9887 default TOP value is 0x10 V4L/DVB (4254): Remove obsoleted tuner_debug option. V4L/DVB (4253): IVTV VBI format description too long. V4L/DVB (4252): Remove duplicate 'tda9887' in info messages. V4L/DVB (4245): Reduce the amount of pvrusb2-sourced noise going into the system log V4L/DVB (4244): Implement use of cx2341x module in pvrusb2 driver V4L/DVB (4243): Exploit new V4L control features in pvrusb2 V4L/DVB (4242): Don't suspend encoder when changing its attributes (in pvrusb2) V4L/DVB (4241): Fix faulty encoder error recovery in pvrusb2 V4L/DVB (4240): Various V4L control enhancements in pvrusb2 V4L/DVB (4239): Handle boolean controls in pvrusb2 V4L/DVB (4238): Make sure flags field is initialized when quering a control in pvrusb2 V4L/DVB (4237): Move LOG_STATUS bracketing to a different part of the pvrusb2 driver V4L/DVB (4236): Rearrange things in pvrusb2 driver in preparation for using cx2341x module V4L/DVB (4235): Increase the maximum number of controls that pvrusb2-sysfs.c can handle. ...
2006-06-27	[PATCH] drivers/char/ipmi/ipmi_msghandler.c: make proc_ipmi_root static	Adrian Bunk	1	-4/+0
	Make struct proc_ipmi_root static. Besides this, tremove removes an unused #ifdef CONFIG_PROC_FS from include/linux/ipmi.h. Acked-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] rtmutex: Propagate priority settings into PI lock chains	Thomas Gleixner	1	-0/+2
	When the priority of a task, which is blocked on a lock, changes we must propagate this change into the PI lock chain. Therefor the chain walk code is changed to get rid of the references to current to avoid false positives in the deadlock detector, as setscheduler might be called by a task which holds the lock on which the task whose priority is changed is blocked. Also add some comments about the get/put_task_struct usage to avoid confusion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: futex_lock_pi/futex_unlock_pi support	Ingo Molnar	2	-0/+10
	This adds the actual pi-futex implementation, based on rt-mutexes. [dino@in.ibm.com: fix an oops-causing race] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: rt mutex tester	Thomas Gleixner	1	-0/+1
	RT-mutex tester: scriptable tester for rt mutexes, which allows userspace scripting of mutex unit-tests (and dynamic tests as well), using the actual rt-mutex implementation of the kernel. [akpm@osdl.org: fixlet] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: rt mutex debug	Ingo Molnar	2	-1/+15
	Runtime debugging functionality for rt-mutexes. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: rt mutex core	Ingo Molnar	4	-0/+118
	Core functions for the rt-mutex subsystem. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: scheduler support for pi	Ingo Molnar	2	-2/+21
	Add framework to boost/unboost the priority of RT tasks. This consists of: - caching the 'normal' priority in ->normal_prio - providing a functions to set/get the priority of the task - make sched_setscheduler() aware of boosting The effective_prio() cleanups also fix a priority-calculation bug pointed out by Andrey Gelman, in set_user_nice(). has_rt_policy() fix: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Andrey Gelman <agelman@012.net.il> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: add plist implementation	Ingo Molnar	1	-0/+247
	Add the priority-sorted list (plist) implementation. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: introduce WARN_ON_SMP	Ingo Molnar	1	-0/+6
	Introduce a new WARN_ON variant: WARN_ON_SMP(cond). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: introduce debug_check_no_locks_freed()	Ingo Molnar	1	-2/+8
	Add debug_check_no_locks_freed(), as a central inline to add bad-lock-free-debugging functionality to. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pi-futex: futex code cleanups	Ingo Molnar	2	-5/+4
	We are pleased to announce "lightweight userspace priority inheritance" (PI) support for futexes. The following patchset and glibc patch implements it, ontop of the robust-futexes patchset which is included in 2.6.16-mm1. We are calling it lightweight for 3 reasons: - in the user-space fastpath a PI-enabled futex involves no kernel work (or any other PI complexity) at all. No registration, no extra kernel calls - just pure fast atomic ops in userspace. - in the slowpath (in the lock-contention case), the system call and scheduling pattern is in fact better than that of normal futexes, due to the 'integrated' nature of FUTEX_LOCK_PI. [more about that further down] - the in-kernel PI implementation is streamlined around the mutex abstraction, with strict rules that keep the implementation relatively simple: only a single owner may own a lock (i.e. no read-write lock support), only the owner may unlock a lock, no recursive locking, etc. Priority Inheritance - why, oh why??? ------------------------------------- Many of you heard the horror stories about the evil PI code circling Linux for years, which makes no real sense at all and is only used by buggy applications and which has horrible overhead. Some of you have dreaded this very moment, when someone actually submits working PI code ;-) So why would we like to see PI support for futexes? We'd like to see it done purely for technological reasons. We dont think it's a buggy concept, we think it's useful functionality to offer to applications, which functionality cannot be achieved in other ways. We also think it's the right thing to do, and we think we've got the right arguments and the right numbers to prove that. We also believe that we can address all the counter-arguments as well. For these reasons (and the reasons outlined below) we are submitting this patch-set for upstream kernel inclusion. What are the benefits of PI? The short reply: ---------------- User-space PI helps achieving/improving determinism for user-space applications. In the best-case, it can help achieve determinism and well-bound latencies. Even in the worst-case, PI will improve the statistical distribution of locking related application delays. The longer reply: ----------------- Firstly, sharing locks between multiple tasks is a common programming technique that often cannot be replaced with lockless algorithms. As we can see it in the kernel [which is a quite complex program in itself], lockless structures are rather the exception than the norm - the current ratio of lockless vs. locky code for shared data structures is somewhere between 1:10 and 1:100. Lockless is hard, and the complexity of lockless algorithms often endangers to ability to do robust reviews of said code. I.e. critical RT apps often choose lock structures to protect critical data structures, instead of lockless algorithms. Furthermore, there are cases (like shared hardware, or other resource limits) where lockless access is mathematically impossible. Media players (such as Jack) are an example of reasonable application design with multiple tasks (with multiple priority levels) sharing short-held locks: for example, a highprio audio playback thread is combined with medium-prio construct-audio-data threads and low-prio display-colory-stuff threads. Add video and decoding to the mix and we've got even more priority levels. So once we accept that synchronization objects (locks) are an unavoidable fact of life, and once we accept that multi-task userspace apps have a very fair expectation of being able to use locks, we've got to think about how to offer the option of a deterministic locking implementation to user-space. Most of the technical counter-arguments against doing priority inheritance only apply to kernel-space locks. But user-space locks are different, there we cannot disable interrupts or make the task non-preemptible in a critical section, so the 'use spinlocks' argument does not apply (user-space spinlocks have the same priority inversion problems as other user-space locking constructs). Fact is, pretty much the only technique that currently enables good determinism for userspace locks (such as futex-based pthread mutexes) is priority inheritance: Currently (without PI), if a high-prio and a low-prio task shares a lock [this is a quite common scenario for most non-trivial RT applications], even if all critical sections are coded carefully to be deterministic (i.e. all critical sections are short in duration and only execute a limited number of instructions), the kernel cannot guarantee any deterministic execution of the high-prio task: any medium-priority task could preempt the low-prio task while it holds the shared lock and executes the critical section, and could delay it indefinitely. Implementation: --------------- As mentioned before, the userspace fastpath of PI-enabled pthread mutexes involves no kernel work at all - they behave quite similarly to normal futex-based locks: a 0 value means unlocked, and a value==TID means locked. (This is the same method as used by list-based robust futexes.) Userspace uses atomic ops to lock/unlock these mutexes without entering the kernel. To handle the slowpath, we have added two new futex ops: FUTEX_LOCK_PI FUTEX_UNLOCK_PI If the lock-acquire fastpath fails, [i.e. an atomic transition from 0 to TID fails], then FUTEX_LOCK_PI is called. The kernel does all the remaining work: if there is no futex-queue attached to the futex address yet then the code looks up the task that owns the futex [it has put its own TID into the futex value], and attaches a 'PI state' structure to the futex-queue. The pi_state includes an rt-mutex, which is a PI-aware, kernel-based synchronization object. The 'other' task is made the owner of the rt-mutex, and the FUTEX_WAITERS bit is atomically set in the futex value. Then this task tries to lock the rt-mutex, on which it blocks. Once it returns, it has the mutex acquired, and it sets the futex value to its own TID and returns. Userspace has no other work to perform - it now owns the lock, and futex value contains FUTEX_WAITERS\|TID. If the unlock side fastpath succeeds, [i.e. userspace manages to do a TID -> 0 atomic transition of the futex value], then no kernel work is triggered. If the unlock fastpath fails (because the FUTEX_WAITERS bit is set), then FUTEX_UNLOCK_PI is called, and the kernel unlocks the futex on the behalf of userspace - and it also unlocks the attached pi_state->rt_mutex and thus wakes up any potential waiters. Note that under this approach, contrary to other PI-futex approaches, there is no prior 'registration' of a PI-futex. [which is not quite possible anyway, due to existing ABI properties of pthread mutexes.] Also, under this scheme, 'robustness' and 'PI' are two orthogonal properties of futexes, and all four combinations are possible: futex, robust-futex, PI-futex, robust+PI-futex. glibc support: -------------- Ulrich Drepper and Jakub Jelinek have written glibc support for PI-futexes (and robust futexes), enabling robust and PI (PTHREAD_PRIO_INHERIT) POSIX mutexes. (PTHREAD_PRIO_PROTECT support will be added later on too, no additional kernel changes are needed for that). [NOTE: The glibc patch is obviously inofficial and unsupported without matching upstream kernel functionality.] the patch-queue and the glibc patch can also be downloaded from: http://redhat.com/~mingo/PI-futex-patches/ Many thanks go to the people who helped us create this kernel feature: Steven Rostedt, Esben Nielsen, Benedikt Spranger, Daniel Walker, John Cooper, Arjan van de Ven, Oleg Nesterov and others. Credits for related prior projects goes to Dirk Grambow, Inaky Perez-Gonzalez, Bill Huey and many others. Clean up the futex code, before adding more features to it: - use u32 as the futex field type - that's the ABI - use __user and pointers to u32 instead of unsigned long - code style / comment style cleanups - rename hash-bucket name from 'bh' to 'hb'. I checked the pre and post futex.o object files to make sure this patch has no code effects. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] sched: mc/smt power savings sched policy	Siddha, Suresh B	7	-1/+28
	sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in /sys/devices/system/cpu/ control the MC/SMT power savings policy for the scheduler. Based on the values (1-enable, 0-disable) for these controls, sched groups cpu power will be determined for different domains. When power savings policy is enabled and under light load conditions, scheduler will minimize the physical packages/cpu cores carrying the load and thus conserving power(with a perf impact based on the workload characteristics... see OLS 2005 CMP kernel scheduler paper for more details..) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Con Kolivas <kernel@kolivas.org> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] sched_domain: handle kmalloc failure	Srivatsa Vaddagiri	1	-1/+1
	Try to handle mem allocation failures in build_sched_domains by bailing out and cleaning up thus-far allocated memory. The patch has a direct consequence that we disable load balancing completely (even at sibling level) upon any memory allocation failure. [Lee.Schermerhorn@hp.com: bugfix] Signed-off-by: Srivatsa Vaddagir <vatsa@in.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] sched: implement smpnice	Peter Williams	1	-2/+6
	Problem: The introduction of separate run queues per CPU has brought with it "nice" enforcement problems that are best described by a simple example. For the sake of argument suppose that on a single CPU machine with a nice==19 hard spinner and a nice==0 hard spinner running that the nice==0 task gets 95% of the CPU and the nice==19 task gets 5% of the CPU. Now suppose that there is a system with 2 CPUs and 2 nice==19 hard spinners and 2 nice==0 hard spinners running. The user of this system would be entitled to expect that the nice==0 tasks each get 95% of a CPU and the nice==19 tasks only get 5% each. However, whether this expectation is met is pretty much down to luck as there are four equally likely distributions of the tasks to the CPUs that the load balancing code will consider to be balanced with loads of 2.0 for each CPU. Two of these distributions involve one nice==0 and one nice==19 task per CPU and in these circumstances the users expectations will be met. The other two distributions both involve both nice==0 tasks being on one CPU and both nice==19 being on the other CPU and each task will get 50% of a CPU and the user's expectations will not be met. Solution: The solution to this problem that is implemented in the attached patch is to use weighted loads when determining if the system is balanced and, when an imbalance is detected, to move an amount of weighted load between run queues (as opposed to a number of tasks) to restore the balance. Once again, the easiest way to explain why both of these measures are necessary is to use a simple example. Suppose that (in a slight variation of the above example) that we have a two CPU system with 4 nice==0 and 4 nice=19 hard spinning tasks running and that the 4 nice==0 tasks are on one CPU and the 4 nice==19 tasks are on the other CPU. The weighted loads for the two CPUs would be 4.0 and 0.2 respectively and the load balancing code would move 2 tasks resulting in one CPU with a load of 2.0 and the other with load of 2.2. If this was considered to be a big enough imbalance to justify moving a task and that task was moved using the current move_tasks() then it would move the highest priority task that it found and this would result in one CPU with a load of 3.0 and the other with a load of 1.2 which would result in the movement of a task in the opposite direction and so on -- infinite loop. If, on the other hand, an amount of load to be moved is calculated from the imbalance (in this case 0.1) and move_tasks() skips tasks until it find ones whose contributions to the weighted load are less than this amount it would move two of the nice==19 tasks resulting in a system with 2 nice==0 and 2 nice=19 on each CPU with loads of 2.1 for each CPU. One of the advantages of this mechanism is that on a system where all tasks have nice==0 the load balancing calculations would be mathematically identical to the current load balancing code. Notes: struct task_struct: has a new field load_weight which (in a trade off of space for speed) stores the contribution that this task makes to a CPU's weighted load when it is runnable. struct runqueue: has a new field raw_weighted_load which is the sum of the load_weight values for the currently runnable tasks on this run queue. This field always needs to be updated when nr_running is updated so two new inline functions inc_nr_running() and dec_nr_running() have been created to make sure that this happens. This also offers a convenient way to optimize away this part of the smpnice mechanism when CONFIG_SMP is not defined. int try_to_wake_up(): in this function the value SCHED_LOAD_BALANCE is used to represent the load contribution of a single task in various calculations in the code that decides which CPU to put the waking task on. While this would be a valid on a system where the nice values for the runnable tasks were distributed evenly around zero it will lead to anomalous load balancing if the distribution is skewed in either direction. To overcome this problem SCHED_LOAD_SCALE has been replaced by the load_weight for the relevant task or by the average load_weight per task for the queue in question (as appropriate). int move_tasks(): The modifications to this function were complicated by the fact that active_load_balance() uses it to move exactly one task without checking whether an imbalance actually exists. This precluded the simple overloading of max_nr_move with max_load_move and necessitated the addition of the latter as an extra argument to the function. The internal implementation is then modified to move up to max_nr_move tasks and max_load_move of weighted load. This slightly complicates the code where move_tasks() is called and if ever active_load_balance() is changed to not use move_tasks() the implementation of move_tasks() should be simplified accordingly. struct sched_group find_busiest_group(): Similar to try_to_wake_up(), there are places in this function where SCHED_LOAD_SCALE is used to represent the load contribution of a single task and the same issues are created. A similar solution is adopted except that it is now the average per task contribution to a group's load (as opposed to a run queue) that is required. As this value is not directly available from the group it is calculated on the fly as the queues in the groups are visited when determining the busiest group. A key change to this function is that it is no longer to scale down imbalance on exit as move_tasks() uses the load in its scaled form. void set_user_nice(): has been modified to update the task's load_weight field when it's nice value and also to ensure that its run queue's raw_weighted_load field is updated if it was runnable. From: "Siddha, Suresh B" <suresh.b.siddha@intel.com> With smpnice, sched groups with highest priority tasks can mask the imbalance between the other sched groups with in the same domain. This patch fixes some of the listed down scenarios by not considering the sched groups which are lightly loaded. a) on a simple 4-way MP system, if we have one high priority and 4 normal priority tasks, with smpnice we would like to see the high priority task scheduled on one cpu, two other cpus getting one normal task each and the fourth cpu getting the remaining two normal tasks. but with current smpnice extra normal priority task keeps jumping from one cpu to another cpu having the normal priority task. This is because of the busiest_has_loaded_cpus, nr_loaded_cpus logic.. We are not including the cpu with high priority task in max_load calculations but including that in total and avg_load calcuations.. leading to max_load < avg_load and load balance between cpus running normal priority tasks(2 Vs 1) will always show imbalanace as one normal priority and the extra normal priority task will keep moving from one cpu to another cpu having normal priority task.. b) 4-way system with HT (8 logical processors). Package-P0 T0 has a highest priority task, T1 is idle. Package-P1 Both T0 and T1 have 1 normal priority task each.. P2 and P3 are idle. With this patch, one of the normal priority tasks on P1 will be moved to P2 or P3.. c) With the current weighted smp nice calculations, it doesn't always make sense to look at the highest weighted runqueue in the busy group.. Consider a load balance scenario on a DP with HT system, with Package-0 containing one high priority and one low priority, Package-1 containing one low priority(with other thread being idle).. Package-1 thinks that it need to take the low priority thread from Package-0. And find_busiest_queue() returns the cpu thread with highest priority task.. And ultimately(with help of active load balance) we move high priority task to Package-1. And same continues with Package-0 now, moving high priority task from package-1 to package-0.. Even without the presence of active load balance, load balance will fail to balance the above scenario.. Fix find_busiest_queue to use "imbalance" when it is lightly loaded. [kernel@kolivas.org: sched: store weighted load on up] [kernel@kolivas.org: sched: add discrete weighted cpu load function] [suresh.b.siddha@intel.com: sched: remove dead code] Signed-off-by: Peter Williams <pwil3058@bigpond.com.au> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Con Kolivas <kernel@kolivas.org> Cc: John Hawkes <hawkes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] chardev: GPIO for SCx200 & PC-8736x: use dev_dbg in common module	Jim Cromie	1	-1/+5
	Use of dev_dbg() and friends is considered good practice. dev_dbg() needs a struct device *devp, but nsc_gpio is only a helper module, so it doesnt have/need its own. To provide devp to the user-modules (scx200 & pc8736x _gpio), we add it to the vtable, and set it during init. Also squeeze nsc_gpio_dump()'s format a little. [ 199.259879] pc8736x_gpio.0: io09: 0x0044 TS OD PUE EDGE LO DEBOUNCE Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] chardev: GPIO for SCx200 & PC-8736x: migrate file-ops to common module	Jim Cromie	1	-0/+5
	Now that the read(), write() file-ops are dispatching gpio-ops via the vtable, they are generic, and can be moved 'verbatim' to the nsc_gpio common-support module. After the move, various symbols are renamed to update 'scx200_' to 'nsc_', and headers are adjusted accordingly. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] chardev: GPIO for SCx200 & PC-8736x: add gpio-ops vtable	Jim Cromie	1	-0/+33
	Abstract the gpio operations into a new nsc_gpio_ops vtable. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] chardev: GPIO for SCx200 & PC-8736x: device minor numbers are ↵	Jim Cromie	1	-7/+7
	unsigned ints Per kernel headers, device minor numbers are unsigned ints. Do the same in this driver. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] chardev: GPIO for SCx200 & PC-8736x: whitespace pre-clean	Jim Cromie	2	-14/+0
	GPIO SUPPORT FOR SCx200 & PC8736x The patch-set reworks the 2.4 vintage scx200_gpio driver for modern 2.6, and refactors GPIO support to reuse it in a new driver for the GPIO on PC-8736x chips. Its handy for the Soekris.com net-4801, which has both chips. These patches have been seen recently on Kernel-Mentors, and then Kernel-Newbies ML, where Jesper Juhl kindly reviewed it. His feedback has been incorporated. Thanks Jesper ! Its also gone to soekris-tech@soekris.com for possible testing by linux folks, I've gotten 1 promise so far. Theyre mostly BSD folk over there, but we'll see.. Device-file & Sysfs The driver preserves the existing device-file interface, including the write/cmd set, but adds v to 'view' the pin-settings & configs by inducing, via gpio_dump(), a dev_info() call. Its a fairly crappy way to get status, but it sticks to the syslog approach, conservatively. Allowing users to voluntarily trigger logging is good, it gives them a familiar way to confirm their app's control & use of the pins, and I've thus reduced the pin-mode-updates from dev_info to dev_dbg. I've recently bolted on a proto sysfs interface for both new drivers. Im not including those patches here; they (the patch + doc-pre-patch) are still quite raw (and unreviewed on KNML), and since they 'invent' a convention for GPIO, a proper vetting is needed. Since this patchset is much bigger than my previous ones, Id like to keep things simpler, and address it 1st, before bolting on more stuff. The driver-split The Geode CPU and the PC-87366 Super-IO chip have GPIO units which share a common pin-architecture (same pin features, with same bits controlling), but with different addressing mechanics and port organizations. The vintage driver expresses the pin capabilities with pin-mode commands [OoPpTt],etc that change the pin configurations, and since the 2 chips share pin-arch, we can reuse the read(), write() commands, once the implementation is suitably adjusted. The patchset adds a vtable: struct nsc_gpio_ops, to abstract the existing gpio operations, then adjusts fileops.write() code to invoke operations via that vtable. Driver specific open()s set private_data to the vtable so its available for use by write(). The vtable gets the gpio_dump() too, since its user-friendly, and (could be construed as) part of the current device-file interface. To support use of dev_dbg() in write() & _dump(), the vtable gets a dev ptr too, set by both scx200 & pc8736x _gpio drivers. heres how the pins are presented in syslog: [ 1890.176223] scx200_gpio.0: io00: 0x0044 TS OD PUE EDGE LO DEBOUNCE [ 1890.287223] scx200_gpio.0: io01: 0x0003 OE PP PUD EDGE LO nsc_gpio.c: new file is new home of several file-ops methods, which are modified to get their vtable from filp->private_data, and use it where needed. scx200_gpio.c: keeps some of its existing gpio routines, but now wires them up via the vtable (they're invoked by nsc_gpio.c:nsc_gpio_write() thru this vtable). A driver-spcific open() initializes filp->private_data with the vtable. Once the split is clean, and the scx200_gpio driver is working, we copy and modify the function and variable names, and rework the access-method bodies for the different addressing scheme. Heres a working overview of the patchset: # series file for GPIO # Spring Cleaning gpio-scx/patch.preclean # scripts/Lindent fixes, editor-ctrl comments # API Modernization gpio-scx/patch.api26 # what I learned from LDD3 gpio-scx/patch.platform-dev-2 # get pdev, support for dev_dbg() gpio-scx/patch.unsigned-minor # fix to match std practice # Debuggability gpio-scx/patch.dump-diet # shrink gpio_dump() gpio-scx/patch.viewpins # add new 'command' to call dump() gpio-scx/patch.init-refactor # pull shadow-register init to sub # Access-Abstraction (add vtable) gpio-scx/patch.access-vtable # introduce nsg_gpio_ops vtable, w dump gpio-scx/patch.vtable-calls # add & use the vtable in scx200_gpio gpio-scx/patch.nscgpio-shell # add empty driver for common-fops # move code under abstraction gpio-scx/patch.migrate-fops # move file-ops methods from scx200_gpio gpio-scx/patch.common-dump # mv scx200.c:scx200_gpio_dump() to nsc_gpio.c gpio-scx/patch.add-pc8736x-gpio # add new driver, like old, w chip adapt # gpio-scx/patch.add-DEBUG # enable all dev_dbg()s # Cleanups # finish printk -> dev_dbg() etc gpio-scx/patch.pdev-pc8736x # new drvr needs pdev too, gpio-scx/patch.devdbg-nscgpio # add device to 'vtable', use in dev_dbg() # gpio-scx/patch.pin-config-view # another 'c' 'command' # gpio-scx/quiet-getset # take out excess dbg stuff (pretty quiet now) gpio-scx/patch.shadow-current # imitate scx200_gpio's shadow regs in pc87* # post KMentors-post patches .. gpio-scx/patch.mutexes # use mutexes for config-locks gpio-scx/patch.viewpins-values # extend dump to obsolete separate 'c' cmd gpio-scx/patch.kconfig # add stuff for kbuild # TBC # combine api26 with pdev, which is just one step. # merge c&v commands to single do-all-fn # delay viewpins, dump-diet should also un-ifdef it too. diff.sys-gpio-rollup-1 This patch: Removed editor format-control comments, and used scripts/Lindent to clean up whitespace, then deleted the bogus chunks :-( Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] cpu hotplug: add hotplug versions of cpu_notifier	Chandra Seetharaman	1	-0/+4
	Define new macros register_hotcpu_notifier() and unregister_hotcpu_notifier() that redefines register_cpu_notifier() and unregister_cpu_notifier() for use only when HOTPLUG_CPU is defined. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] cpu hotplug: make [un]register_cpu_notifier init time only	Chandra Seetharaman	1	-0/+6
	CPUs come online only at init time (unless CONFIG_HOTPLUG_CPU is defined). So, cpu_notifier functionality need to be available only at init time. This patch makes register_cpu_notifier() available only at init time, unless CONFIG_HOTPLUG_CPU is defined. This patch exports register_cpu_notifier() and unregister_cpu_notifier() only if CONFIG_HOTPLUG_CPU is defined. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] rcutorture: add call_rcu_bh() operations	Paul E. McKenney	1	-0/+1
	Add operations for the call_rcu_bh() variant of RCU. Also add an rcu_batches_completed_bh() function, which is needed by rcutorture. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] Remove gratuitous inclusion of <linux/config.h> from <linux/dmaengine.h>	David Woodhouse	1	-1/+1
	We include config.h on the compiler command line. There's no need for it to be included again. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] spin/rwlock init cleanups	Ingo Molnar	1	-1/+1
	locking init cleanups: - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK() - convert rwlocks in a similar manner this patch was generated automatically. Motivation: - cleanliness - lockdep needs control of lock initialization, which the open-coded variants do not give - it's also useful for -rt and for lock debugging in general Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] fs/buffer.c: cleanups	Adrian Bunk	1	-1/+1
	- add a proper prototype for the following global function: - buffer_init() - make the following needlessly global function static: - end_buffer_async_write() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] poison: add & use more constants	Randy Dunlap	1	-0/+7
	Add more poison values to include/linux/poison.h. It's not clear to me whether some others should be added or not, so I haven't added any of these: ./include/linux/libata.h:#define ATA_TAG_POISON 0xfafbfcfdU ./arch/ppc/8260_io/fcc_enet.c:1918: memset((char )(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32); ./drivers/usb/mon/mon_text.c:429: memset(mem, 0xe5, sizeof(struct mon_event_text)); ./drivers/char/ftape/lowlevel/ftape-ctl.c:738: memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE); ./drivers/block/sx8.c:/ 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */ Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] update two drivers for poison.h	Randy Dunlap	1	-0/+6
	Update two drivers to use poison.h. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] add poison.h and patch primary users	Randy Dunlap	2	-8/+46
	Localize poison values into one header file for better documentation and easier/quicker debugging and so that the same values won't be used for multiple purposes. Use these constants in core arch., mm, driver, and fs code. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] vdso: randomize the i386 vDSO by moving it into a vma	Ingo Molnar	8	-24/+51
	Move the i386 VDSO down into a vma and thus randomize it. Besides the security implications, this feature also helps debuggers, which can COW a vma-backed VDSO just like a normal DSO and can thus do single-stepping and other debugging features. It's good for hypervisors (Xen, VMWare) too, which typically live in the same high-mapped address space as the VDSO, hence whenever the VDSO is used, they get lots of guest pagefaults and have to fix such guest accesses up - which slows things down instead of speeding things up (the primary purpose of the VDSO). There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support for older glibcs that still rely on a prelinked high-mapped VDSO. Newer distributions (using glibc 2.3.3 or later) can turn this option off. Turning it off is also recommended for security reasons: attackers cannot use the predictable high-mapped VDSO page as syscall trampoline anymore. There is a new vdso=[0\|1] boot option as well, and a runtime /proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned on/off. (This version of the VDSO-randomization patch also has working ELF coredumping, the previous patch crashed in the coredumping code.) This code is a combined work of the exec-shield VDSO randomization code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell started this patch and i completed it. [akpm@osdl.org: cleanups] [akpm@osdl.org: compile fix] [akpm@osdl.org: compile fix 2] [akpm@osdl.org: compile fix 3] [akpm@osdl.org: revernt MAXMEM change] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Cc: Gerd Hoffmann <kraxel@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@muc.de> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] i386: use C code for current_thread_info()	Chuck Ebbert	1	-6/+4
	Using C code for current_thread_info() lets the compiler optimize it. With gcc 4.0.2, kernel is smaller: text data bss dec hex filename 3645212 555556 312024 4512792 44dc18 2.6.17-rc6-nb-post/vmlinux 3647276 555556 312024 4514856 44e428 2.6.17-rc6-nb/vmlinux ------- -2064 Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86	Rohit Seth	2	-7/+7
	Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure. Similar patch for x86_64 is already accepted by Andi earlier this week. [akpm@osdl.org: fix warning] Signed-off-by: Rohit Seth <rohitseth@google.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] x86: increase interrupt vector range	Rusty Russell	1	-1/+1
	Remove the limit of 256 interrupt vectors by changing the value stored in orig_{e,r}ax to be the complemented interrupt vector. The orig_{e,r}ax needs to be < 0 to allow the signal code to distinguish between return from interrupt and return from syscall. With this change applied, NR_IRQS can be > 256. Xen extends the IRQ numbering space to include room for dynamically allocated virtual interrupts (in the range 256-511), which requires a more permissive interface to do_IRQ. Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] node hotplug: register cpu: remove node struct	KAMEZAWA Hiroyuki	2	-2/+15
	With Goto-san's patch, we can add new pgdat/node at runtime. I'm now considering node-hot-add with cpu + memory on ACPI. I found acpi container, which describes node, could evaluate cpu before memory. This means cpu-hot-add occurs before memory hot add. In most part, cpu-hot-add doesn't depend on node hot add. But register_cpu(), which creates symbolic link from node to cpu, requires that node should be onlined before register_cpu(). When a node is onlined, its pgdat should be there. This patch-set holds off creating symbolic link from node to cpu until node is onlined. This removes node arguments from register_cpu(). Now, register_cpu() requires 'struct node' as its argument. But the array of struct node is now unified in driver/base/node.c now (By Goto's node hotplug patch). We can get struct node in generic way. So, this argument is not necessary now. This patch also guarantees add cpu under node only when node is onlined. It is necessary for node-hot-add vs. cpu-hot-add patch following this. Moreover, register_cpu calculates cpu->node_id by cpu_to_node() without regard to its 'struct node *root' argument. This patch removes it. Also modify callers of register_cpu()/unregister_cpu, whose args are changed by register-cpu-remove-node-struct patch. [Brice.Goglin@ens-lyon.org: fix it] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pgdat allocation and update for ia64 of memory hotplug: allocate ↵	Yasunori Goto	1	-7/+2
	pgdat and per node data This is a patch to allocate pgdat and per node data area for ia64. The size for them can be calculated by compute_pernodesize(). Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pgdat allocation and update for ia64 of memory hotplug: update pgdat ↵	Yasunori Goto	2	-3/+13
	address array This is to refresh node_data[] array for ia64. As I mentioned previous patches, ia64 has copies of information of pgdat address array on each node as per node data. At v2 of node_add, this function used stop_machine_run() to update them. (I wished that they were copied safety as much as possible.) But, in this patch, this arrays are just copied simply, and set node_online_map bit after completion of pgdat initialization. So, kernel must touch NODE_DATA() macro after checking node_online_map(). (Current code has already done it.) This is more simple way for just hot-add..... Note : It will be problem when hot-remove will occur, because, even if online_map bit is set, kernel may touch NODE_DATA() due to race condition. :-( Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] Register sysfs file for hotplugged new node	Yasunori Goto	3	-31/+4
	When new node becomes enable by hot-add, new sysfs file must be created for new node. So, if new node is enabled by add_memory(), register_one_node() is called to create it. In addition, I386's arch_register_node() and a part of register_nodes() of powerpc are consolidated to register_one_node() as a generic_code(). This is tested by Tiger4(IPF) with node hot-plug emulation. Signed-off-by: Keiichiro Tokunaga <tokuanga.keiich@jp.fujitsu.com> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] catch valid mem range at onlining memory	KAMEZAWA Hiroyuki	1	-0/+3
	This patch allows hot-add memory which is not aligned to section. Now, hot-added memory has to be aligned to section size. Considering big section sized archs, this is not useful. When hot-added memory is registerd as iomem resoruce by iomem resource patch, we can make use of that information to detect valid memory range. Note: With this, not-aligned memory can be registerd. To allow hot-add memory with holes, we have to do more work around add_memory(). (It doesn't allows add memory to already existing mem section.) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pgdat allocation for new node add (export kswapd start func)	Yasunori Goto	1	-0/+2
	When node is hot-added, kswapd for the node should start. This export kswapd start function as kswapd_run() to use at add_memory(). [akpm@osdl.org: daemonize() isn't needed when using the kthread API] Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27	[PATCH] pgdat allocation for new node add (refresh node_data[])	Yasunori Goto	1	-0/+12
	Refresh NODE_DATA() for generic archs. In this case, NODE_DATA(nid) == node_data[nid]. node_data[] is array of address of pgdat. So, refresh is quite simple. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>