aboutsummaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2006-08-17[BRIDGE]: Disable SG/GSO if TX checksum is offHerbert Xu1-1/+6
When the bridge recomputes features, it does not maintain the constraint that SG/GSO must be off if TX checksum is off. This patch adds that constraint. On a completely unrelated note, I've also added TSO6 and TSO_ECN feature bits if GSO is enabled on the underlying device through the new NETIF_F_GSO_SOFTWARE macro. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NETFILTER]: ip_tables: fix table locking in ipt_do_tablePatrick McHardy1-1/+2
table->private might change because of ruleset changes, don't use it without holding the lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NETFILTER]: ctnetlink: fix deadlock in table dumpingPatrick McHardy2-20/+14
ip_conntrack_put must not be called while holding ip_conntrack_lock since destroy_conntrack takes it again. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[IPV4]: severe locking bug in fib_semantics.cAlexey Kuznetsov1-6/+6
Found in 2.4 by Yixin Pan <yxpan@hotmail.com>. > When I read fib_semantics.c of Linux-2.4.32, write_lock(&fib_info_lock) = > is used in fib_release_info() instead of write_lock_bh(&fib_info_lock). = > Is the following case possible: a BH interrupts fib_release_info() while = > holding the write lock, and calls ip_check_fib_default() which calls = > read_lock(&fib_info_lock), and spin forever. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[MCAST]: Fix filter leak on device removal.David L Stevens2-17/+25
This fixes source filter leakage when a device is removed and a process leaves the group thereafter. This also includes corresponding fixes for IPv6 multicast source filters on device removal. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NET]: Disallow whitespace in network device names.David S. Miller1-5/+14
It causes way too much trouble and confusion in userspace. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[PKT_SCHED] cls_u32: Fix typo.Ralf Hildebrandt1-1/+1
Signed-off-by: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[ATM]: Compile error on ARMKevin Hilman1-1/+1
atm_proc_exit() is declared as __exit, and thus in .exit.text. On some architectures (ARM) .exit.text is discarded at compile time, and since atm_proc_exit() is called by some other __init functions, it results in a link error. Signed-off-by: Kevin Hilman <khilman@mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[IPV4]: Possible leak of multicast source filter sctructureMichal Ruzicka1-3/+3
There is a leak of a socket's multicast source filter list structure on closing a socket with a multicast source filter set on an interface that does not exist any more. Signed-off-by: Michal Ruzicka <michal.ruzicka@comstar.cz> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[IPV6] lockdep: annotate __icmpv6_socketIngo Molnar1-0/+13
Split off __icmpv6_socket's sk->sk_dst_lock class, because it gets used from softirqs, which is safe for __icmpv6_sockets (because they never get directly used via userspace syscalls), but unsafe for normal sockets. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NETFILTER]: xt_physdev build fixAndrew Morton1-0/+1
It needs netfilter_bridge.h for brnf_deferred_hooks Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NET]: Fix potential stack overflow in net/core/utils.cSuresh Siddha1-3/+4
On High end systems (1024 or so cpus) this can potentially cause stack overflow. Fix the stack usage. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[VLAN]: Make sure bonding packet drop checks get done in hwaccel RX path.David S. Miller1-17/+1
Since __vlan_hwaccel_rx() is essentially bypassing the netif_receive_skb() call that would have occurred if we did the VLAN decapsulation in software, we are missing the skb_bond() call and the assosciated checks it does. Export those checks via an inline function, skb_bond_should_drop(), and use this in __vlan_hwaccel_rx(). Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[INET]: Use pskb_trim_unique when trimming paged unique skbsHerbert Xu2-3/+3
The IPv4/IPv6 datagram output path was using skb_trim to trim paged packets because they know that the packet has not been cloned yet (since the packet hasn't been given to anything else in the system). This broke because skb_trim no longer allows paged packets to be trimmed. Paged packets must be given to one of the pskb_trim functions instead. This patch adds a new pskb_trim_unique function to cover the IPv4/IPv6 datagram output path scenario and replaces the corresponding skb_trim calls with it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: ulog: fix panic on SMP kernelsMark Huang3-0/+11
Fix kernel panic on various SMP machines. The culprit is a null ub->skb in ulog_send(). If ulog_timer() has already been scheduled on one CPU and is spinning on the lock, and ipt_ulog_packet() flushes the queue on another CPU by calling ulog_send() right before it exits, there will be no skbuff when ulog_timer() acquires the lock and calls ulog_send(). Cancelling the timer in ulog_send() doesn't help because it has already been scheduled and is running on the first CPU. Similar problem exists in ebt_ulog.c and nfnetlink_log.c. Signed-off-by: Mark Huang <mlhuang@cs.princeton.edu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: {arp,ip,ip6}_tables: proper error recovery in init pathPatrick McHardy3-24/+70
Neither of {arp,ip,ip6}_tables cleans up behind itself when something goes wrong during initialization. Noticed by Rennie deGraaf <degraaf@cpsc.ucalgary.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[LLC]: multicast receive device matchStephen Hemminger1-0/+3
Fix from Aji_Srinivas@emc.com, STP packets are incorrectly received on all LLC datagram sockets, whichever interface they are bound to. The llc_sap datagram receive logic sends packets with a unicast destination MAC to one socket bound to that SAP and MAC, and multicast packets to all sockets bound to that SAP. STP packets are multicast, and we do need to know on which interface they were received. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[IPSEC]: Validate properly in xfrm_dst_check()David S. Miller1-3/+24
If dst->obsolete is -1, this is a signal from the bundle creator that we want the XFRM dst and the dsts that it references to be validated on every use. I misunderstood this intention when I changed xfrm_dst_check() to always return NULL. Now, when we purge a dst entry, by running dst_free() on it. This will set the dst->obsolete to a positive integer, and we want to return NULL in that case so that the socket does a relookup for the route. Thus, if dst->obsolete<0, let stale_bundle() validate the state, else always return NULL. In general, we need to do things more intelligently here because we flush too much state during rule changes. Herbert Xu has some ideas wherein the key manager gives us some help in this area. We can also use smarter state management algorithms inside of the kernel as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: xt_hashlimit: fix limit off-by-onePatrick McHardy1-7/+4
Hashlimit doesn't account for the first packet, which is inconsistent with the limit match. Reported by ryan.castellucci@gmail.com, netfilter bugzilla #500. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: xt_string: fix negationPhil Oester1-1/+1
The xt_string match is broken with ! negation. This resolves a portion of netfilter bugzilla #497. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[TCP]: Fix botched memory leak fix to tcpprobe_read().David S. Miller1-1/+2
Somehow I clobbered James's original fix and only my subsequent compiler warning change went in for that changeset. Get the real fix in there. Noticed by Jesper Juhl. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-09[IPX]: Fix typo, ipxhdr() --> ipx_hdr()David S. Miller1-1/+1
Noticed by Dave Jones. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-09[IPV6]: The ifa lock is a BH lockHerbert Xu1-2/+2
The ifa lock is expected to be taken in BH context (by addrconf timers) so we must disable BH when accessing it from user context. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-09Merge gregkh@master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Greg Kroah-Hartman7-11/+39
2006-08-09[NET]: add_timer -> mod_timer() in dst_run_gc()Dmitry Mishin1-2/+1
Patch from Dmitry Mishin <dim@openvz.org>: Replace add_timer() by mod_timer() in dst_run_gc in order to avoid BUG message. CPU1 CPU2 dst_run_gc() entered dst_run_gc() entered spin_lock(&dst_lock) ..... del_timer(&dst_gc_timer) fail to get lock .... mod_timer() <--- puts timer back to the list add_timer(&dst_gc_timer) <--- BUG because timer is in list already. Found during OpenVZ internal testing. At first we thought that it is OpenVZ specific as we added dst_run_gc(0) call in dst_dev_event(), but as Alexey pointed to me it is possible to trigger this condition in mainstream kernel. F.e. timer has fired on CPU2, but the handler was preeempted by an irq before dst_lock is tried. Meanwhile, someone on CPU1 adds an entry to gc list and starts the timer. If CPU2 was preempted long enough, this timer can expire simultaneously with resuming timer handler on CPU1, arriving exactly to the situation described. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-08Merge branch 'upstream-fixes' of ↵Jeff Garzik1-1/+23
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes
2006-08-08[IPX]: Another nonlinear receive fixStephen Hemminger1-2/+5
Need to check some more cases in IPX receive. If the skb is purely fragments, the IPX header needs to be extracted. The function pskb_may_pull() may in theory invalidate all the pointers in the skb, so references to ipx header must be refreshed. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-08[RTNETLINK]: Fix IFLA_ADDRESS handling.David S. Miller1-1/+14
The ->set_mac_address handlers expect a pointer to a sockaddr which contains the MAC address, whereas IFLA_ADDRESS provides just the MAC address itself. So whip up a sockaddr to wrap around the netlink attribute for the ->set_mac_address call. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-07[TCP]: SNMPv2 tcpOutSegs counter errorWei Yongjun1-3/+9
Do not count retransmitted segments. Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-07[PKTGEN]: Make sure skb->{nh,h} are initialized in fill_packet_ipv6() too.David S. Miller1-0/+2
Mirror the bug fix from fill_packet_ipv4() Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-07[PKTGEN]: Fix oops when used with balance-tlb bondingChen-Li Tien1-0/+2
Signed-off-by: Chen-Li Tien <cltien@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-07[IPV4]: Limit rt cache size properly.Kirill Korotaev1-1/+1
From: Kirill Korotaev <dev@sw.ru> During OpenVZ stress testing we found that UDP traffic with random src can generate too much excessive rt hash growing leading finally to OOM and kernel panics. It was found that for 4GB i686 system (having 1048576 total pages and 225280 normal zone pages) kernel allocates the following route hash: syslog: IP route cache hash table entries: 262144 (order: 8, 1048576 bytes) => ip_rt_max_size = 4194304 entries, i.e. max rt size is 4194304 * 256b = 1Gb of RAM > normal_zone Attached the patch which removes HASH_HIGHMEM flag from alloc_large_system_hash() call. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-07[IPX]: Header length validation neededStephen Hemminger1-1/+2
This patch will linearize and check there is enough data. It handles the pprop case as well as avoiding a whole audit of the routing code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-07[NET]: Assign skb->dev in netdev_alloc_skbChristoph Hellwig1-1/+3
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-06Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds6-22/+21
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [LAPB]: Fix windowsize check [TCP]: Fixes IW > 2 cases when TCP is application limited [PKT_SCHED] RED: Fix overflow in calculation of queue average [LLX]: SOCK_DGRAM interface fixes [PKT_SCHED]: Return ENOENT if qdisc module is unavailable [BRIDGE]: netlink status fix
2006-08-06[PATCH] knfsd: fix race related problem when adding items to and svcrpc auth ↵Neil Brown1-1/+5
cache If we don't find the item we are lookng for, we allocate a new one, and then grab the lock again and search to see if it has been added while we did the alloc. If it had been added we need to 'cache_put' the newly created item that we are never going to use. But as it hasn't been initialised properly, putting it can cause an oops. So move the ->init call earlier to that it will always be fully initilised if we have to put it. Thanks to Philipp Matthias Hahn <pmhahn@svs.Informatik.Uni-Oldenburg.de> for reporting the problem. Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-05[LAPB]: Fix windowsize checkDiego Calleja1-5/+7
In bug #6954, Norbert Reinartz reported the following issue: "Function lapb_setparms() in file net/lapb/lapb_iface.c checks if the given parameters are valid. If the given window size is in the range of 8 .. 127, lapb_setparms() fails and returns an error value of LAPB_INVALUE, even if bit LAPB_EXTENDED in parms->mode is set. If bit LAPB_EXTENDED in parms->mode is set and the window size is in the range of 8 .. 127, the first check "(parms->mode & LAPB_EXTENDED)" results true and the second check "(parms->window < 1 || parms->window > 127)" results false. Both checks in conjunction result to false, thus the third check "(parms->window < 1 || parms->window > 7)" is done by fault. This third check results true, so that we leave lapb_setparms() by 'goto out_put'. Seems that this bug doesn't cause any problems, because lapb_setparms() isn't used to change the default values of LAPB. We are using kernel lapb in our software project and also change the default parameters of lapb, so we found this bug" He also pasted a fix, that I've transformated into a patch: Signed-off-by: Diego Calleja <diegocg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-04[TCP]: Fixes IW > 2 cases when TCP is application limitedIlpo Järvinen1-1/+2
Whenever a transfer is application limited, we are allowed at least initial window worth of data per window unless cwnd is previously less than that. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-04[LLX]: SOCK_DGRAM interface fixesStephen Hemminger2-14/+10
The datagram interface of LLC is broken in a couple of ways. These were discovered when trying to use it to build an out-of-kernel version of STP. First it didn't pass the source address of the received packet in recvfrom(). It needs to copy the source address of received LLC packets into the socket control block. At the same time fix a security issue because there was uninitialized data leakage. Every recvfrom call was just copying out old data. Second, LLC should not merge multiple packets in one receive call on datagram sockets. LLC should preserve packet boundaries on SOCK_DGRAM. This fix goes against the old historical comments about UNIX98 semantics but without this fix SOCK_DGRAM is broken and useless. So either ANK's interpretation was incorect or UNIX98 standard was wrong. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-04[PKT_SCHED]: Return ENOENT if qdisc module is unavailableJamal Hadi Salim1-1/+1
Return ENOENT if qdisc module is unavailable Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-04[BRIDGE]: netlink status fixStephen Hemminger1-1/+1
Fix code that passes back netlink status messages about bridge changes. Submitted by Aji_Srinivas@emc.com Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-04[PATCH] Send wireless netlink events with a clean slateHerbert Xu1-1/+23
Drivers expect to be able to call wireless_send_event in arbitrary contexts. On the other hand, netlink really doesn't like being invoked in an IRQ context. So we need to postpone the sending of netlink skb's to a tasklet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-08-03SUNRPC: Fix obvious refcounting bugs in rpc_pipefs.Trond Myklebust1-2/+4
Doh! Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 496f408f2f0e7ee5481a7c2222189be6c4f5aa6c commit)
2006-08-03RPC: Ensure that we disconnect TCP socket when client requests error outTrond Myklebust3-42/+60
If we're part way through transmitting a TCP request, and the client errors, then we need to disconnect and reconnect the TCP socket in order to avoid confusing the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 031a50c8b9ea82616abd4a4e18021a25848941ce commit)
2006-08-02[NET]: Fix more per-cpu typosAlexey Dobriyan2-3/+3
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[I/OAT]: Remove CPU hotplug lock from net_dma_rebalanceChris Leech1-5/+0
Remove the lock_cpu_hotplug()/unlock_cpu_hotplug() calls from net_dma_rebalance The lock_cpu_hotplug()/unlock_cpu_hotplug() sequence in net_dma_rebalance is both incorrect (as pointed out by David Miller) because lock_cpu_hotplug() may sleep while the net_dma_event_lock spinlock is held, and unnecessary (as pointed out by Andrew Morton) as spin_lock() disables preemption which protects from CPU hotplug events. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[DECNET]: Fix for routing bugPatrick Caulfield1-2/+7
This patch fixes a bug in the DECnet routing code where we were selecting a loopback device in preference to an outward facing device even when the destination was known non-local. This patch should fix the problem. Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com> Signed-off-by: Steven Whitehouse <steve@chygwyn.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patchCatherine Zhang2-14/+12
From: Catherine Zhang <cxzhang@watson.ibm.com> This patch implements a cleaner fix for the memory leak problem of the original unix datagram getpeersec patch. Instead of creating a security context each time a unix datagram is sent, we only create the security context when the receiver requests it. This new design requires modification of the current unix_getsecpeer_dgram LSM hook and addition of two new hooks, namely, secid_to_secctx and release_secctx. The former retrieves the security context and the latter releases it. A hook is required for releasing the security context because it is up to the security module to decide how that's done. In the case of Selinux, it's a simple kfree operation. Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: skb_queue_lock_key() is no longer used.Adrian Bunk1-7/+0
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[IPV6]: SNMPv2 "ipv6IfStatsOutFragCreates" counter errorWei Dong2-6/+9
When I tested linux kernel 2.6.71.7 about statistics "ipv6IfStatsOutFragCreates", and found that it couldn't increase correctly. The criteria is RFC 2465: ipv6IfStatsOutFragCreates OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of output datagram fragments that have been generated as a result of fragmentation at this output interface." ::= { ipv6IfStatsEntry 15 } I think there are two issues in Linux kernel. 1st: RFC2465 specifies the counter is "The number of output datagram fragments...". I think increasing this counter after output a fragment successfully is better. And it should not be increased even though a fragment is created but failed to output. 2nd: If we send a big ICMP/ICMPv6 echo request to a host, and receive ICMP/ICMPv6 echo reply consisted of some fragments. As we know that in Linux kernel first fragmentation occurs in ICMP layer(maybe saying transport layer is better), but this is not the "real" fragmentation,just do some "pre-fragment" -- allocate space for date, and form a frag_list, etc. The "real" fragmentation happens in IP layer -- set offset and MF flag and so on. So I think in "fast path" for ip_fragment/ip6_fragment, if we send a fragment which "pre-fragment" by upper layer we should also increase "ipv6IfStatsOutFragCreates". Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[IPV6]: SNMPv2 "ipv6IfStatsInHdrErrors" counter errorWei Dong1-0/+1
When I tested Linux kernel 2.6.17.7 about statistics "ipv6IfStatsInHdrErrors", found that this counter couldn't increase correctly. The criteria is RFC2465: ipv6IfStatsInHdrErrors OBJECT-TYPE SYNTAX Counter3 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of input datagrams discarded due to errors in their IPv6 headers, including version number mismatch, other format errors, hop count exceeded, errors discovered in processing their IPv6 options, etc." ::= { ipv6IfStatsEntry 2 } When I send TTL=0 and TTL=1 a packet to a router which need to be forwarded, router just sends an ICMPv6 message to tell the sender that TIME_EXCEED and HOPLIMITS, but no increments for this counter(in the function ip6_forward). Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: Kill the WARN_ON() calls for checksum fixups.David S. Miller1-10/+0
We have a more complete solution in the works, involving the seperation of CHECKSUM_HW on input vs. output, and having netfilter properly do incremental checksums. But that is a very involved patch and is thus 2.6.19 material. What we have now is infinitely better than the past, wherein all TSO packets were dropped due to corrupt checksums as soon at the NAT module was loaded. At least now, the checksums do get fixed up, it just isn't the cleanest nor most optimal solution. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NETFILTER]: xt_hashlimit/xt_string: missing string validationPatrick McHardy2-1/+7
The hashlimit table name and the textsearch algorithm need to be terminated, the textsearch pattern length must not exceed the maximum size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NETFILTER]: SIP helper: expect RTP streams in both directionsPatrick McHardy1-1/+1
Since we don't know in which direction the first packet will arrive, we need to create one expectation for each direction, which is currently prevented by max_expected beeing set to 1. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: Add netdev_alloc_skb().Christoph Hellwig1-0/+24
Add a dev_alloc_skb variant that takes a struct net_device * paramater. For now that paramater is unused, but I'll use it to allocate the skb from node-local memory in a follow-up patch. Also there have been some other plans mentioned on the list that can use it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[TCP]: Process linger2 timeout consistently.David S. Miller1-1/+2
Based upon guidance from Alexey Kuznetsov. When linger2 is active, we check to see if the fin_wait2 timeout is longer than the timewait. If it is, we schedule the keepalive timer for the difference between the timewait timeout and the fin_wait2 timeout. When this orphan socket is seen by tcp_keepalive_timer() it will try to transform this fin_wait2 socket into a fin_wait2 mini-socket, again if linger2 is active. Not all paths were setting this initial keepalive timer correctly. The tcp input path was doing it correctly, but tcp_close() wasn't, potentially making the socket linger longer than it really needs to. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[SECURITY] secmark: nul-terminate secdataJames Morris1-0/+2
The patch below fixes a problem in the iptables SECMARK target, where the user-supplied 'selctx' string may not be nul-terminated. From initial analysis, it seems that the strlen() called from selinux_string_to_sid() could run until it arbitrarily finds a zero, and possibly cause a kernel oops before then. The impact of this appears limited because the operation requires CAP_NET_ADMIN, which is essentially always root. Also, the module is not yet in wide use. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: Core net changes to generate neteventsTom Tucker4-7/+24
Generate netevents for: - neighbour changes - routing redirects - pmtu changes Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: Network Event Notifier Mechanism.Tom Tucker1-0/+69
This patch uses notifier blocks to implement a network event notifier mechanism. Clients register their callback function by calling register_netevent_notifier() like this: static struct notifier_block nb = { .notifier_call = my_callback_func }; ... register_netevent_notifier(&nb); Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[TCP]: SNMPv2 tcpAttemptFails counter errorWei Yongjun3-5/+3
Refer to RFC2012, tcpAttemptFails is defined as following: tcpAttemptFails OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of times TCP connections have made a direct transition to the CLOSED state from either the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state." ::= { tcp 7 } When I lookup into RFC793, I found that the state change should occured under following condition: 1. SYN-SENT -> CLOSED a) Received ACK,RST segment when SYN-SENT state. 2. SYN-RCVD -> CLOSED b) Received SYN segment when SYN-RCVD state(came from LISTEN). c) Received RST segment when SYN-RCVD state(came from SYN-SENT). d) Received SYN segment when SYN-RCVD state(came from SYN-SENT). 3. SYN-RCVD -> LISTEN e) Received RST segment when SYN-RCVD state(came from LISTEN). In my test, those direct state transition can not be counted to tcpAttemptFails. Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[TCP]: fix memory leak in net/ipv4/tcp_probe.c::tcpprobe_read()James Morris1-1/+1
Based upon a patch by Jesper Juhl. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: Fix ___pskb_trim when entire frag_list needs droppingHerbert Xu1-4/+10
When the trim point is within the head and there is no paged data, ___pskb_trim fails to drop the first element in the frag_list. This patch fixes this by moving the len <= offset case out of the page data loop. This patch also adds a missing kfree_skb on the frag that we just cloned. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[IPV6]: Audit all ip6_dst_lookup/ip6_dst_store callsHerbert Xu6-46/+88
The current users of ip6_dst_lookup can be divided into two classes: 1) The caller holds no locks and is in user-context (UDP). 2) The caller does not want to lookup the dst cache at all. The second class covers everyone except UDP because most people do the cache lookup directly before calling ip6_dst_lookup. This patch adds ip6_sk_dst_lookup for the first class. Similarly ip6_dst_store users can be divded into those that need to take the socket dst lock and those that don't. This patch adds __ip6_dst_store for those (everyone except UDP/datagram) that don't need an extra lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[XFRM]: Fix protocol field value for outgoing IPv6 GSO packetsPatrick McHardy1-1/+1
Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[IPV6] ADDRCONF: NLM_F_REPLACE support for RTM_NEWADDRNoriaki TAKAMIYA1-0/+57
Based on MIPL2 kernel patch. Signed-off-by: Noriaki YAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-08-02[IPV6] ADDRCONF: Support get operation of single addressNoriaki TAKAMIYA1-1/+58
Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-08-02[IPV6] ADDRCONF: Do not verify an address with infinity lifetimeYOSHIFUJI Hideaki1-1/+5
We also do not try regenarating new temporary address corresponding to an address with infinite preferred lifetime. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-08-02[IPV6] ADDRCONF: Allow user-space to specify address lifetimeNoriaki TAKAMIYA1-4/+42
Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-08-02[IPV6] ADDRCONF: Check payload length for IFA_LOCAL attribute in ↵YOSHIFUJI Hideaki1-2/+4
RTM_{ADD,DEL}MSG message Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-07-27[PATCH] ieee80211: TKIP requires CRC32Chuck Ebbert1-0/+1
ieee80211_crypt_tkip will not work without CRC32. LD .tmp_vmlinux1 net/built-in.o: In function `ieee80211_tkip_encrypt': net/ieee80211/ieee80211_crypt_tkip.c:349: undefined reference to `crc32_le' Reported by Toralf Foerster <toralf.foerster@gmx.de> Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-27[PATCH] softmac: do shared key auth in workqueueDaniel Drake1-6/+22
Johann Uhrmann reported a bcm43xx crash and Michael Buesch tracked it down to a problem with the new shared key auth code (recursive calls into the driver) This patch (effectively Michael's patch with a couple of small modifications) solves the problem by sending the authentication challenge response frame from a workqueue entry. I also removed a lone \n from the bcm43xx messages relating to authentication mode - this small change was previously discussed but not patched in. Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-25[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg().Tetsuo Handa2-0/+2
From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp The recvmsg() for raw socket seems to return random u16 value from the kernel stack memory since port field is not initialized. But I'm not sure this patch is correct. Does raw socket return any information stored in port field? [ BSD defines RAW IP recvmsg to return a sin_port value of zero. This is described in Steven's TCP/IP Illustrated Volume 2 on page 1055, which is discussing the BSD rip_input() implementation. ] Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-25[IPV4] ipmr: ip multicast route bug fix.Alexey Kuznetsov1-6/+13
IP multicast route code was reusing an skb which causes use after free and double free. From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Note, it is real skb_clone(), not alloc_skb(). Equeued skb contains the whole half-prepared netlink message plus room for the rest. It could be also skb_copy(), if we want to be puristic about mangling cloned data, but original copy is really not going to be used. Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags.Guillaume Chazarain1-1/+1
Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[IPV6]: Clean skb cb on IPv6 input.Guillaume Chazarain1-0/+2
Clear the accumulated junk in IP6CB when starting to handle an IPV6 packet. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NETFILTER]: Demote xt_sctp to EXPERIMENTALPatrick McHardy1-2/+2
After the recent problems with all the SCTP stuff it seems reasonable to mark this as experimental. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NETFILTER]: bridge netfilter: add deferred output hooks to ↵Patrick McHardy2-0/+20
feature-removal-schedule Add bridge netfilter deferred output hooks to feature-removal-schedule and disable them by default. Until their removal they will be activated by the physdev match when needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NETFILTER]: xt_pkttype: fix mismatches on locally generated packetsPhil Oester1-1/+11
Locally generated broadcast and multicast packets have pkttype set to PACKET_LOOPBACK instead of PACKET_BROADCAST or PACKET_MULTICAST. This causes the pkttype match to fail to match packets of either type. The below patch remedies this by using the daddr as a hint as to broadcast|multicast. While not pretty, this seems like the only way to solve the problem short of just noting this as a limitation of the match. This resolves netfilter bugzilla #484 Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NETFILTER]: SNMP NAT: fix byteorder confusionPatrick McHardy1-2/+2
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NETFILTER]: conntrack: fix SYSCTL=n compileAdrian Bunk2-4/+4
Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinjectPatrick McHardy1-5/+4
In case of an unknown verdict or NF_STOP the packet leaks. Unknown verdicts can happen when userspace is buggy. Reinject the packet in case of NF_STOP, drop on unknown verdicts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NETFILTER]: H.323 helper: fix possible NULL-ptr dereferencePatrick McHardy1-1/+1
An RCF message containing a timeout results in a NULL-ptr dereference if no RRQ has been seen before. Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu> and Isil Dillig <isil@stanford.edu>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[VLAN]: Fix link state propagationStefan Rompf1-5/+3
When the queue of the underlying device is stopped at initialization time or the device is marked "not present", the state will be propagated to the vlan device and never change. Based on an analysis by Patrick McHardy. Signed-off-by: Stefan Rompf <stefan@loplof.de> ACKed-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[IPV6] xfrm6_tunnel: Delete debugging code.David S. Miller1-130/+10
It doesn't compile, and it's dubious in several regards: 1) is enabled by non-Kconfig controlled CONFIG_* value (noted by Randy Dunlap) 2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use 3) the debugging messages print object pointer addresses which have no meaning without context So let's just get rid of it. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[Bluetooth] Correct RFCOMM channel MTU for broken implementationsMarcel Holtmann1-2/+17
Some Bluetooth RFCOMM implementations try to negotiate a bigger channel MTU than we can support for a particular session. The maximum MTU for a RFCOMM session is limited through the L2CAP layer. So if the other side proposes a channel MTU that is bigger than the underlying L2CAP MTU, we should reduce it to the L2CAP MTU of the session minus five bytes for the RFCOMM headers. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-24[DCCP]: Fix default sequence window sizeIan McDonald4-4/+7
When using the default sequence window size (100) I got the following in my logs: Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC... Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC... I went to alter the default sysctl and it didn't take for new sockets. Below patch fixes this. I think the default is too low but it is what the DCCP spec specifies. As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec to 3.5. This is still far too slow but it is a step in the right direction. Compile tested only for IPv6 but not particularly complex change. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[TIPC]: Removing useless castsPanagiotis Issaris2-2/+2
Removing useless casts Signed-off-by: Panagiotis Issaris <takis@issaris.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[IPV4]: Fix nexthop realm dumping for multipath routesPatrick McHardy1-4/+8
Routing realms exist per nexthop, but are only returned to userspace for the first nexthop. This is due to the fact that iproute2 only allows to set the realm for the first nexthop and the kernel refuses multipath routes where only a single realm is present. Dump all realms for multipath routes to enable iproute to correctly display them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[NET]: Conversions from kmalloc+memset to k(z|c)alloc.Panagiotis Issaris94-334/+154
Signed-off-by: Panagiotis Issaris <takis@issaris.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[IrDA]: Use alloc_skb() in IrDA TX pathSamuel Ortiz12-35/+38
As pointed out by Christoph Hellwig, dev_alloc_skb() is not intended to be used for allocating TX sk_buff. The IrDA stack was exclusively calling dev_alloc_skb() on the TX path, and this patch fixes that. Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[I/OAT]: net/core/user_dma.c should #include <net/netdma.h>Adrian Bunk1-0/+1
Every file should #include the headers containing the prototypes for its global functions. Especially in cases like this one where gcc can tell us through a compile error that the prototype was wrong... Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKedSridhar Samudrala5-22/+78
This implements Rules D1 and D4 of Sec 4.3 in the ADDIP draft. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[SCTP]: Set chunk->data_accepted only if we are going to accept it.Sridhar Samudrala1-1/+2
Currently there is a code path in sctp_eat_data() where it is possible to set this flag even when we are dropping this chunk. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[SCTP]: Verify all the paths to a peer via heartbeat before using them.Sridhar Samudrala6-19/+47
This patch implements Path Initialization procedure as described in Sec 2.36 of RFC4460. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[SCTP]: Unhash the endpoint in sctp_endpoint_free().Vlad Yasevich1-5/+6
This prevents a race between the close of a socket and receive of an incoming packet. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[SCTP]: Check for NULL arg to sctp_bucket_destroy().Sridhar Samudrala1-1/+1
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[PKT_SCHED] netem: Fix slab corruption with netem (2nd try)Guillaume Chazarain1-1/+3
CONFIG_DEBUG_SLAB found the following bug: netem_enqueue() in sch_netem.c gets a pointer inside a slab object: struct netem_skb_cb *cb = (struct netem_skb_cb *)skb->cb; But then, the slab object may be freed: skb = skb_unshare(skb, GFP_ATOMIC) cb is still pointing inside the freed skb, so here is a patch to initialize cb later, and make it clear that initializing it sooner is a bad idea. [From Stephen Hemminger: leave cb unitialized in order to let gcc complain in case of use before initialization] Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[IPV4]: Get rid of redundant IPCB->opts initialisationHerbert Xu6-7/+0
Now that we always zero the IPCB->opts in ip_rcv, it is no longer necessary to do so before calling netif_rx for tunneled packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-17[NET] ethtool: fix oops by testing correct struct memberJeff Garzik1-1/+1
Noticed by Willy Tarreau. Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-07-15[PATCH] sch_htb compile fix.Dave Jones1-1/+1
net/sched/sch_htb.c: In function 'htb_change_class': net/sched/sch_htb.c:1605: error: expected ';' before 'do_gettimeofday' Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PKT_SCHED] HTB: initialize upper bound properlyStephen Hemminger1-2/+2
The upper bound for HTB time diff needs to be scaled to PSCHED units rather than just assuming usecs. The field mbuffer is used in TDIFF_SAFE(), as an upper bound. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-14[IPV4]: Clear skb cb on IP inputStephen Hemminger1-0/+3
when data arrives at IP through loopback (and possibly other devices). So the field needs to be cleared before it confuses the route code. This was seen when running netem over loopback, but there are probably other device cases. Maybe this should go into stable? Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-13[NET]: Update frag_list in pskb_trimHerbert Xu1-26/+65
When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of the packet, the frag_list is not updated to reflect the trimming. This will usually work fine until you hit something that uses the packet length or tail from the frag_list. Examples include esp_output and ip_fragment. Another problem caused by this is that you can end up with a linear packet with a frag_list attached. It is possible to get away with this if we audit everything to make sure that they always consult skb->len before going down onto frag_list. In fact we can do the samething for the paged part as well to avoid copying the data area of the skb. For now though, let's do the conservative fix and update frag_list. Many thanks to Marco Berizzi for helping me to track down this bug. This 4-year old bug took 3 months to track down. Marco was very patient indeed :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[NET]: fix __sk_stream_mem_reclaimIan McDonald1-9/+7
__sk_stream_mem_reclaim is only called by sk_stream_mem_reclaim. As such the check on sk->sk_forward_alloc is not needed and can be removed. Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[Bluetooth] Fix deadlock in the L2CAP layerMarcel Holtmann1-9/+9
The Bluetooth L2CAP layer has 2 locks that are used in softirq context, (one spinlock and one rwlock, where the softirq usage is readlock) but where not all usages of the lock were _bh safe. The patch below corrects this. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-12[Bluetooth] Let BT_HIDP depend on INPUTMarcel Holtmann1-2/+1
This patch lets BT_HIDP depend on instead of select INPUT. This fixes the following warning during an s390 build: net/bluetooth/hidp/Kconfig:4:warning: 'select' used by config symbol 'BT_HIDP' refer to undefined symbol 'INPUT' A dependency on INPUT also implies !S390 (and therefore makes the explicit dependency obsolete) since INPUT is not available on s390. The practical difference should be nearly zero, since INPUT is always set to y unless EMBEDDED=y (or S390=y). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-12[Bluetooth] Remaining transitions to use kzalloc()Marcel Holtmann7-25/+16
This patch makes the remaining transitions to use kzalloc(). Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-12[IPV4]: Fix error handling for fib_insert_node callHerbert Xu1-1/+1
The error handling around fib_insert_node was broken because we always zeroed the error before checking it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[NETROM] lockdep: fix false positiveRalf Baechle1-0/+9
NETROM network devices are virtual network devices encapsulating NETROM frames into AX.25 which will be sent through an AX.25 device, so form a special "super class" of normal net devices; split their locks off into a separate class since they always nest. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[ROSE] lockdep: fix false positiveRalf Baechle1-0/+9
ROSE network devices are virtual network devices encapsulating ROSE frames into AX.25 which will be sent through an AX.25 device, so form a special "super class" of normal net devices; split their locks off into a separate class since they always nest. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[AX.25]: Optimize AX.25 socket list lockRalf Baechle3-13/+13
Right now all uses of the ax25_list_lock lock are _bh locks but knowing some code is only ever getting invoked from _bh context we can better. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[IPCOMP]: Fix truesize after decompressionHerbert Xu2-2/+4
The truesize check has uncovered the fact that we forgot to update truesize after pskb_expand_head. Unfortunately pskb_expand_head can't update it for us because it's used in all sorts of different contexts, some of which would not allow truesize to be updated by itself. So the solution for now is to simply update it in IPComp. This patch also changes skb_put to __skb_put since we've just expanded tailroom by exactly that amount so we know it's there (but gcc does not). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[IPV6]: Use ipv6_addr_src_scope for link address sorting.YOSHIFUJI Hideaki1-1/+2
In the source address selection, the address must be sorted from global to node-local. But, ifp->scope is different from the scope for source address selection. 2001::1 fe80::1 ::1 ifp->scope 0 0x02 0x01 ipv6_addr_src_scope(&ifp->addr) 0x0e 0x02 0x01 So, we need to use ipv6_addr_src_scope(&ifp->addr) for sorting. And, for backward compatibility, addresses should be sorted from new one to old one. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[TCP] tcp_highspeed: Fix AI updates.Xiaoliang (David) Wei1-4/+9
I think there is still a problem with the AIMD parameter update in HighSpeed TCP code. Line 125~138 of the code (net/ipv4/tcp_highspeed.c): /* Update AIMD parameters */ if (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd) { while (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd && ca->ai < HSTCP_AIMD_MAX - 1) ca->ai++; } else if (tp->snd_cwnd < hstcp_aimd_vals[ca->ai].cwnd) { while (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd && ca->ai > 0) ca->ai--; In fact, the second part (decreasing ca->ai) never decreases since the while loop's inequality is in the reverse direction. This leads to unfairness with multiple flows (once a flow happens to enjoy a higher ca->ai, it keeps enjoying that even its cwnd decreases) Here is a tentative fix (I also added a comment, trying to keep the change clear): Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[MAINTAINERS]: Add proper entry for TC classifierStephen Hemminger1-2/+0
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[NETROM]: Drop lock before calling nr_destroy_socketRalf Baechle1-1/+1
nr_destroy_socket takes the socket lock itself so it should better be called with the socket unlocked. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[NETROM]: Fix locking order when establishing a NETROM circuit.Ralf Baechle1-6/+6
When establishing a new circuit in nr_rx_frame the locks are taken in a different order than in the rest of the stack. This should be harmless but triggers lockdep. Either way, reordering the code a little solves the issue. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[AX.25]: Fix locking of ax25 protocol function list.Ralf Baechle1-9/+9
Delivery of AX.25 frame to the layer 3 protocols happens in softirq context so locking needs to be bh-proof. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-12[IPV6]: order addresses by scopeBrian Haley1-3/+21
If IPv6 addresses are ordered by scope, then ipv6_dev_get_saddr() can break-out of the device addr_list for() loop when the candidate source address scope is less than the destination address scope. Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-10[DCCP]: Fix sparse warnings.Alan Cox1-2/+2
No actual bugs that I can see just a couple of unmarked casts getting annoying in my debug log files. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-10[TCP]: Remove TCP CompoundDavid S. Miller3-459/+0
This reverts: f890f921040fef6a35e39d15b729af1fd1a35f29 The inclusion of TCP Compound needs to be reverted at this time because it is not 100% certain that this code conforms to the requirements of Developer's Certificate of Origin 1.1 paragraph (b). Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-10[IPV4] inetpeer: Get rid of volatile from peer_totalHerbert Xu1-1/+1
The variable peer_total is protected by a lock. The volatile marker makes no sense. This shaves off 20 bytes on i386. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-10[AX.25]: Get rid of the last volatile.Ralf Baechle1-1/+1
This volatile makes no sense - not even wearing pink shades ... Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-09[AX.25]: Use kzallocRalf Baechle4-10/+4
Replace kzalloc instead of kmalloc + memset. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-09[ATM] net/atm/clip.c: fix PROC_FS=n compileAdrian Bunk1-4/+9
This patch fixes the following compile error with CONFIG_PROC_FS=n by reverting commit dcdb02752ff13a64433c36f2937a58d93ae7a19e: <-- snip --> ... CC net/atm/clip.o net/atm/clip.c: In function ‘atm_clip_init’: net/atm/clip.c:975: error: ‘atm_proc_root’ undeclared (first use in this function) net/atm/clip.c:975: error: (Each undeclared identifier is reported only once net/atm/clip.c:975: error: for each function it appears in.) net/atm/clip.c:977: error: ‘arp_seq_fops’ undeclared (first use in this function) make[2]: *** [net/atm/clip.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-09[PKT_SCHED]: act_api: Fix module leak while flushing actionsThomas Graf1-1/+1
Module reference needs to be given back if message header construction fails. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-08[NET]: Fix IPv4/DECnet routing rule dumpingPatrick McHardy2-3/+4
When more rules are present than fit in a single skb, the remaining rules are incorrectly skipped. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-08[NET] gso: Fix up GSO packets with broken checksumsHerbert Xu5-32/+166
Certain subsystems in the stack (e.g., netfilter) can break the partial checksum on GSO packets. Until they're fixed, this patch allows this to work by recomputing the partial checksums through the GSO mechanism. Once they've all been converted to update the partial checksum instead of clearing it, this workaround can be removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-08[NET] gso: Add skb_is_gsoHerbert Xu6-8/+8
This patch adds the wrapper function skb_is_gso which can be used instead of directly testing skb_shinfo(skb)->gso_size. This makes things a little nicer and allows us to change the primary key for indicating whether an skb is GSO (if we ever want to do that). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-08[ATM]: fix possible recursive locking in skb_migrate()Arjan van de Ven1-6/+11
ok this is a real potential deadlock in a way, it takes two locks of 2 skbuffs without doing any kind of lock ordering; I think the following patch should fix it. Just sort the lock taking order by address of the skb.. it's not pretty but it's the best this can do in a minimally invasive way. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-07[NET]: Fix network device interface printk message priorityStephen Hemminger1-3/+3
The printk's in the network device interface code should all be tagged with severity. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-05Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds1-8/+10
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [PKT_SCHED]: Fix error handling while dumping actions [PKT_SCHED]: Return ENOENT if action module is unavailable [PKT_SCHED]: Fix illegal memory dereferences when dumping actions
2006-07-05[PKT_SCHED]: Fix error handling while dumping actionsThomas Graf1-2/+4
"return -err" and blindly inheriting the error code in the netlink failure exception handler causes errors codes to be returned as positive value therefore making them being ignored by the caller. May lead to sending out incomplete netlink messages. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-05[PKT_SCHED]: Return ENOENT if action module is unavailableThomas Graf1-0/+1
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-05[PKT_SCHED]: Fix illegal memory dereferences when dumping actionsThomas Graf1-6/+5
The TCA_ACT_KIND attribute is used without checking its availability when dumping actions therefore leading to a value of 0x4 being dereferenced. The use of strcmp() in tc_lookup_action_n() isn't safe when fed with string from an attribute without enforcing proper NUL termination. Both bugs can be triggered with malformed netlink message and don't require any privileges. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-05Merge git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds1-2/+1
* git://git.linux-nfs.org/pub/linux/nfs-2.6: NLM,NFSv4: Wait on local locks before we put RPC calls on the wire VFS: Add support for the FL_ACCESS flag to flock_lock_file() NFSv4: Ensure nfs4_lock_expired() caches delegated locks NLM,NFSv4: Don't put UNLOCK requests on the wire unless we hold a lock VFS: Allow caller to determine if BSD or posix locks were actually freed NFS: Optimise away an excessive GETATTR call when a file is symlinked This fixes a panic doing the first READDIR or READDIRPLUS call when: NFS: Fix NFS page_state usage Revert "Merge branch 'odirect'"
2006-07-05[PATCH] softmac: fix build-break from 881ee6999d66c8fc903b429b73bbe6045b38c549John W. Linville1-2/+2
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05[PATCH] CONFIG_WIRELESS_EXT is neccessary after allHorms1-0/+2
WARNING: /lib/modules/2.6.17-mm2/kernel/net/ieee80211/ieee80211.ko needs unknown symbol wireless_spy_update Someone removed the `#ifdef CONFIG_WIRELESS_EXT' from around the callsite in net/ieee80211/ieee80211_rx.c and didn't update Kconfig appropriately. The offending patchset seems to be 35c14b855f52c49e4f3d078b9532b056005ed321 which is tittled [PATCH] ieee80211: remove unnecessary CONFIG_WIRELESS_EXT checking After a quick look it seems that wireless_spy_update() lives in net/core/wirless.c, and that file is only compiled if CONFIG_WIRELESS_EXT is set. Perhaps this is Kconig work, but in the mean time here is a reversal of the recent change. Signed-Off-By: Horms <horms@verge.net.au> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05[PATCH] SoftMAC: Add network to ieee80211softmac_call_events when associate ↵Joseph Jezak1-2/+4
times out The ieee80211softmac_call_events function, when called with event type IEEE80211SOFTMAC_EVENT_ASSOCIATE_TIMEOUT should pass the network as the third parameter. This patch does that. Signed-off-by: Joseph Jezak <josejx@gentoo.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05[PATCH] SoftMAC: Prevent multiple authentication attempts on the same networkJoseph Jezak3-9/+56
This patch addresses the "No queue exists" messages commonly seen during authentication and associating. These appear due to scheduling multiple authentication attempts on the same network. To prevent this, I added a flag to stop multiple authentication attempts by the association layer. I also added a check to the wx handler to see if we're connecting to a different network than the one already in progress. This scenario was causing multiple requests on the same network because the network BSSID was not being updated despite the fact that the ESSID changed. Signed-off-by: Joseph Jezak <josejx@gentoo.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05[PATCH] 2.6.17 missing a call to ieee80211softmac_capabilities from ↵Larry Finger1-0/+3
ieee80211softmac_assoc_req In commit ba9b28d19a3251bb1dfe6a6f8cc89b96fb85f683, routine ieee80211softmac_capabilities was added to ieee80211softmac_io.c. As denoted by its name, it completes the capabilities IE that is needed in the associate and reassociate requests sent to the AP. For at least one AP, the Linksys WRT54G V5, the capabilities field must set the 'short preamble' bit or the AP refuses to associate. In the commit noted above, there is a call to the new routine from ieee80211softmac_reassoc_req, but not from ieee80211softmac_assoc_req. This patch fixes that oversight. As noted in the subject, v2.6.17 is affected. My bcm43xx card had been unable to associate since I was forced to buy a new AP. I finally was able to get a packet dump and traced the problem to the capabilities info. Although I had heard that a patch was "floating around", I had not seen it before 2.6.17 was released. As this bug does not affect security and I seem to have the only AP affected by it, there should be no problem in leaving it for 2.6.18. Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05[PATCH] skb used after passing to netif_rx in net/ieee80211/ieee80211_rx.cEric Sesterhenn1-1/+1
this patch fixes coverity id #913. ieee80211_monitor_rx() passes the skb to netif_rx() and we should not reference it any longer. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05[PATCH] ieee80211: fix not allocating IV+ICV space when usingencryption in ↵Hong Liu1-4/+11
ieee80211_tx_frame We should preallocate IV+ICV space when encrypting the frame. Currently no problem shows up just because dev_alloc_skb aligns the data len to SMP_CACHE_BYTES which can be used for ICV. Signed-off-by: Hong Liu <hong.liu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05This fixes a panic doing the first READDIR or READDIRPLUS call when:Trond Myklebust1-2/+1
* the client is ia64 or any platform that actually implements flush_dcache_page(), and * the server returns fsinfo.dtpref >= client's PAGE_SIZE, and * the server does *not* return post-op attributes for the directory in the READDIR reply. Problem diagnosed by Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-03[Bluetooth] Add RFCOMM role switch supportMarcel Holtmann1-0/+5
This patch adds the support for RFCOMM role switching before the connection is fully established. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03[Bluetooth] Allow disabling of credit based flow controlMarcel Holtmann1-5/+13
This patch adds the module parameter disable_cfc which can be used to disable the credit based flow control. The credit based flow control was introduced with the Bluetooth 1.1 specification and devices can negotiate its support, but for testing purpose it is helpful to allow disabling of it. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03[Bluetooth] Small cleanup of the L2CAP source codeMarcel Holtmann1-180/+177
This patch is a small cleanup of the L2CAP source code. It makes some coding style changes and moves some functions around to avoid forward declarations. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03[Bluetooth] Use real devices for host controllersMarcel Holtmann6-97/+75
This patch converts the Bluetooth class devices into real devices. The Bluetooth class is kept and the driver core provides the appropriate symlinks for backward compatibility. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03[Bluetooth] Add platform device for virtual and serial devicesMarcel Holtmann2-8/+51
This patch adds a generic Bluetooth platform device that can be used as parent device by virtual and serial devices. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03[Bluetooth] Add automatic sniff mode supportMarcel Holtmann5-48/+368
This patch introduces the automatic sniff mode feature. This allows the host to switch idle connections into sniff mode to safe power. Signed-off-by: Ulisses Furquim <ulissesf@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03[Bluetooth] Correct SCO buffer size on requestMarcel Holtmann1-3/+11
This patch introduces a quirk that allows the drivers to tell the host to correct the SCO buffer size values. Signed-off-by: Olivier Galibert <galibert@pobox.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03[BRIDGE]: br_dump_ifinfo index fixAndrey Savochkin1-1/+2
Fix for inability of br_dump_ifinfo to handle non-zero start index: loop index never increases when entered with non-zero start. Spotted by Kirill Korotaev. Signed-off-by: Andrey Savochkin <saw@swsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[NET]: add+use poison definesRandy Dunlap2-2/+4
Add and use poison defines in net/. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[IPV6]: Fix ipv6 GSO payload lengthMichael Chan1-1/+2
Fix ipv6 GSO payload length calculation. The ipv6 payload length excludes the ipv6 base header length and so must be subtracted. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[TIPC] Fixed sk_buff panic caused by tipc_link_bundle_buf (REVISED)Allan Stephens2-1/+6
The recent change to direct inspection of bundle buffer tailroom did not account for the possiblity of unrequested tailroom added by skb_alloc(), thereby allowing a bundle to be created that exceeds the current link MTU. An additional check now ensures that bundling works correctly no matter if the bundle buffer is smaller, larger, or equal to the link MTU. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[NET]: Verify gso_type too in gso_segmentHerbert Xu3-4/+31
We don't want nasty Xen guests to pass a TCPv6 packet in with gso_type set to TCPv4 or even UDP (or a packet that's both TCP and UDP). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[ROSE]: Try all routes when establishing a ROSE connections.Ralf Baechle1-1/+6
From Jean-Paul F6FBB ROSE will only try to establish a route using the first route in its routing table. Fix to iterate through all additional routes if a connection attempt has failed. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[NETROM]: Use socket helpers instead of direct fiddling with struct sockRalf Baechle DL5RB1-2/+2
Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[AX.25]: Reference counting for AX.25 routes.Ralf Baechle DL5RB2-49/+23
In the past routes could be freed even though the were possibly in use ... Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[ROSE]: Fix dereference of skb pointer after free.Ralf Baechle1-1/+4
If rose_route_frame return success we'll dereference a stale pointer. Likely this is only going to result in bad statistics for the ROSE interface. This fixes coverity 946. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[AF_UNIX]: datagram getpeersec fixAndrew Morton1-1/+1
The unix_get_peersec_dgram() stub should have been inlined so that it disappears. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03[PATCH] lockdep: annotate sk_locksIngo Molnar1-9/+88
Teach sk_lock semantics to the lock validator. In the softirq path the slock has mutex_trylock()+mutex_unlock() semantics, in the process context sock_lock() case it has mutex_lock()/mutex_unlock() semantics. Thus we treat sock_owned_by_user() flagged areas as an exclusion area too, not just those areas covered by a held sk_lock.slock. Effect on non-lockdep kernels: minimal, sk_lock_sock_init() has been turned into an inline function. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate vlan net device as being a special classArjan van de Ven1-0/+11
vlan network devices have devices nesting below it, and are a special "super class" of normal network devices; split their locks off into a separate class since they always nest. [deweerdt@free.fr: fix possible null-pointer deref] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate sunrpc codeArjan van de Ven1-4/+4
Add i_mutex ordering annotations to the sunrpc rpc_pipe code. This code has 3 levels of i_mutex hierarchy in some cases: parent dir, client dir and file inside client dir; the i_mutex ordering is I_MUTEX_PARENT -> I_MUTEX_CHILD -> I_MUTEX_NORMAL This patch applies this ordering annotation to the various functions. This is in line with the VFS expected ordering where it is always OK to lock a child after locking a parent; the sunrpc code is very diligent in doing this correctly. Has no effect on non-lockdep kernels. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate bh_lock_sock()Ingo Molnar1-1/+1
Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate af_unix lockingIngo Molnar1-1/+11
Teach special (recursive) locking code to the lock validator. Also splits af_unix's sk_receive_queue.lock class from the other networking skb-queue locks. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate sock_lock_init()Ingo Molnar1-0/+16
Teach special (multi-initialized, per-address-family) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate skb_queue_head_initIngo Molnar1-0/+7
Teach special (multi-initialized) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: fix RT_HASH_LOCK_SZIngo Molnar1-9/+14
On lockdep we have a quite big spinlock_t, so keep the size down. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: prove spinlock rwlock locking correctnessIngo Molnar1-1/+2
Use the lock validator framework to prove spinlock and rwlock locking correctness. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: locking init debugging improvementIngo Molnar2-2/+2
Locking init improvement: - introduce and use __SPIN_LOCK_UNLOCKED for array initializations, to pass in the name string of locks, used by debugging Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] bcm43xx: netlink deadlock fixArjan van de Ven1-4/+4
reported by Jure Repinc: > > http://bugzilla.kernel.org/show_bug.cgi?id=6773 > > checked out dmesg output and found the message > > > > ====================================================== > > [ BUG: hard-safe -> hard-unsafe lock order detected! ] > > ------------------------------------------------------ > > > > starting at line 660 of the dmesg.txt that I will attach. The patch below should fix the deadlock, albeit I suspect it's not the "right" fix; the right fix may well be to move the rx processing in bcm43xx to softirq context. [it's debatable, ipw2200 hit this exact same bug; at some point it's better to bite the bullet and move this to the common layer as my patch below does] Make the nl_table_lock irq-safe; it's taken for read in various netlink functions, including functions that several wireless drivers (ipw2200, bcm43xx) want to call from hardirq context. The deadlock was found by the lock validator. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michael Buesch <mb@bu3sch.de> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: Jeff Garzik <jeff@garzik.org> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: jamal <hadi@cyberus.ca> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds13-33/+91
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV6]: Added GSO support for TCPv6 [NET]: Generalise TSO-specific bits from skb_setup_caps [IPV6]: Added GSO support for TCPv6 [IPV6]: Remove redundant length check on input [NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunks [TG3]: Update version and reldate [TG3]: Add TSO workaround using GSO [TG3]: Turn on hw fix for ASF problems [TG3]: Add rx BD workaround [TG3]: Add tg3_netif_stop() in vlan functions [TCP]: Reset gso_segs if packet is dodgy
2006-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds304-307/+4
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: Remove obsolete #include <linux/config.h> remove obsolete swsusp_encrypt arch/arm26/Kconfig typos Documentation/IPMI typos Kconfig: Typos in net/sched/Kconfig v9fs: do not include linux/version.h Documentation/DocBook/mtdnand.tmpl: typo fixes typo fixes: specfic -> specific typo fixes in Documentation/networking/pktgen.txt typo fixes: occuring -> occurring typo fixes: infomation -> information typo fixes: disadvantadge -> disadvantage typo fixes: aquire -> acquire typo fixes: mecanism -> mechanism typo fixes: bandwith -> bandwidth fix a typo in the RTC_CLASS help text smb is no longer maintained Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
2006-06-30[IPV6]: Added GSO support for TCPv6Herbert Xu5-12/+6
This patch adds GSO support for IPv6 and TCPv6. This is based on a patch by Ananda Raju <Ananda.Raju@neterion.com>. His original description is: This patch enables TSO over IPv6. Currently Linux network stacks restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from "dev->features". This patch will remove this restriction. This patch will introduce a new flag NETIF_F_TSO6 which will be used to check whether device supports TSO over IPv6. If device support TSO over IPv6 then we don't clear of NETIF_F_TSO and which will make the TCP layer to create TSO packets. Any device supporting TSO over IPv6 will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO. In case when user disables TSO using ethtool, NETIF_F_TSO will get cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't get TSO packets created by TCP layer. SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet. SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature. UFO is supported over IPv6 also The following table shows there is significant improvement in throughput with normal frames and CPU usage for both normal and jumbo. -------------------------------------------------- | | 1500 | 9600 | | ------------------|-------------------| | | thru CPU | thru CPU | -------------------------------------------------- | TSO OFF | 2.00 5.5% id | 5.66 20.0% id | -------------------------------------------------- | TSO ON | 2.63 78.0 id | 5.67 39.0% id | -------------------------------------------------- Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30[NET]: Generalise TSO-specific bits from skb_setup_capsHerbert Xu3-7/+6
This patch generalises the TSO-specific bits from sk_setup_caps by adding the sk_gso_type member to struct sock. This makes sk_setup_caps generic so that it can be used by TCPv6 or UFO. The only catch is that whoever uses this must provide a GSO implementation for their protocol which I think is a fair deal :) For now UFO continues to live without a GSO implementation which is OK since it doesn't use the sock caps field at the moment. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30[IPV6]: Added GSO support for TCPv6Herbert Xu4-2/+66
This patch adds GSO support for IPv6 and TCPv6. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30[IPV6]: Remove redundant length check on inputHerbert Xu1-6/+1
We don't need to check skb->len when we're just about to call pskb_may_pull since that checks it for us. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30[NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunksPatrick McHardy2-2/+2
When a packet without any chunks is received, the newconntrack variable in sctp_packet contains an out of bounds value that is used to look up an pointer from the array of timeouts, which is then dereferenced, resulting in a crash. Make sure at least a single chunk is present. Problem noticed by George A. Theall <theall@tenablesecurity.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30[TCP]: Reset gso_segs if packet is dodgyHerbert Xu1-4/+10
I wasn't paranoid enough in verifying GSO information. A bogus gso_segs could upset drivers as much as a bogus header would. Let's reset it in the per-protocol gso_segment functions. I didn't verify gso_size because that can be verified by the source of the dodgy packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30[PATCH] knfsd: svcrpc: gss: server-side implementation of rpcsec_gss privacyJ. Bruce Fields2-7/+148
Server-side implementation of rpcsec_gss privacy, which enables encryption of the payload of every rpc request and response. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd: mark rqstp to prevent use of sendfile in privacy caseJ. Bruce Fields1-0/+2
Add a rq_sendfile_ok flag to svc_rqst which will be cleared in the privacy case so that the wrapping code will get copies of the read data instead of real page cache pages. This makes life simpler when we encrypt the response. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: svcrpc: Simplify nfsd rpcsec_gss integrity codeJ. Bruce Fields1-51/+64
Pull out some of the integrity code into its own function, otherwise svcauth_gss_release() is going to become very ungainly after the addition of privacy code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: svcrpc: gss: simplify rsc_parse()J. Bruce Fields2-11/+7
Adopt a simpler convention for gss_mech_put(), to simplify rsc_parse(). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel303-303/+0
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30Kconfig: Typos in net/sched/KconfigMatt LaPlante1-4/+4
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-29[TIPC]: Initial activation message now includes TIPC version numberAllan Stephens1-1/+2
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[TIPC]: Improve response to requests for node/link informationAllan Stephens2-11/+19
Now allocates reply space for "get links" request based on number of actual links, not number of potential links. Also, limits reply to "get links" and "get nodes" requests to 32KB to match capabilities of tipc-config utility that issued request. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-06-29[TIPC]: Fixed skb_under_panic caused by tipc_link_bundle_bufAllan Stephens1-5/+6
Now determines tailroom of bundle buffer by directly inspection of buffer. Previously, buffer was assumed to have a max capacity equal to the link MTU, but the addition of link MTU negotiation means that the link MTU can increase after the bundle buffer is allocated. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[IrDA]: Fix RCU lock pairing on error pathJosh Triplett1-1/+2
irlan_client_discovery_indication calls rcu_read_lock and rcu_read_unlock, but returns without unlocking in an error case. Fix that by replacing the return with a goto so that the rcu_read_unlock always gets executed. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Samuel Ortiz samuel@sortiz.org <> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[XFRM]: unexport xfrm_state_mtuAdrian Bunk1-2/+0
This patch removes the unused EXPORT_SYMBOL(xfrm_state_mtu). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[NET]: make skb_release_data() staticAdrian Bunk1-1/+1
skb_release_data() no longer has any users in other files. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[NETFILTE] ipv4: Fix typo (Bugzilla #6753)Matt LaPlante1-1/+1
This patch fixes bugzilla #6753, a typo in the netfilter Kconfig Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[ATM]: basic sysfs support for ATM devicesRoman Kagan6-5/+206
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[NET]: Add ECN support for TSOMichael Chan3-8/+0
In the current TSO implementation, NETIF_F_TSO and ECN cannot be turned on together in a TCP connection. The problem is that most hardware that supports TSO does not handle CWR correctly if it is set in the TSO packet. Correct handling requires CWR to be set in the first packet only if it is set in the TSO header. This patch adds the ability to turn on NETIF_F_TSO and ECN using GSO if necessary to handle TSO packets with CWR set. Hardware that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev-> features flag. All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set. If the output device does not have the NETIF_F_TSO_ECN feature set, GSO will split the packet up correctly with CWR only set in the first segment. With help from Herbert Xu <herbert@gondor.apana.org.au>. Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock flag is completely removed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[AF_UNIX]: Datagram getpeersecCatherine Zhang2-0/+38
This patch implements an API whereby an application can determine the label of its peer's Unix datagram sockets via the auxiliary data mechanism of recvmsg. Patch purpose: This patch enables a security-aware application to retrieve the security context of the peer of a Unix datagram socket. The application can then use this security context to determine the security context for processing on behalf of the peer who sent the packet. Patch design and implementation: The design and implementation is very similar to the UDP case for INET sockets. Basically we build upon the existing Unix domain socket API for retrieving user credentials. Linux offers the API for obtaining user credentials via ancillary messages (i.e., out of band/control messages that are bundled together with a normal message). To retrieve the security context, the application first indicates to the kernel such desire by setting the SO_PASSSEC option via getsockopt. Then the application retrieves the security context using the auxiliary data mechanism. An example server application for Unix datagram socket should look like this: toggle = 1; toggle_len = sizeof(toggle); setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len); recvmsg(sockfd, &msg_hdr, 0); if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) { cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr); if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) && cmsg_hdr->cmsg_level == SOL_SOCKET && cmsg_hdr->cmsg_type == SCM_SECURITY) { memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext)); } } sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow a server socket to receive security context of the peer. Testing: We have tested the patch by setting up Unix datagram client and server applications. We verified that the server can retrieve the security context using the auxiliary data mechanism of recvmsg. Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com> Acked-by: Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[NET]: Make illegal_highdma more analHerbert Xu1-4/+2
Rather than having illegal_highdma as a macro when HIGHMEM is off, we can turn it into an inline function that returns zero. This will catch callers that give it bad arguments. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[TCP]: Export accept queue len of a TCP listening socket via rx_queueSridhar Samudrala3-3/+8
While debugging a TCP server hang issue, we noticed that currently there is no way for a user to get the acceptq backlog value for a TCP listen socket. All the standard networking utilities that display socket info like netstat, ss and /proc/net/tcp have 2 fields called rx_queue and tx_queue. These fields do not mean much for listening sockets. This patch uses one of these unused fields(rx_queue) to export the accept queue len for listening sockets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[NETLINK]: Encapsulate eff_cap usage within security framework.Darrel Goeddel7-7/+7
This patch encapsulates the usage of eff_cap (in netlink_skb_params) within the security framework by extending security_netlink_recv to include a required capability parameter and converting all direct usage of eff_caps outside of the lsm modules to use the interface. It also updates the SELinux implementation of the security_netlink_send and security_netlink_recv hooks to take advantage of the sid in the netlink_skb_params struct. This also enables SELinux to perform auditing of netlink capability checks. Please apply, for 2.6.18 if possible. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29[NET]: Added GSO header verificationHerbert Xu6-19/+40
When GSO packets come from an untrusted source (e.g., a Xen guest domain), we need to verify the header integrity before passing it to the hardware. Since the first step in GSO is to verify the header, we can reuse that code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this bit set can only be fed directly to devices with the corresponding bit NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb is fed to the GSO engine which will allow the packet to be sent to the hardware if it passes the header check. This patch changes the sg flag to a full features flag. The same method can be used to implement TSO ECN support. We simply have to mark packets with CWR set with SKB_GSO_ECN so that only hardware with a corresponding NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment the packet, or segment the first MTU and pass the rest to the hardware for further segmentation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>