kernel/git/leon/iproute2.git - Leon Romanovsky's fork of iproute2.git

Age	Commit message (Collapse)	Author	Files	Lines
2022-08-24	Merge branch 'devlink-rm-dl_argv_parse_put' into nextHEAD master	David Ahern	1	-258/+329
	Jacob Keller says: ==================== This series removes the dl_argv_parse_put function which both parses the command line arguments and places them into the netlink header. This was originally sent as an RFC at https://lore.kernel.org/netdev/20220805234155.2878160-1-jacob.e.keller@intel.com/ Since there is some ongoing work around policy code being generated from YAML, I thought it best to wait on the devlink policy portion of this series for now. Jiri mentioned he wanted to base some work on top of this, so I am sending just the cleanup patches. The primary motivation for this is due to the fact that dl_argv_parse_put requires a netlink header, meaning a command must have already been prepared. This prevents addition of a different netlink command to get the policy data, and thus prevents us from using this variant while checking netlink policy. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-24	devlink: remove dl_argv_parse_put	Jacob Keller	1	-258/+329
	The dl_argv_parse_put function is used to extract arguments from the command line and convert them to the appropriate netlink attributes. This function is a combination of calling dl_argv_parse and dl_put_opts. A future change is going to refactor dl_argv_parse to check the kernel's netlink policy for the command. This requires issuing another netlink message which requires calling dl_argv_parse before mnlu_gen_socket_cmd_prepare. Otherwise, the get policy command issued in dl_argv_parse would overwrite the prepared buffer. This conflicts with dl_argv_parse_put which requires being called after mnlu_gen_socket_cmd_prepare. Remove dl_argv_parse_put and replace it with appropriate calls to dl_argv_parse and dl_put_opts. This allows us to ensure dl_argv_parse is called before mnlu_gen_socket_cmd_prepare while dl_put_opts is called afterwards. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-24	devlink: use dl_no_arg instead of checking dl_argc == 0	Jacob Keller	1	-16/+16
	Use the helper dl_no_arg function to check for whether the command has any arguments. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-14	devlink: expose nested devlink for a line card object	Jiri Pirko	1	-0/+23
	If line card object contains a nested devlink, expose it. Example: $ devlink lc show pci/0000:01:00.0 lc 1 pci/0000:01:00.0: lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0 supported_types: 16x100G $ devlink dev show auxiliary/mlxsw_core.lc.0 auxiliary/mlxsw_core.lc.0 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-14	Merge branch 'main' into next	David Ahern	2	-0/+6
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-12	configure: Define _GNU_SOURCE when checking for setns	Khem Raj	1	-0/+1
	glibc defines this function only as gnu extention Signed-off-by: Khem Raj <raj.khem@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-08-09	ipstats: add missing headers	Stephen Hemminger	1	-0/+4
	IWYU reports several headers are not explicitly included by ipstats. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-08-09	ipstats: Add param.h for musl	Changhyeok Bae	1	-0/+1
	Fix build error for musl \| /usr/src/debug/iproute2/5.19.0-r0/iproute2-5.19.0/ip/ipstats.c:231: undefined reference to `MIN' Signed-off-by: Changhyeok Bae <changhyeok.bae@gmail.com>
2022-08-04	Merge branch 'main' into next	David Ahern	1	-1/+1
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-04	devlink: add support for running selftests	Vikas Gupta	3	-1/+403
	Add commands and helper APIs to run selftests. Include a selftest id for a non volatile memory i.e. flash. Also, update the man page and bash-completion for selftests commands. Examples: $ devlink dev selftests run pci/0000:03:00.0 id flash pci/0000:03:00.0: flash: status passed $ devlink dev selftests show pci/0000:03:00.0 pci/0000:03:00.0 flash $ devlink dev selftests show pci/0000:03:00.0 -j {"selftests":{"pci/0000:03:00.0":["flash"]}} $ devlink dev selftests run pci/0000:03:00.0 id flash -j {"selftests":{"pci/0000:03:00.0":{"flash":{"status":"passed"}}}} Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-02	v5.19.0	Stephen Hemminger	1	-1/+1

2022-08-01	Merge branch 'main' into next	David Ahern	8	-12/+309
	Conflicts: vdpa/include/uapi/linux/vdpa.h Signed-off-by: David Ahern <dsahern@kernel.org>
2022-08-01	seg6: add support for SRv6 Headend Reduced Encapsulation	Paolo Lungaroni	3	-2/+16
	This patch adds the support for the reduced version of the H.Encaps and H.L2Encaps behaviors as defined in RFC 8986 [1]. H.Encaps.Red and H.L2Encaps.Red SRv6 behaviors are an optimization of the H.Encaps and H.L2Encaps aiming to reduce the length of the SID List carried in the pushed SRH. Specifically, the reduced version of the behaviors removes the first SID contained in the SID List (i.e. SRv6 Policy) by storing it into the IPv6 Destination Address. When SRv6 Policy is made of only one SID, the reduced version of the behaviors omits the SRH at all and pushes that SID directly into the IPv6 DA. Some examples: ip -6 route add 2001:db8::1 encap seg6 mode encap.red segs fcf0:1::e,fcf0:2::d6 dev eth0 ip -6 route add 2001:db8::2 encap seg6 mode l2encap.red segs fcf0:1::d2 dev eth0 Standard Output: ip -6 route show 2001:db8::1 2001:db8::1 encap seg6 mode encap.red segs 2 [ fcf0:1::e fcf0:2::d6 ] dev eth0 metric 1024 pref medium JSON Output: ip -6 -j -p route show 2001:db8::1 [ { "dst": "2001:db8::1", "encap": "seg6", "mode": "encap.red", "segs": [ "fcf0:1::e","fcf0:2::d6" ], "dev": "eth0", "metric": 1024, "flags": [ ], "pref": "medium" } ] [1] - https://datatracker.ietf.org/doc/html/rfc8986 Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-30	Update kernel headers	David Ahern	5	-8/+42
	Update kernel headers to commit 63757225a933 ("Merge tag 'mlx5-updates-2022-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-29	Merge branch 'pppoe-in-flower' into next	David Ahern	9	-27/+185
	Wojciech Drewek says: ==================== This patchset implements support for matching on PPPoE specific fields using tc-flower. First patch introduces small refactor which allows to use same mechanism of finding protocol for both ppp and ether protocols. Second patch adds support for parsing ppp protocols. Last patch is about parsing PPPoE fields. Kernel changes (merged): https://lore.kernel.org/netdev/20220726203133.2171332-1-anthony.l.nguyen@intel.com/T/#t ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-29	f_flower: Introduce PPPoE support	Wojciech Drewek	3	-1/+77
	Introduce PPPoE specific fields in tc-flower: - session id (16 bits) - ppp protocol (16 bits) Those fields can be provided only when protocol was set to ETH_P_PPP_SES. ppp_proto works similar to vlan_ethtype, i.e. ppp_proto overwrites eth_type. Thanks to that, fields from encapsulated protocols (such as src_ip) can be specified. e.g. # tc filter add dev ens6f0 ingress prio 1 protocol ppp_ses \ flower \ pppoe_sid 1234 \ ppp_proto ip \ dst_ip 127.0.0.1 \ src_ip 127.0.0.2 \ action drop Vlan and cvlan is also supported, in this case cvlan_ethtype or vlan_ethtype has to be set to ETH_P_PPP_SES. e.g. # tc filter add dev ens6f0 ingress prio 1 protocol 802.1Q \ flower \ vlan_id 2 \ vlan_ethtype ppp_ses \ pppoe_sid 1234 \ ppp_proto ip \ dst_ip 127.0.0.1 \ src_ip 127.0.0.2 \ action drop Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-29	lib: Introduce ppp protocols	Wojciech Drewek	3	-1/+56
	PPP protocol field uses different values than ethertype. Introduce utilities for translating PPP protocols from strings to values and vice versa. Use generic API from utils in order to get proto id and name. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-29	lib: refactor ll_proto functions	Wojciech Drewek	3	-25/+52
	Move core logic of ll_proto_n2a and ll_proto_a2n to utils.c and make it more generic by allowing to pass table of protocols as argument (proto_tb). Introduce struct proto with protocol ID and name to allow this. This wil allow to use those functions by other use cases. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-29	Import posix_types.h uapi file from point of last kernel headers sync	David Ahern	1	-0/+99
	__kernel_old_time_t definition is needed for pppoe-in-flower patches. Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-28	Import ppp_defs.h uapi file from point of last kernel headers sync	David Ahern	1	-0/+165
	ppp_defs header file is needed by PPPoE in flower support. Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-25	bpf_glue: include errno.h	Juhee Kang	1	-0/+1
	If __NR_bpf is not enabled, bpf() function set errno and return -1. Thus, this patch includes the header. Fixes: ac4e0913beb1 ("bpf: Export bpf syscall wrapper") Signed-off-by: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-21	devlink: add support for linecard show and type set	Jiri Pirko	3	-3/+378
	Introduce a new object "lc" to add devlink support for line cards with two commands: show - to get the info about the line card state, list of supported types as reported by kernel/driver. set - to set/clear the line card type. Example: $ devlink lc pci/0000:01:00.0: lc 1 state unprovisioned supported_types: 16x100G lc 2 state unprovisioned supported_types: 16x100G lc 3 state unprovisioned supported_types: 16x100G lc 4 state unprovisioned supported_types: 16x100G lc 5 state unprovisioned supported_types: 16x100G lc 6 state unprovisioned supported_types: 16x100G lc 7 state unprovisioned supported_types: 16x100G lc 8 state unprovisioned supported_types: 16x100G To provision the slot #8: $ devlink lc set pci/0000:01:00.0 lc 8 type 16x100G $ devlink lc show pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 state active type 16x100G supported_types: 16x100G To uprovision the slot #8: $ devlink lc set pci/0000:01:00.0 lc 8 notype Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-21	Update kernel headers	David Ahern	5	-4/+18
	Update kernel headers to commit: 5588d6280270 ("net/cdc_ncm: Increase NTB max RX/TX values to 64kb") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-21	vdpa: Update man page to include vdpa statistics	Eli Cohen	1	-0/+22
	Update the man page to include vdpa statistics information inroduce in 6f97e9c9337b ("vdpa: Add support for reading vdpa device statistics") Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-18	rdma: update uapi/ib_user_verbs.h	Stephen Hemminger	1	-0/+42
	Update from 5.19-rc7 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	vdpa: update uapi headers from 5.19-rc7	Stephen Hemminger	3	-7/+13
	Keep VDPA sanitized headers up to current kernel. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	Revert "uapi: add vdpa.h"	Stephen Hemminger	1	-59/+0
	This reverts commit 291898c5ff881d0dc5d947031def0528101476cb.
2022-07-18	ip neigh: Fix memory leak when doing 'get'	Benjamin Poirier	1	-0/+2
	With the following command sequence: ip link add dummy0 type dummy ip neigh add 192.168.0.1 dev dummy0 ip neigh get 192.168.0.1 dev dummy0 when running the last command under valgrind, it reports 32,768 bytes in 1 blocks are definitely lost in loss record 2 of 2 at 0x483F7B5: malloc (vg_replace_malloc.c:381) by 0x17A0EC: rtnl_recvmsg (libnetlink.c:838) by 0x17A3D1: __rtnl_talk_iov.constprop.0 (libnetlink.c:1040) by 0x17B894: __rtnl_talk (libnetlink.c:1141) by 0x17B894: rtnl_talk (libnetlink.c:1147) by 0x12E49B: ipneigh_get (ipneigh.c:728) by 0x1174CB: do_cmd (ip.c:136) by 0x116F7C: main (ip.c:324) Free the answer obtained from rtnl_talk(). Fixes: 62842362370b ("ipneigh: neigh get support") Suggested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	mptcp: Fix memory leak when getting limits	Benjamin Poirier	1	-3/+7
	When running the command `ip mptcp limits` under valgrind, it reports 32,768 bytes in 1 blocks are definitely lost in loss record 1 of 1 at 0x483F7B5: malloc (vg_replace_malloc.c:381) by 0x17A0BC: rtnl_recvmsg (libnetlink.c:838) by 0x17A3A1: __rtnl_talk_iov.constprop.0 (libnetlink.c:1040) by 0x17B864: __rtnl_talk (libnetlink.c:1141) by 0x17B864: rtnl_talk (libnetlink.c:1147) by 0x16837D: mptcp_limit_get_set (ipmptcp.c:436) by 0x1174CB: do_cmd (ip.c:136) by 0x116F7C: main (ip.c:324) Free the answer obtained from rtnl_talk(). Fixes: 7e0767cd862b ("add support for mptcp netlink interface") Suggested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	mptcp: Fix memory leak when doing 'endpoint show'	Benjamin Poirier	1	-0/+1
	With the following command sequence: ip mptcp endpoint add 127.0.0.1 id 1 ip mptcp endpoint show id 1 when running the last command under valgrind, it reports 32,768 bytes in 1 blocks are definitely lost in loss record 2 of 2 at 0x483F7B5: malloc (vg_replace_malloc.c:381) by 0x17A0AC: rtnl_recvmsg (libnetlink.c:838) by 0x17A391: __rtnl_talk_iov.constprop.0 (libnetlink.c:1040) by 0x17B854: __rtnl_talk (libnetlink.c:1141) by 0x17B854: rtnl_talk (libnetlink.c:1147) by 0x168A56: mptcp_addr_show (ipmptcp.c:334) by 0x1174CB: do_cmd (ip.c:136) by 0x116F7C: main (ip.c:324) Free the answer obtained from rtnl_talk(). Fixes: 7e0767cd862b ("add support for mptcp netlink interface") Suggested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	bridge: Fix memory leak when doing 'fdb get'	Benjamin Poirier	1	-2/+5
	With the following command sequence: ip link add br0 up type bridge ip link add dummy0 up address 02:00:00:00:00:01 master br0 type dummy bridge fdb get 02:00:00:00:00:01 br br0 when running the last command under valgrind, it reports 32,768 bytes in 1 blocks are definitely lost in loss record 2 of 2 at 0x483F7B5: malloc (vg_replace_malloc.c:381) by 0x11C1EC: rtnl_recvmsg (libnetlink.c:838) by 0x11C4D1: __rtnl_talk_iov.constprop.0 (libnetlink.c:1040) by 0x11D994: __rtnl_talk (libnetlink.c:1141) by 0x11D994: rtnl_talk (libnetlink.c:1147) by 0x10D336: fdb_get (fdb.c:652) by 0x48907FC: (below main) (libc-start.c:332) Free the answer obtained from rtnl_talk(). Fixes: 4ed5ad7bd3c6 ("bridge: fdb get support") Reported-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	ip address: Fix memory leak when specifying device	Benjamin Poirier	1	-0/+2
	Running a command like `ip addr show dev lo` under valgrind informs us that 32,768 bytes in 1 blocks are definitely lost in loss record 4 of 4 at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x16CBE2: rtnl_recvmsg (libnetlink.c:775) by 0x16CF04: __rtnl_talk_iov (libnetlink.c:954) by 0x16E257: __rtnl_talk (libnetlink.c:1059) by 0x16E257: rtnl_talk (libnetlink.c:1065) by 0x115CB1: ipaddr_link_get (ipaddress.c:1833) by 0x11A0D1: ipaddr_list_flush_or_save (ipaddress.c:2030) by 0x1152EB: do_cmd (ip.c:115) by 0x114D6F: main (ip.c:321) After calling store_nlmsg(), the original buffer should be freed. That is the pattern used elsewhere through the rtnl_dump_filter() call chain. Fixes: 884709785057 ("ip address: Set device index in dump request") Reported-by: Binu Gopalakrishnapillai <binug@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	uapi: add virtio_ring.h	Stephen Hemminger	1	-0/+242
	When vdpa was updated, it included linux/virtio_ring.h but that sanitized header file was not added. Fixes: bd91c7647189 ("vdpa: Allow for printing negotiated features of a device") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-18	uapi: add vdpa.h	Stephen Hemminger	1	-0/+59
	Iproute2 depends on kernel headers and all necessary kernel headers should be in iproute tree. Fixes: c2ecc82b9d4c ("vdpa: Add vdpa tool") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-16	uapi: update bpf.h	Stephen Hemminger	1	-4/+7
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-08	libbpf: add xdp program name support	Hangbin Liu	5	-6/+45
	In bpf program, only the program name is unique. Before this patch, if there are multiple programs with the same section name, only the first program will be attached. With program name support, users could specify the exact program they want to attach. Note this feature is only supported when iproute2 build with libbpf. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-06	ip: Fix rx_otherhost_dropped support	Petr Machata	1	-3/+5
	The commit cited below added a new column to print_stats64(). However it then updated only one size_columns() call site, neglecting to update the remaining three. As a result, in those not-updated invocations, size_columns() now accesses a vararg argument that is not being passed, which is undefined behavior. Fixes: cebf67a35d8a ("show rx_otherehost_dropped stat in ip link show") CC: Tariq Toukan <tariqt@nvidia.com> CC: Itay Aveksis <itayav@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-06	Merge branch 'main' into next	David Ahern	3	-6/+16
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-07-05	ip: Fix size_columns() invocation that passes a 32-bit quantity	Petr Machata	1	-4/+6
	In print_stats64(), the last size_columns() invocation passes number of carrier changes as one of the arguments. The value is decoded as a 32-bit quantity, but size_columns() expects a 64-bit one. This is undefined behavior. The reason valgrind does not cite this is that the previous size_columns() invocations prime the ABI area used for the value transfer. When these other invocations are commented away, valgrind does complain that "conditional jump or move depends on uninitialised value", as would be expected. Fixes: 49437375b6c1 ("ip: dynamically size columns when printing stats") Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-05	man: tc-fq_codel: add drop_batch	Yuki Inoguchi	1	-0/+7
	Let's describe the drop_batch parameter added to tc command by Commit 7868f802e2d9 ("tc: fq_codel: add drop_batch parameter") Signed-off-by: Yuki Inoguchi <inoguchi.yuki@fujitsu.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-01	uapi: update mptcp.h	Stephen Hemminger	1	-2/+3
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-07-01	Merge branch 'main' into next	David Ahern	12	-33/+34
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-27	ip: Fix size_columns() for very large values	Petr Machata	1	-2/+2
	For values near the 64-bit boundary, the iterative application of powi *= 10 causes powi to overflow without the termination condition of powi >= val having ever been satisfied. Instead, when determining the length of the number, iterate val /= 10 and terminate when it's a single digit. Fixes: 49437375b6c1 ("ip: dynamically size columns when printing stats") CC: Tariq Toukan <tariqt@nvidia.com> CC: Itay Aveksis <itayav@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-26	iplink: bond_slave: add per port prio support	Hangbin Liu	2	-1/+19
	Add per port priority support for active slave re-selection during bonding failover. A higher number means higher priority. This option is only valid for active-backup(1), balance-tlb (5) and balance-alb (6) mode. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-26	Update kernel headers	David Ahern	3	-7/+99
	Update kernel headers to commit: ebeae54d3a77 ("net: pcs: xpcs: depends on PHYLINK in Kconfig") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-21	man: tc-ct.8: fix example	Andrea Claudi	1	-2/+2
	tc-ct manpage provides a wrong command to add an ingress qdisc to an interface: $ tc qdisc add dev eth0 handle ingress Error: argument "ingress" is wrong: invalid qdisc ID Fix it removing the useless "handle" keyword. Fixes: 924c43778a84 ("man: tc-ct.8: Add manual page for ct tc action") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-21	l2tp: fix typo in AF_INET6 checksum JSON print	Andrea Claudi	1	-1/+1
	In print_tunnel json output, a typo makes it impossible to know the value of udp6_csum_rx, printing instead udp6_csum_tx two times. Fixed getting rid of the typo. Fixes: 98453b65800f ("ip/l2tp: add JSON support") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-21	Merge branch 'lgtm'	Stephen Hemminger	8	-27/+27

2022-06-21	man: tc-fq_codel: Fix a typo.	Yuki Inoguchi	1	-1/+2
	In tc-fq_codel man page, "length .B interval" should be "length interval." Signed-off-by: Yuki Inoguchi <inoguchi.yuki@fujitsu.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-19	vdpa: Add support for reading vdpa device statistics	Eli Cohen	2	-0/+169
	Read statistics of a vdpa device. The specific data is a received as a pair of attribute name and attribute value. Examples: 1. Read statistics for the virtqueue at index 1 $ vdpa dev vstats show vdpa-a qidx 1 vdpa-a: vdpa-a: queue_type tx received_desc 321812 completed_desc 321812 2. Read statistics for the virtqueue at index 16 $ vdpa dev vstats show vdpa-a qidx 16 vdpa-a: queue_type control_vq received_desc 17 completed_desc 17 3. Read statisitics for the virtqueue at index 0 with json output $ vdpa -j dev vstats show vdpa-a qidx 0 {"vstats":{"vdpa-a":{"queue_type":"rx","received_desc":114855,"completed_desc":114617}}} 4. Read statistics for the virtqueue at index 0 with preety json output $ vdpa -jp dev vstats show vdpa-a qidx 0 vdpa -jp dev vstats show vdpa-a qidx 0 { "vstats": { "vdpa-a": { "queue_type": "rx", "received_desc": 114855, "completed_desc": 114617 } } } Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-17	tc: declaration hides parameter	Stephen Hemminger	7	-25/+25
	In several places (code reuse?), the variable handle is a parameter to the function, but then is defined inside basic block for classid. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-17	genl: fix duplicate include guard	Stephen Hemminger	1	-2/+2
	Found by LGTM. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-17	uapi: change name for zerocopy sendfile in tls	Stephen Hemminger	1	-2/+2
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-17	uapi: update socket.h	Stephen Hemminger	2	-1/+2
	From 5.19-rc0 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-16	man: tc-fw: Document masked handle usage	Ido Schimmel	1	-3/+41
	The tc-fw filter can be used to match on the packet's fwmark by adding a filter with a matching handle. It also supports matching on specific bits of the fwmark by specifying the handle together with a mask. This is documented in the usage message below, but not in the man page. Document it in the man page together with an example. $ tc filter add fw help Usage: ... fw [ classid CLASSID ] [ indev DEV ] [ action ACTION_SPEC ] CLASSID := Push matching packets to the class identified by CLASSID with format X:Y CLASSID is parsed as hexadecimal input. DEV := specify device for incoming device classification. ACTION_SPEC := Apply an action on matching packets. NOTE: handle is represented as HANDLE[/FWMASK]. FWMASK is 0xffffffff by default. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-12	show rx_otherehost_dropped stat in ip link show	Jeffrey Ji	1	-3/+14
	This stat was added in commit 794c24e9921f ("net-core: rx_otherhost_dropped to core_stats") Tested: sent packet with wrong MAC address from 1 network namespace to another, verified that counter showed "1" in `ip -s -s link sh` and `ip -s -s -j link sh` Signed-off-by: Jeffrey Ji <jeffreyji@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-12	ss: Shorter display format for TLS zerocopy sendfile	Maxim Mikityanskiy	1	-2/+3
	Commit 21c07b45688f ("ss: Show zerocopy sendfile status of TLS sockets") started displaying the activation status of zerocopy sendfile on TLS sockets, exposed via sock_diag. This commit makes the format more compact: the flag's name is shorter and is printed only when the feature is active, similar to other flag options. The flag's name is also generalized ("sendfile" -> "tx") to embrace possible future optimizations, and includes an explicit indication that the underlying data must not be modified during transfer ("ro"). Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-12	Update kernel headers	David Ahern	3	-3/+4
	Update kernel headers to commit: 27f2533bcc6e ("nfp: flower: support to offload pedit of IPv6 flowinto fields") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	Merge branch 'bridge-fdb-flush' into next	David Ahern	2	-1/+224
	Nikolay Aleksandrov says: ==================== Hi, This set adds support for the new bulk delete flag to allow fdb flushing for specific entries which are matched based on the supplied options. The new bridge fdb subcommand is "flush", and as can be seen from the commits it allows to delete entries based on many different criteria: - matching vlan - matching port - matching all sorts of flags (combinations are allowed) There are also examples for each option in the respective commit messages. Examples: $ bridge fdb flush dev swp2 master vlan 100 dynamic [ delete all dynamic entries with port swp2 and vlan 100 ] $ bridge fdb flush dev br0 vlan 1 static [ delete all static entries in br0's fdb table ] $ bridge fdb flush dev swp2 master extern_learn nosticky [ delete all entries with port swp2 which have extern_learn set and don't have the sticky flag set ] $ bridge fdb flush dev br0 brport br0 vlan 100 permanent [ delete all entries pointing to the bridge itself with vlan 100 ] $ bridge fdb flush dev swp2 master nostatic nooffloaded [ delete all entries with port swp2 which are not static and not offloaded ] If keyword is specified and after that nokeyword is specified obviously the nokeyword would override keyword. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush [no]offloaded entry matching	Nikolay Aleksandrov	2	-2/+14
	Add flush support to match entries with or without (if "no" is prepended) offloaded flag. Examples: $ bridge fdb flush dev br0 offloaded This will delete all offloaded entries in br0's fdb table. $ bridge fdb flush dev br0 nooffloaded This will delete all entries except the ones with offloaded flag in br0's fdb table. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush [no]sticky entry matching	Nikolay Aleksandrov	2	-2/+14
	Add flush support to match entries with or without (if "no" is prepended) sticky flag. Examples: $ bridge fdb flush dev br0 sticky This will delete all sticky entries in br0's fdb table. $ bridge fdb flush dev br0 nosticky This will delete all entries except the ones with sticky flag in br0's fdb table. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush [no]extern_learn entry matching	Nikolay Aleksandrov	2	-2/+13
	Add flush support to match entries with or without (if "no" is prepended) extern_learn flag. Examples: $ bridge fdb flush dev br0 extern_learn This will delete all extern_learn entries in br0's fdb table. $ bridge fdb flush dev br0 noextern_learn This will delete all entries except the ones with extern_learn flag in br0's fdb table. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush [no]added_by_user entry matching	Nikolay Aleksandrov	2	-2/+19
	Add flush support to match entries with or without (if "no" is prepended) added_by_user flag. Note that NTF_USE is used internally because there is no NTF_ flag that describes such entries. Examples: $ bridge fdb flush dev br0 added_by_user This will delete all added_by_user entries in br0's fdb table. $ bridge fdb flush dev br0 noadded_by_user This will delete all entries except the ones with added_by_user flag in br0's fdb table. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush [no]dynamic entry matching	Nikolay Aleksandrov	2	-2/+13
	Add flush support to match dynamic or non-dynamic (static or permanent) entries if "no" is prepended respectively. Note that dynamic entries are defined as fdbs without NUD_NOARP and NUD_PERMANENT set, and non-dynamic entries are fdbs with NUD_NOARP set (that matches both static and permanent entries). Examples: $ bridge fdb flush dev br0 dynamic This will delete all dynamic entries in br0's fdb table. $ bridge fdb flush dev br0 nodynamic This will delete all entries except the dynamic ones in br0's fdb table. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush [no]static entry matching	Nikolay Aleksandrov	2	-2/+13
	Add flush support to match static or non-static entries if "no" is prepended respectively. Note that static entries are only NUD_NOARP ones without NUD_PERMANENT, also when matching non-static entries exclude permanent entries as well (permanent entries by definition are also static). Examples: $ bridge fdb flush dev br0 static This will delete all static entries in br0's fdb table. $ bridge fdb flush dev br0 nostatic This will delete all entries except the static ones in br0's fdb table. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush [no]permanent entry matching	Nikolay Aleksandrov	2	-3/+22
	Add flush support to match permanent or non-permanent entries if "no" is prepended respectively. Examples: $ bridge fdb flush dev br0 permanent This will delete all permanent entries in br0's fdb table. $ bridge fdb flush dev br0 nopermanent This will delete all entries except the permanent ones in br0's fdb table. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush port matching	Nikolay Aleksandrov	2	-3/+28
	Usually we match on the device specified after "dev" but there are special cases where we need an additional device attribute for matching such as when matching entries specifically pointing to the bridge device itself. We use NDA_IFINDEX for that purpose. Example: $ bridge fdb flush dev br0 brport br0 This will flush only entries pointing to the bridge itself. $ bridge fdb flush dev swp1 brport swp2 master Note this will flush entries pointing to swp2 only. The NDA_IFINDEX attribute overrides the dev argument. This is documented in the man page. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add flush vlan matching	Nikolay Aleksandrov	2	-1/+21
	Add flush support to match fdb entries in a specific vlan. Example: $ bridge fdb flush dev swp1 vlan 10 master This will flush all fdb entries with port swp1 and vlan 10. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-10	bridge: fdb: add new flush command	Nikolay Aleksandrov	2	-1/+86
	Add support for fdb bulk delete (aka flush) command. Currently it only supports the self and master flags with the same semantics as fdb add/del. The device is a mandatory argument. Example: $ bridge fdb flush dev br0 This will delete all fdb entries in br0's fdb table. $ bridge fdb flush dev swp1 master This will delete all fdb entries pointing to swp1. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-09	Merge branch 'main' into next	David Ahern	3	-27/+13
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-06-01	ip: Convert non-constant initializers to macros	Petr Machata	3	-27/+13
	As per the C standard, "expressions in an initializer for an object that has static or thread storage duration shall be constant expressions". Aggregate objects are not constant expressions. Newer GCC doesn't mind, but older GCC and LLVM do. Therefore convert to a macro. And since all these macros will look very similar, extract a generic helper, IPSTATS_STAT_DESC_XSTATS_LEAF, which takes the leaf name as an argument and initializes the rest as appropriate for an xstats descriptor. Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-30	Merge branch 'ss-threads' into next	David Ahern	2	-96/+142
	Peilin Ye says: ==================== From: Peilin Ye <peilin.ye@bytedance.com> This patchset adds a new ss option, -T (--threads), to show thread information. It extends the -p (--processes) option, and should be useful for debugging, monitoring multi-threaded applications. Example output: $ ss -ltT "sport = 1234" State Recv-Q Send-Q Local Address:Port Peer Address:Port Process LISTEN 0 100 0.0.0.0:1234 0.0.0.0:* users:(("test",pid=2932547,tid=2932548,fd=3),("test",pid=2932547,tid=2932547,fd=3)) It implies -p i.e. it outputs all threads in the thread group, including the thread group leader. When -T is used, -Z and -z also show SELinux contexts for threads. [1-5/7] are small clean-ups for the user_ent_hash_build() function. [6/7] factors out logic iterating $PROC_ROOT/$PID/fd/ from user_ent_hash_build() to make [7/7] easier. [7/7] actually implements the feature. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Introduce -T, --threads option	Peilin Ye	2	-30/+72
	The -p, -Z and -z options only show process (thread group leader) information. For example, if the thread group leader has exited, but another thread in the group is still using a socket, ss -[pZz] does not show it. Add a new option, -T (--threads), to show thread information. It implies the -p option. For example, imagine process A and thread B (in the same group) using the same socket. ss -p only shows A: $ ss -ltp "sport = 1234" State Recv-Q Send-Q Local Address:Port Peer Address:Port Process LISTEN 0 100 0.0.0.0:1234 0.0.0.0:* users:(("test",pid=2932547,fd=3)) ss -T shows A and B: $ ss -ltT "sport = 1234" State Recv-Q Send-Q Local Address:Port Peer Address:Port Process LISTEN 0 100 0.0.0.0:1234 0.0.0.0:* users:(("test",pid=2932547,tid=2932548,fd=3),("test",pid=2932547,tid=2932547,fd=3)) If -T is used, -Z and -z also show SELinux contexts for threads. Rename some variables (from "process" to "task", for example) since we use them for both processes and threads. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Factor out fd iterating logic from user_ent_hash_build()	Peilin Ye	1	-65/+79
	We are planning to add a thread version of the -p, --process option. Move the logic iterating $PROC_ROOT/$PID/fd/ into a new function, user_ent_hash_build_task(), to make it easier. Since we will use this function for both processes and threads, rename local variables as such (e.g. from "process" to "task"). Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Fix coding style issues in user_ent_hash_build()	Peilin Ye	1	-10/+12
	Make checkpatch.pl --strict happy about user_ent_hash_build(). Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Delete unnecessary call to snprintf() in user_ent_hash_build()	Peilin Ye	1	-4/+1
	'name' is already $PROC_ROOT/$PID/fd/$FD there, no need to rebuild the string. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Do not call user_ent_hash_build() more than once	Peilin Ye	1	-9/+3
	Call user_ent_hash_build() once after the getopt_long() loop if -p, -z or -Z is used. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Remove unnecessary stack variable 'p' in user_ent_hash_build()	Peilin Ye	1	-5/+3
	Commit 116ac9270b6d ("ss: Add support for retrieving SELinux contexts") added an unnecessary stack variable, 'char *p', in user_ent_hash_build(). Delete it for readability. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Use assignment-suppression character in sscanf()	Peilin Ye	1	-3/+2
	Use the '*' assignment-suppression character, instead of an inappropriately named temporary variable. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	ss: Show zerocopy sendfile status of TLS sockets	Maxim Mikityanskiy	1	-0/+6
	Print the activation status of zerocopy sendfile on TLS sockets. Zerocopy sendfile was recently added to Linux and exposed via sock_diag. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	iplink: report tso_max_size and tso_max_segs	Eric Dumazet	1	-0/+12
	New netlink attributes IFLA_TSO_MAX_SIZE and IFLA_TSO_MAX_SEGS are used to report device TSO limits to user-space. ip -d link sh dev eth0 ... tso_max_size 65536 tso_max_segs 65535 ip -d link sh dev lo ... tso_max_size 524280 tso_max_segs 65535 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-30	Merge branch 'main' into next	David Ahern	21	-28/+155
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-26	Merge git://git.kernel.org/pub/scm/network/iproute2/iproute2-next	Stephen Hemminger	41	-275/+2984

2022-05-26	v5.18.0	Stephen Hemminger	1	-1/+1

2022-05-17	tipc: fix keylen check	Andrea Claudi	1	-3/+2
	Key length check in str2key() is wrong for hex. Fix this using the proper hex key length. Fixes: 28ee49e5153b ("tipc: bail out if key is abnormally long") Suggested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-16	iplink: remove GSO_MAX_SIZE definition	Eric Dumazet	1	-3/+0
	David removed the check using GSO_MAX_SIZE in commit f1d18e2e6ec5 ("Update kernel headers"). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-13	doc: fix 'infact' --> 'in fact' typo	Andrea Claudi	1	-1/+1
	Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-13	man: fix some typos	Andrea Claudi	3	-3/+3
	In dcb-app man page, 'direcly' should be 'directly' In dcb-dcbx man page, 'respecively' should be 'respectively' In devlink-dev man page, 'unspecificed' should be 'unspecified' Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-13	man: devlink-region: fix typo in example	Andrea Claudi	1	-1/+1
	devlink-region does not accept the legth param, but the length one. Fixes: 8b4fbf0bed8e ("devlink: Add support for devlink-region access") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-13	tc: em_u32: fix offset parsing	Andrea Claudi	1	-1/+1
	tc u32 ematch offset parsing might fail even if nexthdr offset is aligned to 4. The issue can be reproduced with the following script: tc qdisc del dev dummy0 root tc qdisc add dev dummy0 root handle 1: htb r2q 1 default 1 tc class add dev dummy0 parent 1:1 classid 1:108 htb quantum 1000000 \ rate 1.00mbit ceil 10.00mbit burst 6k while true; do if ! tc filter add dev dummy0 protocol all parent 1: prio 1 basic match \ "meta(vlan mask 0xfff eq 1)" and "u32(u32 0x20011002 0xffffffff \ at nexthdr+8)" flowid 1:108; then exit 0 fi done which we expect to produce an endless loop. With the current code, instead, this ends with: u32: invalid offset alignment, must be aligned to 4. ... meta(vlan mask 0xfff eq 1) and >>u32(u32 0x20011002 0xffffffff at nexthdr+8)<< ... ... u32(u32 0x20011002 0xffffffff at >>nexthdr+8<<)... Usage: u32(ALIGN VALUE MASK at [ nexthdr+ ] OFFSET) where: ALIGN := { u8 \| u16 \| u32 } Example: u32(u16 0x1122 0xffff at nexthdr+4) Illegal "ematch" This is caused by memcpy copying into buf an unterminated string. Fix it using strncpy instead of memcpy. Fixes: commit 311b41454dc4 ("Add new extended match files.") Reported-by: Alfred Yang <alf.redyoung@gmail.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-13	uapi: update of virtio_ids	Stephen Hemminger	1	-7/+7
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-13	Update kernel headers	David Ahern	4	-12/+27
	Update kernel headers to commit: a65cc8435540 ("Merge branch 'bnxt_en-next'") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	Merge branch 'support-xstats-afstats' into next	David Ahern	6	-140/+522
	Petr Machata says: ==================== The RTM_GETSTATS response attributes IFLA_STATS_LINK_XSTATS and IFLA_STATS_LINK_XSTATS_SLAVE are used to carry statistics related to, respectively, netdevices of a certain type, and netdevices enslaved to netdevices of a certain type. IFLA_STATS_AF_SPEC are similarly used to carry statistics specific to a certain address family. In this patch set, add support for three new stats groups that cover the above attributes: xstats, xstats_slave and afstats. Add bridge and bond subgroups to the former two groups, and mpls subgroup to the latter one. Now "group" is used for selecting the top-level attribute, and subgroup for the link-type or address-family nest below it (bridge, bond, mpls in this patchset). But xstats (both master and slave) are further subdivided. E.g. in the case of bridge statistics, the two subdivisions are called "stp" and "mcast". To make it possible to pick these sets, add to the two selector levels of group and subgroup a third level, suite, which is filtered in the userspace. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	man: ip-stats.8: Describe groups xstats, xstats_slave and afstats	Petr Machata	1	-0/+47
	Add description of the newly-added statistics groups. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	ipstats: Expose bond stats in ipstats	Petr Machata	3	-2/+58
	Describe xstats and xstats_slave subgroups for bond netdevices. For example: # ip stats show dev swp1 group xstats_slave subgroup bond 56: swp1: group xstats_slave subgroup bond suite 802.3ad LACPDU Rx 0 LACPDU Tx 0 LACPDU Unknown type Rx 0 LACPDU Illegal Rx 0 Marker Rx 0 Marker Tx 0 Marker response Rx 0 Marker response Tx 0 Marker unknown type Rx 0 # ip -j stats show dev swp1 group xstats_slave subgroup bond \| jq [ { "ifindex": 56, "ifname": "swp1", "group": "xstats_slave", "subgroup": "bond", "suite": "802.3ad", "802.3ad": { "lacpdu_rx": 0, "lacpdu_tx": 0, "lacpdu_unknown_rx": 0, "lacpdu_illegal_rx": 0, "marker_rx": 0, "marker_tx": 0, "marker_response_rx": 0, "marker_response_tx": 0, "marker_unknown_rx": 0 } } ] Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	ipstats: Expose bridge stats in ipstats	Petr Machata	3	-0/+84
	Bridge supports two suites, STP and IGMP, carried by attributes BRIDGE_XSTATS_STP and BRIDGE_XSTATS_MCAST. Expose them as suites "stp" and "mcast" (to correspond to the attribute name). For example: # ip stats show dev swp1 group xstats_slave subgroup bridge 56: swp1: group xstats_slave subgroup bridge suite mcast IGMP queries: RX: v1 0 v2 0 v3 0 TX: v1 0 v2 0 v3 0 IGMP reports: RX: v1 0 v2 0 v3 0 TX: v1 0 v2 0 v3 0 IGMP leaves: RX: 0 TX: 0 IGMP parse errors: 0 MLD queries: RX: v1 0 v2 0 TX: v1 0 v2 0 MLD reports: RX: v1 0 v2 0 TX: v1 0 v2 0 MLD leaves: RX: 0 TX: 0 MLD parse errors: 0 56: swp1: group xstats_slave subgroup bridge suite stp STP BPDU: RX: 0 TX: 0 STP TCN: RX: 0 TX: 0 STP Transitions: Blocked: 1 Forwarding: 0 # ip -j stats show dev swp1 group xstats_slave subgroup bridge \| jq [ { "ifindex": 56, "ifname": "swp1", "group": "xstats_slave", "subgroup": "bridge", "suite": "mcast", "multicast": { "igmp_queries": { "rx_v1": 0, "rx_v2": 0, "rx_v3": 0, "tx_v1": 0, "tx_v2": 0, "tx_v3": 0 }, "igmp_reports": { "rx_v1": 0, "rx_v2": 0, "rx_v3": 0, "tx_v1": 0, "tx_v2": 0, "tx_v3": 0 }, "igmp_leaves": { "rx": 0, "tx": 0 }, "igmp_parse_errors": 0, "mld_queries": { "rx_v1": 0, "rx_v2": 0, "tx_v1": 0, "tx_v2": 0 }, "mld_reports": { "rx_v1": 0, "rx_v2": 0, "tx_v1": 0, "tx_v2": 0 }, "mld_leaves": { "rx": 0, "tx": 0 }, "mld_parse_errors": 0 } }, { "ifindex": 56, "ifname": "swp1", "group": "xstats_slave", "subgroup": "bridge", "suite": "stp", "stp": { "rx_bpdu": 0, "tx_bpdu": 0, "rx_tcn": 0, "tx_tcn": 0, "transition_blk": 1, "transition_fwd": 0 } } ] Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	iplink_bridge: Split bridge_print_stats_attr()	Petr Machata	1	-121/+133
	Extract from bridge_print_stats_attr() two helpers, one for dumping the multicast attribute, one for dumping the STP attribute. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	ipstats: Add groups "xstats", "xstats_slave"	Petr Machata	2	-0/+73
	The RTM_GETSTATS response attributes IFLA_STATS_LINK_XSTATS and IFLA_STATS_LINK_XSTATS_SLAVE are used to carry statistics related to, respectively, netdevices of a certain type, and netdevices enslaved to netdevices of a certain type. Inside the nest is then link-type specific attribute (e.g. LINK_XSTATS_TYPE_BRIDGE), and inside that nest further attributes for individual type-specific statistical suites. Under the "ip stats" model, that corresponds to groups "xstats" and "xstats_slave", link-type specific subgroup, e.g. "bridge", and one or more link-type specific suites, such as "stp". Link-type specific stats are currently supported through struct link_util and in particular the callbacks parse_ifla_xstats and print_ifla_xstats. The role of parse_ifla_xstats is to establish which statistical suite to display, and on which device. "ip stats" has framework for both of these tasks, which obviates the need for custom parsing. Therefore the module should instead provide a subgroup descriptor, which "ip stats" will then use as any other. The second link_util callback, print_ifla_xstats, is for response dissection. In "ip stats" model, this belongs to leaf descriptors. Eventually, the link-specific leaf descriptors will be similar to each other: either master or slave top-level nest needs to be parsed, and link-type attribute underneath that, and suite attribute underneath that. To support this commonality, add struct ipstats_stat_desc_xstats to describe the xstats suites. Further, expose ipstats_stat_desc_pack_xstats() and ipstats_stat_desc_show_xstats(), which can be used at leaf descriptors and do the appropriate thing according to the configuration in ipstats_stat_desc_xstats. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	ipstats: Add a third level of stats hierarchy, a "suite"	Petr Machata	2	-2/+10
	To show statistics nested under IFLA_STATS_LINK_XSTATS_SLAVE or IFLA_STATS_LINK_XSTATS, one would use "group" to select the top-level attribute, then "subgroup" to select the link type, which is itself a nest, and then would lack a way to denote which attribute to select out of the link-type nest. To that end, add the selector level "suite", which is filtered in the userspace. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	iplink: Add JSON support to MPLS stats formatter	Petr Machata	1	-27/+46
	MPLS stats currently do not support dumping in JSON format. Recognize when JSON is requested and dump in an obvious manner: # ip -n ns0-2G8Ozd9z -j stats show dev veth01 group afstats \| jq [ { "ifindex": 3, "ifname": "veth01", "group": "afstats", "subgroup": "mpls", "mpls_stats": { "rx": { "bytes": 0, "packets": 0, "errors": 0, "dropped": 0, "noroute": 0 }, "tx": { "bytes": 216, "packets": 2, "errors": 0, "dropped": 0 } } } ] Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	ipstats: Add a group "afstats", subgroup "mpls"	Petr Machata	2	-1/+55
	Add a new group, "afstats", for showing counters from the IFLA_STATS_AF_SPEC nest, and a subgroup "mpls" for the AF_MPLS specifically. For example: # ip -n ns0-NrdgY9sx stats show dev veth01 group afstats 3: veth01: group afstats subgroup mpls RX: bytes packets errors dropped noroute 0 0 0 0 0 TX: bytes packets errors dropped 108 1 0 0 Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	iplink: Publish a function to format MPLS stats	Petr Machata	2	-15/+24
	Extract from print_mpls_stats() a new function, print_mpls_link_stats(), make it non-static and publish in the header file. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-12	iplink: Fix formatting of MPLS stats	Petr Machata	1	-12/+32
	Currently, MPLS stats are formatted this way: # ip -n ns0-DBZpxj8I link afstats dev veth01 3: veth01 mpls: RX: bytes packets errors dropped noroute 0 0 0 0 0 TX: bytes packets errors dropped 216 2 0 0 Note how most numbers are not aligned properly under their column headers. Fix by converting the code to use size_columns() to dynamically determine the necessary width of individual columns, which also takes care of formatting the table properly in case the counter values are high. After the fix, the formatting looks as follows: # ip -n ns0-Y1PyEc55 link afstats dev veth01 3: veth01 mpls: RX: bytes packets errors dropped noroute 0 0 0 0 0 TX: bytes packets errors dropped 108 1 0 0 Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-08	ip: ipstats: Do not assume length of response attribute payload	Petr Machata	1	-23/+14
	In Linux kernel commit 794c24e9921f ("net-core: rx_otherhost_dropped to core_stats"), struct rtnl_link_stats64 got a new member. This change got to iproute2 through commit bba95837524d ("Update kernel headers"). "ip stats" makes the assumption that the payload of attributes that carry structures is at least as long as the size of the given structure as iproute2 knows it. But that will not hold when a newer iproute2 is used against an older kernel: since such kernel misses some fields on the tail end of the structure, "ip stats" bails out: # ip stats show group link 1: lo: group link Error: attribute payload too shortDump terminated Instead, be tolerant of responses that are both longer and shorter than what is expected. Instead of forming a pointer directly into the payload, allocate the stats structure on the stack, zero it, and then copy over the portion from the response. Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-08	Merge branch 'bridge-vxlan-vni-filtering' into next	David Ahern	10	-4/+606
	Roopa Prabhu says: ==================== This series adds bridge command to manage recently added vnifilter on a collect metadata vxlan (external) device. Also includes per vni stats support. examples: $bridge vni add dev vxlan0 vni 400 $bridge vni add dev vxlan0 vni 200 group 239.1.1.101 $bridge vni del dev vxlan0 vni 400 $bridge vni show $bridge -s vni show ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-08	bridge: vni: add support for stats dumping	Nikolay Aleksandrov	3	-19/+81
	Add support for "-s" option which causes bridge vni to dump per-vni statistics. Note that it disables vni range compression. Example: $ bridge -s vni \| more dev vni group/remote vxlan0 1024 239.1.1.1 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 1025 239.1.1.1 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-08	ip: iplink_vxlan: add support to set vnifiltering flag on vxlan device	Roopa Prabhu	2	-1/+31
	This patch adds option to set vnifilter flag on a vxlan device. vnifilter is only supported on a collect metadata device. example: set vnifilter flag $ ip link add vxlan0 type vxlan external vnifilter local 172.16.0.1 Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-08	bridge: vxlan device vnifilter support	Roopa Prabhu	8	-3/+513
	This patch adds bridge command to manage recently added vnifilter on a collect metadata vxlan device. examples: $bridge vni add dev vxlan0 vni 400 $bridge vni add dev vxlan0 vni 200 group 239.1.1.101 $bridge vni del dev vxlan0 vni 400 $bridge vni show $bridge -s vni show Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-05-04	libbpf: Remove use of bpf_map_is_offload_neutral	David Ahern	1	-1/+6
	bpf_map_is_offload_neutral is deprecated as of v0.8+; import definition to maintain backwards compatibility. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-04	libbpf: Remove use of bpf_program__set_priv and bpf_program__priv	David Ahern	1	-6/+10
	bpf_program__set_priv and bpf_program__priv are deprecated as of libbpf v0.7+. Rather than store the map as priv on the program, change find_legacy_tail_calls to take an argument to return a reference to the map. find_legacy_tail_calls is invoked twice from load_bpf_object - the first time to check for programs that should be loaded. In this case a reference to the map is not needed, but it does validate the map exists. The second is invoked from update_legacy_tail_call_maps where the map pointer is needed. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-04	libbpf: Use bpf_object__load instead of bpf_object__load_xattr	David Ahern	1	-6/+1
	bpf_object__load_xattr is deprecated as of v0.8+; remove it in favor of bpf_object__load. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-05-02	libbpf: Remove use of bpf_map_is_offload_neutral	David Ahern	1	-1/+6
	bpf_map_is_offload_neutral is deprecated as of v0.8+; import definition to maintain backwards compatibility. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
2022-05-02	libbpf: Remove use of bpf_program__set_priv and bpf_program__priv	David Ahern	1	-6/+10
	bpf_program__set_priv and bpf_program__priv are deprecated as of libbpf v0.7+. Rather than store the map as priv on the program, change find_legacy_tail_calls to take an argument to return a reference to the map. find_legacy_tail_calls is invoked twice from load_bpf_object - the first time to check for programs that should be loaded. In this case a reference to the map is not needed, but it does validate the map exists. The second is invoked from update_legacy_tail_call_maps where the map pointer is needed. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
2022-05-02	libbpf: Use bpf_object__load instead of bpf_object__load_xattr	David Ahern	1	-6/+1
	bpf_object__load_xattr is deprecated as of v0.8+; remove it in favor of bpf_object__load. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
2022-04-29	f_flower: add number of vlans man entry	Boris Sukholitko	1	-0/+5
	The documentation was missing in the number of vlans commit. Fixes: 5ba31bcf (f_flower: Add num of vlans parameter) Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	Merge branch 'flower-vlans' into next	David Ahern	1	-16/+41
	Boris Sukholitko says: ==================== Our customers in the fiber telecom world have network configurations where they would like to control their traffic according to the number of tags appearing in the packet. For example, TR247 GPON conformance test suite specification mostly talks about untagged, single, double tagged packets and gives lax guidelines on the vlan protocol vs. number of vlan tags. This is different from the common IT networks where 802.1Q and 802.1ad protocols are usually describe single and double tagged packet. GPON configurations that we work with have arbitrary mix the above protocols and number of vlan tags in the packet. The following patch series implement number of vlans flower filter. They add num_of_vlans flower filter as an alternative to vlan ethtype protocol matching. The end result is that the following command becomes possible: tc filter add dev eth1 ingress flower \ num_of_vlans 1 vlan_prio 5 action drop Also, from our logs, we have redirect rules such that: tc filter add dev $GPON ingress flower num_of_vlans $N \ action mirred egress redirect dev $DEV where N can range from 0 to 3 and $DEV is the function of $N. Also there are rules setting skb mark based on the number of vlans: tc filter add dev $GPON ingress flower num_of_vlans $N vlan_prio \ $P action skbedit mark $M ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	f_flower: Check args with num_of_vlans	Boris Sukholitko	1	-18/+23
	Having more than one vlan allows matching on the vlan tag parameters. This patch changes vlan key validation to take number of vlan tags into account. Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	f_flower: Add num of vlans parameter	Boris Sukholitko	1	-0/+20
	Our customers in the fiber telecom world have network configurations where they would like to control their traffic according to the number of tags appearing in the packet. For example, TR247 GPON conformance test suite specification mostly talks about untagged, single, double tagged packets and gives lax guidelines on the vlan protocol vs. number of vlan tags. This is different from the common IT networks where 802.1Q and 802.1ad protocols are usually describe single and double tagged packet. GPON configurations that we work with have arbitrary mix the above protocols and number of vlan tags in the packet. This patch adds num_of_vlans flower key and associated print and parse routines. The following command becomes possible: tc filter add dev eth1 ingress flower num_of_vlans 1 action drop Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	Merge branch 'ip-stats' into next	David Ahern	15	-26/+1512
	Petr Machata says: ==================== A new rtnetlink message, RTM_SETSTATS, has been added recently in kernel commit ca0a53dcec94 ("Merge branch 'net-hw-counters-for-soft-devices'"). At the same time, RTM_GETSTATS has been around for a while. The users of this API are spread in a couple different places: "ip link xstats" reads stats from the IFLA_STATS_LINK_XSTATS and _XSTATS_SLAVE subgroups, "ip link afstats" then reads IFLA_STATS_AF_SPEC. Finally, to read IFLA_STATS_LINK_OFFLOAD_XSTATS, one would use ifstats. This does not seem to be a good fit for IFLA_OFFLOAD_XSTATS_HW_S_INFO in particular. The obvious place to expose all these offload stats suites would be under a new link subcommand "ip link offload_xstats", or similar, which would then have syntax for both showing stats and setting them. However, this looks like a good opportunity to introduce a new top-level command, "ip stats", that would be the go-to place to access anything backed by RTM_GETSTATS and RTM_SETSTATS. This patchset therefore does the following: - It adds the new "stats" infrastructure - It adds specifically the ability to toggle and show the suites that were recently added to Linux, IFLA_OFFLOAD_XSTATS_HW_S_INFO and IFLA_OFFLOAD_XSTATS_L3_STATS. - It adds support to dump IFLA_OFFLOAD_XSTATS_CPU_HIT, which was not available under "ip" at all. - Does all this in a way that is easy to extend for new stats suites. The patchset proceeds as follows: - Patches #1 and #2 lay some groundwork and tweak existing code. - Patch #3 adds the shell of the new "ip stats" command. - Patch #4 adds "ip stats set" and the ability to toggle l3_stats in particular. - Patch #5 adds "ip stats show", but no actual stats suites. - Patches #6-#9 add support for showing individual stats suites: respectively, IFLA_STATS_LINK_64, IFLA_OFFLOAD_XSTATS_CPU_HIT, IFLA_OFFLOAD_XSTATS_HW_S_INFO and IFLA_OFFLOAD_XSTATS_L3_STATS. - Patch #10 adds support for monitoring stats events to "ip monitor". - Patch #11 adds man page verbiage for the above. The plan is to contribute support for afstats and xstats in a follow-up patch set. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	man: Add man pages for the "stats" functions	Petr Machata	3	-2/+167
	Add a man page for the new "stats" command. Also mention the new "stats" group in ip-monitor. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ipmonitor: Add monitoring support for stats events	Petr Machata	3	-1/+36
	Toggles and offloads of HW statistics cause emission of and RTM_NEWSTATS event. Add support to "ip monitor" for these events. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ipstats: Add offload subgroup "l3_stats"	Petr Machata	1	-0/+158
	Add into the group "offload" a subgroup "l3_stats" for showing L3 statistics. For example: # ip stats show dev swp2.200 group offload subgroup l3_stats 4212: swp2.200: group offload subgroup l3_stats on used on RX: bytes packets errors dropped mcast 1920 21 1 0 0 TX: bytes packets errors dropped 756 9 0 0 # ip -j stats show dev swp2.200 group offload subgroup l3_stats \| jq [ { "ifindex": 4212, "ifname": "swp2.200", "group": "offload", "subgroup": "l3_stats", "info": { "request": true, "used": true }, "stats64": { "rx": { "bytes": 1920, "packets": 21, "errors": 1, "dropped": 0, "multicast": 0 }, "tx": { "bytes": 756, "packets": 9, "errors": 0, "dropped": 0 } } } ] Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ipstats: Add offload subgroup "hw_stats_info"	Petr Machata	1	-0/+217
	Add into the group "offload" a subgroup "hw_stats_info" for showing information about HW statistics counters. For example: # ip stats show dev swp1 group offload subgroup hw_stats_info 4178: swp1: group offload subgroup hw_stats_info l3_stats on used off # ip -j stats show dev swp1 group offload subgroup hw_stats_info \| jq [ { "ifindex": 4178, "ifname": "swp1", "group": "offload", "subgroup": "hw_stats_info", "info": { "l3_stats": { "request": true, "used": false } } } ] Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ipstats: Add a group "offload", subgroup "cpu_hit"	Petr Machata	1	-0/+37
	Add a new group, "offload", for showing counters from the IFLA_STATS_LINK_OFFLOAD_XSTATS nest, and a subgroup "cpu_hit" for the IFLA_OFFLOAD_XSTATS_CPU_HIT stats suite. For example: # ip stats show dev swp1 group offload subgroup cpu_hit 4178: swp1: group offload subgroup cpu_hit RX: bytes packets errors dropped missed mcast 45522 353 0 0 0 0 TX: bytes packets errors dropped carrier collsns 46054 355 0 0 0 0 # ip -j stats show dev swp1 group offload subgroup cpu_hit \| jq [ { "ifindex": 4178, "ifname": "swp1", "group": "offload", "subgroup": "cpu_hit", "stats64": { "rx": { "bytes": 45522, "packets": 353, "errors": 0, "dropped": 0, "over_errors": 0, "multicast": 0 }, "tx": { "bytes": 46054, "packets": 355, "errors": 0, "dropped": 0, "carrier_errors": 0, "collisions": 0 } } } ] Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ipstats: Add a group "link"	Petr Machata	1	-0/+90
	Add the "link" top-level group for showing IFLA_STATS_LINK_64 statistics. For example: # ip stats show dev swp1 group link 4178: swp1: group link RX: bytes packets errors dropped missed mcast 47048 354 0 0 0 64 TX: bytes packets errors dropped carrier collsns 47474 355 0 0 0 0 # ip -j stats show dev swp1 group link \| jq [ { "ifindex": 4178, "ifname": "swp1", "group": "link", "stats64": { "rx": { "bytes": 47048, "packets": 354, "errors": 0, "dropped": 0, "over_errors": 0, "multicast": 64 }, "tx": { "bytes": 47474, "packets": 355, "errors": 0, "dropped": 0, "carrier_errors": 0, "collisions": 0 } } } ] Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ipstats: Add a shell of "show" command	Petr Machata	2	-2/+638
	Add the scaffolding necessary for adding individual stats suites to show. Expose some ipstats artifacts in ip_common.h. These will be used to support "xstats" in a follow-up patchset. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ipstats: Add a "set" command	Petr Machata	1	-0/+78
	Add a command to allow toggling HW stats. An example usage: # ip stats set dev swp1 l3_stats on Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ip: Add a new family of commands, "stats"	Petr Machata	4	-1/+35
	Add a core of a new frontend tool for interfacing with the RTM_*STATS family of messages. The following patches will add subcommands for showing and setting individual statistics suites. Note that in this patch, "ip stats" is made to be an invalid command line. This will be changed in later patches to default to "show" when that is introduced. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	ip: Publish functions for stats formatting	Petr Machata	2	-11/+25
	Formatting struct rtnl_link_stats64 will be useful outside of iplink.c as well. Extract from __print_link_stats() a new function, print_stats64(), make it non-static and publish in the header file. Additionally, publish the helper size_columns(), which will be useful for formatting the new struct rtnl_hw_stats64. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	libnetlink: Add filtering to rtnl_statsdump_req_filter()	Petr Machata	6	-11/+33
	A number of functions in the rtnl_*_req family accept a caller-provided callback to set up arbitrary filtering. rtnl_statsdump_req_filter() currently only allows setting a field in the IFSM header, not custom attributes. So far these were not necessary, but with introduction of more detailed filtering settings, the callback becomes necessary. To that end, add a filter_fn and filter_data arguments to the function. Unlike the other filters, this one is typed to expect an IFSM pointer, to permit tweaking the header itself as well. Pass NULLs in the existing callers. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-27	devlink: introduce -[he]x cmdline option to allow dumping numbers in hex format	Jiri Pirko	2	-6/+17
	For health reporter dumps it is quite convenient to have the numbers in hexadecimal format. Introduce a command line option to allow user to achieve that output. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-25	Update kernel headers	David Ahern	11	-30/+110
	Update kernel headers to commit: cc271ab86606 ("wwan_hwsim: Avoid flush_scheduled_work() usage") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-25	devlink: fix "devlink health dump" command without arg	Jiri Pirko	1	-7/+18
	Fix bug when user calls "devlink health dump" without "show" or "clear": $ devlink health dump Command "(null)" not found Put the dump command into a separate helper as it is usual in the rest of the code. Also, treat no cmd as "show", as it is common for other devlink objects. Fixes: 041e6e651a8e ("devlink: Add devlink health dump show command") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-04-22	ip-link: put types on man page in alphabetic order	Stephen Hemminger	1	-86/+89
	Lets try and keep man pages using alpha order, it looks like it started that way then drifted. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-19	ip/iplink_virt_wifi: add support for virt_wifi	Baligh Gasmi	4	-3/+32
	Add support for creating virt_wifi devices type. Syntax: $ ip link add link eth0 name wlan0 type virt_wifi Signed-off-by: Baligh Gasmi <gasmibal@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-04-18	man: use quote instead of acute accent	Luca Boccassi	1	-1/+1
	Lintian complains: I: iproute2: acute-accent-in-manual-page usr/share/man/man8/tc-bpf.8.gz:220 "This manual page uses the \' groff sequence. Usually, the intent to generate an apostrophe, but that sequence actually renders as an acute accent. For an apostrophe or a single closing quote, use plain '. For single opening quote, i.e. a straight downward line ' like the one used in shell commands, use '\(aq'." Before: ´s,c t f k,c t f k,c t f k,...´ After: 's,c t f k,c t f k,c t f k,...' Signed-off-by: Luca Boccassi <bluca@debian.org>
2022-04-18	man: 'allow to' -> 'allow one to'	Luca Boccassi	6	-7/+7
	Lintian warnings: I: iproute2: typo-in-manual-page usr/share/man/man8/tc-ctinfo.8.gz line 61 "allow to" "allow one to" I: iproute2: typo-in-manual-page usr/share/man/man8/tc-netem.8.gz line 70 "allow to" "allow one to" I: iproute2: typo-in-manual-page usr/share/man/man8/tc-netem.8.gz line 90 "allow to" "allow one to" I: iproute2: typo-in-manual-page usr/share/man/man8/tc-pedit.8.gz line 307 "allow to" "allow one to" I: iproute2: typo-in-manual-page usr/share/man/man8/tc-skbmod.8.gz line 66 "allow to" "allow one to" I: iproute2: typo-in-manual-page usr/share/man/man8/tc-vlan.8.gz line 48 "allow to" "allow one to" I: iproute2: typo-in-manual-page usr/share/man/man8/tc.8.gz line 346 "allow to" "allow one to" Signed-off-by: Luca Boccassi <bluca@debian.org>
2022-04-18	uapi: upstream update to stddef.h	Stephen Hemminger	1	-0/+4
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-04-06	uapi: update from 5.18-rc1	Stephen Hemminger	3	-28/+64
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-24	Merge branch 'main' into next	David Ahern	3	-17/+7
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-23	Merge branch 'ss-rpcinfo' into next	David Ahern	2	-20/+89
	Andrea Claudi says: ==================== ss uses rpcinfo to get info about rpc service sockets. However, rpcinfo is not part of iproute2 and it's an implicit dependency for ss. This series uses libtirpc[1] API to implement the same feature of rpcinfo for ss. This makes it possible to get info about rpc sockets, provided ss is compiled with libtirpc support. As a nice byproduct, this makes ss provide info about some ipv6 rpc sockets that are not displayed using 'rpcinfo -p'. - patch 1 adds a configure function to check for libtirpc; - patch 2 actually rework ss to use libtirpc. [1] https://git.linux-nfs.org/?p=steved/libtirpc.git ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-23	ss: remove an implicit dependency on rpcinfo	Andrea Claudi	1	-20/+73
	ss uses rpcinfo to get info about rpc services socket. This makes it dependent on a tool not included in iproute2, and makes it impossible to get info on rpc sockets if rpcinfo is not installed. This reworks init_service_resolver() to use libtirpc, thus avoiding the implicity dependency on rpcinfo. Moreover, this also makes it possible to display info about ipv6 rpc socket that are not included in the rpcinfo -p output. For example, before this patch: $ ss -rtap LISTEN 0 5 localhost:ipp [::]:* users:(("cupsd",pid=1600,fd=9)) LISTEN 0 64 [::]:34265 [::]:* LISTEN 0 64 [::]:rpc.nfs_acl [::]:* LISTEN 0 128 [::]:42253 [::]:* users:(("rpc.statd",pid=146164,fd=12)) After this patch: $ ss -rtap LISTEN 0 5 localhost:ipp [::]:* users:(("cupsd",pid=1600,fd=9)) LISTEN 0 64 [::]:rpc.nlockmgr [::]:* LISTEN 0 64 [::]:rpc.nfs_acl [::]:* LISTEN 0 128 [::]:rpc.status [::]:* users:(("rpc.statd",pid=146164,fd=12)) Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-23	configure: add check_libtirpc()	Andrea Claudi	1	-0/+16
	This patch adds a configure function to check if libtirpc is installed on the build system. If this is the case, it makes iproute2 to compile with libtirpc support. Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-22	v5.17.0	Stephen Hemminger	1	-1/+1

2022-03-20	ip/geneve: add support for IFLA_GENEVE_INNER_PROTO_INHERIT	Eyal Birger	2	-0/+19
	Add support for creating devices with this property. Since it cannot be changed, not adding a [no] option. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-20	Merge branch 'gtp-netdev' into next	David Ahern	6	-5/+301
	Wojciech Drewek says: ==================== This patch series introduces GTP support to iproute2. Since this patch series it is possible to create net devices of GTP type. Then, those devices can be used in tc in order to offload GTP packets. New field in tc flower (gtp_opts) can be used to match on QFI and PDU type. Kernel changes (merged): https://lore.kernel.org/netdev/164708701228.11169.15700740251869229843.git-patchwork-notify@kernel.org/ ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-20	f_flower: Implement gtp options support	Wojciech Drewek	2	-2/+131
	Add support for parsing TCA_FLOWER_KEY_ENC_OPTS_GTP. Options are as follows: PDU_TYPE:QFI where each option is represented as 8-bit hexadecimal value. e.g. # ip link add gtp_dev type gtp role sgsn # tc qdisc add dev gtp_dev ingress # tc filter add dev gtp_dev protocol ip parent ffff: \ flower \ enc_key_id 11 \ gtp_opts 1:8/ff:ff \ action mirred egress redirect dev eth0 Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-20	ip: GTP support in ip link	Wojciech Drewek	4	-3/+170
	Support for creating GTP devices through ip link. Two arguments can be specified by the user when adding device of the GTP type. - role (sgsn or ggsn) - indicates whether we are on the GGSN or SGSN - hsize - indicates the size of the hash table where PDP sessions are stored IFLA_GTP_FD0 and IFLA_GTP_FD1 arguments would not be provided. Those are file descriptores to the sockets created in the userspace. Since we are not going to create sockets in ip link, we don't have to provide them. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-20	man: bridge: document per-port mcast_router settings	Joachim Wiberg	1	-0/+15
	Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
2022-03-20	bridge: support for controlling mcast_router per port	Joachim Wiberg	1	-0/+11
	The bridge vlan command supports setting mcast_router per-port and per-vlan, what's however missing is the ability to set the per-port mcast_router options, e.g. when VLAN filtering is disabled. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
2022-03-20	Update kernel headers	David Ahern	7	-3/+42
	Update kernel headers to commit: 092d992b76ed ("Merge tag 'mlx5-updates-2022-03-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-16	vdpa: Update man page with added support to configure max vq pair	Eli Cohen	1	-0/+6
	Update man page to include information how to configure the max virtqueue pairs for a vdpa device when creating one. Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-15	link_xfrm: if_id must be non zero	Antony Antony	1	-1/+6
	Since kernel upstream commit 8dce43919566 ("xfrm: interface with if_id 0 should return error") if_id must be non zero. Fix the usage and return error for if_id 0. Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-15	testsuite: link xfrm delete no if_id test	Antony Antony	1	-15/+0
	Since kernel commit 8dce43919566 ("xfrm: interface with if_id 0 should return error") if_id should be non zero. Delete the test without if_id, which defaulted if_id to zero. Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-14	vdpa: Support reading device features	Eli Cohen	1	-2/+13
	When showing the available management devices, check if VDPA_ATTR_DEV_SUPPORTED_FEATURES feature is available and print the supported features for a management device. Examples: $ vdpa mgmtdev show auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -jp mgmtdev show { "mgmtdev": { "auxiliary/mlx5_core.sf.1": { "supported_classes": [ "net" ], "max_supported_vqs": 257, "dev_features": [ "CSUM","GUEST_CSUM","MTU","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ",\ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ] } } } Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-14	vdpa: Support for configuring max VQ pairs for a device	Eli Cohen	1	-1/+24
	Use VDPA_ATTR_DEV_MGMTDEV_MAX_VQS to specify max number of virtqueue pairs to configure for a vdpa device when adding a device. Examples: 1. Create a device with 3 virtqueue pairs: $ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 3 2. Read the configuration of a vdpa device $ vdpa dev config show vdpa-a vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 3 \ mtu 1500 negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \ CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-14	vdpa: Allow for printing negotiated features of a device	Eli Cohen	1	-2/+103
	When reading the configuration of a vdpa device, check if the VDPA_ATTR_DEV_NEGOTIATED_FEATURES is available. If it is, parse the feature bits and print a string representation of each of the feature bits. We keep the strings in two different arrays. One for net device related devices and one for generic feature bits. In this patch we parse only net device specific features. Support for other devices can be added later. If the device queried is not a net device, we print its bit number only. Examples: 1. Standard presentation $ vdpa dev config show vdpa-a vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 2 mtu 9000 negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \ CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM 2. json output $ vdpa -j dev config show vdpa-a {"config":{"vdpa-a":{"mac":"00:00:00:00:88:88","link":"up","link_announce":false,\ "max_vq_pairs":2,"mtu":9000,"negotiated_features":["CSUM","GUEST_CSUM",\ "MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR",\ "VERSION_1","ACCESS_PLATFORM"]}}} 3. Pretty json $ vdpa -jp dev config show vdpa-a { "config": { "vdpa-a": { "mac": "00:00:00:00:88:88", "link ": "up", "link_announce ": false, "max_vq_pairs": 2, "mtu": 9000, "negotiated_features": [ "CSUM","GUEST_CSUM","MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ",\ "MQ","CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ] } } } Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-14	vdpa: Remove unsupported command line option	Eli Cohen	1	-1/+1
	"-v[erbose]" option is not supported. Remove it. Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	Makefile: move HAVE_MNL check to top-level Makefile	Andrea Claudi	6	-29/+4
	dcb, devlink, rdma, tipc and vdpa rely on libmnl to compile, so they check for libmnl to be installed on their Makefiles. This moves HAVE_MNL check from the tools to top-level Makefile, thus avoiding to call their Makefiles if libmnl is not present. Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	Merge branch 'bridge-broadcast-flooding' into next	David Ahern	4	-8/+42
	Joachim Wiberg says: ==================== this patch set address a slight omission in controlling broadcast flooding per bridge port, which the bridge has had support for a good while now. v3: - Move bcast_flood option in manual files to before the mcast_flood option, instead of breaking the two mcast options. Unfortunately the other options are not alphabetically sorted, so this was the least worst option. (Stephen) - Add missing closing " for 'bridge mdb show' in bridge(8) SYNOPSIS v2: - Add bcast_flood also to ip/iplink_bridge_slave.c (Nik) - Update man page for ip-link(8) with new bcast_flood flag - Update mcast_flood in same man page slightly - Fix minor weird whitespace issues causing sudden line breaks v1: - Add bcast_flood to bridge/link.c - Update man page for bridge(8) with bcast_flood for brports ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	Merge branch 'main' into next	David Ahern	18	-130/+142
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	man: ip-link: whitespace fixes to odd line breaks mid sentence	Joachim Wiberg	1	-7/+7
	Some options, spread across the man page, were accidentally (?) space indented (possible bullet list auto-indent in editors), causing odd line breaks in presentation mode (emacs, nroff, etc.). This patch aligns the multi-line descriptions to column zero, in line with other such option descriptions. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	man: ip-link: mention bridge port's default mcast_flood state	Joachim Wiberg	1	-1/+1
	Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	man: ip-link: document new bcast_flood flag on bridge ports	Joachim Wiberg	1	-0/+6
	The options are not alphabetically sorted, so placing bcast_flood right before mcast_flood for now. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	ip: iplink_bridge_slave: support for broadcast flooding	Joachim Wiberg	1	-0/+9
	Add per-port support for controlling flooding of broadcast traffic. Similar to unicast and multcast flooding that already exist. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	man: bridge: add missing closing " in bridge show mdb	Joachim Wiberg	1	-1/+1
	Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	man: bridge: document new bcast_flood flag for bridge ports	Joachim Wiberg	1	-0/+6
	The bridge link options are not alphabetically sorted, so placing bcast_flood right before mcast_flood for now. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-12	bridge: support for controlling flooding of broadcast per port	Joachim Wiberg	1	-0/+13
	Add per-port support for controlling flooding of broadcast traffic. Similar to unicast and multcast flooding that already exist. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-11	rdma: make RES_PID and RES_KERN_NAME alternative to each other	Andrea Claudi	8	-39/+34
	RDMA_NLDEV_ATTR_RES_PID and RDMA_NLDEV_ATTR_RES_KERN_NAME cannot be set together, as evident for the fill_res_name_pid() function in the kernel infiniband driver. This commit makes this clear at first glance, using an else branch for the RDMA_NLDEV_ATTR_RES_KERN_NAME case. This also helps coverity to better understand this code and avoid producing a bogus warning complaining about mnl_attr_get_str overwriting comme, and thus leaking the storage that comm points to. Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-11	uapi: update vdpa.h	Stephen Hemminger	1	-0/+6
	Update header from upstream. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-11	ipaddress: remove 'label' compatibility with Linux-2.0 net aliases	Maxime de Roucy	2	-19/+0
	As Linux-2.0 is getting old and systemd allows non Linux-2.0 compatible aliases to be set, I think iproute2 should be able to manage such aliases. Signed-off-by: Maxime de Roucy <maxime.deroucy@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-11	lib/fs: fix memory leak in get_task_name()	Andrea Claudi	11	-39/+57
	asprintf() allocates memory which is not freed on the error path of get_task_name(), thus potentially leading to memory leaks. %m specifier on fscanf allocates memory, too, which needs to be freed by the caller. This reworks get_task_name() to avoid memory allocation. - Pass a buffer and its length to the function, similarly to what get_command_name() does, thus avoiding to allocate memory for the string to be returned; - Use snprintf() instead of asprintf(); - Use fgets() instead of fscanf() to limit string length. Fixes: 81bfd01a4c9e ("lib: move get_task_name() from rdma") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-11	ip/batadv: allow to specify RA when creating link	Nicolas Escande	2	-1/+65
	This patch adds the possibility to specify batadv specific options when creating a new batman link. The only option available on link creation is IFLA_BATADV_ALGO_NAME which specifies the routing algorithm. Note there is no batadv specific attr to be handled on link dump. Signed-off-by: Nicolas Escande <nico.escande@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-11	Import batman_adv.h header from last kernel sync point	David Ahern	1	-0/+704
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-10	uapi: update magic.h	Stephen Hemminger	1	-0/+1
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-08	Revert "configure: Allow command line override of toolchain"	David Ahern	1	-4/+4
	This reverts commit 386ae64c8312dd27b09508993a7c8386aff8b1d3. Ido reported compile breakage on Fedora with this patch, so reverting. Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-07	tc: separate action print for filter and action dump	Baowen Zheng	1	-17/+36
	We need to separate action print for filter and action dump since in action dump, we need to print hardware status and flags. But in filter dump, we do not need to print action hardware status and hardware related flags. In filter dump, actions hardware status should be same with filter. so we will not print action hardware status in this case. Action print for action dump: action order 0: police 0xff000100 rate 0bit burst 0b mtu 64Kb pkts_rate 50000 pkts_burst 10000 action drop/pipe overhead 0b linklayer unspec ref 4 bind 3 installed 666 sec used 0 sec firstused 106 sec Action statistics: Sent 7634140154 bytes 5109889 pkt (dropped 0, overlimits 0 requeues 0) Sent software 84 bytes 3 pkt Sent hardware 7634140070 bytes 5109886 pkt backlog 0b 0p requeues 0 in_hw in_hw_count 1 used_hw_stats delayed Action print for filter dump: action order 1: police 0xff000100 rate 0bit burst 0b mtu 64Kb pkts_rate 50000 pkts_burst 10000 action drop/pipe overhead 0b linklayer unspec ref 4 bind 3 installed 680 sec used 0 sec firstused 119 sec Action statistics: Sent 8627975846 bytes 5775107 pkt (dropped 0, overlimits 0 requeues 0) Sent software 84 bytes 3 pkt Sent hardware 8627975762 bytes 5775104 pkt backlog 0b 0p requeues 0 used_hw_stats delayed Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-07	rdma: Fix the logic to print unsigned int.	Shangyan Zhou	11	-33/+44
	Use the corresponding function and fmt string to print unsigned int32 and int64. Signed-off-by: Shangyan Zhou <sy.zhou@hotmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-07	Revert "rdma: Fix res_print_uint() and add res_print_u64()"	Stephen Hemminger	6	-20/+9
	This reverts commit 9d0badecea4c5e85345577984a328f38c75685c3.
2022-03-07	Merge branch 'libbpf-fixups' into next	David Ahern	4	-31/+29
	Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-07	bpf: Remove use of bpf_create_map_xattr	David Ahern	1	-12/+12
	bpf_create_map_xattr is deprecated in v0.7 in favor of bpf_map_create. bpf_map_create and its bpf_map_create_opts are not available across the range of v0.1 and up versions of libbpf, so change create_map to use the bpf syscall directly. Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-07	bpf: Export bpf syscall wrapper	David Ahern	3	-12/+15
	Move bpf syscall wrapper to bpf_glue to make it available to libbpf based functions. Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-07	bpf_glue: Remove use of bpf_load_program from libbpf	David Ahern	2	-12/+7
	bpf_load_program is deprecated starting in v0.7. The preferred bpf_prog_load requires bpf_prog_load_opts from v0.6. This creates an ugly scenario for iproute2 to work across libbpf versions from v0.1 and up. Since bpf_program_load is only used to load the builtin vrf program, just remove the libbpf call and use the legacy code. Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	rdma: Fix res_print_uint() and add res_print_u64()	Shangyan Zhou	6	-9/+20
	Use the corresponding function and fmt string to print unsigned int32 and int64. Signed-off-by: Shangyan Zhou <sy.zhou@hotmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-04	uapi: update to xfrm.h	Stephen Hemminger	1	-0/+6
	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-03-04	ss: display advertised TCP receive window and out-of-order counter	Davide Caratti	1	-0/+8
	these members of TCP_INFO have been included in v5.4. tested with: # ss -nti Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	Merge branch 'link-layer-protocols' into next	David Ahern	2	-4/+6
	Daniel Braunwarth says: ==================== Update the llproto_names array to allow users to reference the PROFINET and EtherCAT protocols with the names 'profinet' and 'ethercat'. These patches depends on the below referenced patch, which extends if_ether.h with the used ETH_P_xxx defines. Link: https://lore.kernel.org/netdev/20220228133029.100913-1-daniel@braunwarth.dev/ ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	tc: bash-completion: Add profinet and ethercat to procotol completion list	Daniel Braunwarth	1	-4/+4
	Add the 'profinet' and 'ethercat' protocols to bash completion. Signed-off-by: Daniel Braunwarth <daniel@braunwarth.dev>
2022-03-04	lib: add profinet and ethercat as link layer protocol names	Daniel Braunwarth	1	-0/+2
	Update the llproto_names array to allow users to reference the PROFINET and EtherCAT protocols with the names 'profinet' and 'ethercat'. Signed-off-by: Daniel Braunwarth <daniel@braunwarth.dev>
2022-03-04	Merge branch '802.1X-locked-bridge-ports' into next	David Ahern	4	-0/+39
	Hans Schultz says: ==================== This patch set is to complement the kernel locked port patches, such that iproute2 can be used to lock/unlock a port and check if a port is locked or not. To lock or unlock a port use the command: bridge link set dev DEV locked {on \| off} To show the detailed setting of a port, including if the locked flag is enabled for the port(s), use the command: bridge -d link show [dev DEV] ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	man8/ip-link.8: add locked port feature description and cmd syntax	Hans Schultz	1	-0/+6
	Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	man8/bridge.8: add locked port feature description and cmd syntax	Hans Schultz	1	-0/+11
	Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	ip: iplink_bridge_slave: add locked port flag support	Hans Schultz	1	-0/+9
	Syntax: ip link set dev DEV type bridge_slave locked {on \| off} Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	bridge: link: add command to set port in locked mode	Hans Schultz	1	-0/+13
	Add support for setting a bridge port in locked mode to use with 802.1X, so that only authorized clients are allowed access through the port. Syntax: bridge link set dev DEV locked {on, off} Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-03-04	Update kernel headers	David Ahern	5	-1/+148
	Update kernel headers to commit: 1039135aedfc ("net: ethernet: sun: Remove redundant code") Signed-off-by: David Ahern <dsahern@kernel.org>
2022-02-28	configure: Allow command line override of toolchain	David Ahern	1	-4/+4
	Easy way to build for both gcc and clang. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-27	mptcp: add port support for setting flags	Geliang Tang	2	-6/+12
	This patch updated the port keyword check for the setting flags, allow to use the port keyword with the non-signal flags. Don't allow to use the port keyword with the id number. With this patch, we can use setting flags in two forms, using the address and port number directly or the id number of the address: ip mptcp endpoint change id 1 fullmesh ip mptcp endpoint change 10.0.2.1 fullmesh ip mptcp endpoint change 10.0.2.1 port 10100 fullmesh Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-02-27	mptcp: add fullmesh support for setting flags	Geliang Tang	2	-10/+18
	A pair of new flags, fullmesh and nofullmesh, had been added in the setting flags of MPTCP PM netlink in kernel space recently by the commit 73c762c1f07d ("mptcp: set fullmesh flag in pm_netlink"). This patch added the corresponding logic to pass these two flags to the netlink in user space. These new flags can be used like this: ip mptcp endpoint change id 1 fullmesh ip mptcp endpoint change id 1 nofullmesh ip mptcp endpoint change id 1 backup fullmesh ip mptcp endpoint change id 1 nobackup nofullmesh Here's an example of setting fullmesh flags: > sudo ip mptcp endpoint add 10.0.2.1 subflow > sudo ip mptcp endpoint show 10.0.2.1 id 1 subflow > sudo ip mptcp endpoint change id 1 fullmesh > sudo ip mptcp endpoint show 10.0.2.1 id 1 subflow fullmesh > sudo ip mptcp endpoint change id 1 nofullmesh > sudo ip mptcp endpoint show 10.0.2.1 id 1 subflow It can be seen that 'ip mptcp endpoint show' already supports showing the fullmesh flag. Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-02-27	mptcp: add fullmesh check for adding address	Geliang Tang	1	-0/+5
	The fullmesh flag mustn't be used with the signal flag when adding an address. Commands like this should be treated as invalid commands: ip mptcp endpoint add 10.0.2.1 signal fullmesh This patch added the necessary flags check for this case. Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2022-02-27	bond: add ns_ip6_target option	Hangbin Liu	1	-1/+52
	Similar with arp_ip_target, this option add bond IPv6 NS/NA monitor support. When IPv6 target was set, the ARP target will be disabled. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>