aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
29 hoursBluetooth: btintel: Refactor btintel_set_ppag()HEADmasterKiran K1-85/+34
Current flow iterates the ACPI table associated with Bluetooth controller looking for PPAG method. Method name can be directly passed to acpi_evaluate_object function instead of iterating the table. Fixes: c585a92b2f9c ("Bluetooth: btintel: Set Per Platform Antenna Gain(PPAG)") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
29 hoursBluetooth: btnxpuart: Shutdown timer and prevent rearming when driver unloadingLuke Wang1-1/+1
When unload the btnxpuart driver, its associated timer will be deleted. If the timer happens to be modified at this moment, it leads to the kernel call this timer even after the driver unloaded, resulting in kernel panic. Use timer_shutdown_sync() instead of del_timer_sync() to prevent rearming. panic log: Internal error: Oops: 0000000086000007 [#1] PREEMPT SMP Modules linked in: algif_hash algif_skcipher af_alg moal(O) mlan(O) crct10dif_ce polyval_ce polyval_generic snd_soc_imx_card snd_soc_fsl_asoc_card snd_soc_imx_audmux mxc_jpeg_encdec v4l2_jpeg snd_soc_wm8962 snd_soc_fsl_micfil snd_soc_fsl_sai flexcan snd_soc_fsl_utils ap130x rpmsg_ctrl imx_pcm_dma can_dev rpmsg_char pwm_fan fuse [last unloaded: btnxpuart] CPU: 5 PID: 723 Comm: memtester Tainted: G O 6.6.23-lts-next-06207-g4aef2658ac28 #1 Hardware name: NXP i.MX95 19X19 board (DT) pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : 0xffff80007a2cf464 lr : call_timer_fn.isra.0+0x24/0x80 ... Call trace: 0xffff80007a2cf464 __run_timers+0x234/0x280 run_timer_softirq+0x20/0x40 __do_softirq+0x100/0x26c ____do_softirq+0x10/0x1c call_on_irq_stack+0x24/0x4c do_softirq_own_stack+0x1c/0x2c irq_exit_rcu+0xc0/0xdc el0_interrupt+0x54/0xd8 __el0_irq_handler_common+0x18/0x24 el0t_64_irq_handler+0x10/0x1c el0t_64_irq+0x190/0x194 Code: ???????? ???????? ???????? ???????? (????????) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x0,c0000000,40028143,1000721b Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- Signed-off-by: Luke Wang <ziniu.wang_1@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: btnxpuart: Handle FW Download Abort scenarioNeeraj Sanjay Kale1-14/+33
This adds a new flag BTNXPUART_FW_DOWNLOAD_ABORT which handles the situation where driver is removed while firmware download is in progress. logs: modprobe btnxpuart [65239.230431] Bluetooth: hci0: ChipID: 7601, Version: 0 [65239.236670] Bluetooth: hci0: Request Firmware: nxp/uartspi_n61x_v1.bin.se rmmod btnxpuart [65241.425300] Bluetooth: hci0: FW Download Aborted Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> Tested-by: Guillaume Legoupil <guillaume.legoupil@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: btnxpuart: Enable status prints for firmware downloadNeeraj Sanjay Kale1-4/+5
This enables prints for firmware download which can help automation tests to verify firmware download functionality. dmesg logs before: modprobe btnxpuart [ 1999.187264] Bluetooth: MGMT ver 1.22 dmesg logs with this patch: modprobe btnxpuart [16179.758515] Bluetooth: hci0: ChipID: 7601, Version: 0 [16179.764748] Bluetooth: hci0: Request Firmware: nxp/uartspi_n61x_v1.bin.se [16181.217490] Bluetooth: hci0: FW Download Complete: 372696 bytes [16182.701398] Bluetooth: MGMT ver 1.22 Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> Tested-by: Guillaume Legoupil <guillaume.legoupil@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: btnxpuart: Fix Null pointer dereference in btnxpuart_flush()Neeraj Sanjay Kale1-4/+8
This adds a check before freeing the rx->skb in flush and close functions to handle the kernel crash seen while removing driver after FW download fails or before FW download completes. dmesg log: [ 54.634586] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 [ 54.643398] Mem abort info: [ 54.646204] ESR = 0x0000000096000004 [ 54.649964] EC = 0x25: DABT (current EL), IL = 32 bits [ 54.655286] SET = 0, FnV = 0 [ 54.658348] EA = 0, S1PTW = 0 [ 54.661498] FSC = 0x04: level 0 translation fault [ 54.666391] Data abort info: [ 54.669273] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 54.674768] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 54.674771] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 54.674775] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000048860000 [ 54.674780] [0000000000000080] pgd=0000000000000000, p4d=0000000000000000 [ 54.703880] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 54.710152] Modules linked in: btnxpuart(-) overlay fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine authenc libdes crct10dif_ce polyval_ce polyval_generic snd_soc_imx_spdif snd_soc_imx_card snd_soc_ak5558 snd_soc_ak4458 caam secvio error snd_soc_fsl_micfil snd_soc_fsl_spdif snd_soc_fsl_sai snd_soc_fsl_utils imx_pcm_dma gpio_ir_recv rc_core sch_fq_codel fuse [ 54.744357] CPU: 3 PID: 72 Comm: kworker/u9:0 Not tainted 6.6.3-otbr-g128004619037 #2 [ 54.744364] Hardware name: FSL i.MX8MM EVK board (DT) [ 54.744368] Workqueue: hci0 hci_power_on [ 54.757244] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 54.757249] pc : kfree_skb_reason+0x18/0xb0 [ 54.772299] lr : btnxpuart_flush+0x40/0x58 [btnxpuart] [ 54.782921] sp : ffff8000805ebca0 [ 54.782923] x29: ffff8000805ebca0 x28: ffffa5c6cf1869c0 x27: ffffa5c6cf186000 [ 54.782931] x26: ffff377b84852400 x25: ffff377b848523c0 x24: ffff377b845e7230 [ 54.782938] x23: ffffa5c6ce8dbe08 x22: ffffa5c6ceb65410 x21: 00000000ffffff92 [ 54.782945] x20: ffffa5c6ce8dbe98 x19: ffffffffffffffac x18: ffffffffffffffff [ 54.807651] x17: 0000000000000000 x16: ffffa5c6ce2824ec x15: ffff8001005eb857 [ 54.821917] x14: 0000000000000000 x13: ffffa5c6cf1a02e0 x12: 0000000000000642 [ 54.821924] x11: 0000000000000040 x10: ffffa5c6cf19d690 x9 : ffffa5c6cf19d688 [ 54.821931] x8 : ffff377b86000028 x7 : 0000000000000000 x6 : 0000000000000000 [ 54.821938] x5 : ffff377b86000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 54.843331] x2 : 0000000000000000 x1 : 0000000000000002 x0 : ffffffffffffffac [ 54.857599] Call trace: [ 54.857601] kfree_skb_reason+0x18/0xb0 [ 54.863878] btnxpuart_flush+0x40/0x58 [btnxpuart] [ 54.863888] hci_dev_open_sync+0x3a8/0xa04 [ 54.872773] hci_power_on+0x54/0x2e4 [ 54.881832] process_one_work+0x138/0x260 [ 54.881842] worker_thread+0x32c/0x438 [ 54.881847] kthread+0x118/0x11c [ 54.881853] ret_from_fork+0x10/0x20 [ 54.896406] Code: a9be7bfd 910003fd f9000bf3 aa0003f3 (b940d400) [ 54.896410] ---[ end trace 0000000000000000 ]--- Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> Tested-by: Guillaume Legoupil <guillaume.legoupil@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: hci_bcm4377: Fix msgid releaseHector Martin1-1/+1
We are releasing a single msgid, so the order argument to bitmap_release_region must be zero. Fixes: 8a06127602de ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards") Cc: stable@vger.kernel.org Signed-off-by: Hector Martin <marcan@marcan.st> Reviewed-by: Sven Peter <sven@svenpeter.dev> Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: hci_bcm4377: Increase boot timeoutHector Martin1-1/+2
BCM4388 takes over 2 seconds to boot, so increase the timeout. Signed-off-by: Hector Martin <marcan@marcan.st> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: hci_bcm4377: Use correct unit for timeoutsSven Peter1-1/+1
BCM4377_TIMEOUT is always used to wait for completitions and their API expects a timeout in jiffies instead of msecs. Fixes: 8a06127602de ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards") Signed-off-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: Add quirk to ignore reserved PHY bits in LE Extended Adv ReportSven Peter3-0/+26
Some Broadcom controllers found on Apple Silicon machines abuse the reserved bits inside the PHY fields of LE Extended Advertising Report events for additional flags. Add a quirk to drop these and correctly extract the Primary/Secondary_PHY field. The following excerpt from a btmon trace shows a report received with "Reserved" for "Primary PHY" on a 4388 controller: > HCI Event: LE Meta Event (0x3e) plen 26 LE Extended Advertising Report (0x0d) Num reports: 1 Entry 0 Event type: 0x2515 Props: 0x0015 Connectable Directed Use legacy advertising PDUs Data status: Complete Reserved (0x2500) Legacy PDU Type: Reserved (0x2515) Address type: Random (0x01) Address: 00:00:00:00:00:00 (Static) Primary PHY: Reserved Secondary PHY: No packets SID: no ADI field (0xff) TX power: 127 dBm RSSI: -60 dBm (0xc4) Periodic advertising interval: 0.00 msec (0x0000) Direct address type: Public (0x00) Direct address: 00:00:00:00:00:00 (Apple, Inc.) Data length: 0x00 Cc: stable@vger.kernel.org Fixes: 2e7ed5f5e69b ("Bluetooth: hci_sync: Use advertised PHYs on hci_le_ext_create_conn_sync") Reported-by: Janne Grunau <j@jannau.net> Closes: https://lore.kernel.org/all/Zjz0atzRhFykROM9@robin Tested-by: Janne Grunau <j@jannau.net> Signed-off-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
3 daysBluetooth: hci_sync: Fix not using correct handleLuiz Augusto von Dentz1-1/+1
When setting up an advertisement the code shall always attempt to use the handle set by the instance since it may not be equal to the instance ID. Fixes: e77f43d531af ("Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: MGMT: Make MGMT_OP_LOAD_CONN_PARAM update existing connectionLuiz Augusto von Dentz3-2/+69
This makes MGMT_OP_LOAD_CONN_PARAM update existing connection by dectecting the request is just for one connection, parameters already exists and there is a connection. Since this is a new behavior the revision is also updated to enable userspace to detect it. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel_pcie: Refactor and code cleanupfor-net-next-2024-05-14Kiran K2-5/+4
Minor refactor and s/TX_WAIT_TIMEOUT_MS/BTINTEL_PCIE_TX_WAIT_TIMEOUT_MS/g. Fixes: 6e65a09f9275 ("Bluetooth: btintel_pcie: Add *setup* function to download firmware") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel_pcie: Fix warning reported by sparseKiran K1-1/+1
Fix sparse error. Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405100654.0djvoryZ-lkp@intel.com/ Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1Luiz Augusto von Dentz3-9/+18
If hdev->le_num_of_adv_sets is set to 1 it means that only handle 0x00 can be used, but since the MGMT interface instances start from 1 (instance 0 means all instances in case of MGMT_OP_REMOVE_ADVERTISING) the code needs to map the instance to handle otherwise users will not be able to advertise as instance 1 would attempt to use handle 0x01. Fixes: 1d0fac2c38ed ("Bluetooth: Use controller sets when available") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel: Fix compiler warning for multi_v7_defconfig configKiran K1-1/+1
Fix the following compiler warning reported for ARCH=arm multi_v7_defconfig. In file included from drivers/bluetooth/hci_ldisc.c:34: drivers/bluetooth/btintel.h:373:13: warning: 'btintel_hw_error' defined but not used [-Wunused-function] 373 | static void btintel_hw_error(struct hci_dev *hdev, u8 code) | ^~~~~~~~~~~~~~~~ cc: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 67d4dbac3b8c ("Bluetooth: btintel: Export few static functions") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel_pcie: Fix compiler warningsKiran K1-7/+3
Fix compiler warnings reported by kernel bot. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405080647.VRBej6fA-lkp@intel.com/ Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel_pcie: Add *setup* function to download firmwareKiran K2-5/+315
Add support to download firmware. dmesg: [4.407464] Bluetooth: Core ver 2.22 [4.407467] Bluetooth: Starting self testing [4.409093] Bluetooth: ECDH test passed in 1587 usecs [4.420737] Bluetooth: SMP test passed in 526 usecs [4.420745] Bluetooth: Finished self testing [4.420760] Bluetooth: HCI device and connection manager initialized [4.420764] Bluetooth: HCI socket layer initialized [4.420766] Bluetooth: L2CAP socket layer initialized [4.420769] Bluetooth: SCO socket layer initialized [4.437976] Bluetooth: hci0: Device revision is 0 [4.437979] Bluetooth: hci0: Secure boot is disabled [4.437980] Bluetooth: hci0: OTP lock is disabled [4.437980] Bluetooth: hci0: API lock is disabled [4.437981] Bluetooth: hci0: Debug lock is disabled [4.437981] Bluetooth: hci0: Minimum firmware build 0 week 0 2000 [4.437982] Bluetooth: hci0: Bootloader timestamp 2023.33 buildtype 1 build 45995 [4.439461] Bluetooth: hci0: Found device firmware: intel/ibt-0190-0291-iml.sfi [4.439467] Bluetooth: hci0: Boot Address: 0x30099000 [4.439468] Bluetooth: hci0: Firmware Version: 92-19.24 [4.486773] Bluetooth: hci0: Waiting for firmware download to complete [4.486784] Bluetooth: hci0: Firmware loaded in 46209 usecs [4.486845] Bluetooth: hci0: Waiting for device to boot [4.491984] Bluetooth: hci0: Malformed MSFT vendor event: 0x02 [4.491987] Bluetooth: hci0: Device booted in 5074 usecs [4.496657] Bluetooth: hci0: Found device firmware: intel/ibt-0190-0291.sfi [4.496703] Bluetooth: hci0: Boot Address: 0x10000800 [4.496704] Bluetooth: hci0: Firmware Version: 92-19.24 [4.687338] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [4.687342] Bluetooth: BNEP filters: protocol multicast [4.687345] Bluetooth: BNEP socket layer initialized [4.922589] Bluetooth: hci0: Waiting for firmware download to complete [4.922608] Bluetooth: hci0: Firmware loaded in 415962 usecs [4.922664] Bluetooth: hci0: Waiting for device to boot [4.956185] Bluetooth: hci0: Malformed MSFT vendor event: 0x02 [4.956188] Bluetooth: hci0: Device booted in 32770 usecs [4.963167] Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-0190-0291.ddc [4.963440] Bluetooth: hci0: Applying Intel DDC parameters completed [4.963684] Bluetooth: hci0: Firmware timestamp 2024.18 buildtype 3 build 62300 [4.963687] Bluetooth: hci0: Firmware SHA1: 0x8201a4cd [5.003020] Bluetooth: MGMT ver 1.22 [5.003084] Bluetooth: ISO socket layer initialized [5.057844] Bluetooth: RFCOMM TTY layer initialized [5.057858] Bluetooth: RFCOMM socket layer initialized [5.057865] Bluetooth: RFCOMM ver 1.11 hciconfig -a: hci0: Type: Primary Bus: PCI BD Address: A0:D3:65:48:F5:7F ACL MTU: 1021:5 SCO MTU: 240:8 UP RUNNING PSCAN RX bytes:23603 acl:0 sco:0 events:3792 errors:0 TX bytes:949804 acl:0 sco:0 commands:3788 errors:0 Features: 0xbf 0xfe 0x0f 0xfe 0xdb 0xff 0x7b 0x87 Packet type: DM1 DM3 DM5 DH1 DH3 DH5 HV1 HV2 HV3 Link policy: RSWITCH SNIFF Link mode: PERIPHERAL ACCEPT Name: 'LNLM620' Class: 0x20010c Service Classes: Audio Device Class: Computer, Laptop HCI Version: 5.4 (0xd) Revision: 0x4b5c LMP Version: 5.4 (0xd) Subversion: 0x4b5c Manufacturer: Intel Corp. (2) Signed-off-by: Chandrashekar <chandrashekar.devegowda@intel.com> Suggested-by: Bjorn Helgaas <helgaas@kernel.org> Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel_pcie: Add support for PCIe transportTedd Ho-Jeong An5-1/+1495
Add initial code to support Intel bluetooth devices based on PCIe transport. Allocate memory for TX & RX buffers, internal structures, initialize interrupts for TX & RX and PCIe device. Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com> Suggested-by: Bjorn Helgaas <helgaas@kernel.org> Suggested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel: Export few static functionsKiran K2-10/+59
Some of the functions used in btintel.c is made global so that they can be reused in other transport drivers apart from USB. Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: HCI: Remove HCI_AMP supportLuiz Augusto von Dentz20-664/+49
Since BT_HS has been remove HCI_AMP controllers no longer has any use so remove it along with the capability of creating AMP controllers. Since we no longer need to differentiate between AMP and Primary controllers, as only HCI_PRIMARY is left, this also remove hdev->dev_type altogether. Fixes: e7b02296fb40 ("Bluetooth: Remove BT_HS") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()Sungwoo Kim7-53/+88
l2cap_le_flowctl_init() can cause both div-by-zero and an integer overflow since hdev->le_mtu may not fall in the valid range. Move MTU from hci_dev to hci_conn to validate MTU and stop the connection process earlier if MTU is invalid. Also, add a missing validation in read_buffer_size() and make it return an error value if the validation fails. Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a kzalloc failure and invalid MTU value. divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G W 6.9.0-rc5+ #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci0 hci_rx_work RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547 Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c 89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42 RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246 RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084 R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000 FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline] l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline] l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline] l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809 l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506 hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline] hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335 worker_thread+0x926/0xe70 kernel/workqueue.c:3416 kthread+0x2e3/0x380 kernel/kthread.c:388 ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Fixes: 6ed58ec520ad ("Bluetooth: Use LE buffers for LE traffic") Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: qca: Fix error code in qca_read_fw_build_info()Dan Carpenter1-1/+3
Return -ENOMEM on allocation failure. Don't return success. Fixes: cda0d6a198e2 ("Bluetooth: qca: fix info leak when fetching fw build id") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warningGustavo A. R. Silva2-23/+17
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally. So, use the `DEFINE_FLEX()` helper for an on-stack definition of a flexible structure where the size of the flexible-array member is known at compile-time, and refactor the rest of the code, accordingly. With these changes, fix the following warning: net/bluetooth/hci_conn.c:669:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Link: https://github.com/KSPP/linux/issues/202 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel: Add support for Filmore Peak2 (BE201)Kiran K1-0/+1
Add VID/PID for Intel Filmore Peak2 (BE201) Device from /sys/kernel/debug/usb/devices: T: Bus=09 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 2.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=8087 ProdID=0037 Rev= 0.00 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel: Add support for BlazarIKiran K1-0/+3
Add support for BlazarI (cnvi) bluetooth core. Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysLE Create Connection command timeout increased to 20 secsMahesh Talewad2-2/+1
On our DUT, we can see that the host issues create connection cancel command after 4-sec if there is no connection complete event for LE create connection cmd. As per core spec v5.3 section 7.8.5, advertisement interval range is- Advertising_Interval_Min Default : 0x0800(1.28s) Time Range: 20ms to 10.24s Advertising_Interval_Max Default : 0x0800(1.28s) Time Range: 20ms to 10.24s If the remote device is using adv interval of > 4 sec, it is difficult to make a connection with the current timeout value. Also, with the default interval of 1.28 sec, we will get only 3 chances to capture the adv packets with the 4 sec window. Hence we want to increase this timeout to 20sec. Signed-off-by: Mahesh Talewad <mahesh.talewad@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysdt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO BluetoothChen-Yu Tsai2-0/+56
The MediaTek MT7921S is a WiFi/Bluetooth combo chip that works over SDIO. WiFi and Bluetooth are separate SDIO functions within the chip. While the Bluetooth SDIO function is fully discoverable, the chip has a pin that can reset just the Bluetooth core, as opposed to the full chip. This should be described in the device tree. Add a device tree binding for the Bluetooth SDIO function of the MT7921S specifically to document the reset line. This binding is based on the MMC controller binding, which specifies one device node per SDIO function. Cc: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: compute LE flow credits based on recvbuf spaceSebastian Urban3-26/+132
Previously LE flow credits were returned to the sender even if the socket's receive buffer was full. This meant that no back-pressure was applied to the sender, thus it continued to send data, resulting in data loss without any error being reported. Furthermore, the amount of credits was essentially fixed to a small amount, leading to reduced performance. This is fixed by computing the number of returned LE flow credits based on the estimated available space in the receive buffer of an L2CAP socket. Consequently, if the receive buffer is full, no credits are returned until the buffer is read and thus cleared by user-space. Since the computation of available receive buffer space can only be performed approximately (due to sk_buff overhead) and the receive buffer size may be changed by user-space after flow credits have been sent, superfluous received data is temporary stored within l2cap_pinfo. This is necessary because Bluetooth LE provides no retransmission mechanism once the data has been acked by the physical layer. If receive buffer space estimation is not possible at the moment, we fall back to providing credits for one full packet as before. This is currently the case during connection setup, when MPS is not yet available. Fixes: b1c325c23d75 ("Bluetooth: Implement returning of LE L2CAP credits") Signed-off-by: Sebastian Urban <surban@surban.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_sync: Use cmd->num_cis instead of magic numberGustavo A. R. Silva1-1/+1
At the moment of the check, `cmd->num_cis` holds the value of 0x1f, which is the max number of elements in the `cmd->cis[]` array at declaration, which is 0x1f. So, avoid using 0x1f directly, and instead use `cmd->num_cis`. Similarly to this other patch[1]. Link: https://lore.kernel.org/linux-hardening/ZivaHUQyDDK9fXEk@neat/ [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_conn: Use struct_size() in hci_le_big_create_sync()Gustavo A. R. Silva1-1/+1
Use struct_size() instead of the open-coded version. Similarly to this other patch[1]. Link: https://lore.kernel.org/linux-hardening/ZiwwPmCvU25YzWek@neat/ [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: qca: clean up definesJohan Hovold1-30/+30
Clean up the QCA driver defines by dropping redundant parentheses around values and making sure they are aligned (using tabs only). Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: qca: drop bogus module versionJohan Hovold1-4/+1
Random module versions serves no purpose, what matters is the kernel version. Drop the bogus module version which has never been updated. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: qca: drop bogus edl header checksJohan Hovold1-20/+0
The skb->data pointer is never NULL so drop the bogus sanity checks when initialising the EDL header pointer. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysdt-bindings: net: broadcom-bluetooth: Add CYW43439 DT bindingMarek Vasut1-14/+19
CYW43439 is a Wi-Fi + Bluetooth combo device from Infineon. The Bluetooth part is capable of Bluetooth 5.2 BR/EDR/LE . This chip is present e.g. on muRata 1YN module. Extend the binding with its DT compatible using fallback compatible string to "brcm,bcm4329-bt" which seems to be the oldest compatible device. This should also prevent the growth of compatible string tables in drivers. The existing block of compatible strings is retained. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_conn: Use __counted_by() to avoid -Wfamnae warningGustavo A. R. Silva2-16/+12
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally. So, use the `DEFINE_FLEX()` helper for an on-stack definition of a flexible structure where the size of the flexible-array member is known at compile-time, and refactor the rest of the code, accordingly. With these changes, fix the following warning: net/bluetooth/hci_conn.c:2116:50: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Link: https://github.com/KSPP/linux/issues/202 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_conn, hci_sync: Use __counted_by() to avoid -Wfamnae warningsGustavo A. R. Silva3-55/+40
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally. So, use the `DEFINE_FLEX()` helper for multiple on-stack definitions of a flexible structure where the size of the flexible-array member is known at compile-time, and refactor the rest of the code, accordingly. Notice that, due to the use of `__counted_by()` in `struct hci_cp_le_create_cis`, the for loop in function `hci_cs_le_create_cis()` had to be modified. Once the index `i`, through which `cp->cis[i]` is accessed, falls in the interval [0, cp->num_cis), `cp->num_cis` cannot be decremented all the way down to zero while accessing `cp->cis[]`: net/bluetooth/hci_event.c:4310: 4310 for (i = 0; cp->num_cis; cp->num_cis--, i++) { ... 4314 handle = __le16_to_cpu(cp->cis[i].cis_handle); otherwise, only half (one iteration before `cp->num_cis == i`) or half plus one (one iteration before `cp->num_cis < i`) of the items in the array will be accessed before running into an out-of-bounds issue. So, in order to avoid this, set `cp->num_cis` to zero just after the for loop. Also, make use of `aux_num_cis` variable to update `cmd->num_cis` after a `list_for_each_entry_rcu()` loop. With these changes, fix the following warnings: net/bluetooth/hci_sync.c:1239:56: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] net/bluetooth/hci_sync.c:1415:51: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] net/bluetooth/hci_sync.c:1731:51: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] net/bluetooth/hci_sync.c:6497:45: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Link: https://github.com/KSPP/linux/issues/202 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btusb: Sort usb_device_id table by the IDJiande Lu1-18/+15
Sort usb device id table for enhanced readability. Signed-off-by: Jiande Lu <jiande.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btusb: Add USB HW IDs for MT7921/MT7922/MT7925Jiande Lu1-0/+24
Add HW IDs for wireless module specific to Acer/ASUS notebook models to ensure proper recognition and functionality. These HW IDs are extracted from Windows driver inf file. Note some HW IDs without official drivers, still in testing phase. Thus, we update module HW ID and test ensure consistent boot success. Signed-off-by: Jiande Lu <jiande.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: qca: Support downloading board id specific NVM for WCN7850Zijun Hu1-3/+15
Download board id specific NVM instead of default for WCN7850 if board id is available. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: Populate hci_set_hw_info for Intel and RealtekArchie Pusaka2-0/+16
The hardware information surfaced via debugfs might be usable by the userspace to set some configuration knobs. This patch sets the hw_info for Intel and Realtek chipsets. Below are some possible output of the hardware_info debugfs file. INTEL platform=55 variant=24 RTL lmp_subver=34898 hci_rev=10 hci_ver=11 hci_bus=1 Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: Remove 3 repeated macro definitionsZijun Hu1-4/+0
Macros HCI_REQ_DONE, HCI_REQ_PEND and HCI_REQ_CANCELED are repeatedly defined twice with hci_request.h, so remove a copy of definition. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_conn: Remove a redundant check for HFP offloadZijun Hu1-4/+4
Remove a redundant check !hdev->get_codec_config_data. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btusb: Correct timeout macro argument used to receive control messageZijun Hu1-3/+3
USB driver defines macro @USB_CTRL_SET_TIMEOUT for sending control message timeout and @USB_CTRL_GET_TIMEOUT for receiving, but usb_control_msg() uses wrong macro @USB_CTRL_SET_TIMEOUT as argument to receive control message, fixed by using @USB_CTRL_GET_TIMEOUT to receive message. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btusb: Fix the patch for MT7920 the affected to MT7921Peter Tsao1-0/+1
Because both MT7920 and MT7921 use the same chip ID. We use the 8th bit of fw_flavor to distingush MT7920. The original patch made a mistake to check whole fw_flavor, that makes the condition both true (dev_id == 0x7961 && fw_flavor), and makes MT7921 flow wrong. In this patch, we correct the flow to get the 8th bit value for MT7920. And the patch is verified pass with both MT7920 and MT7921. Signed-off-by: Peter Tsao <peter.tsao@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: ath3k: Fix multiple issues reported by checkpatch.plUri Arev1-14/+11
This fixes some CHECKs reported by the checkpatch script. Issues reported in ath3k.c: ------- ath3k.c ------- CHECK: Please don't use multiple blank lines + + CHECK: Blank lines aren't necessary after an open brace '{' +static const struct usb_device_id ath3k_blist_tbl[] = { + CHECK: Alignment should match open parenthesis +static int ath3k_load_firmware(struct usb_device *udev, + const struct firmware *firmware) CHECK: Alignment should match open parenthesis + err = usb_bulk_msg(udev, pipe, send_buf, size, + &len, 3000); CHECK: Unnecessary parentheses around 'len != size' + if (err || (len != size)) { CHECK: Alignment should match open parenthesis +static int ath3k_get_version(struct usb_device *udev, + struct ath3k_version *version) CHECK: Alignment should match open parenthesis +static int ath3k_load_fwfile(struct usb_device *udev, + const struct firmware *firmware) CHECK: Alignment should match open parenthesis + err = usb_bulk_msg(udev, pipe, send_buf, size, + &len, 3000); CHECK: Unnecessary parentheses around 'len != size' + if (err || (len != size)) { CHECK: Blank lines aren't necessary after an open brace '{' + switch (fw_version.ref_clock) { + CHECK: Alignment should match open parenthesis + snprintf(filename, ATH3K_NAME_LEN, "ar3k/ramps_0x%08x_%d%s", + le32_to_cpu(fw_version.rom_version), clk_value, ".dfu"); CHECK: Alignment should match open parenthesis +static int ath3k_probe(struct usb_interface *intf, + const struct usb_device_id *id) CHECK: Alignment should match open parenthesis + BT_ERR("Firmware file \"%s\" not found", + ATH3K_FIRMWARE); CHECK: Alignment should match open parenthesis + BT_ERR("Firmware file \"%s\" request failed (err=%d)", + ATH3K_FIRMWARE, ret); total: 0 errors, 0 warnings, 14 checks, 540 lines checked Signed-off-by: Uri Arev <me@wantyapps.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_bcm: Limit bcm43455 baudrate to 2000000Hans de Goede1-1/+1
Like the bcm43430a0 the bcm43455 BT does not support the 0xfc45 command to set the UART clock to 48 MHz and because of this it does not work at 4000000 baud. These chips are found on ACPI/x86 devices where the operating baudrate does not come from the firmware but is hardcoded at 4000000, which does not work. Make the driver_data for the "BCM2EA4" ACPI HID which is used for the bcm43455 BT point to bcm43430_device_data which limits the baudrate to 2000000. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: L2CAP: Avoid -Wflex-array-member-not-at-end warningsGustavo A. R. Silva2-33/+35
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally. There are currently a couple of objects (`req` and `rsp`), in a couple of structures, that contain flexible structures (`struct l2cap_ecred_conn_req` and `struct l2cap_ecred_conn_rsp`), for example: struct l2cap_ecred_rsp_data { struct { struct l2cap_ecred_conn_rsp rsp; __le16 scid[L2CAP_ECRED_MAX_CID]; } __packed pdu; int count; }; in the struct above, `struct l2cap_ecred_conn_rsp` is a flexible structure: struct l2cap_ecred_conn_rsp { __le16 mtu; __le16 mps; __le16 credits; __le16 result; __le16 dcid[]; }; So, in order to avoid ending up with a flexible-array member in the middle of another structure, we use the `struct_group_tagged()` (and `__struct_group()` when the flexible structure is `__packed`) helper to separate the flexible array from the rest of the members in the flexible structure: struct l2cap_ecred_conn_rsp { struct_group_tagged(l2cap_ecred_conn_rsp_hdr, hdr, ... the rest of members ); __le16 dcid[]; }; With the change described above, we now declare objects of the type of the tagged struct, in this example `struct l2cap_ecred_conn_rsp_hdr`, without embedding flexible arrays in the middle of other structures: struct l2cap_ecred_rsp_data { struct { struct l2cap_ecred_conn_rsp_hdr rsp; __le16 scid[L2CAP_ECRED_MAX_CID]; } __packed pdu; int count; }; Also, when the flexible-array member needs to be accessed, we use `container_of()` to retrieve a pointer to the flexible structure. We also use the `DEFINE_RAW_FLEX()` helper for a couple of on-stack definitions of a flexible structure where the size of the flexible-array member is known at compile-time. So, with these changes, fix the following warnings: net/bluetooth/l2cap_core.c:1260:45: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] net/bluetooth/l2cap_core.c:3740:45: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] net/bluetooth/l2cap_core.c:4999:45: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] net/bluetooth/l2cap_core.c:7116:47: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Link: https://github.com/KSPP/linux/issues/202 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_intel: Fix multiple issues reported by checkpatch.plUri Arev1-9/+10
This fixes the following CHECKs, WARNINGs, and ERRORs reported in hci_intel.c Reported by checkpatch.pl: ----------- hci_intel.c ----------- WARNING: Prefer using '"%s...", __func__' to using 'intel_setup', this function's name, in a string + bt_dev_dbg(hdev, "start intel_setup"); ERROR: code indent should use tabs where possible + /* Check for supported iBT hardware variants of this firmware$ ERROR: code indent should use tabs where possible + * loading method.$ ERROR: code indent should use tabs where possible + *$ ERROR: code indent should use tabs where possible + * This check has been put in place to ensure correct forward$ ERROR: code indent should use tabs where possible + * compatibility options when newer hardware variants come along.$ ERROR: code indent should use tabs where possible + */$ CHECK: No space is necessary after a cast + duration = (unsigned long long) ktime_to_ns(delta) >> 10; CHECK: No space is necessary after a cast + duration = (unsigned long long) ktime_to_ns(delta) >> 10; WARNING: Missing a blank line after declarations + int err = PTR_ERR(intel->rx_skb); + bt_dev_err(hu->hdev, "Frame reassembly failed (%d)", err); Signed-off-by: Uri Arev <me@wantyapps.xyz> Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: ISO: Handle PA sync when no BIGInfo reports are generatedIulia Tanasescu3-81/+74
In case of a Broadcast Source that has PA enabled but no active BIG, a Broadcast Sink needs to establish PA sync and parse BASE from PA reports. This commit moves the allocation of a PA sync hcon from the BIGInfo advertising report event to the PA sync established event. After the first complete PA report, the hcon is notified to the ISO layer. A child socket is allocated and enqueued in the parent's accept queue. BIGInfo reports also need to be processed, to extract the encryption field and inform userspace. After the first BIGInfo report is received, the PA sync hcon is notified again to the ISO layer. Since a socket will be found this time, the socket state will transition to BT_CONNECTED and the userspace will be woken up using sk_state_change. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: ISO: Make iso_get_sock_listen genericIulia Tanasescu2-34/+43
This makes iso_get_sock_listen more generic, to return matching socket in the state provided as argument. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_event: Set DISCOVERY_FINDING on SCAN_ENABLEDLuiz Augusto von Dentz2-4/+10
This makes sure that discovery state is properly synchronized otherwise reports may not generate MGMT DeviceFound events as it would be assumed that it was not initiated by a discovery session. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: Add proper definitions for scan interval and windowLuiz Augusto von Dentz2-10/+24
This adds proper definitions for scan interval and window and then make use of them instead their values. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_intel: Convert to platform remove callback returning voidUwe Kleine-König1-4/+2
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_bcm: Convert to platform remove callback returning voidUwe Kleine-König1-4/+2
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btqcomsmd: Convert to platform remove callback returning voidUwe Kleine-König1-4/+2
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: Add support for MediaTek MT7922 deviceIan W MORRISON1-0/+5
This patch adds support for the MediaTek MT7922 Bluetooth device. The information in /sys/kernel/debug/usb/devices about the MT7922 is as follows: T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3585 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us Signed-off-by: Ian W MORRISON <ianwmorrison@live.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel: Add support to download intermediate loaderKiran K2-1/+40
Some variants of Intel controllers like BlazarI supports downloading of Intermediate bootloader (IML) image. IML gives flexibility to fix issues as its not possible to fix issue in Primary bootloader once flashed to ROM. This patch adds the support to download IML before downloading operational firmware image. dmesg logs: [13.399003] Bluetooth: Core ver 2.22 [13.399006] Bluetooth: Starting self testing [13.401194] Bluetooth: ECDH test passed in 2135 usecs [13.421175] Bluetooth: SMP test passed in 597 usecs [13.421184] Bluetooth: Finished self testing [13.422919] Bluetooth: HCI device and connection manager initialized [13.422923] Bluetooth: HCI socket layer initialized [13.422925] Bluetooth: L2CAP socket layer initialized [13.422930] Bluetooth: SCO socket layer initialized [13.458065] Bluetooth: hci0: Device revision is 0 [13.458071] Bluetooth: hci0: Secure boot is disabled [13.458072] Bluetooth: hci0: OTP lock is disabled [13.458072] Bluetooth: hci0: API lock is enabled [13.458073] Bluetooth: hci0: Debug lock is disabled [13.458073] Bluetooth: hci0: Minimum firmware build 1 week 10 2014 [13.458075] Bluetooth: hci0: Bootloader timestamp 2022.46 buildtype 1 build 26590 [13.458324] Bluetooth: hci0: DSM reset method type: 0x00 [13.460678] Bluetooth: hci0: Found device firmware: intel/ibt-0090-0291-iml.sfi [13.460684] Bluetooth: hci0: Boot Address: 0x30099000 [13.460685] Bluetooth: hci0: Firmware Version: 227-11.24 [13.562554] Bluetooth: hci0: Waiting for firmware download to complete [13.563023] Bluetooth: hci0: Firmware loaded in 99941 usecs [13.563057] Bluetooth: hci0: Waiting for device to boot [13.565029] Bluetooth: hci0: Malformed MSFT vendor event: 0x02 [13.565148] Bluetooth: hci0: Device booted in 2064 usecs [13.567065] Bluetooth: hci0: No device address configured [13.569010] Bluetooth: hci0: Found device firmware: intel/ibt-0090-0291.sfi [13.569061] Bluetooth: hci0: Boot Address: 0x10000800 [13.569062] Bluetooth: hci0: Firmware Version: 227-11.24 [13.788891] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [13.788897] Bluetooth: BNEP filters: protocol multicast [13.788902] Bluetooth: BNEP socket layer initialized [15.435905] Bluetooth: hci0: Waiting for firmware download to complete [15.436016] Bluetooth: hci0: Firmware loaded in 1823233 usecs [15.436258] Bluetooth: hci0: Waiting for device to boot [15.471140] Bluetooth: hci0: Device booted in 34277 usecs [15.471201] Bluetooth: hci0: Malformed MSFT vendor event: 0x02 [15.471487] Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-0090-0291.ddc [15.474353] Bluetooth: hci0: Applying Intel DDC parameters completed [15.474486] Bluetooth: hci0: Found Intel DDC parameters: intel/bdaddress.cfg [15.475299] Bluetooth: hci0: Applying Intel DDC parameters completed [15.479381] Bluetooth: hci0: Firmware timestamp 2024.10 buildtype 3 build 58595 [15.479385] Bluetooth: hci0: Firmware SHA1: 0xb4f3cc46 [15.483243] Bluetooth: hci0: Fseq status: Success (0x00) [15.483246] Bluetooth: hci0: Fseq executed: 00.00.00.00 [15.483247] Bluetooth: hci0: Fseq BT Top: 00.00.00.00 [15.578712] Bluetooth: MGMT ver 1.22 [15.822682] Bluetooth: RFCOMM TTY layer initialized [15.822690] Bluetooth: RFCOMM socket layer initialized [15.822695] Bluetooth: RFCOMM ver 1.11 Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: btintel: Define macros for image typesKiran K2-6/+9
Use macro for image type instead of using hard code number. Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
5 daysnet: revert partially applied PHY topology seriesJakub Kicinski22-385/+8
The series is causing issues with PHY drivers built as modules. Since it was only partially applied and the merge window has opened let's revert and try again for v6.11. Revert 6916e461e793 ("net: phy: Introduce ethernet link topology representation") Revert 0ec5ed6c130e ("net: sfp: pass the phy_device when disconnecting an sfp module's PHY") Revert e75e4e074c44 ("net: phy: add helpers to handle sfp phy connect/disconnect") Revert fdd353965b52 ("net: sfp: Add helper to return the SFP bus name") Revert 841942bc6212 ("net: ethtool: Allow passing a phy index for some commands") Link: https://lore.kernel.org/all/171242462917.4000.9759453824684907063.git-patchwork-notify@kernel.org/ Link: https://lore.kernel.org/all/20240507102822.2023826-1-maxime.chevallier@bootlin.com/ Link: https://lore.kernel.org/r/20240513154156.104281-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'move-est-lock-and-est-structure-to-struct-stmmac_priv'Jakub Kicinski5-69/+70
Xiaolei Wang says: ==================== Move EST lock and EST structure to struct stmmac_priv 1. Pulling the mutex protecting the EST structure out to avoid clearing it during reinit/memset of the EST structure,and reacquire the mutex lock when doing this initialization. 2. Moving the EST structure to a more logical location ==================== Link: https://lore.kernel.org/r/20240513014346.1718740-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: stmmac: move the EST structure to struct stmmac_privXiaolei Wang5-56/+54
Move the EST structure to struct stmmac_priv, because the EST configs don't look like platform config, but EST is enabled in runtime with the settings retrieved for the TC TAPRIO feature also in runtime. So it's better to have the EST-data preserved in the driver private data instead of the platform data storage. Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240513014346.1718740-3-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: stmmac: move the EST lock to struct stmmac_privXiaolei Wang4-13/+16
Reinitialize the whole EST structure would also reset the mutex lock which is embedded in the EST structure, and then trigger the following warning. To address this, move the lock to struct stmmac_priv. We also need to reacquire the mutex lock when doing this initialization. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 505 at kernel/locking/mutex.c:587 __mutex_lock+0xd84/0x1068 Modules linked in: CPU: 3 PID: 505 Comm: tc Not tainted 6.9.0-rc6-00053-g0106679839f7-dirty #29 Hardware name: NXP i.MX8MPlus EVK board (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mutex_lock+0xd84/0x1068 lr : __mutex_lock+0xd84/0x1068 sp : ffffffc0864e3570 x29: ffffffc0864e3570 x28: ffffffc0817bdc78 x27: 0000000000000003 x26: ffffff80c54f1808 x25: ffffff80c9164080 x24: ffffffc080d723ac x23: 0000000000000000 x22: 0000000000000002 x21: 0000000000000000 x20: 0000000000000000 x19: ffffffc083bc3000 x18: ffffffffffffffff x17: ffffffc08117b080 x16: 0000000000000002 x15: ffffff80d2d40000 x14: 00000000000002da x13: ffffff80d2d404b8 x12: ffffffc082b5a5c8 x11: ffffffc082bca680 x10: ffffffc082bb2640 x9 : ffffffc082bb2698 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001 x5 : ffffff8178fe0d48 x4 : 0000000000000000 x3 : 0000000000000027 x2 : ffffff8178fe0d50 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: __mutex_lock+0xd84/0x1068 mutex_lock_nested+0x28/0x34 tc_setup_taprio+0x118/0x68c stmmac_setup_tc+0x50/0xf0 taprio_change+0x868/0xc9c Fixes: b2aae654a479 ("net: stmmac: add mutex lock to protect est parameters") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240513014346.1718740-2-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'mptcp-small-improvements-fix-and-clean-ups'Jakub Kicinski10-24/+143
Mat Martineau says: ==================== mptcp: small improvements, fix and clean-ups This series contain mostly unrelated patches: - The two first patches can be seen as "fixes". They are part of this series for -next because it looks like the last batch of fixes for v6.9 has already been sent. These fixes are not urgent, so they can wait if an unlikely v6.9-rc8 is published. About the two patches: - Patch 1 fixes getsockopt(SO_KEEPALIVE) support on MPTCP sockets - Patch 2 makes sure the full TCP keep-alive feature is supported, not just SO_KEEPALIVE. - Patch 3 is a small optimisation when getsockopt(MPTCP_INFO) is used without buffer, just to check if MPTCP is still being used: no fallback to TCP. - Patch 4 adds net.mptcp.available_schedulers sysctl knob to list packet schedulers, similar to net.ipv4.tcp_available_congestion_control. - Patch 5 and 6 fix CheckPatch warnings: "prefer strscpy over strcpy" and "else is not generally useful after a break or return". - Patch 7 and 8 remove and add header includes to avoid unused ones, and add missing ones to be self-contained. ==================== Link: https://lore.kernel.org/r/20240514011335.176158-1-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: include inet_common in mib.hMatthieu Baerts (NGI0)1-0/+2
So this file is now self-contained: it can be compiled alone with analytic tools. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-9-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: move mptcp_pm_gen.h's includeMatthieu Baerts (NGI0)3-2/+2
Nothing from protocol.h depends on mptcp_pm_gen.h, only code from pm_netlink.c and pm_userspace.c depends on it. So this include can be moved where it is needed to avoid a "unused includes" warning. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-8-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: remove unnecessary else statementsMatthieu Baerts (NGI0)1-15/+17
The 'else' statements are not needed here, because their previous 'if' block ends with a 'return'. This fixes CheckPatch warnings: WARNING: else is not generally useful after a break or return Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-7-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: prefer strscpy over strcpyMatthieu Baerts (NGI0)3-4/+5
strcpy() performs no bounds checking on the destination buffer. This could result in linear overflows beyond the end of the buffer, leading to all kinds of misbehaviors. The safe replacement is strscpy() [1]. This is in preparation of a possible future step where all strcpy() uses will be removed in favour of strscpy() [2]. This fixes CheckPatch warnings: WARNING: Prefer strscpy over strcpy Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1] Link: https://github.com/KSPP/linux/issues/88 [2] Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-6-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: add net.mptcp.available_schedulersGregory Detal4-1/+52
The sysctl lists the available schedulers that can be set using net.mptcp.scheduler similarly to net.ipv4.tcp_available_congestion_control. Signed-off-by: Gregory Detal <gregory.detal@gmail.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-5-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: sockopt: info: stop early if no bufferMatthieu Baerts (NGI0)1-0/+4
Up to recently, it has been recommended to use getsockopt(MPTCP_INFO) to check if a fallback to TCP happened, or if the client requested to use MPTCP. In this case, the userspace app is only interested by the returned value of the getsocktop() call, and can then give 0 for the option length, and NULL for the buffer address. An easy optimisation is then to stop early, and avoid filling a local buffer -- which now requires two different locks -- if it is not needed. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-4-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: fix full TCP keep-alive supportMatthieu Baerts (NGI0)2-0/+61
SO_KEEPALIVE support has been added a while ago, as part of a series "adding SOL_SOCKET" support. To have a full control of this keep-alive feature, it is important to also support TCP_KEEP* socket options at the SOL_TCP level. Supporting them on the setsockopt() part is easy, it is just a matter of remembering each value in the MPTCP sock structure, and calling tcp_sock_set_keep*() helpers on each subflow. If the value is not modified (0), calling these helpers will not do anything. For the getsockopt() part, the corresponding value from the MPTCP sock structure or the default one is simply returned. All of this is very similar to other TCP_* socket options supported by MPTCP. It looks important for kernels supporting SO_KEEPALIVE, to also support TCP_KEEP* options as well: some apps seem to (wrongly) consider that if the former is supported, the latter ones will be supported as well. But also, not having this simple and isolated change is preventing MPTCP support in some apps, and libraries like GoLang [1]. This is why this patch is seen as a fix. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/383 Fixes: 1b3e7ede1365 ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY") Link: https://github.com/golang/go/issues/56539 [1] Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-3-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysmptcp: SO_KEEPALIVE: fix getsockopt supportMatthieu Baerts (NGI0)1-2/+0
SO_KEEPALIVE support has to be set on each subflow: on each TCP socket, where sk_prot->keepalive is defined. Technically, nothing has to be done on the MPTCP socket. That's why mptcp_sol_socket_sync_intval() was called instead of mptcp_sol_socket_intval(). Except that when nothing is done on the MPTCP socket, the getsockopt(SO_KEEPALIVE), handled in net/core/sock.c:sk_getsockopt(), will not know if SO_KEEPALIVE has been set on the different subflows or not. The fix is simple: simply call mptcp_sol_socket_intval() which will end up calling net/core/sock.c:sk_setsockopt() where the SOCK_KEEPOPEN flag will be set, the one used in sk_getsockopt(). So now, getsockopt(SO_KEEPALIVE) on an MPTCP socket will return the same value as the one previously set with setsockopt(SO_KEEPALIVE). Fixes: 1b3e7ede1365 ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-2-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: mana: Enable MANA driver on ARM64 with 4K page sizeHaiyang Zhang1-1/+2
Change the Kconfig dependency, so this driver can be built and run on ARM64 with 4K page size. 16/64K page sizes are not supported yet. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1715632141-8089-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: prestera: Add flex arrays to some structsErick Archer1-46/+37
The "struct prestera_msg_vtcam_rule_add_req" uses a dynamically sized set of trailing elements. Specifically, it uses an array of structures of type "prestera_msg_acl_action actions_msg". The "struct prestera_msg_flood_domain_ports_set_req" also uses a dynamically sized set of trailing elements. Specifically, it uses an array of structures of type "prestera_msg_acl_action actions_msg". So, use the preferred way in the kernel declaring flexible arrays [1]. At the same time, prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). In this case, it is important to note that the attribute used is specifically __counted_by_le since the counters are of type __le32. The logic does not need to change since the counters for the flexible arrays are asigned before any access to the arrays. The order in which the structure prestera_msg_vtcam_rule_add_req and the structure prestera_msg_flood_domain_ports_set_req are defined must be changed to avoid incomplete type errors. Also, avoid the open-coded arithmetic in memory allocator functions [2] using the "struct_size" macro. Moreover, the new structure members also allow us to avoid the open- coded arithmetic on pointers. So, take advantage of this refactoring accordingly. This code was detected with the help of Coccinelle, and audited and modified manually. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] Signed-off-by: Erick Archer <erick.archer@outlook.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/AS8PR02MB7237E8469568A59795F1F0408BE12@AS8PR02MB7237.eurprd02.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'tcp-support-rstreasons-in-the-passive-logic'Jakub Kicinski4-3/+64
Jason Xing says: ==================== tcp: support rstreasons in the passive logic In this series, I split all kinds of reasons into five part which, I think, can be easily reviewed. I respectively implement corresponding rstreasons in those functions. After this, we can trace the whole tcp passive reset with clear reasons. ==================== Link: https://lore.kernel.org/r/20240510122502.27850-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daystcp: rstreason: fully support in tcp_check_req()Jason Xing2-1/+9
We're going to send an RST due to invalid syn packet which is already checked whether 1) it is in sequence, 2) it is a retransmitted skb. As RFC 793 says, if the state of socket is not CLOSED/LISTEN/SYN-SENT, then we should send an RST when receiving bad syn packet: "fourth, check the SYN bit,...If the SYN is in the window it is an error, send a reset" Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://lore.kernel.org/r/20240510122502.27850-6-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daystcp: rstreason: handle timewait cases in the receive pathJason Xing3-2/+7
There are two possible cases where TCP layer can send an RST. Since they happen in the same place, I think using one independent reason is enough to identify this special situation. Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://lore.kernel.org/r/20240510122502.27850-5-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daystcp: rstreason: fully support in tcp_rcv_state_process()Jason Xing1-0/+18
Like the previous patch does in this series, finish the conversion map is enough to let rstreason mechanism work in this function. Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://lore.kernel.org/r/20240510122502.27850-4-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daystcp: rstreason: fully support in tcp_ack()Jason Xing1-0/+13
Based on the existing skb drop reason, updating the rstreason map can help us finish the rstreason job in this function. Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://lore.kernel.org/r/20240510122502.27850-3-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daystcp: rstreason: fully support in tcp_rcv_synsent_state_process()Jason Xing1-0/+17
In this function, only updating the map can finish the job for socket reset reason because the corresponding drop reasons are ready. Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://lore.kernel.org/r/20240510122502.27850-2-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'net-stmmac-add-support-for-rzn1-gmac-devices'Jakub Kicinski10-77/+272
Romain Gantois says: ==================== net: stmmac: Add support for RZN1 GMAC devices This is version seven of my series that adds support for a Gigabit Ethernet controller featured in the Renesas r9a06g032 SoC, of the RZ/N1 family. This GMAC device is based on a Synopsys IP and is compatible with the stmmac driver. My former colleague Clément Léger originally sent a series for this driver, but an issue in bringing up the PCS clock had blocked the upstreaming process. This issue has since been resolved by the following series: https://lore.kernel.org/all/20240326-rxc_bugfix-v6-0-24a74e5c761f@bootlin.com/ This series consists of a devicetree binding describing the RZN1 GMAC controller IP, a node for the GMAC1 device in the r9a06g032 SoC device tree, and the GMAC driver itself which is a glue layer in stmmac. There are also two patches by Russell that improve pcs initialization handling in stmmac. ==================== Link: https://lore.kernel.org/r/20240513-rzn1-gmac1-v7-0-6acf58b5440d@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: stmmac: add support for RZ/N1 GMACClément Léger4-0/+105
Add support for the Renesas RZ/N1 GMAC. This support can make use of a custom RZ/N1 PCS which is fetched by parsing the pcs-handle device tree property. Signed-off-by: Clément Léger <clement.leger@bootlin.com> Co-developed-by: Romain Gantois <romain.gantois@bootlin.com> Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240513-rzn1-gmac1-v7-6-6acf58b5440d@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: stmmac: dwmac-socfpga: use pcs_init/pcs_exitRussell King (Oracle)1-54/+53
Use the newly introduced pcs_init() and pcs_exit() operations to create and destroy the PCS instance at a more appropriate moment during the driver lifecycle, thereby avoiding publishing a network device to userspace that has not yet finished its PCS initialisation. There are other similar issues with this driver which remain unaddressed, but these are out of scope for this patch. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> [rgantois: removed second parameters of new callbacks] Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240513-rzn1-gmac1-v7-5-6acf58b5440d@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: stmmac: introduce pcs_init/pcs_exit stmmac operationsRussell King (Oracle)2-1/+9
Introduce a mechanism whereby platforms can create their PCS instances prior to the network device being published to userspace, but after some of the core stmmac initialisation has been completed. This means that the data structures that platforms need will be available. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Co-developed-by: Romain Gantois <romain.gantois@bootlin.com> Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240513-rzn1-gmac1-v7-4-6acf58b5440d@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: stmmac: Make stmmac_xpcs_setup() generic to all PCS devicesSerge Semin3-19/+23
A pcs_init() callback will be introduced to stmmac in a future patch. This new function will be called during the hardware initialization phase. Instead of separately initializing XPCS and PCS components, let's group all PCS-related hardware initialization logic in the current stmmac_xpcs_setup() function. Rename stmmac_xpcs_setup() to stmmac_pcs_setup() and move the conditional call to stmmac_xpcs_setup() inside the function itself. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Co-developed-by: Romain Gantois <romain.gantois@bootlin.com> Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240513-rzn1-gmac1-v7-3-6acf58b5440d@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: stmmac: Add dedicated XPCS cleanup methodSerge Semin3-4/+17
Currently the XPCS handler destruction is performed in the stmmac_mdio_unregister() method. It doesn't look good because the handler isn't originally created in the corresponding protagonist stmmac_mdio_unregister(), but in the stmmac_xpcs_setup() function. In order to have more coherent MDIO and XPCS setup/cleanup procedures, let's move the DW XPCS destruction to the dedicated stmmac_pcs_clean() method. This method will also be used to cleanup PCS hardware using the pcs_exit() callback that will be introduced to stmmac in a subsequent patch. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Co-developed-by: Romain Gantois <romain.gantois@bootlin.com> Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240513-rzn1-gmac1-v7-2-6acf58b5440d@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysdt-bindings: net: renesas,rzn1-gmac: Document RZ/N1 GMAC supportClément Léger1-0/+66
The RZ/N1 series of MPUs feature up to two Gigabit Ethernet controllers. These controllers are based on Synopsys IPs. They can be connected to RZ/N1 RGMII/RMII converters. Add a binding that describes these GMAC devices. Signed-off-by: Clément Léger <clement.leger@bootlin.com> [rgantois: commit log] Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Link: https://lore.kernel.org/r/20240513-rzn1-gmac1-v7-1-6acf58b5440d@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: qede: flower: validate control flagsAsbjørn Sloth Tønnesen1-0/+3
This driver currently doesn't support any control flags. Use flow_rule_match_has_control_flags() to check for control flags, such as can be set through `tc flower ... ip_flags frag`. In case any control flags are masked, flow_rule_match_has_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240511073705.230507-1-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'virtio_net-rx-enable-premapped-mode-by-default'Jakub Kicinski2-59/+38
Xuan Zhuo says: ==================== virtio_net: rx enable premapped mode by default Actually, for the virtio drivers, we can enable premapped mode whatever the value of use_dma_api. Because we provide the virtio dma apis. So the driver can enable premapped mode unconditionally. This patch set makes the big mode of virtio-net to support premapped mode. And enable premapped mode for rx by default. Based on the following points, we do not use page pool to manage these pages: 1. virtio-net uses the DMA APIs wrapped by virtio core. Therefore, we can only prevent the page pool from performing DMA operations, and let the driver perform DMA operations on the allocated pages. 2. But when the page pool releases the page, we have no chance to execute dma unmap. 3. A solution to #2 is to execute dma unmap every time before putting the page back to the page pool. (This is actually a waste, we don't execute unmap so frequently.) 4. But there is another problem, we still need to use page.dma_addr to save the dma address. Using page.dma_addr while using page pool is unsafe behavior. 5. And we need space the chain the pages submitted once to virtio core. More: https://lore.kernel.org/all/CACGkMEu=Aok9z2imB_c5qVuujSh=vjj1kx12fy9N7hqyi+M5Ow@mail.gmail.com/ Why we do not use the page space to store the dma? http://lore.kernel.org/all/CACGkMEuyeJ9mMgYnnB42=hw6umNuo=agn7VBqBqYPd7GN=+39Q@mail.gmail.com ==================== Link: https://lore.kernel.org/r/20240511031404.30903-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysvirtio_net: remove the misleading commentXuan Zhuo1-1/+0
We call the build_skb() actually without copying data. The comment is misleading. So remove it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240511031404.30903-5-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysvirtio_net: rx remove premapped failover codeXuan Zhuo1-50/+35
Now, the premapped mode can be enabled unconditionally. So we can remove the failover code for merge and small mode. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20240511031404.30903-4-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysvirtio_net: big mode skip the unmap checkXuan Zhuo1-2/+2
The virtio-net big mode did not enable premapped mode, so we did not need to check the unmap. And the subsequent commit will remove the failover code for failing enable premapped for merge and small mode. So we need to remove the checking do_dma code in the big mode path. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240511031404.30903-3-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysvirtio_ring: enable premapped mode whatever use_dma_apiXuan Zhuo1-6/+1
Now, we have virtio DMA APIs, the driver can be the premapped mode whatever the virtio core uses dma api or not. So remove the limit of checking use_dma_api from virtqueue_set_dma_premapped(). Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240511031404.30903-2-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge tag 'for-netdev' of ↵Jakub Kicinski134-4738/+9458
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-05-13 We've added 119 non-merge commits during the last 14 day(s) which contain a total of 134 files changed, 9462 insertions(+), 4742 deletions(-). The main changes are: 1) Add BPF JIT support for 32-bit ARCv2 processors, from Shahab Vahedi. 2) Add BPF range computation improvements to the verifier in particular around XOR and OR operators, refactoring of checks for range computation and relaxing MUL range computation so that src_reg can also be an unknown scalar, from Cupertino Miranda. 3) Add support to attach kprobe BPF programs through kprobe_multi link in a session mode, meaning, a BPF program is attached to both function entry and return, the entry program can decide if the return program gets executed and the entry program can share u64 cookie value with return program. Session mode is a common use-case for tetragon and bpftrace, from Jiri Olsa. 4) Fix a potential overflow in libbpf's ring__consume_n() and improve libbpf as well as BPF selftest's struct_ops handling, from Andrii Nakryiko. 5) Improvements to BPF selftests in context of BPF gcc backend, from Jose E. Marchesi & David Faust. 6) Migrate remaining BPF selftest tests from test_sock_addr.c to prog_test- -style in order to retire the old test, run it in BPF CI and additionally expand test coverage, from Jordan Rife. 7) Big batch for BPF selftest refactoring in order to remove duplicate code around common network helpers, from Geliang Tang. 8) Another batch of improvements to BPF selftests to retire obsolete bpf_tcp_helpers.h as everything is available vmlinux.h, from Martin KaFai Lau. 9) Fix BPF map tear-down to not walk the map twice on free when both timer and wq is used, from Benjamin Tissoires. 10) Fix BPF verifier assumptions about socket->sk that it can be non-NULL, from Alexei Starovoitov. 11) Change BTF build scripts to using --btf_features for pahole v1.26+, from Alan Maguire. 12) Small improvements to BPF reusing struct_size() and krealloc_array(), from Andy Shevchenko. 13) Fix s390 JIT to emit a barrier for BPF_FETCH instructions, from Ilya Leoshkevich. 14) Extend TCP ->cong_control() callback in order to feed in ack and flag parameters and allow write-access to tp->snd_cwnd_stamp from BPF program, from Miao Xu. 15) Add support for internal-only per-CPU instructions to inline bpf_get_smp_processor_id() helper call for arm64 and riscv64 BPF JITs, from Puranjay Mohan. 16) Follow-up to remove the redundant ethtool.h from tooling infrastructure, from Tushar Vyavahare. 17) Extend libbpf to support "module:<function>" syntax for tracing programs, from Viktor Malik. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits) bpf: make list_for_each_entry portable bpf: ignore expected GCC warning in test_global_func10.c bpf: disable strict aliasing in test_global_func9.c selftests/bpf: Free strdup memory in xdp_hw_metadata selftests/bpf: Fix a few tests for GCC related warnings. bpf: avoid gcc overflow warning in test_xdp_vlan.c tools: remove redundant ethtool.h from tooling infra selftests/bpf: Expand ATTACH_REJECT tests selftests/bpf: Expand getsockname and getpeername tests sefltests/bpf: Expand sockaddr hook deny tests selftests/bpf: Expand sockaddr program return value tests selftests/bpf: Retire test_sock_addr.(c|sh) selftests/bpf: Remove redundant sendmsg test cases selftests/bpf: Migrate ATTACH_REJECT test cases selftests/bpf: Migrate expected_attach_type tests selftests/bpf: Migrate wildcard destination rewrite test selftests/bpf: Migrate sendmsg6 v4 mapped address tests selftests/bpf: Migrate sendmsg deny test cases selftests/bpf: Migrate WILDCARD_IP test selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases ... ==================== Link: https://lore.kernel.org/r/20240513134114.17575-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: pcs: lynx: no need to read LPA in lynx_pcs_get_state_2500basex()Vladimir Oltean1-3/+2
Nothing useful is done with the LPA variable in lynx_pcs_get_state_2500basex(), we can just remove the read. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240513115345.2452799-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'mlx5-misc-patches'Jakub Kicinski6-114/+48
Tariq Toukan says: ==================== mlx5 misc patches This series includes patches for the mlx5 driver. Patch 1 by Shay enables LAG with HCAs of 8 ports. Patch 2 by Carolina optimizes the safe switch channels operation for the TX-only changes. Patch 3 by Parav cleans up some unused code. ==================== Link: https://lore.kernel.org/r/20240512124306.740898-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet/mlx5: Remove unused msix related exported APIsParav Pandit2-59/+0
MSIX irq allocation and free APIs are no longer in use. Hence, remove the dead code. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://lore.kernel.org/r/20240512124306.740898-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet/mlx5e: Modifying channels number and updating TX queuesCarolina Jubran3-51/+47
It is not appropriate for the mlx5e_num_channels_changed function to be called solely for updating the TX queues, even if the channels number has not been changed. Move the code responsible for updating the TC and TX queues from mlx5e_num_channels_changed and produce a new function called mlx5e_update_tc_and_tx_queues. This new function should only be called when the channels number remains unchanged. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512124306.740898-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet/mlx5: Enable 8 ports LAGShay Drory2-4/+1
This patch adds to mlx5 drivers support for 8 ports HCAs. Starting with ConnectX-8 HCAs with 8 ports are possible. As most driver parts aren't affected by such configuration most driver code is unchanged. Specially the only affected areas are: - Lag - Multiport E-Switch - Single FDB E-Switch All of the above are already factored in generic way, and LAG and VF LAG are tested, so all that left is to change a #define and remove checks which are no longer needed. However, Multiport E-Switch is not tested yet, so it is left untouched. This patch will allow to create hardware LAG/VF LAG when all 8 ports are added to the same bond device. for example, In order to activate the hardware lag a user can execute the following: ip link add bond0 type bond ip link set bond0 type bond miimon 100 mode 2 ip link set eth2 master bond0 ip link set eth3 master bond0 ip link set eth4 master bond0 ip link set eth5 master bond0 ip link set eth6 master bond0 ip link set eth7 master bond0 ip link set eth8 master bond0 ip link set eth9 master bond0 Where eth2, eth3, eth4, eth5, eth6, eth7, eth8 and eth9 are the PFs of the same HCA. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512124306.740898-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daystest: hsr: Extend the hsr_redbox.sh to have more SAN devices connectedLukasz Majewski1-22/+49
After this change the single SAN device (ns3eth1) is now replaced with two SAN devices - respectively ns4eth1 and ns5eth1. It is possible to extend this script to have more SAN devices connected by adding them to ns3br1 bridge. Signed-off-by: Lukasz Majewski <lukma@denx.de> Link: https://lore.kernel.org/r/20240510143710.3916631-1-lukma@denx.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'net-dsa-microchip-dcb-fixes'Jakub Kicinski3-69/+85
Oleksij Rempel says: ==================== net: dsa: microchip: DCB fixes This patch series address recommendation to rename IPV to IPM to avoid confusion with IPV name used in 802.1Qci PSFP. And restores default "PCP only" configuration as source of priorities to avoid possible regressions. ==================== Link: https://lore.kernel.org/r/20240510053828.2412516-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: dsa: microchip: dcb: set default apptrust to PCP onlyOleksij Rempel1-18/+3
Before DCB support, the KSZ driver had only PCP as source of packet priority values. To avoid regressions, make PCP only as default value. User will need enable DSCP support manually. This patch do not affect other KSZ8 related quirks. User will still be warned by setting not support configurations for the port 2. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240510053828.2412516-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: dsa: microchip: dcb: add comments for DSCP related functionsOleksij Rempel1-0/+31
All other functions are commented. Add missing comments to following functions: ksz_set_global_dscp_entry() ksz_port_add_dscp_prio() ksz_port_del_dscp_prio() Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240510053828.2412516-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: dsa: microchip: dcb: rename IPV to IPMOleksij Rempel3-51/+51
IPV is added and used term in 802.1Qci PSFP and merged into 802.1Q (from 802.1Q-2018) for another functions. Even it does similar operation holding temporal priority value internally (as it is named), because KSZ datasheet doesn't use the term of IPV (Internal Priority Value) and avoiding any confusion later when PSFP is in the Linux world, it is better to rename IPV to IPM (Internal Priority Mapping). In addition, LAN937x documentation already use IPV for 802.1Qci PSFP related functionality. Suggested-by: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Woojung Huh <woojung.huh@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240510053828.2412516-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysl2tp: Support different protocol versions with same IP/port quadrupleSamuel Thibault1-8/+10
628bc3e5a1be ("l2tp: Support several sockets with same IP/port quadruple") added support for several L2TPv2 tunnels using the same IP/port quadruple, but if an L2TPv3 socket exists it could eat all the trafic. We thus have to first use the version from the packet to get the proper tunnel, and only then check that the version matches. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Reviewed-by: James Chapman <jchapman@katalix.com> Link: https://lore.kernel.org/r/20240509205812.4063198-1-samuel.thibault@ens-lyon.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysynl: ensure exact-len value is resolvedAntonio Quartulli1-2/+2
For type String and Binary we are currently usinig the exact-len limit value as is without attempting any name resolution. However, the spec may specify the name of a constant rather than an actual value, which would result in using the constant name as is and thus break the policy. Ensure the limit value is passed to get_limit(), which will always attempt resolving the name before printing the policy rule. Signed-off-by: Antonio Quartulli <a@unstable.cc> Link: https://lore.kernel.org/r/20240510232202.24051-1-a@unstable.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'add-tx-stop-wake-counters'Jakub Kicinski6-3/+50
Daniel Jurgens says: ==================== Add TX stop/wake counters Several drivers provide TX stop and wake counters via ethtool stats. Add those to the netdev queue stats, and use them in virtio_net. ==================== Link: https://lore.kernel.org/r/20240510201927.1821109-1-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysvirtio_net: Add TX stopped and wake countersDaniel Jurgens1-2/+26
Add a tx queue stop and wake counters, they are useful for debugging. $ ./tools/net/ynl/cli.py --spec netlink/specs/netdev.yaml \ --dump qstats-get --json '{"scope": "queue"}' ... {'ifindex': 13, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 14756682850, 'tx-packets': 226465, 'tx-stop': 113208, 'tx-wake': 113208}, {'ifindex': 13, 'queue-id': 1, 'queue-type': 'tx', 'tx-bytes': 18167675008, 'tx-packets': 278660, 'tx-stop': 8632, 'tx-wake': 8632}] Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240510201927.1821109-3-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnetdev: Add queue stats for TX stop and wakeDaniel Jurgens5-1/+24
TX queue stop and wake are counted by some drivers. Support reporting these via netdev-genl queue stats. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240510201927.1821109-2-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daystcp: socket option to check for MPTCP fallback to TCPMatthieu Baerts (NGI0)3-0/+7
A way for an application to know if an MPTCP connection fell back to TCP is to use getsockopt(MPTCP_INFO) and look for errors. The issue with this technique is that the same errors -- EOPNOTSUPP (IPv4) and ENOPROTOOPT (IPv6) -- are returned if there was a fallback, *or* if the kernel doesn't support this socket option. The userspace then has to look at the kernel version to understand what the errors mean. It is not clean, and it doesn't take into account older kernels where the socket option has been backported. A cleaner way would be to expose this info to the TCP socket level. In case of MPTCP socket where no fallback happened, the socket options for the TCP level will be handled in MPTCP code, in mptcp_getsockopt_sol_tcp(). If not, that will be in TCP code, in do_tcp_getsockopt(). So MPTCP simply has to set the value 1, while TCP has to set 0. If the socket option is not supported, one of these two errors will be reported: - EOPNOTSUPP (95 - Operation not supported) for MPTCP sockets - ENOPROTOOPT (92 - Protocol not available) for TCP sockets, e.g. on the socket received after an 'accept()', when the client didn't request to use MPTCP: this socket will be a TCP one, even if the listen socket was an MPTCP one. With this new option, the kernel can return a clear answer to both "Is this kernel new enough to tell me the fallback status?" and "If it is new enough, is it currently a TCP or MPTCP socket?" questions, while not breaking the previous method. Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240509-upstream-net-next-20240509-mptcp-tcp_is_mptcp-v1-1-f846df999202@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch ↵Jakub Kicinski8-95/+224
'net-gro-remove-network_header-use-move-p-flush-flush_id-calculations-to-l4' Richard Gobert says: ==================== net: gro: remove network_header use, move p->{flush/flush_id} calculations to L4 The cb fields network_offset and inner_network_offset are used instead of skb->network_header throughout GRO. These fields are then leveraged in the next commit to remove flush_id state from napi_gro_cb, and stateful code in {ipv6,inet}_gro_receive which may be unnecessarily complicated due to encapsulation support in GRO. These fields are checked in L4 instead. 3rd patch adds tests for different flush_id flows in GRO. ==================== Link: https://lore.kernel.org/r/20240509190819.2985-1-richardbgobert@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysselftests/net: add flush id selftestsRichard Gobert1-0/+138
Added flush id selftests to test different cases where DF flag is set or unset and id value changes in the following packets. All cases where the packets should coalesce or should not coalesce are tested. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240509190819.2985-4-richardbgobert@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segmentRichard Gobert6-84/+73
{inet,ipv6}_gro_receive functions perform flush checks (ttl, flags, iph->id, ...) against all packets in a loop. These flush checks are used in all merging UDP and TCP flows. These checks need to be done only once and only against the found p skb, since they only affect flush and not same_flow. This patch leverages correct network header offsets from the cb for both outer and inner network headers - allowing these checks to be done only once, in tcp_gro_receive and udp_gro_receive_segment. As a result, NAPI_GRO_CB(p)->flush is not used at all. In addition, flush_id checks are more declarative and contained in inet_gro_flush, thus removing the need for flush_id in napi_gro_cb. This results in less parsing code for non-loop flush tests for TCP and UDP flows. To make sure results are not within noise range - I've made netfilter drop all TCP packets, and measured CPU performance in GRO (in this case GRO is responsible for about 50% of the CPU utilization). perf top while replaying 64 parallel IP/TCP streams merging in GRO: (gro_receive_network_flush is compiled inline to tcp_gro_receive) net-next: 6.94% [kernel] [k] inet_gro_receive 3.02% [kernel] [k] tcp_gro_receive patch applied: 4.27% [kernel] [k] tcp_gro_receive 4.22% [kernel] [k] inet_gro_receive perf top while replaying 64 parallel IP/IP/TCP streams merging in GRO (same results for any encapsulation, in this case inet_gro_receive is top offender in net-next) net-next: 10.09% [kernel] [k] inet_gro_receive 2.08% [kernel] [k] tcp_gro_receive patch applied: 6.97% [kernel] [k] inet_gro_receive 3.68% [kernel] [k] tcp_gro_receive Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240509190819.2985-3-richardbgobert@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: gro: use cb instead of skb->network_headerRichard Gobert5-11/+13
This patch converts references of skb->network_header to napi_gro_cb's network_offset and inner_network_offset. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240509190819.2985-2-richardbgobert@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge branch 'ena-driver-changes-may-2024'Jakub Kicinski7-28/+73
David Arinzon says: ==================== ENA driver changes May 2024 This patchset contains several misc and minor changes to the ENA driver. ==================== Link: https://lore.kernel.org/r/20240512134637.25299-1-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: ena: Change initial rx_usec intervalDavid Arinzon1-1/+1
For the purpose of obtaining better CPU utilization, minimum rx moderation interval is set to 20 usec. Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512134637.25299-6-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: ena: Changes around strscpy callsDavid Arinzon2-8/+25
strscpy copies as much of the string as possible, meaning that the destination string will be truncated in case of no space. As this is a non-critical error in our case, adding a debug level print for indication. This patch also removes a -1 which was added to ensure enough space for NUL, but strscpy destination string is guaranteed to be NUL-terminted, therefore, the -1 is not needed. Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512134637.25299-5-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: ena: Add validation for completion descriptors consistencyDavid Arinzon3-10/+30
Validate that `first` flag is set only for the first descriptor in multi-buffer packets. In case of an invalid descriptor, a reset will occur. A new reset reason for RX data corruption has been added. Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512134637.25299-4-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: ena: Reduce holes in ena_com structuresDavid Arinzon2-3/+3
This patch makes two changes in order to fill holes and reduce ther overall size of the structures ena_com_dev and ena_com_rx_ctx. Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512134637.25299-3-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: ena: Add a counter for driver's reset failuresDavid Arinzon3-6/+14
This patch adds a counter to the ena_adapter struct in order to keep track of reset failures. The counter is incremented every time either ena_restore_device() or ena_destroy_device() fail. Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512134637.25299-2-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysselftests: netfilter: nft_flowtable.sh: bump socat timeout to 1mFlorian Westphal1-2/+3
Now that this test runs in netdev CI it looks like 10s isn't enough for debug kernels: selftests: net/netfilter: nft_flowtable.sh 2024/05/10 20:33:08 socat[12204] E write(7, 0x563feb16a000, 8192): Broken pipe FAIL: file mismatch for ns1 -> ns2 -rw------- 1 root root 37345280 May 10 20:32 /tmp/tmp.Am0yEHhNqI ... Looks like socat gets zapped too quickly, so increase timeout to 1m. Could also reduce tx file size for KSFT_MACHINE_SLOW, but its preferrable to have same test for both debug and nondebug. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240511064814.561525-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysselftests: net: use upstream mtoolsVladimir Oltean1-2/+17
Joachim kindly merged the IPv6 support in https://github.com/troglobit/mtools/pull/2, so we can just use his version now. A few more fixes subsequently came in for IPv6, so even better. Check that the deployed mtools version is 3.0 or above. Note that the version check breaks compatibility with my fork where I didn't bump the version, but I assume that won't be a problem. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20240510112856.1262901-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysselftest: epoll_busy_poll: Fix spelling mistake "couldnt" -> "couldn't"Colin Ian King1-1/+1
There is a spelling mistake in a TH_LOG message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240510084811.3299685-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysnet: phy: air_en8811h: reset netdev rules when LED is set manuallyDaniel Golle1-0/+4
Setting LED_OFF via brightness_set should deactivate hw control, so make sure netdev trigger rules also get cleared in that case. This fixes unwanted restoration of the default netdev trigger rules and matches the behaviour when using the 'netdev' trigger without any hardware offloading. Fixes: 71e79430117d ("net: phy: air_en8811h: Add the Airoha EN8811H PHY driver") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/5ed8ea615890a91fa4df59a7ae8311bbdf63cdcf.1715248281.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysMerge tag 'nf-next-24-05-12' of ↵Jakub Kicinski28-175/+639
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: Patch #1 skips transaction if object type provides no .update interface. Patch #2 skips NETDEV_CHANGENAME which is unused. Patch #3 enables conntrack to handle Multicast Router Advertisements and Multicast Router Solicitations from the Multicast Router Discovery protocol (RFC4286) as untracked opposed to invalid packets. From Linus Luessing. Patch #4 updates DCCP conntracker to mark invalid as invalid, instead of dropping them, from Jason Xing. Patch #5 uses NF_DROP instead of -NF_DROP since NF_DROP is 0, also from Jason. Patch #6 removes reference in netfilter's sysctl documentation on pickup entries which were already removed by Florian Westphal. Patch #7 removes check for IPS_OFFLOAD flag to disable early drop which allows to evict entries from the conntrack table, also from Florian. Patches #8 to #16 updates nf_tables pipapo set backend to allocate the datastructure copy on-demand from preparation phase, to better deal with OOM situations where .commit step is too late to fail. Series from Florian Westphal. Patch #17 adds a selftest with packetdrill to cover conntrack TCP state transitions, also from Florian. Patch #18 use GFP_KERNEL to clone elements from control plane to avoid quick atomic reserves exhaustion with large sets, reporter refers to million entries magnitude. * tag 'nf-next-24-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: allow clone callbacks to sleep selftests: netfilter: add packetdrill based conntrack tests netfilter: nft_set_pipapo: remove dirty flag netfilter: nft_set_pipapo: move cloning of match info to insert/removal path netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone netfilter: nft_set_pipapo: merge deactivate helper into caller netfilter: nft_set_pipapo: prepare walk function for on-demand clone netfilter: nft_set_pipapo: prepare destroy function for on-demand clone netfilter: nft_set_pipapo: make pipapo_clone helper return NULL netfilter: nft_set_pipapo: move prove_locking helper around netfilter: conntrack: remove flowtable early-drop test netfilter: conntrack: documentation: remove reference to non-existent sysctl netfilter: use NF_DROP instead of -NF_DROP netfilter: conntrack: dccp: try not to drop skb in conntrack netfilter: conntrack: fix ct-state for ICMPv6 Multicast Router Discovery netfilter: nf_tables: remove NETDEV_CHANGENAME from netdev chain event handler netfilter: nf_tables: skip transaction if update object is not implemented ==================== Link: https://lore.kernel.org/r/20240512161436.168973-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysbpf: make list_for_each_entry portableJose E. Marchesi4-10/+38
[Changes from V1: - The __compat_break has been abandoned in favor of a more readable can_loop macro that can be used anywhere, including loop conditions.] The macro list_for_each_entry is defined in bpf_arena_list.h as follows: #define list_for_each_entry(pos, head, member) \ for (void * ___tmp = (pos = list_entry_safe((head)->first, \ typeof(*(pos)), member), \ (void *)0); \ pos && ({ ___tmp = (void *)pos->member.next; 1; }); \ cond_break, \ pos = list_entry_safe((void __arena *)___tmp, typeof(*(pos)), member)) The macro cond_break, in turn, expands to a statement expression that contains a `break' statement. Compound statement expressions, and the subsequent ability of placing statements in the header of a `for' loop, are GNU extensions. Unfortunately, clang implements this GNU extension differently than GCC: - In GCC the `break' statement is bound to the containing "breakable" context in which the defining `for' appears. If there is no such context, GCC emits a warning: break statement without enclosing `for' o `switch' statement. - In clang the `break' statement is bound to the defining `for'. If the defining `for' is itself inside some breakable construct, then clang emits a -Wgcc-compat warning. This patch adds a new macro can_loop to bpf_experimental, that implements the same logic than cond_break but evaluates to a boolean expression. The patch also changes all the current instances of usage of cond_break withing the header of loop accordingly. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Link: https://lore.kernel.org/r/20240511212243.23477-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysbpf: ignore expected GCC warning in test_global_func10.cJose E. Marchesi1-0/+4
The BPF selftest global_func10 in progs/test_global_func10.c contains: struct Small { long x; }; struct Big { long x; long y; }; [...] __noinline int foo(const struct Big *big) { if (!big) return 0; return bpf_get_prandom_u32() < big->y; } [...] SEC("cgroup_skb/ingress") __failure __msg("invalid indirect access to stack") int global_func10(struct __sk_buff *skb) { const struct Small small = {.x = skb->len }; return foo((struct Big *)&small) ? 1 : 0; } GCC emits a "maybe uninitialized" warning for the code above, because it knows `foo' accesses `big->y'. Since the purpose of this selftest is to check that the verifier will fail on this sort of invalid memory access, this patch just silences the compiler warning. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240511212349.23549-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysbpf: disable strict aliasing in test_global_func9.cJose E. Marchesi1-0/+1
The BPF selftest test_global_func9.c performs type punning and breaks srict-aliasing rules. In particular, given: int global_func9(struct __sk_buff *skb) { int result = 0; [...] { const struct C c = {.x = skb->len, .y = skb->family }; result |= foo((const struct S *)&c); } } When building with strict-aliasing enabled (the default) the initialization of `c' gets optimized away in its entirely: [... no initialization of `c' ...] r1 = r10 r1 += -40 call foo w0 |= w6 Since GCC knows that `foo' accesses s->x, we get a "maybe uninitialized" warning. On the other hand, when strict-aliasing is disabled GCC only optimizes away the store to `.y': r1 = *(u32 *) (r6+0) *(u32 *) (r10+-40) = r1 ; This is .x = skb->len in `c' r1 = r10 r1 += -40 call foo w0 |= w6 In this case the warning is not emitted, because s-> is initialized. This patch disables strict aliasing in this test when building with GCC. clang seems to not optimize this particular code even when strict aliasing is enabled. Tested in bpf-next master. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240511212213.23418-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Free strdup memory in xdp_hw_metadataGeliang Tang1-0/+2
The strdup() function returns a pointer to a new string which is a duplicate of the string "ifname". Memory for the new string is obtained with malloc(), and need to be freed with free(). This patch adds this missing "free(saved_hwtstamp_ifname)" in cleanup() to avoid a potential memory leak in xdp_hw_metadata.c. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/af9bcccb96655e82de5ce2b4510b88c9c8ed5ed0.1715417367.git.tanggeliang@kylinos.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Fix a few tests for GCC related warnings.Cupertino Miranda4-29/+37
This patch corrects a few warnings to allow selftests to compile for GCC. -- progs/cpumask_failure.c -- progs/bpf_misc.h:136:22: error: ‘cpumask’ is used uninitialized [-Werror=uninitialized] 136 | #define __sink(expr) asm volatile("" : "+g"(expr)) | ^~~ progs/cpumask_failure.c:68:9: note: in expansion of macro ‘__sink’ 68 | __sink(cpumask); The macro __sink(cpumask) with the '+' contraint modifier forces the the compiler to expect a read and write from cpumask. GCC detects that cpumask is never initialized and reports an error. This patch removes the spurious non required definitions of cpumask. -- progs/dynptr_fail.c -- progs/dynptr_fail.c:1444:9: error: ‘ptr1’ may be used uninitialized [-Werror=maybe-uninitialized] 1444 | bpf_dynptr_clone(&ptr1, &ptr2); Many of the tests in the file are related to the detection of uninitialized pointers by the verifier. GCC is able to detect possible uninitialized values, and reports this as an error. The patch initializes all of the previous uninitialized structs. -- progs/test_tunnel_kern.c -- progs/test_tunnel_kern.c:590:9: error: array subscript 1 is outside array bounds of ‘struct geneve_opt[1]’ [-Werror=array-bounds=] 590 | *(int *) &gopt.opt_data = bpf_htonl(0xdeadbeef); | ^~~~~~~~~~~~~~~~~~~~~~~ progs/test_tunnel_kern.c:575:27: note: at offset 4 into object ‘gopt’ of size 4 575 | struct geneve_opt gopt; This tests accesses beyond the defined data for the struct geneve_opt which contains as last field "u8 opt_data[0]" which clearly does not get reserved space (in stack) in the function header. This pattern is repeated in ip6geneve_set_tunnel and geneve_set_tunnel functions. GCC is able to see this and emits a warning. The patch introduces a local struct that allocates enough space to safely allow the write to opt_data field. -- progs/jeq_infer_not_null_fail.c -- progs/jeq_infer_not_null_fail.c:21:40: error: array subscript ‘struct bpf_map[0]’ is partly outside array bounds of ‘struct <anonymous>[1]’ [-Werror=array-bounds=] 21 | struct bpf_map *inner_map = map->inner_map_meta; | ^~ progs/jeq_infer_not_null_fail.c:14:3: note: object ‘m_hash’ of size 32 14 | } m_hash SEC(".maps"); This example defines m_hash in the context of the compilation unit and casts it to struct bpf_map which is much smaller than the size of struct bpf_map. It errors out in GCC when it attempts to access an element that would be defined in struct bpf_map outsize of the defined limits for m_hash. This patch disables the warning through a GCC pragma. This changes were tested in bpf-next master selftests without any regressions. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Cc: jose.marchesi@oracle.com Cc: david.faust@oracle.com Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Link: https://lore.kernel.org/r/20240510183850.286661-2-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysbpf: avoid gcc overflow warning in test_xdp_vlan.cDavid Faust1-1/+1
This patch fixes an integer overflow warning raised by GCC in xdp_prognum1 of progs/test_xdp_vlan.c: GCC-BPF [test_maps] test_xdp_vlan.bpf.o progs/test_xdp_vlan.c: In function 'xdp_prognum1': progs/test_xdp_vlan.c:163:25: error: integer overflow in expression '(short int)(((__builtin_constant_p((int)vlan_hdr->h_vlan_TCI)) != 0 ? (int)(short unsigned int)((short int)((int)vlan_hdr->h_vlan_TCI << 8 >> 8) << 8 | (short int)((int)vlan_hdr->h_vlan_TCI << 0 >> 8 << 0)) & 61440 : (int)__builtin_bswap16(vlan_hdr->h_vlan_TCI) & 61440) << 8 >> 8) << 8' of type 'short int' results in '0' [-Werror=overflow] 163 | bpf_htons((bpf_ntohs(vlan_hdr->h_vlan_TCI) & 0xf000) | ^~~~~~~~~ The problem lies with the expansion of the bpf_htons macro and the expression passed into it. The bpf_htons macro (and similarly the bpf_ntohs macro) expand to a ternary operation using either __builtin_bswap16 or ___bpf_swab16 to swap the bytes, depending on whether the expression is constant. For an expression, with 'value' as a u16, like: bpf_htons (value & 0xf000) The entire (value & 0xf000) is 'x' in the expansion of ___bpf_swab16 and we get as one part of the expanded swab16: ((__u16)(value & 0xf000) << 8 >> 8 << 8 This will always evaluate to 0, which is intentional since this subexpression deals with the byte guaranteed to be 0 by the mask. However, GCC warns because the precise reason this always evaluates to 0 is an overflow. Specifically, the plain 0xf000 in the expression is a signed 32-bit integer, which causes 'value' to also be promoted to a signed 32-bit integer, and the combination of the 8-bit left shift and down-cast back to __u16 results in a signed overflow (really a 'warning: overflow in conversion from int to __u16' which is propegated up through the rest of the expression leading to the ultimate overflow warning above), which is a valid warning despite being the intended result of this code. Clang does not warn on this case, likely because it performs constant folding later in the compilation process relative to GCC. It seems that by the time clang does constant folding for this expression, the side of the ternary with this overflow has already been discarded. Fortunately, this warning is easily silenced by simply making the 0xf000 mask explicitly unsigned. This has no impact on the result. Signed-off-by: David Faust <david.faust@oracle.com> Cc: jose.marchesi@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240508193512.152759-1-david.faust@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daystools: remove redundant ethtool.h from tooling infraTushar Vyavahare1-2271/+0
Remove the redundant ethtool.h header file from tools/include/uapi/linux. The file is unnecessary as the system uses the kernel's include/uapi/linux/ethtool.h directly. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240508104123.434769-1-tushar.vyavahare@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysMerge branch 'retire-progs-test_sock_addr'Alexei Starovoitov19-1274/+1992
Jordan Rife says: ==================== Retire progs/test_sock_addr.c This patch series migrates remaining tests from bpf/test_sock_addr.c to prog_tests/sock_addr.c and progs/verifier_sock_addr.c in order to fully retire the old-style test program and expands test coverage to test previously untested scenarios related to sockaddr hooks. This is a continuation of the work started recently during the expansion of prog_tests/sock_addr.c. Link: https://lore.kernel.org/bpf/20240429214529.2644801-1-jrife@google.com/T/#u ======= Patches ======= * Patch 1 moves tests that check valid return values for recvmsg hooks into progs/verifier_sock_addr.c, a new addition to the verifier test suite. * Patches 2-5 lay the groundwork for test migration, enabling prog_tests/sock_addr.c to handle more test dimensions. * Patches 6-11 move existing tests to prog_tests/sock_addr.c. * Patch 12 removes some redundant test cases. * Patches 14-17 expand on existing test coverage. ==================== Link: https://lore.kernel.org/r/20240510190246.3247730-1-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Expand ATTACH_REJECT testsJordan Rife1-0/+187
This expands coverage for ATTACH_REJECT tests to include connect_unix, sendmsg_unix, recvmsg*, getsockname*, and getpeername*. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-18-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Expand getsockname and getpeername testsJordan Rife5-2/+412
This expands coverage for getsockname and getpeername hooks to include getsockname4, getsockname6, getpeername4, and getpeername6. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-17-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 dayssefltests/bpf: Expand sockaddr hook deny testsJordan Rife7-0/+378
This patch expands test coverage for EPERM tests to include connect and bind calls and rounds out the coverage for sendmsg by adding tests for sendmsg_unix. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-16-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Expand sockaddr program return value testsJordan Rife1-0/+294
This patch expands verifier coverage for program return values to cover bind, connect, sendmsg, getsockname, and getpeername hooks. It also rounds out the recvmsg coverage by adding test cases for recvmsg_unix hooks. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-15-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Retire test_sock_addr.(c|sh)Jordan Rife4-636/+1
Fully remove test_sock_addr.c and test_sock_addr.sh, as test coverage has been fully moved to prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-14-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Remove redundant sendmsg test casesJordan Rife1-161/+0
Remove these test cases completely, as the same behavior is already covered by other sendmsg* test cases in prog_tests/sock_addr.c. This just rewrites the destination address similar to sendmsg_v4_prog and sendmsg_v6_prog. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-13-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Migrate ATTACH_REJECT test casesJordan Rife2-146/+102
Migrate test case from bpf/test_sock_addr.c ensuring that program attachment fails when using an inappropriate attach type. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-12-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Migrate expected_attach_type testsJordan Rife2-84/+96
Migrates tests from progs/test_sock_addr.c ensuring that programs fail to load when the expected attach type does not match. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-11-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Migrate wildcard destination rewrite testJordan Rife3-20/+37
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg respects when sendmsg6 hooks rewrite the destination IP with the IPv6 wildcard IP, [::]. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-10-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Migrate sendmsg6 v4 mapped address testsJordan Rife3-20/+42
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg returns -ENOTSUPP when sending to an IPv4-mapped IPv6 address to prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-9-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Migrate sendmsg deny test casesJordan Rife4-45/+110
This set of tests checks that sendmsg calls are rejected (return -EPERM) when the sendmsg* hook returns 0. Replace those in bpf/test_sock_addr.c with corresponding tests in prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-8-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Migrate WILDCARD_IP testJordan Rife3-20/+56
Move wildcard IP sendmsg test case out of bpf/test_sock_addr.c into prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-7-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test casesJordan Rife1-20/+58
In preparation to move test cases from bpf/test_sock_addr.c that expect system calls to return ENOTSUPP or EPERM, this patch propagates errno from relevant system calls up to test_sock_addr() where the result can be checked. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-6-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Handle ATTACH_REJECT test casesJordan Rife1-1/+34
In preparation to move test cases from bpf/test_sock_addr.c that expect ATTACH_REJECT, this patch adds BPF_SKEL_FUNCS_RAW to generate load and destroy functions that use bpf_prog_attach() to control the attach_type. The normal load functions use bpf_program__attach_cgroup which does not have the same degree of control over the attach type, as bpf_program_attach_fd() calls bpf_link_create() with the attach type extracted from prog using bpf_program__expected_attach_type(). It is currently not possible to modify the attach type before bpf_program__attach_cgroup() is called, since bpf_program__set_expected_attach_type() has no effect after the program is loaded. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-5-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Handle LOAD_REJECT test casesJordan Rife1-5/+98
In preparation to move test cases from bpf/test_sock_addr.c that expect LOAD_REJECT, this patch adds expected_attach_type and extends load_fn to accept an expected attach type and a flag indicating whether or not rejection is expected. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-4-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Use program name for skel load/destroy functionsJordan Rife1-46/+50
In preparation to migrate tests from bpf/test_sock_addr.c to sock_addr.c, update BPF_SKEL_FUNCS so that it generates functions based on prog_name instead of skel_name. This allows us to differentiate between programs in the same skeleton. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-3-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysselftests/bpf: Migrate recvmsg* return code tests to verifier_sock_addr.cJordan Rife3-70/+39
This set of tests check that the BPF verifier rejects programs with invalid return codes (recvmsg4 and recvmsg6 hooks can only return 1). This patch replaces the tests in test_sock_addr.c with verifier_sock_addr.c, a new verifier prog_tests for sockaddr hooks, in a step towards fully retiring test_sock_addr.c. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240510190246.3247730-2-jrife@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysriscv, bpf: make some atomic operations fully orderedPuranjay Mohan1-10/+10
The BPF atomic operations with the BPF_FETCH modifier along with BPF_XCHG and BPF_CMPXCHG are fully ordered but the RISC-V JIT implements all atomic operations except BPF_CMPXCHG with relaxed ordering. Section 8.1 of the "The RISC-V Instruction Set Manual Volume I: Unprivileged ISA" [1], titled, "Specifying Ordering of Atomic Instructions" says: | To provide more efficient support for release consistency [5], each | atomic instruction has two bits, aq and rl, used to specify additional | memory ordering constraints as viewed by other RISC-V harts. and | If only the aq bit is set, the atomic memory operation is treated as | an acquire access. | If only the rl bit is set, the atomic memory operation is treated as a | release access. | | If both the aq and rl bits are set, the atomic memory operation is | sequentially consistent. Fix this by setting both aq and rl bits as 1 for operations with BPF_FETCH and BPF_XCHG. [1] https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64") Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20240505201633.123115-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysriscv, bpf: Fix typo in commentXiao Wang1-2/+2
We can use either "instruction" or "insn" in the comment. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20240507111618.437121-1-xiao.w.wang@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 dayss390/bpf: Emit a barrier for BPF_FETCH instructionsIlya Leoshkevich1-2/+6
BPF_ATOMIC_OP() macro documentation states that "BPF_ADD | BPF_FETCH" should be the same as atomic_fetch_add(), which is currently not the case on s390x: the serialization instruction "bcr 14,0" is missing. This applies to "and", "or" and "xor" variants too. s390x is allowed to reorder stores with subsequent fetches from different addresses, so code relying on BPF_FETCH acting as a barrier, for example: stw [%r0], 1 afadd [%r1], %r2 ldxw %r3, [%r4] may be broken. Fix it by emitting "bcr 14,0". Note that a separate serialization instruction is not needed for BPF_XCHG and BPF_CMPXCHG, because COMPARE AND SWAP performs serialization itself. Fixes: ba3b86b9cef0 ("s390/bpf: Implement new atomic ops") Reported-by: Puranjay Mohan <puranjay12@gmail.com> Closes: https://lore.kernel.org/bpf/mb61p34qvq3wf.fsf@kernel.org/ Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20240507000557.12048-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysMerge branch 'bpf-inline-helpers-in-arm64-and-riscv-jits'Alexei Starovoitov8-0/+132
Puranjay Mohan says: ==================== bpf: Inline helpers in arm64 and riscv JITs Changes in v5 -> v6: arm64 v5: https://lore.kernel.org/all/20240430234739.79185-1-puranjay@kernel.org/ riscv v2: https://lore.kernel.org/all/20240430175834.33152-1-puranjay@kernel.org/ - Combine riscv and arm64 changes in single series - Some coding style fixes Changes in v4 -> v5: v4: https://lore.kernel.org/all/20240429131647.50165-1-puranjay@kernel.org/ - Implement the inlining of the bpf_get_smp_processor_id() in the JIT. NOTE: This needs to be based on: https://lore.kernel.org/all/20240430175834.33152-1-puranjay@kernel.org/ to be built. Manual run of bpf-ci with this series rebased on above: https://github.com/kernel-patches/bpf/pull/6929 Changes in v3 -> v4: v3: https://lore.kernel.org/all/20240426121349.97651-1-puranjay@kernel.org/ - Fix coding style issue related to C89 standards. Changes in v2 -> v3: v2: https://lore.kernel.org/all/20240424173550.16359-1-puranjay@kernel.org/ - Fixed the xlated dump of percpu mov to "r0 = &(void __percpu *)(r0)" - Made ARM64 and x86-64 use the same code for inlining. The only difference that remains is the per-cpu address of the cpu_number. Changes in v1 -> v2: v1: https://lore.kernel.org/all/20240405091707.66675-1-puranjay12@gmail.com/ - Add a patch to inline bpf_get_smp_processor_id() - Fix an issue in MRS instruction encoding as pointed out by Will - Remove CONFIG_SMP check because arm64 kernel always compiles with CONFIG_SMP This series adds the support of internal only per-CPU instructions and inlines the bpf_get_smp_processor_id() helper call for ARM64 and RISC-V BPF JITs. Here is an example of calls to bpf_get_smp_processor_id() and percpu_array_map_lookup_elem() before and after this series on ARM64. BPF ===== BEFORE AFTER -------- ------- int cpu = bpf_get_smp_processor_id(); int cpu = bpf_get_smp_processor_id(); (85) call bpf_get_smp_processor_id#229032 (85) call bpf_get_smp_processor_id#8 p = bpf_map_lookup_elem(map, &zero); p = bpf_map_lookup_elem(map, &zero); (18) r1 = map[id:78] (18) r1 = map[id:153] (18) r2 = map[id:82][0]+65536 (18) r2 = map[id:157][0]+65536 (85) call percpu_array_map_lookup_elem#313512 (07) r1 += 496 (61) r0 = *(u32 *)(r2 +0) (35) if r0 >= 0x1 goto pc+5 (67) r0 <<= 3 (0f) r0 += r1 (79) r0 = *(u64 *)(r0 +0) (bf) r0 = &(void __percpu *)(r0) (05) goto pc+1 (b7) r0 = 0 ARM64 JIT =========== BEFORE AFTER -------- ------- int cpu = bpf_get_smp_processor_id(); int cpu = bpf_get_smp_processor_id(); mov x10, #0xfffffffffffff4d0 mrs x10, sp_el0 movk x10, #0x802b, lsl #16 ldr w7, [x10, #24] movk x10, #0x8000, lsl #32 blr x10 add x7, x0, #0x0 p = bpf_map_lookup_elem(map, &zero); p = bpf_map_lookup_elem(map, &zero); mov x0, #0xffff0003ffffffff mov x0, #0xffff0003ffffffff movk x0, #0xce5c, lsl #16 movk x0, #0xe0f3, lsl #16 movk x0, #0xca00 movk x0, #0x7c00 mov x1, #0xffff8000ffffffff mov x1, #0xffff8000ffffffff movk x1, #0x8bdb, lsl #16 movk x1, #0xb0c7, lsl #16 movk x1, #0x6000 movk x1, #0xe000 mov x10, #0xffffffffffff3ed0 add x0, x0, #0x1f0 movk x10, #0x802d, lsl #16 ldr w7, [x1] movk x10, #0x8000, lsl #32 cmp x7, #0x1 blr x10 b.cs 0x0000000000000090 add x7, x0, #0x0 lsl x7, x7, #3 add x7, x7, x0 ldr x7, [x7] mrs x10, tpidr_el1 add x7, x7, x10 b 0x0000000000000094 mov x7, #0x0 Performance improvement found using benchmark[1] ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+-------------------+-------------------+--------------+ | Name | Before | After | % change | |---------------+-------------------+-------------------+--------------| | glob-arr-inc | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s | + 10.74% | | arr-inc | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s | + 5.37% | | hash-inc | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s | + 2.08% | +---------------+-------------------+-------------------+--------------+ [1] https://github.com/anakryiko/linux/commit/8dec900975ef RISCV64 JIT output for `call bpf_get_smp_processor_id` ======================================================= Before After -------- ------- auipc t1,0x848c ld a5,32(tp) jalr 604(t1) mv a5,a0 Benchmark using [1] on Qemu. ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+------------------+------------------+--------------+ | Name | Before | After | % change | |---------------+------------------+------------------+--------------| | glob-arr-inc | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s | + 24.04% | | arr-inc | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s | + 23.56% | | hash-inc | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s | + 32.18% | +---------------+------------------+------------------+--------------+ ==================== Link: https://lore.kernel.org/r/20240502151854.9810-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysbpf, arm64: inline bpf_get_smp_processor_id() helperPuranjay Mohan3-0/+28
Inline calls to bpf_get_smp_processor_id() helper in the JIT by emitting a read from struct thread_info. The SP_EL0 system register holds the pointer to the task_struct and thread_info is the first member of this struct. We can read the cpu number from the thread_info. Here is how the ARM64 JITed assembly changes after this commit: ARM64 JIT =========== BEFORE AFTER -------- ------- int cpu = bpf_get_smp_processor_id(); int cpu = bpf_get_smp_processor_id(); mov x10, #0xfffffffffffff4d0 mrs x10, sp_el0 movk x10, #0x802b, lsl #16 ldr w7, [x10, #24] movk x10, #0x8000, lsl #32 blr x10 add x7, x0, #0x0 Performance improvement using benchmark[1] ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+-------------------+-------------------+--------------+ | Name | Before | After | % change | |---------------+-------------------+-------------------+--------------| | glob-arr-inc | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s | + 10.74% | | arr-inc | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s | + 5.37% | | hash-inc | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s | + 2.08% | +---------------+-------------------+-------------------+--------------+ [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysarm64, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan4-0/+38
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. Since commit 7158627686f0 ("arm64: percpu: implement optimised pcpu access using tpidr_el1"), the per-cpu offset for the CPU is stored in the tpidr_el1/2 register of that CPU. To support this BPF instruction in the ARM64 JIT, the following ARM64 instructions are emitted: mov dst, src // Move src to dst, if src != dst mrs tmp, tpidr_el1/2 // Move per-cpu offset of the current cpu in tmp. add dst, dst, tmp // Add the per cpu offset to the dst. To measure the performance improvement provided by this change, the benchmark in [1] was used: Before: glob-arr-inc : 23.597 ± 0.012M/s arr-inc : 23.173 ± 0.019M/s hash-inc : 12.186 ± 0.028M/s After: glob-arr-inc : 23.819 ± 0.034M/s arr-inc : 23.285 ± 0.017M/s hash-inc : 12.419 ± 0.011M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysriscv, bpf: inline bpf_get_smp_processor_id()Puranjay Mohan4-0/+42
Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit. RISCV saves the pointer to the CPU's task_struct in the TP (thread pointer) register. This makes it trivial to get the CPU's processor id. As thread_info is the first member of task_struct, we can read the processor id from TP + offsetof(struct thread_info, cpu). RISCV64 JIT output for `call bpf_get_smp_processor_id` ====================================================== Before After -------- ------- auipc t1,0x848c ld a5,32(tp) jalr 604(t1) mv a5,a0 Benchmark using [1] on Qemu. ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+------------------+------------------+--------------+ | Name | Before | After | % change | |---------------+------------------+------------------+--------------| | glob-arr-inc | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s | + 24.04% | | arr-inc | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s | + 23.56% | | hash-inc | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s | + 32.18% | +---------------+------------------+------------------+--------------+ NOTE: This benchmark includes changes from this patch and the previous patch that implemented the per-cpu insn. [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysriscv, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan1-0/+24
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. RISC-V uses generic per-cpu implementation where the offsets for CPUs are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores the address of the task_struct in TP register. The first element in task_struct is struct thread_info, and we can get the cpu number by reading from the TP register + offsetof(struct thread_info, cpu). Once we have the cpu number in a register we read the offset for that cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this offset to the destination register. To measure the improvement from this change, the benchmark in [1] was used on Qemu: Before: glob-arr-inc : 1.127 ± 0.013M/s arr-inc : 1.121 ± 0.004M/s hash-inc : 0.681 ± 0.052M/s After: glob-arr-inc : 1.138 ± 0.011M/s arr-inc : 1.366 ± 0.006M/s hash-inc : 0.676 ± 0.001M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 daysARC: Add eBPF JIT supportShahab Vahedi9-2/+4611
This will add eBPF JIT support to the 32-bit ARCv2 processors. The implementation is qualified by running the BPF tests on a Synopsys HSDK board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU. The test_bpf.ko reports 2-10 fold improvements in execution time of its tests. For instance: test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS test_bpf: #33 tcpdump port 22 jited:1 120 224 260 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1 23 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS Deployment and structure ------------------------ The related codes are added to "arch/arc/net": - bpf_jit.h -- The interface that a back-end translator must provide - bpf_jit_core.c -- Knows how to handle the input eBPF byte stream - bpf_jit_arcv2.c -- The back-end code that knows the translation logic The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance to the whole process. Normally, the translation is done in one pass, namely the "normal pass". In case some relocations are not known during this pass, some data (arc_jit_data) is allocated for the next pass to come. This possible next (and last) pass is called the "extra pass". 1. Normal pass # The necessary pass 1a. Dry run # Get the whole JIT length, epilogue offset, etc. 1b. Emit phase # Allocate memory and start emitting instructions 2. Extra pass # Only needed if there are relocations to be fixed 2a. Patch relocations Support status -------------- The JIT compiler supports BPF instructions up to "cpu=v4". However, it does not yet provide support for: - Tail calls - Atomic operations - 64-bit division/remainder - BPF_PROBE_MEM* (exception table) The result of "test_bpf" test suite on an HSDK board is: hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] All the failing test cases are due to the ones that were not JIT'ed. Categorically, they can be represented as: .-----------.------------.-------------. | test type | opcodes | # of cases | |-----------+------------+-------------| | atomic | 0xC3, 0xDB | 149 | | div64 | 0x37, 0x3F | 22 | | mod64 | 0x97, 0x9F | 15 | `-----------^------------+-------------| | (total) 186 | `-------------' Setup: build config ------------------- The following configs must be set to have a working JIT test: CONFIG_BPF_JIT=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_TEST_BPF=m The following options are not necessary for the tests module, but are good to have: CONFIG_DEBUG_INFO=y # prerequisite for below CONFIG_DEBUG_INFO_BTF=y # so bpftool can generate vmlinux.h CONFIG_FTRACE=y # CONFIG_BPF_SYSCALL=y # all these options lead to CONFIG_KPROBE_EVENTS=y # having CONFIG_BPF_EVENTS=y CONFIG_PERF_EVENTS=y # Some BPF programs provide data through /sys/kernel/debug: CONFIG_DEBUG_FS=y arc# mount -t debugfs debugfs /sys/kernel/debug Setup: elfutils --------------- The libdw.{so,a} library that is used by pahole for processing the final binary must come from elfutils 0.189 or newer. The support for ARCv2 [1] has been added since that version. [1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7 Setup: pahole ------------- The line below in linux/scripts/Makefile.btf must be commented out: pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats Or else, the build will fail: $ make V=1 ... BTF .btf.vmlinux.bin.o pahole -J --btf_gen_floats \ -j --lang_exclude=rust \ --skip_encoding_btf_inconsistent_proto \ --btf_gen_optimized .tmp_vmlinux.btf Complex, interval and imaginary float types are not supported Encountered error while encoding BTF. ... BTFIDS vmlinux ./tools/bpf/resolve_btfids/resolve_btfids vmlinux libbpf: failed to find '.BTF' ELF section in vmlinux FAILED: load BTF from vmlinux: No data available This is due to the fact that the ARC toolchains generate "complex float" DIE entries in libgcc and at the moment, pahole can't handle such entries. Running the tests ----------------- host$ scp /bld/linux/lib/test_bpf.ko arc: arc # sysctl net.core.bpf_jit_enable=1 arc # insmod test_bpf.ko test_suite=test_bpf ... test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] Acknowledgments --------------- - Claudiu Zissulescu for his unwavering support - Yuriy Kolerov for testing and troubleshooting - Vladimir Isaev for the pahole workaround - Sergey Matyukevich for paving the road by adding the interpreter support Signed-off-by: Shahab Vahedi <shahab@synopsys.com> Link: https://lore.kernel.org/r/20240430145604.38592-1-list+bpf@vahedi.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
8 daysMerge branch '40GbE' of ↵Jakub Kicinski8-6/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-05-08 (most Intel drivers) This series contains updates to i40e, iavf, ice, igb, igc, e1000e, and ixgbe drivers. Asbjørn Sloth Tønnesen adds checks against supported flower control flags for i40e, iavf, ice, and igb drivers. Michal corrects filters removed during eswitch release for ice. Corinna Vinschen defers PTP initialization to later in probe so that netdev log entry is initialized on igc. Ilpo Järvinen removes a couple of unused, duplicate defines on e1000e and ixgbe. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD duplicates igc: fix a log entry using uninitialized netdev ice: remove correct filters during eswitch release igb: flower: validate control flags ice: flower: validate control flags iavf: flower: validate control flags i40e: flower: validate control flags ==================== Link: https://lore.kernel.org/r/20240508173342.2760994-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysMerge branch 'net-qede-convert-filter-code-to-use-extack'Jakub Kicinski1-51/+63
Asbjørn Sloth Tønnesen says: ==================== net: qede: convert filter code to use extack This series converts the filter code in the qede driver to use NL_SET_ERR_MSG_*(extack, ...) for error handling. Patch 1-12 converts qede_parse_flow_attr() to use extack, along with all it's static helper functions. qede_parse_flow_attr() is used in two places: - qede_add_tc_flower_fltr() - qede_flow_spec_to_rule() In the latter call site extack is faked in the same way as is done in mlxsw (patch 12). While the conversion is going on, some error messages are silenced in between patch 1-12. If wanted could squash patch 1-12 in a v3, but I felt that it would be easier to review as 12 more trivial patches. Patch 13 and 14, finishes up by converting qede_parse_actions(), and ensures that extack is propagated to it, in both call contexts. v1: https://lore.kernel.org/netdev/20240507104421.1628139-1-ast@fiberby.net/ ==================== Link: https://lore.kernel.org/r/20240508143404.95901-1-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_parse_actions()Asbjørn Sloth Tønnesen1-2/+3
Convert DP_NOTICE/DP_INFO to NL_SET_ERR_MSG_MOD. Keep edev around for use with QEDE_RSS_COUNT(). Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-15-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: propagate extack through qede_flow_spec_validate()Asbjørn Sloth Tønnesen1-3/+4
Pass extack to qede_flow_spec_validate() when called in qede_flow_spec_to_rule(). Pass extack to qede_parse_actions(). Not converting qede_flow_spec_validate() to use extack for errors, as it's only called from qede_flow_spec_to_rule(), where extack is faked into a DP_NOTICE anyway, so opting to keep DP_VERBOSE/DP_NOTICE usage. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-14-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use faked extack in qede_flow_spec_to_rule()Asbjørn Sloth Tønnesen1-1/+4
Since qede_parse_flow_attr() now does error reporting through extack, then give it a fake extack and extract the error message afterwards if one was set. The extracted error message is then passed on through DP_NOTICE(), including messages that was earlier issued with DP_INFO(). This fake extack approach is already used by mlxsw_env_linecard_modules_power_mode_apply() in drivers/net/ethernet/mellanox/mlxsw/core_env.c Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-13-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_parse_flow_attr()Asbjørn Sloth Tønnesen1-12/+14
Convert qede_parse_flow_attr() to take extack, and drop the edev argument. Convert DP_NOTICE calls to use NL_SET_ERR_MSG_* instead. Pass extack in calls to qede_flow_parse_{tcp,udp}_v{4,6}(). In calls to qede_parse_flow_attr(), if extack is unavailable, then use NULL for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-12-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: add extack in qede_add_tc_flower_fltr()Asbjørn Sloth Tønnesen1-1/+2
Define extack locally, to reduce line lengths and aid future users. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-11-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_flow_parse_udp_v4()Asbjørn Sloth Tønnesen1-4/+4
Convert qede_flow_parse_udp_v4() to take extack, and drop the edev argument. Pass extack in call to qede_flow_parse_v4_common(). In call to qede_flow_parse_udp_v4(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-10-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_flow_parse_udp_v6()Asbjørn Sloth Tønnesen1-4/+4
Convert qede_flow_parse_udp_v6() to take extack, and drop the edev argument. Pass extack in call to qede_flow_parse_v6_common(). In call to qede_flow_parse_udp_v6(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-9-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_flow_parse_tcp_v4()Asbjørn Sloth Tønnesen1-4/+4
Convert qede_flow_parse_tcp_v4() to take extack, and drop the edev argument. Pass extack in call to qede_flow_parse_v4_common(). In call to qede_flow_parse_tcp_v4(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-8-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_flow_parse_tcp_v6()Asbjørn Sloth Tønnesen1-4/+4
Convert qede_flow_parse_tcp_v6() to take extack, and drop the edev argument. Pass extack in call to qede_flow_parse_v6_common(). In call to qede_flow_parse_tcp_v6(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-7-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_flow_parse_v4_common()Asbjørn Sloth Tønnesen1-7/+9
Convert qede_flow_parse_v4_common() to take extack, and drop the edev argument. Convert DP_NOTICE call to use NL_SET_ERR_MSG_MOD instead. Pass extack in calls to qede_flow_parse_ports() and qede_set_v4_tuple_to_profile(). In calls to qede_flow_parse_v4_common(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-6-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_flow_parse_v6_common()Asbjørn Sloth Tønnesen1-8/+9
Convert qede_flow_parse_v6_common() to take extack, and drop the edev argument. Convert DP_NOTICE call to use NL_SET_ERR_MSG_MOD instead. Pass extack in calls to qede_flow_parse_ports() and qede_set_v6_tuple_to_profile(). In calls to qede_flow_parse_v6_common(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-5-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_set_v4_tuple_to_profile()Asbjørn Sloth Tønnesen1-4/+4
Convert qede_set_v4_tuple_to_profile() to take extack, and drop the edev argument. Convert DP_INFO call to use NL_SET_ERR_MSG_MOD instead. In calls to qede_set_v4_tuple_to_profile(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-4-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_set_v6_tuple_to_profile()Asbjørn Sloth Tønnesen1-5/+5
Convert qede_set_v6_tuple_to_profile() to take extack, and drop the edev argument. Convert DP_INFO call to use NL_SET_ERR_MSG_MOD instead. In calls to qede_set_v6_tuple_to_profile(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-3-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qede: use extack in qede_flow_parse_ports()Asbjørn Sloth Tønnesen1-5/+6
Convert qede_flow_parse_ports to use extack, and drop the edev argument. Convert DP_NOTICE call to use NL_SET_ERR_MSG_MOD instead. In calls to qede_flow_parse_ports(), use NULL as extack for now, until a subsequent patch makes extack available. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508143404.95901-2-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: usb: smsc95xx: stop lying about skb->truesizeEric Dumazet1-8/+7
Some usb drivers try to set small skb->truesize and break core networking stacks. In this patch, I removed one of the skb->truesize override. I also replaced one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") and 4ce62d5b2f7a ("net: usb: ax88179_178a: stop lying about skb->truesize") v3: also fix a sparse error ( https://lore.kernel.org/oe-kbuild-all/202405091310.KvncIecx-lkp@intel.com/ ) v2: leave the skb_trim() game because smsc95xx_rx_csum_offload() needs the csum part. (Jakub) While we are it, use get_unaligned() in smsc95xx_rx_csum_offload(). Fixes: 2f7ca802bdae ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steve Glendinning <steve.glendinning@shawell.net> Cc: UNGLinuxDriver@microchip.com Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240509083313.2113832-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: dsa: microchip: Fix spellig mistake "configur" -> "configure"Colin Ian King1-1/+1
There is a spelling mistake in a dev_err message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240509065023.3033397-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysaf_unix: Add dead flag to struct scm_fp_list.Kuniyuki Iwashima3-4/+12
Commit 1af2dface5d2 ("af_unix: Don't access successor in unix_del_edges() during GC.") fixed use-after-free by avoid accessing edge->successor while GC is in progress. However, there could be a small race window where another process could call unix_del_edges() while gc_in_progress is true and __skb_queue_purge() is on the way. So, we need another marker for struct scm_fp_list which indicates if the skb is garbage-collected. This patch adds dead flag in struct scm_fp_list and set it true before calling __skb_queue_purge(). Fixes: 1af2dface5d2 ("af_unix: Don't access successor in unix_del_edges() during GC.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240508171150.50601-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: ethernet: adi: adin1110: Replace linux/gpio.h by proper oneAndy Shevchenko1-1/+1
linux/gpio.h is deprecated and subject to remove. The driver doesn't use it directly, replace it with what is really being used. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508114519.972082-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysocteontx2-pf: Reuse Transmit queue/Send queue index of HTB classHariprasad Kelam1-1/+79
Real number of Transmit queues are incremented when user enables HTB class and vice versa. Depending on SKB priority driver returns transmit queue (Txq). Transmit queues and Send queues are one-to-one mapped. In few scenarios, Driver is returning transmit queue value which is greater than real number of transmit queue and Stack detects this as error and overwrites transmit queue value. For example user has added two classes and real number of queues are incremented accordingly - tc class add dev eth1 parent 1: classid 1:1 htb rate 100Mbit ceil 100Mbit prio 1 quantum 1024 - tc class add dev eth1 parent 1: classid 1:2 htb rate 100Mbit ceil 200Mbit prio 7 quantum 1024 now if user deletes the class with id 1:1, driver decrements the real number of queues - tc class del dev eth1 classid 1:1 But for the class with id 1:2, driver is returning transmit queue value which is higher than real number of transmit queue leading to below error eth1 selects TX queue x, but real number of TX queues is x This patch solves the problem by assigning deleted class transmit queue/send queue to active class. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508070935.11501-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysMerge branch 'gve-minor-cleanups'Jakub Kicinski2-27/+19
Simon Horman says: ==================== gve: Minor cleanups This short patchset provides two minor cleanups for the gve driver. These were found by tooling as mentioned in each patch, and otherwise by inspection. No change in run time behaviour is intended. Each patch is compile tested only. v1: https://lore.kernel.org/r/20240503-gve-comma-v1-0-b50f965694ef@kernel.org ==================== Link: https://lore.kernel.org/r/20240508-gve-comma-v2-0-1ac919225f13@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysgve: Use ethtool_sprintf/puts() to fill stats stringsSimon Horman1-25/+17
Make use of standard helpers to simplify filling in stats strings. The first two ethtool_puts() changes address the following fortification warnings flagged by W=1 builds with clang-18. (The last ethtool_puts change does not because the warning relates to writing beyond the first element of an array, and gve_gstrings_priv_flags only has one element.) .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 562 | __read_overflow2_field(q_size_field, size); | ^ .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] Likewise, the same changes resolve the same problems flagged by Smatch. .../gve_ethtool.c:100 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_main_stats' too small (32 vs 576) .../gve_ethtool.c:120 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_adminq_stats' too small (32 vs 512) Compile tested only. Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20240508-gve-comma-v2-2-1ac919225f13@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysgve: Avoid unnecessary use of comma operatorSimon Horman1-2/+2
Although it does not seem to have any untoward side-effects, the use of ';' to separate to assignments seems more appropriate than ','. Flagged by clang-18 -Wcomma No functional change intended. Compile tested only. Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240508-gve-comma-v2-1-1ac919225f13@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysselftests: net: increase the delay for relative cmsg_time.sh testJakub Kicinski2-16/+23
Slow machines can delay scheduling of the packets for milliseconds. Increase the delay to 8ms if KSFT_MACHINE_SLOW. Try to limit the variability by moving setsockopts earlier (before we read time). This fixes the "TXTIME rel" failures on debug kernels, like: Case ICMPv4 - TXTIME rel returned '', expected 'OK' Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240510005705.43069-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysselftests: net: fix timestamp not arriving in cmsg_time.shJakub Kicinski1-5/+15
On slow machines the SND timestamp sometimes doesn't arrive before we quit. The test only waits as long as the packet delay, so it's easy for a race condition to happen. Double the wait but do a bit of polling, once the SND timestamp arrives there's no point to wait any longer. This fixes the "TXTIME abs" failures on debug kernels, like: Case ICMPv4 - TXTIME abs returned '', expected 'OK' Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240510005705.43069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysvirtio_net: Fix memory leak in virtnet_rx_mod_workDaniel Jurgens1-2/+1
The pointer delcaration was missing the __free(kfree). Fixes: ff7c7d9f5261 ("virtio_net: Remove command data from control_buf") Reported-by: Jens Axboe <axboe@kernel.dk> Closes: https://lore.kernel.org/netdev/0674ca1b-020f-4f93-94d0-104964566e3f@kernel.dk/ Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240509183634.143273-1-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysbnxt_en: silence clang build warningVadim Fedorenko1-1/+1
Clang build brings a warning: ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:133:12: warning: comparison of distinct pointer types ('typeof (tmo_us) *' (aka 'unsigned int *') and 'typeof (65535) *' (aka 'int *')) [-Wcompare-distinct-pointer-types] 133 | tmo_us = min(tmo_us, BNXT_PTP_QTS_MAX_TMO_US); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix it by specifying proper type for BNXT_PTP_QTS_MAX_TMO_US. Fixes: 7de3c2218eed ("bnxt_en: Add a timeout parameter to bnxt_hwrm_port_ts_query()") Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240509151833.12579-1-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysMerge tag 'gtp-24-05-07' of ↵David S. Miller4-130/+745
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/gtp Pablo neira Ayuso says: ==================== gtp pull request 24-05-07 This v3 includes: - fix for clang uninitialized variable per Jakub. - address Smatch and Coccinelle reports per Simon - remove inline in new IPv6 support per Simon - fix memleaks in netlink control plane per Simon -o- The following patchset contains IPv6 GTP driver support for net-next, this also includes IPv6 over IPv4 and vice-versa: Patch #1 removes a unnecessary stack variable initialization in the socket routine. Patch #2 deals with GTP extension headers. This variable length extension header to decapsulate packets accordingly. Otherwise, packets are dropped when these extension headers are present which breaks interoperation with other non-Linux based GTP implementations. Patch #3 prepares for IPv6 support by moving IPv4 specific fields in PDP context objects to a union. Patch #4 adds IPv6 support while retaining backward compatibility. Three new attributes allows to declare an IPv6 GTP tunnel GTPA_FAMILY, GTPA_PEER_ADDR6 and GTPA_MS_ADDR6 as well as IFLA_GTP_LOCAL6 to declare the IPv6 GTP UDP socket. Up to this patch, only IPv6 outer in IPv6 inner is supported. Patch #5 uses IPv6 address /64 prefix for UE/MS in the inner headers. Unlike IPv4, which provides a 1:1 mapping between UE/MS, IPv6 tunnel encapsulates traffic for /64 address as specified by 3GPP TS. Patch has been split from Patch #4 to highlight this behaviour. Patch #6 passes up IPv6 link-local traffic, such as IPv6 SLAAC, for handling to userspace so they are handled as control packets. Patch #7 prepares to allow for GTP IPv4 over IPv6 and vice-versa by moving IP specific debugging out of the function to build IPv4 and IPv6 GTP packets. Patch #8 generalizes TOS/DSCP handling following similar approach as in the existing iptunnel infrastructure. Patch #9 adds a helper function to build an IPv4 GTP packet in the outer header. Patch #10 adds a helper function to build an IPv6 GTP packet in the outer header. Patch #11 adds support for GTP IPv4-over-IPv6 and vice-versa. Patch #12 allows to use the same TID/TEID (tunnel identifier) for inner IPv4 and IPv6 packets for better UE/MS dual stack integration. This series integrates with the osmocom.org project CI and TTCN-3 test infrastructure (Oliver Smith) as well as the userspace libgtpnl library. Thanks to Harald Welte, Oliver Smith and Pau Espin for reviewing and providing feedback through the osmocom.org redmine platform to make this happen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
8 daysnetfilter: nf_tables: allow clone callbacks to sleepFlorian Westphal8-21/+23
Sven Auhagen reports transaction failures with following error: ./main.nft:13:1-26: Error: Could not process rule: Cannot allocate memory percpu: allocation failed, size=16 align=8 atomic=1, atomic alloc failed, no space left This points to failing pcpu allocation with GFP_ATOMIC flag. However, transactions happen from user context and are allowed to sleep. One case where we can call into percpu allocator with GFP_ATOMIC is nft_counter expression. Normally this happens from control plane, so this could use GFP_KERNEL instead. But one use case, element insertion from packet path, needs to use GFP_ATOMIC allocations (nft_dynset expression). At this time, .clone callbacks always use GFP_ATOMIC for this reason. Add gfp_t argument to the .clone function and pass GFP_KERNEL or GFP_ATOMIC flag depending on context, this allows all clone memory allocations to sleep for the normal (transaction) case. Cc: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 daysselftests: netfilter: add packetdrill based conntrack testsFlorian Westphal10-0/+475
Add a new test script that uses packetdrill tool to exercise conntrack state machine. Needs ip/ip6tables and conntrack tool (to check if we have an entry in the expected state). Test cases added here cover following scenarios: 1. already-acked (retransmitted) packets are not tagged as INVALID 2. RST packet coming when conntrack is already closing (FIN/CLOSE_WAIT) transitions conntrack to CLOSE even if the RST is not an exact match 3. RST packets with out-of-window sequence numbers are marked as INVALID 4. SYN+Challenge ACK: check that challenge ack is allowed to pass 5. Old SYN/ACK: check conntrack handles the case where SYN is answered with SYN/ACK for an old, previous connection attempt 6. Check SYN reception while in ESTABLISHED state generates a challenge ack, RST response clears 'outdated' state + next SYN retransmit gets us into 'SYN_RECV' conntrack state. Tests get run twice, once with ipv4 and once with ipv6. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 daysnetfilter: nft_set_pipapo: remove dirty flagFlorian Westphal2-27/+0
After previous change: ->clone exists: ->dirty is always true ->clone == NULL ->dirty is always false So remove this flag. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 daysnetfilter: nft_set_pipapo: move cloning of match info to insert/removal pathFlorian Westphal1-21/+49
This set type keeps two copies of the sets' content, priv->match (live version, used to match from packet path) priv->clone (work-in-progress version of the 'future' priv->match). All additions and removals are done on priv->clone. When transaction completes, priv->clone becomes priv->match and a new clone is allocated for use by next transaction. Problem is that the cloning requires GFP_KERNEL allocations but we cannot fail at either commit or abort time. This patch defers the clone until we get an insertion or removal request. This allows us to handle OOM situations correctly. This also allows to remove ->dirty in a followup change: If ->clone exists, ->dirty is always true If ->clone is NULL, ->dirty is always false, no elements were added or removed (except catchall elements which are external to the specific set backend). Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 daysnetfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand cloneFlorian Westphal1-7/+10
The helper uses priv->clone unconditionally which will fail once we do the clone conditionally on first insert or removal. 'nft get element' from userspace needs to use priv->match since this runs from rcu read side lock section. Prepare for this by passing the match backend data as argument. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 daysnet: ipv6: fix wrong start position when receive hop-by-hop fragmentgaoxingwang1-1/+1
In IPv6, ipv6_rcv_core will parse the hop-by-hop type extension header and increase skb->transport_header by one extension header length. But if there are more other extension headers like fragment header at this time, the skb->transport_header points to the second extension header, not the transport layer header or the first extension header. This will result in the start and nexthdrp variable not pointing to the same position in ipv6frag_thdr_trunced, and ipv6_skip_exthdr returning incorrect offset and frag_off.Sometimes,the length of the last sharded packet is smaller than the calculated incorrect offset, resulting in packet loss. We can use network header to offset and calculate the correct position to solve this problem. Fixes: 9d9e937b1c8b (ipv6/netfilter: Discard first fragment not including all headers) Signed-off-by: Gao Xingwang <gaoxingwang1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
9 daystcp: get rid of twsk_unique()Eric Dumazet5-13/+5
DCCP is going away soon, and had no twsk_unique() method. We can directly call tcp_twsk_unique() for TCP sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240507164140.940547-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 daysnet/sched: adjust device watchdog timer to detect stopped queue at right timePraveen Kumar Kannoju1-4/+7
Applications are sensitive to long network latency, particularly heartbeat monitoring ones. Longer the tx timeout recovery higher the risk with such applications on a production machines. This patch remedies, yet honoring device set tx timeout. Modify watchdog next timeout to be shorter than the device specified. Compute the next timeout be equal to device watchdog timeout less the how long ago queue stop had been done. At next watchdog timeout tx timeout handler is called into if still in stopped state. Either called or not called, restore the watchdog timeout back to device specified. Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com> Link: https://lore.kernel.org/r/20240508133617.4424-1-praveen.kannoju@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 dayskbuild,bpf: Switch to using --btf_features for pahole v1.26 and laterAlan Maguire1-2/+13
The btf_features list can be used for pahole v1.26 and later - it is useful because if a feature is not yet implemented it will not exit with a failure message. This will allow us to add feature requests to the pahole options without having to check pahole versions in future; if the version of pahole supports the feature it will be added. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240507135514.490467-1-alan.maguire@oracle.com
9 daysMerge branch 'use network helpers, part 4'Martin KaFai Lau5-150/+59
Geliang Tang says: ==================== From: Geliang Tang <tanggeliang@kylinos.cn> This patchset adds post_socket_cb pointer into struct network_helper_opts to make start_server_addr() helper more flexible. With these modifications, many duplicate codes can be dropped. Patches 1-3 address Martin's comments in the previous series. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
9 daysselftests/bpf: Drop get_port in test_tcp_check_syncookieGeliang Tang1-18/+3
The arguments "addr" and "len" of run_test() have dropped. This makes function get_port() useless. Drop it from test_tcp_check_syncookie_user.c. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/a9b5c8064ab4cbf0f68886fe0e4706428b8d0d47.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
9 daysselftests/bpf: Use connect_to_fd in test_tcp_check_syncookieGeliang Tang1-33/+5
This patch uses public helper connect_to_fd() exported in network_helpers.h instead of the local defined function connect_to_server() in test_tcp_check_syncookie_user.c. This can avoid duplicate code. Then the arguments "addr" and "len" of run_test() become useless, drop them too. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/e0ae6b790ac0abc7193aadfb2660c8c9eb0fe1f0.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
9 daysselftests/bpf: Use connect_to_fd in sockopt_inheritGeliang Tang1-30/+1
This patch uses public helper connect_to_fd() exported in network_helpers.h instead of the local defined function connect_to_server() in prog_tests/sockopt_inherit.c. This can avoid duplicate code. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/71db79127cc160b0643fd9a12c70ae019ae076a1.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>