net: hold instance lock during NETDEV_CHANGE

Cosmin reports an issue with ipv6_add_dev being called from
NETDEV_CHANGE notifier:

[ 3455.008776]  ? ipv6_add_dev+0x370/0x620
[ 3455.010097]  ipv6_find_idev+0x96/0xe0
[ 3455.010725]  addrconf_add_dev+0x1e/0xa0
[ 3455.011382]  addrconf_init_auto_addrs+0xb0/0x720
[ 3455.013537]  addrconf_notify+0x35f/0x8d0
[ 3455.014214]  notifier_call_chain+0x38/0xf0
[ 3455.014903]  netdev_state_change+0x65/0x90
[ 3455.015586]  linkwatch_do_dev+0x5a/0x70
[ 3455.016238]  rtnl_getlink+0x241/0x3e0
[ 3455.019046]  rtnetlink_rcv_msg+0x177/0x5e0

Similarly, linkwatch might get to ipv6_add_dev without ops lock:
[ 3456.656261]  ? ipv6_add_dev+0x370/0x620
[ 3456.660039]  ipv6_find_idev+0x96/0xe0
[ 3456.660445]  addrconf_add_dev+0x1e/0xa0
[ 3456.660861]  addrconf_init_auto_addrs+0xb0/0x720
[ 3456.661803]  addrconf_notify+0x35f/0x8d0
[ 3456.662236]  notifier_call_chain+0x38/0xf0
[ 3456.662676]  netdev_state_change+0x65/0x90
[ 3456.663112]  linkwatch_do_dev+0x5a/0x70
[ 3456.663529]  __linkwatch_run_queue+0xeb/0x200
[ 3456.663990]  linkwatch_event+0x21/0x30
[ 3456.664399]  process_one_work+0x211/0x610
[ 3456.664828]  worker_thread+0x1cc/0x380
[ 3456.665691]  kthread+0xf4/0x210

Reclassify NETDEV_CHANGE as a notifier that consistently runs under the
instance lock.

Link: https://lore.kernel.org/netdev/aac073de8beec3e531c86c101b274d434741c28e.camel@nvidia.com/
Reported-by: Cosmin Ratiu <cratiu@nvidia.com>
Tested-by: Cosmin Ratiu <cratiu@nvidia.com>
Fixes: ad7c7b2172 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250404161122.3907628-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Stanislav Fomichev
2025-04-04 09:11:22 -07:00
committed by Jakub Kicinski
parent 54f5fafcce
commit 04efcee6ef
10 changed files with 63 additions and 31 deletions

View File

@@ -338,10 +338,11 @@ operations directly under the netdev instance lock.
Devices drivers are encouraged to rely on the instance lock where possible.
For the (mostly software) drivers that need to interact with the core stack,
there are two sets of interfaces: ``dev_xxx`` and ``netif_xxx`` (e.g.,
``dev_set_mtu`` and ``netif_set_mtu``). The ``dev_xxx`` functions handle
acquiring the instance lock themselves, while the ``netif_xxx`` functions
assume that the driver has already acquired the instance lock.
there are two sets of interfaces: ``dev_xxx``/``netdev_xxx`` and ``netif_xxx``
(e.g., ``dev_set_mtu`` and ``netif_set_mtu``). The ``dev_xxx``/``netdev_xxx``
functions handle acquiring the instance lock themselves, while the
``netif_xxx`` functions assume that the driver has already acquired
the instance lock.
Notifiers and netdev instance lock
==================================
@@ -354,6 +355,7 @@ For devices with locked ops, currently only the following notifiers are
running under the lock:
* ``NETDEV_REGISTER``
* ``NETDEV_UP``
* ``NETDEV_CHANGE``
The following notifiers are running without the lock:
* ``NETDEV_UNREGISTER``