Discussion:
[dpdk-dev] Problems running netvsc multiq
(too old to reply)
Mohammed Gamal
2018-11-30 11:04:41 UTC
Permalink
Hi All,
I am having the following errors when I run testpmd with the netvsc
driver and --txq 2 and --rxq 2 options:

testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): invalid subchannel id 0
hn_subchan_configure(): open subchannel failed: -5
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -5
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed

Multiq support was enabled and kernel module was loaded. The full
command line was:
./testpmd -l 0-1 -n 2 --log-level=8 --log-level='pmd.*,8' --log-
level='bus.vmbus,8' -- --port-topology=chained --forward-mode=rxonly --
stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2 --rxq 2

I am running latest upstream kernel from the Linus tree and latest DPDK
upstream from git.dpdk.org.

Could you also reproduce this? If not, what could I be misssing?

Regards,
Mohammed
Stephen Hemminger
2018-11-30 18:27:56 UTC
Permalink
On Fri, 30 Nov 2018 12:04:41 +0100
Post by Mohammed Gamal
Hi All,
I am having the following errors when I run testpmd with the netvsc
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): invalid subchannel id 0
hn_subchan_configure(): open subchannel failed: -5
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -5
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed
Multiq support was enabled and kernel module was loaded. The full
./testpmd -l 0-1 -n 2 --log-level=8 --log-level='pmd.*,8' --log-
level='bus.vmbus,8' -- --port-topology=chained --forward-mode=rxonly --
stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2 --rxq 2
I am running latest upstream kernel from the Linus tree and latest DPDK
upstream from git.dpdk.org.
Could you also reproduce this? If not, what could I be misssing?
Regards,
Mohammed
Investigating now.
Does single queue work for you?
Mohammed Gamal
2018-11-30 19:06:52 UTC
Permalink
Post by Stephen Hemminger
On Fri, 30 Nov 2018 12:04:41 +0100
Post by Mohammed Gamal
Hi All,
I am having the following errors when I run testpmd with the netvsc
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): invalid subchannel id 0
hn_subchan_configure(): open subchannel failed: -5
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -5
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed
Multiq support was enabled and kernel module was loaded. The full
./testpmd -l 0-1 -n 2 --log-level=8 --log-level='pmd.*,8' --log-
level='bus.vmbus,8' -- --port-topology=chained --forward-mode=rxonly --
stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2 --rxq 2
I am running latest upstream kernel from the Linus tree and latest DPDK
upstream from git.dpdk.org.
Could you also reproduce this? If not, what could I be misssing?
Regards,
Mohammed
Investigating now.
Does single queue work for you?
Yes it does.
Stephen Hemminger
2018-12-04 16:48:58 UTC
Permalink
On Fri, 30 Nov 2018 14:06:52 -0500 (EST)
Post by Mohammed Gamal
Post by Stephen Hemminger
On Fri, 30 Nov 2018 12:04:41 +0100
Post by Mohammed Gamal
Hi All,
I am having the following errors when I run testpmd with the netvsc
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): invalid subchannel id 0
hn_subchan_configure(): open subchannel failed: -5
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -5
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed
Multiq support was enabled and kernel module was loaded. The full
./testpmd -l 0-1 -n 2 --log-level=8 --log-level='pmd.*,8' --log-
level='bus.vmbus,8' -- --port-topology=chained --forward-mode=rxonly --
stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2 --rxq 2
I am running latest upstream kernel from the Linus tree and latest DPDK
upstream from git.dpdk.org.
Could you also reproduce this? If not, what could I be misssing?
Regards,
Mohammed
Investigating now.
Does single queue work for you?
Yes it does.
What version of Windows are you running on? Multi-queue requires WS2016 or later.
The driver is missing a check of NDIS version, will add that.
Mohammed Gamal
2018-12-04 16:56:11 UTC
Permalink
Post by Stephen Hemminger
On Fri, 30 Nov 2018 14:06:52 -0500 (EST)
Post by Mohammed Gamal
Post by Stephen Hemminger
On Fri, 30 Nov 2018 12:04:41 +0100
  
Post by Mohammed Gamal
Hi All,
I am having the following errors when I run testpmd with the netvsc
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)  
hn_dev_configure():  >>    
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): invalid subchannel id 0
hn_subchan_configure(): open subchannel failed: -5
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -5  
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>    
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed
Multiq support was enabled and kernel module was loaded. The full
./testpmd -l 0-1 -n 2 --log-level=8 --log-level='pmd.*,8' --
log-
level='bus.vmbus,8' -- --port-topology=chained --forward-
mode=rxonly --
stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2 --rxq 2
I am running latest upstream kernel from the Linus tree and latest DPDK
upstream from git.dpdk.org.
Could you also reproduce this? If not, what could I be
misssing?
Regards,
Mohammed  
Investigating now.
Does single queue work for  you?  
Yes it does.
What version of Windows are you running on?  Multi-queue requires
WS2016 or later.
The driver is missing a check of NDIS version, will add that.
I am running WS2016
Stephen Hemminger
2018-12-05 22:12:46 UTC
Permalink
On WS2016 and 4.19.7 kernel (with the 4 patches), this is what I see:

$ sudo ./testpmd -l 0-1 -n2 --log-level=8 --log-level='pmd.*,8' --log-level='bus.vmbus,8' -- --port-topology=chained --forward-mode=rxonly --stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2 --rxq 2
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
vmbus_scan_one(): Adding vmbus device 2dd1ce17-079e-403c-b352-a1921ee207ee
vmbus_scan_one(): Adding vmbus device 635a7ae3-091e-4410-ad59-667c4f8c04c3
vmbus_scan_one(): Adding vmbus device 58f75a6d-d949-4320-99e1-a2a2576d581c
vmbus_scan_one(): Adding vmbus device 242ff919-07db-4180-9c2e-b86cb68c8c55
vmbus_scan_one(): Adding vmbus device 7cb9f65d-684d-44dc-9d55-13d40dd60570
vmbus_scan_one(): Adding vmbus device a6dcdcb3-c4da-445a-bc12-9050eb9cebfc
vmbus_scan_one(): Adding vmbus device 2450ee40-33bf-4fbd-892e-9fb06e9214cf
vmbus_scan_one(): Adding vmbus device 99221fa0-24ad-11e2-be98-001aa01bbf6e
vmbus_scan_one(): Adding vmbus device d34b2567-b9b6-42b9-8778-0a4ec0b955bf
vmbus_scan_one(): Adding vmbus device fd149e91-82e0-4a7d-afa6-2a4166cbd7c0
vmbus_scan_one(): Adding vmbus device b6650ff7-33bc-4840-8048-e0676786f393
vmbus_scan_one(): Adding vmbus device 5620e0c7-8062-4dce-aeb7-520c7ef76171
vmbus_scan_one(): Adding vmbus device 1eccfd72-4b41-45ef-b73a-4a6e44c12924
vmbus_scan_one(): Adding vmbus device 4487b255-b88c-403f-bb51-d1f69cf17f87
vmbus_scan_one(): Adding vmbus device b5fa4c59-1916-4725-935f-5c8d09d596c5
vmbus_scan_one(): Adding vmbus device b30ed368-1a6f-4921-8d2b-4160a0dfc667
vmbus_scan_one(): Adding vmbus device f5bee29c-1741-4aad-a4c2-8fdedb46dcc2
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 935f:00:02.0 on NUMA socket 0
EAL: probe driver: 15b3:1014 net_mlx5
net_mlx5: checking device "mlx5_0"
net_mlx5: PCI information matches for device "mlx5_0"
net_mlx5: no switch support detected
net_mlx5: MPW isn't supported
net_mlx5: SWP support: 0
net_mlx5: tunnel offloading is supported
net_mlx5: MPLS over GRE/UDP tunnel offloading disabled due to old OFED/rdma-core version or firmware configuration
net_mlx5: naming Ethernet device "935f:00:02.0"
net_mlx5: port is not active: "down" (1)
net_mlx5: checksum offloading is supported
net_mlx5: counters are not supported
net_mlx5: maximum Rx indirection table size is 512
net_mlx5: VLAN stripping is supported
net_mlx5: FCS stripping configuration is supported
net_mlx5: hardware Rx end alignment padding is not supported
net_mlx5: MPS is disabled
net_mlx5: port 0 reserved UAR address space: 0x7f5523f6f000
net_mlx5: port 0 MAC address is 00:15:5d:2a:16:66
net_mlx5: port 0 MTU is 1500
net_mlx5: port 0 forcing Ethernet interface up
net_mlx5: port 0 flow maximum priority: 5
dpaax: read_memory_node(): Unable to glob device-tree memory node: (/proc/device-tree/memory[@0-9]*/reg)(3)
dpaax: PA->VA translation not available;
dpaax: Expect performance impact.
vmbus_probe_one_driver(): VMBUS device 635a7ae3-091e-4410-ad59-667c4f8c04c3 on NUMA socket 0
vmbus_probe_one_driver(): probe driver: net_netvsc
eth_hn_probe(): >>
eth_hn_dev_init(): >>
hn_nvs_init(): NVS version 0x60001, NDIS version 6.30
hn_nvs_conn_rxbuf(): connect rxbuff va=0x2200402000 gpad=0xe1e2f
hn_nvs_conn_rxbuf(): receive buffer size 1728 count 9102
hn_nvs_conn_chim(): connect send buf va=0x2201302000 gpad=0xe1e30
hn_nvs_conn_chim(): send buffer 15728640 section size:6144, count:2560
hn_rndis_init(): RNDIS ver 1.0, aggpkt size 4026531839, aggpkt cnt 8, aggpkt align 8
hn_nvs_handle_vfassoc(): VF serial 2 add to port 1
hn_rndis_link_status(): link status 0x4001000b
hn_rndis_set_rxfilter(): set RX filter 0 done
hn_tx_pool_init(): create a TX send pool hn_txd_1 n=2560 size=32 socket=0
hn_rndis_get_eaddr(): MAC address 00:15:5d:2a:16:66
eth_hn_dev_init(): VMBus max channels 64
hn_rndis_query_rsscaps(): RX rings 64 indirect 128 caps 0x301
eth_hn_dev_init(): Adding VF device
hn_vf_attach(): Attach VF device 0
hn_nvs_set_datapath(): set datapath VF
vmbus_probe_one_driver(): VMBUS device 7cb9f65d-684d-44dc-9d55-13d40dd60570 on NUMA socket 0
rte_vmbus_map_device(): Not managed by UIO driver, skipped
vmbus_probe_one_driver(): VMBUS device b30ed368-1a6f-4921-8d2b-4160a0dfc667 on NUMA socket 0
rte_vmbus_map_device(): Not managed by UIO driver, skipped
Set rxonly packet forwarding mode
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 1 (socket 0)
hn_dev_configure(): >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
hn_subchan_configure(): new sub channel 1
hn_rndis_conf_rss(): >>
_hn_vf_configure(): enabling LSC for VF 0
net_mlx5: port 0 Tx queues number update: 0 -> 2
net_mlx5: port 0 Rx queues number update: 0 -> 2
hn_dev_tx_queue_setup(): >>
net_mlx5: port 0 configuring queue 0 for 256 descriptors
net_mlx5: port 0 priv->device_attr.max_qp_wr is 32768
net_mlx5: port 0 priv->device_attr.max_sge is 30
net_mlx5: port 0 adding Tx queue 0 to list
hn_dev_tx_queue_setup(): >>
net_mlx5: port 0 configuring queue 1 for 256 descriptors
net_mlx5: port 0 priv->device_attr.max_qp_wr is 32768
net_mlx5: port 0 priv->device_attr.max_sge is 30
net_mlx5: port 0 adding Tx queue 1 to list
hn_dev_rx_queue_setup(): >>
net_mlx5: port 0 configuring Rx queue 0 for 256 descriptors
net_mlx5: port 0 maximum number of segments per packet: 1
net_mlx5: port 0 CRC stripping is enabled, 0 bytes will be subtracted from incoming frames to hide it
net_mlx5: port 0 adding Rx queue 0 to list
hn_dev_rx_queue_setup(): >>
net_mlx5: port 0 configuring Rx queue 1 for 256 descriptors
net_mlx5: port 0 maximum number of segments per packet: 1
net_mlx5: port 0 CRC stripping is enabled, 0 bytes will be subtracted from incoming frames to hide it
net_mlx5: port 0 adding Rx queue 1 to list
hn_dev_start(): >>
hn_rndis_set_rxfilter(): set RX filter 0xd done
net_mlx5: port 0 starting device
net_mlx5: port 0 Tx queue 0 allocated and configured 256 WRs
net_mlx5: port 0: uar_mmap_offset 0x6000
net_mlx5: port 0 Tx queue 1 allocated and configured 256 WRs
net_mlx5: port 0: uar_mmap_offset 0x6000
net_mlx5: port 0 Rx queue 0 registering mp mbuf_pool_socket_0 having 1 chunks
net_mlx5: port 0 creating a MR using address (0x1611be400)
net_mlx5: port 0 inserting MR(0x161184e80) to global cache
net_mlx5: inserted B-tree(0x17ffe85b8)[1], [0x140000000, 0x180000000) lkey=0x4040800
net_mlx5: inserted B-tree(0x16119475e)[1], [0x140000000, 0x180000000) lkey=0x4040800
net_mlx5: port 0 Rx queue 0 allocated and configured 256 segments (max 256 packets)
net_mlx5: port 0 priv->device_attr.max_qp_wr is 32768
net_mlx5: port 0 priv->device_attr.max_sge is 30
net_mlx5: port 0 rxq 0 updated with 0x7ffca42c1388
net_mlx5: port 0 Rx queue 1 registering mp mbuf_pool_socket_0 having 1 chunks
net_mlx5: inserted B-tree(0x16119145e)[1], [0x140000000, 0x180000000) lkey=0x4040800
net_mlx5: port 0 Rx queue 1 allocated and configured 256 segments (max 256 packets)
net_mlx5: port 0 priv->device_attr.max_qp_wr is 32768
net_mlx5: port 0 priv->device_attr.max_sge is 30
net_mlx5: port 0 rxq 1 updated with 0x7ffca42c1388
net_mlx5: port 0 selected Rx vectorized function
net_mlx5: port 0 setting primary MAC address
hn_rndis_set_rxfilter(): set RX filter 0x9 done
hn_rndis_set_rxfilter(): set RX filter 0x9 done
Port 1: 00:15:5D:2A:16:66
Checking link statuses...
Done
hn_rndis_set_rxfilter(): set RX filter 0x20 done
No commandline core given, start packet forwarding
rxonly packet forwarding - ports=1 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
Logical Core 1 (socket 0) forwards packets on 2 streams:
R
P=1/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
RX P=1/Q=1 (socket 0) -> TX P=1/Q=1 (socket 0) peer=02:00:00:00:00:01

rxonly packet forwarding packets/burst=32
nb forwarding cores=1 - nb forwarding ports=1
port 1: RX queue number: 2 Tx queue number: 2
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=0 - RX free threshold=0
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=0 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x0 - TX RS bit threshold=0


Port statistics ====================================
######################## NIC statistics for port 1 ########################
RX-packets: 0 RX-missed: 0 RX-bytes: 0
RX-errors: 0
RX-nombuf: 0
TX-packets: 0 TX-errors: 0 TX-bytes: 0

Throughput (since last show)
Rx-pps: 0
Tx-pps: 0
############################################################################




Also make sure you have N VCPU >= N DPDK queues.
Stephen Hemminger
2018-12-05 22:32:38 UTC
Permalink
The problem is a regression in 4.20 kernel. Bisecting now.
The kernel in 4.19.7 works.

Failure with logging is:

$ sudo ./testpmd -l 0-1 -n2 --log-level=8 --log-level='pmd.*,8' --log-level='bus.vmbus,8' -- --port-topology=chained --forward-mode=rxonly --stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2 --rxq 2
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
vmbus_scan_one(): Adding vmbus device 2dd1ce17-079e-403c-b352-a1921ee207ee
vmbus_scan_one(): Adding vmbus device 635a7ae3-091e-4410-ad59-667c4f8c04c3
vmbus_scan_one(): Adding vmbus device 58f75a6d-d949-4320-99e1-a2a2576d581c
vmbus_scan_one(): Adding vmbus device 242ff919-07db-4180-9c2e-b86cb68c8c55
vmbus_scan_one(): Adding vmbus device 7cb9f65d-684d-44dc-9d55-13d40dd60570
vmbus_scan_one(): Adding vmbus device a6dcdcb3-c4da-445a-bc12-9050eb9cebfc
vmbus_scan_one(): Adding vmbus device 2450ee40-33bf-4fbd-892e-9fb06e9214cf
vmbus_scan_one(): Adding vmbus device 99221fa0-24ad-11e2-be98-001aa01bbf6e
vmbus_scan_one(): Adding vmbus device d34b2567-b9b6-42b9-8778-0a4ec0b955bf
vmbus_scan_one(): Adding vmbus device fd149e91-82e0-4a7d-afa6-2a4166cbd7c0
vmbus_scan_one(): Adding vmbus device b6650ff7-33bc-4840-8048-e0676786f393
vmbus_scan_one(): Adding vmbus device 5620e0c7-8062-4dce-aeb7-520c7ef76171
vmbus_scan_one(): Adding vmbus device 1eccfd72-4b41-45ef-b73a-4a6e44c12924
vmbus_scan_one(): Adding vmbus device 4487b255-b88c-403f-bb51-d1f69cf17f87
vmbus_scan_one(): Adding vmbus device b5fa4c59-1916-4725-935f-5c8d09d596c5
vmbus_scan_one(): Adding vmbus device b30ed368-1a6f-4921-8d2b-4160a0dfc667
vmbus_scan_one(): Adding vmbus device f5bee29c-1741-4aad-a4c2-8fdedb46dcc2
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 935f:00:02.0 on NUMA socket 0
EAL: probe driver: 15b3:1014 net_mlx5
net_mlx5: checking device "mlx5_0"
net_mlx5: PCI information matches for device "mlx5_0"
net_mlx5: no switch support detected
net_mlx5: MPW isn't supported
net_mlx5: SWP support: 0
net_mlx5: tunnel offloading is supported
net_mlx5: MPLS over GRE/UDP tunnel offloading disabled due to old OFED/rdma-core version or firmware configuration
net_mlx5: naming Ethernet device "935f:00:02.0"
net_mlx5: port is not active: "down" (1)
net_mlx5: checksum offloading is supported
net_mlx5: counters are not supported
net_mlx5: maximum Rx indirection table size is 512
net_mlx5: VLAN stripping is supported
net_mlx5: FCS stripping configuration is supported
net_mlx5: hardware Rx end alignment padding is not supported
net_mlx5: MPS is disabled
net_mlx5: port 0 reserved UAR address space: 0x7f346659e000
net_mlx5: port 0 MAC address is 00:15:5d:2a:16:66
net_mlx5: port 0 MTU is 1500
net_mlx5: port 0 forcing Ethernet interface up
net_mlx5: port 0 flow maximum priority: 5
vmbus_probe_one_driver(): VMBUS device 635a7ae3-091e-4410-ad59-667c4f8c04c3 on NUMA socket 0
vmbus_probe_one_driver(): probe driver: net_netvsc
eth_hn_probe(): >>
eth_hn_dev_init(): >>
hn_nvs_init(): NVS version 0x60001, NDIS version 6.30
hn_nvs_conn_rxbuf(): connect rxbuff va=0x2200402000 gpad=0xe1e2d
hn_nvs_conn_rxbuf(): receive buffer size 1728 count 18811
hn_nvs_conn_chim(): connect send buf va=0x2202302000 gpad=0xe1e2e
hn_nvs_conn_chim(): send buffer 16777216 section size:6144, count:2730
hn_nvs_handle_vfassoc(): VF serial 2 add to port 1
hn_rndis_init(): RNDIS ver 1.0, aggpkt size 4026531839, aggpkt cnt 8, aggpkt align 8
hn_rndis_link_status(): link status 0x4001000b
hn_rndis_set_rxfilter(): set RX filter 0 done
hn_tx_pool_init(): create a TX send pool hn_txd_1 n=2730 size=32 socket=0
hn_rndis_get_eaddr(): MAC address 00:15:5d:2a:16:66
eth_hn_dev_init(): VMBus max channels 64
hn_rndis_query_rsscaps(): RX rings 64 indirect 128 caps 0x301
eth_hn_dev_init(): Adding VF device
hn_vf_attach(): Attach VF device 0
hn_nvs_set_datapath(): set datapath VF
vmbus_probe_one_driver(): VMBUS device 7cb9f65d-684d-44dc-9d55-13d40dd60570 on NUMA socket 0
rte_vmbus_map_device(): Not managed by UIO driver, skipped
vmbus_probe_one_driver(): VMBUS device b30ed368-1a6f-4921-8d2b-4160a0dfc667 on NUMA socket 0
rte_vmbus_map_device(): Not managed by UIO driver, skipped
Set rxonly packet forwarding mode
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 1 (socket 0)
hn_dev_configure(): >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
Mohammed Gamal
2018-12-07 11:15:43 UTC
Permalink
Post by Stephen Hemminger
The problem is a regression in 4.20 kernel. Bisecting now.
I was bisecting the kernel and the change that seems to introduce this
regression is this one:

commit ae6935ed7d424ffa74d634da00767e7b03c98fd3
Author: Stephen Hemminger <***@networkplumber.org>
Date:   Fri Sep 14 09:10:17 2018 -0700

    vmbus: split ring buffer allocation from open
    
    The UIO driver needs the ring buffer to be persistent(reused)
    across open/close. Split the allocation and setup of ring buffer
    out of vmbus_open. For normal usage vmbus_open/vmbus_close there
    are no changes; only impacts uio_hv_generic which needs to keep
    ring buffer memory and reuse when application restarts.
    
Post by Stephen Hemminger
$ sudo ./testpmd -l 0-1 -n2 --log-level=8 --log-level='pmd.*,8' --
log-level='bus.vmbus,8' -- --port-topology=chained --forward-
mode=rxonly --stats-period 1 --eth-peer=0,00:15:5d:1e:20:c0 --txq 2
--rxq 2
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
vmbus_scan_one(): Adding vmbus device 2dd1ce17-079e-403c-b352-
a1921ee207ee
vmbus_scan_one(): Adding vmbus device 635a7ae3-091e-4410-ad59-
667c4f8c04c3
vmbus_scan_one(): Adding vmbus device 58f75a6d-d949-4320-99e1-
a2a2576d581c
vmbus_scan_one(): Adding vmbus device 242ff919-07db-4180-9c2e-
b86cb68c8c55
vmbus_scan_one(): Adding vmbus device 7cb9f65d-684d-44dc-9d55-
13d40dd60570
vmbus_scan_one(): Adding vmbus device a6dcdcb3-c4da-445a-bc12-
9050eb9cebfc
vmbus_scan_one(): Adding vmbus device 2450ee40-33bf-4fbd-892e-
9fb06e9214cf
vmbus_scan_one(): Adding vmbus device 99221fa0-24ad-11e2-be98-
001aa01bbf6e
vmbus_scan_one(): Adding vmbus device d34b2567-b9b6-42b9-8778-
0a4ec0b955bf
vmbus_scan_one(): Adding vmbus device fd149e91-82e0-4a7d-afa6-
2a4166cbd7c0
vmbus_scan_one(): Adding vmbus device b6650ff7-33bc-4840-8048-
e0676786f393
vmbus_scan_one(): Adding vmbus device 5620e0c7-8062-4dce-aeb7-
520c7ef76171
vmbus_scan_one(): Adding vmbus device 1eccfd72-4b41-45ef-b73a-
4a6e44c12924
vmbus_scan_one(): Adding vmbus device 4487b255-b88c-403f-bb51-
d1f69cf17f87
vmbus_scan_one(): Adding vmbus device b5fa4c59-1916-4725-935f-
5c8d09d596c5
vmbus_scan_one(): Adding vmbus device b30ed368-1a6f-4921-8d2b-
4160a0dfc667
vmbus_scan_one(): Adding vmbus device f5bee29c-1741-4aad-a4c2-
8fdedb46dcc2
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 935f:00:02.0 on NUMA socket 0
EAL:   probe driver: 15b3:1014 net_mlx5
net_mlx5: checking device "mlx5_0"
net_mlx5: PCI information matches for device "mlx5_0"
net_mlx5: no switch support detected
net_mlx5: MPW isn't supported
net_mlx5: SWP support: 0
net_mlx5: tunnel offloading is supported
net_mlx5: MPLS over GRE/UDP tunnel offloading disabled due to old
OFED/rdma-core version or firmware configuration
net_mlx5: naming Ethernet device "935f:00:02.0"
net_mlx5: port is not active: "down" (1)
net_mlx5: checksum offloading is supported
net_mlx5: counters are not supported
net_mlx5: maximum Rx indirection table size is 512
net_mlx5: VLAN stripping is supported
net_mlx5: FCS stripping configuration is supported
net_mlx5: hardware Rx end alignment padding is not supported
net_mlx5: MPS is disabled
net_mlx5: port 0 reserved UAR address space: 0x7f346659e000
net_mlx5: port 0 MAC address is 00:15:5d:2a:16:66
net_mlx5: port 0 MTU is 1500
net_mlx5: port 0 forcing Ethernet interface up
net_mlx5: port 0 flow maximum priority: 5
vmbus_probe_one_driver(): VMBUS device 635a7ae3-091e-4410-ad59-
667c4f8c04c3 on NUMA socket 0
vmbus_probe_one_driver():   probe driver: net_netvsc
eth_hn_probe():  >>
eth_hn_dev_init():  >>
hn_nvs_init(): NVS version 0x60001, NDIS version 6.30
hn_nvs_conn_rxbuf(): connect rxbuff va=0x2200402000 gpad=0xe1e2d
hn_nvs_conn_rxbuf(): receive buffer size 1728 count 18811
hn_nvs_conn_chim(): connect send buf va=0x2202302000 gpad=0xe1e2e
hn_nvs_conn_chim(): send buffer 16777216 section size:6144,
count:2730
hn_nvs_handle_vfassoc(): VF serial 2 add to port 1
hn_rndis_init(): RNDIS ver 1.0, aggpkt size 4026531839, aggpkt cnt 8, aggpkt align 8
hn_rndis_link_status(): link status 0x4001000b
hn_rndis_set_rxfilter(): set RX filter 0 done
hn_tx_pool_init(): create a TX send pool hn_txd_1 n=2730 size=32 socket=0
hn_rndis_get_eaddr(): MAC address 00:15:5d:2a:16:66
eth_hn_dev_init(): VMBus max channels 64
hn_rndis_query_rsscaps(): RX rings 64 indirect 128 caps 0x301
eth_hn_dev_init(): Adding VF device
hn_vf_attach(): Attach VF device 0
hn_nvs_set_datapath(): set datapath VF
vmbus_probe_one_driver(): VMBUS device 7cb9f65d-684d-44dc-9d55-
13d40dd60570 on NUMA socket 0
rte_vmbus_map_device(): Not managed by UIO driver, skipped
vmbus_probe_one_driver(): VMBUS device b30ed368-1a6f-4921-8d2b-
4160a0dfc667 on NUMA socket 0
rte_vmbus_map_device(): Not managed by UIO driver, skipped
Set rxonly packet forwarding mode
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 1 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 21
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 22
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 20
Stephen Hemminger
2018-12-07 17:31:06 UTC
Permalink
On Fri, 07 Dec 2018 13:15:43 +0200
Post by Mohammed Gamal
Post by Stephen Hemminger
The problem is a regression in 4.20 kernel. Bisecting now.
I was bisecting the kernel and the change that seems to introduce this
commit ae6935ed7d424ffa74d634da00767e7b03c98fd3
Date:   Fri Sep 14 09:10:17 2018 -0700
    vmbus: split ring buffer allocation from open
    
    The UIO driver needs the ring buffer to be persistent(reused)
    across open/close. Split the allocation and setup of ring buffer
    out of vmbus_open. For normal usage vmbus_open/vmbus_close there
    are no changes; only impacts uio_hv_generic which needs to keep
    ring buffer memory and reuse when application restarts.
    
Yes, this is the kernel problem that introduces the problem.
The issue is actually back in the unbind logic. When device is unbound
from the netvsc driver all the subchannels aren't cleaned up.

Still debugging.
Stephen Hemminger
2018-12-07 19:18:40 UTC
Permalink
On Fri, 07 Dec 2018 13:15:43 +0200
Post by Mohammed Gamal
Post by Stephen Hemminger
The problem is a regression in 4.20 kernel. Bisecting now.
I was bisecting the kernel and the change that seems to introduce this
commit ae6935ed7d424ffa74d634da00767e7b03c98fd3
Date:   Fri Sep 14 09:10:17 2018 -0700
    vmbus: split ring buffer allocation from open
    
    The UIO driver needs the ring buffer to be persistent(reused)
    across open/close. Split the allocation and setup of ring buffer
    out of vmbus_open. For normal usage vmbus_open/vmbus_close there
    are no changes; only impacts uio_hv_generic which needs to keep
    ring buffer memory and reuse when application restarts.
    
Patch posted:

From ***@networkplumber.org Fri Dec 7 10:58:47 2018
From: Stephen Hemminger <***@networkplumber.org>
Subject: [PATCH] vmbus: fix subchannel removal

The changes to split ring allocation from open/close, broke
the cleanup of subchannels. This resulted in problems using
uio on network devices because the subchannel was left behind
when the network device was unbound.

The cause was in the disconnect logic which used list splice
to move the subchannel list into a local variable. This won't
work because the subchannel list is needed later during the
process of the rescind messages (relid2channel).

The fix is to just leave the subchannel list in place
which is what the original code did. The list is cleaned
up later when the host rescind is processed.

Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open")
Signed-off-by: Stephen Hemminger <***@microsoft.com>
---
drivers/hv/channel.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index fe00b12e4417..bea4c9850247 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -701,20 +701,12 @@ static int vmbus_close_internal(struct vmbus_channel *channel)
int vmbus_disconnect_ring(struct vmbus_channel *channel)
{
struct vmbus_channel *cur_channel, *tmp;
- unsigned long flags;
- LIST_HEAD(list);
int ret;

if (channel->primary_channel != NULL)
return -EINVAL;

- /* Snapshot the list of subchannels */
- spin_lock_irqsave(&channel->lock, flags);
- list_splice_init(&channel->sc_list, &list);
- channel->num_sc = 0;
- spin_unlock_irqrestore(&channel->lock, flags);
-
- list_for_each_entry_safe(cur_channel, tmp, &list, sc_list) {
+ list_for_each_entry_safe(cur_channel, tmp, &channel->sc_list, sc_list) {
if (cur_channel->rescind)
wait_for_completion(&cur_channel->rescind_event);
--
2.19.2
Mohammed Gamal
2018-12-08 08:10:19 UTC
Permalink
Post by Stephen Hemminger
On Fri, 07 Dec 2018 13:15:43 +0200
Post by Mohammed Gamal
The problem is a regression in 4.20 kernel. Bisecting now.  
I was bisecting the kernel and the change that seems to introduce this
commit ae6935ed7d424ffa74d634da00767e7b03c98fd3
Date:   Fri Sep 14 09:10:17 2018 -0700
    vmbus: split ring buffer allocation from open
    
    The UIO driver needs the ring buffer to be persistent(reused)
    across open/close. Split the allocation and setup of ring buffer
    out of vmbus_open. For normal usage vmbus_open/vmbus_close there
    are no changes; only impacts uio_hv_generic which needs to keep
    ring buffer memory and reuse when application restarts.
    
Patch posted: 
Subject: [PATCH] vmbus: fix subchannel removal
The changes to split ring allocation from open/close, broke
the cleanup of subchannels. This resulted in problems using
uio on network devices because the subchannel was left behind
when the network device was unbound.
The cause was in the disconnect logic which used list splice
to move the subchannel list into a local variable. This won't
work because the subchannel list is needed later during the
process of the rescind messages (relid2channel).
The fix is to just leave the subchannel list in place
which is what the original code did. The list is cleaned
up later when the host rescind is processed.
Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open")
---
 drivers/hv/channel.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index fe00b12e4417..bea4c9850247 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -701,20 +701,12 @@ static int vmbus_close_internal(struct
vmbus_channel *channel)
 int vmbus_disconnect_ring(struct vmbus_channel *channel)
 {
  struct vmbus_channel *cur_channel, *tmp;
- unsigned long flags;
- LIST_HEAD(list);
  int ret;
 
  if (channel->primary_channel != NULL)
  return -EINVAL;
 
- /* Snapshot the list of subchannels */
- spin_lock_irqsave(&channel->lock, flags);
- list_splice_init(&channel->sc_list, &list);
- channel->num_sc = 0;
- spin_unlock_irqrestore(&channel->lock, flags);
-
- list_for_each_entry_safe(cur_channel, tmp, &list, sc_list) {
+ list_for_each_entry_safe(cur_channel, tmp, &channel-
Post by Mohammed Gamal
sc_list, sc_list) {
  if (cur_channel->rescind)
  wait_for_completion(&cur_channel-
Post by Mohammed Gamal
rescind_event);
Hi Stephen,
This works indeed for the first run. In any subsequent runs, I get this

testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
[...]
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19
^C
Signal 2 received, preparing to exit...
LATENCY_STATS: failed to remove Rx callback for pid=0, qid=0
LATENCY_STATS: failed to remove Rx callback for pid=0, qid=1
LATENCY_STATS: failed to remove Tx callback for pid=0, qid=0
LATENCY_STATS: failed to remove Tx callback for pid=0, qid=1

Shutting down port 0...
Stopping ports...
Done
Closing ports...
Port 0 is now not stopped
Done
Bye...

Do you see that on your end as well?

Stephen Hemminger
2018-11-30 20:24:57 UTC
Permalink
When using multiple queues, there was a race with the kernel
in setting up the second channel. This is do due to a kernel change
whiche does not allow accessing sysfs files for Hyper-V
channels that are not opened.

The fix is simple, just move the logic to detect not ready
sub channels earlier in the existing loop.

Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Reported-by:Mohammed Gamal <***@redhat.com>
Signed-off-by: Stephen Hemminger <***@microsoft.com>
---
drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c b/drivers/bus/vmbus/linux/vmbus_uio.c
index 12e97e3a420a..38df4d724ed5 100644
--- a/drivers/bus/vmbus/linux/vmbus_uio.c
+++ b/drivers/bus/vmbus/linux/vmbus_uio.c
@@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
continue;
}

+ if (!vmbus_isnew_subchannel(primary, relid))
+ continue; /* Already know about you */
+
+ if (!vmbus_uio_ring_present(dev, relid))
+ continue; /* Ring may not be ready yet */
+
snprintf(subchan_path, sizeof(subchan_path), "%s/%lu",
chan_path, relid);
err = vmbus_uio_sysfs_read(subchan_path, "subchannel_id",
@@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
if (subid == 0)
continue; /* skip primary channel */

- if (!vmbus_isnew_subchannel(primary, relid))
- continue;
-
- if (!vmbus_uio_ring_present(dev, relid))
- continue; /* Ring may not be ready yet */
-
err = vmbus_uio_sysfs_read(subchan_path, "monitor_id",
&monid, UINT8_MAX);
if (err) {
--
2.19.2
Mohammed Gamal
2018-12-03 06:02:55 UTC
Permalink
Post by Stephen Hemminger
When using multiple queues, there was a race with the kernel
in setting up the second channel. This is do due to a kernel change
whiche does not allow accessing sysfs files for Hyper-V
channels that are not opened.
The fix is simple, just move the logic to detect not ready
sub channels earlier in the existing loop.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
---
 drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c
b/drivers/bus/vmbus/linux/vmbus_uio.c
index 12e97e3a420a..38df4d724ed5 100644
--- a/drivers/bus/vmbus/linux/vmbus_uio.c
+++ b/drivers/bus/vmbus/linux/vmbus_uio.c
@@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
  continue;
  }
 
+ if (!vmbus_isnew_subchannel(primary, relid))
+ continue; /* Already know about you
*/
+
+ if (!vmbus_uio_ring_present(dev, relid))
+ continue; /* Ring may not be ready
yet */
+
  snprintf(subchan_path, sizeof(subchan_path),
"%s/%lu",
   chan_path, relid);
  err = vmbus_uio_sysfs_read(subchan_path,
"subchannel_id",
@@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
  if (subid == 0)
  continue; /* skip primary channel */
 
- if (!vmbus_isnew_subchannel(primary, relid))
- continue;
-
- if (!vmbus_uio_ring_present(dev, relid))
- continue; /* Ring may not be ready
yet */
-
  err = vmbus_uio_sysfs_read(subchan_path,
"monitor_id",
     &monid, UINT8_MAX);
  if (err) {
With this patch I am now getting the following error:
[...]
Configuring Port 0 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
hn_subchan_configure(): open subchannel failed: -2
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -2
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed

Apparently, no subchannels were ready. Anything I may have missed or
misconfigured?

Regards,
Mohammed
Stephen Hemminger
2018-12-03 16:48:44 UTC
Permalink
On Mon, 03 Dec 2018 07:02:55 +0100
Post by Mohammed Gamal
Post by Stephen Hemminger
When using multiple queues, there was a race with the kernel
in setting up the second channel. This is do due to a kernel change
whiche does not allow accessing sysfs files for Hyper-V
channels that are not opened.
The fix is simple, just move the logic to detect not ready
sub channels earlier in the existing loop.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
---
 drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c
b/drivers/bus/vmbus/linux/vmbus_uio.c
index 12e97e3a420a..38df4d724ed5 100644
--- a/drivers/bus/vmbus/linux/vmbus_uio.c
+++ b/drivers/bus/vmbus/linux/vmbus_uio.c
@@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
  continue;
  }
 
+ if (!vmbus_isnew_subchannel(primary, relid))
+ continue; /* Already know about you
*/
+
+ if (!vmbus_uio_ring_present(dev, relid))
+ continue; /* Ring may not be ready
yet */
+
  snprintf(subchan_path, sizeof(subchan_path),
"%s/%lu",
   chan_path, relid);
  err = vmbus_uio_sysfs_read(subchan_path,
"subchannel_id",
@@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
  if (subid == 0)
  continue; /* skip primary channel */
 
- if (!vmbus_isnew_subchannel(primary, relid))
- continue;
-
- if (!vmbus_uio_ring_present(dev, relid))
- continue; /* Ring may not be ready
yet */
-
  err = vmbus_uio_sysfs_read(subchan_path,
"monitor_id",
     &monid, UINT8_MAX);
  if (err) {
[...]
Configuring Port 0 (socket 0)
hn_dev_configure():  >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
hn_subchan_configure(): open subchannel failed: -2
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -2
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed
Apparently, no subchannels were ready. Anything I may have missed or
misconfigured?
Regards,
Mohammed
Could you check the kernel log?

The way sub channel configuration works is that the userspace code in DPDK
sends a message to the hypervisor that it would like N subchannels, then
the response from the hypervisor is processed by the kernel causing sysfs
files to be created. Meanwhile the userspace is polling waiting for the
sysfs files to show up (for 10 seconds). You could increas the timeout or
go looking in the sysfs directory to see what is present.

There is no good way to handle errors here, the hypervisor doesn't really
give much feedback.
Mohammed Gamal
2018-12-04 11:59:46 UTC
Permalink
Post by Stephen Hemminger
On Mon, 03 Dec 2018 07:02:55 +0100
Post by Mohammed Gamal
Post by Stephen Hemminger
When using multiple queues, there was a race with the kernel
in setting up the second channel. This is do due to a kernel change
whiche does not allow accessing sysfs files for Hyper-V
channels that are not opened.
The fix is simple, just move the logic to detect not ready
sub channels earlier in the existing loop.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus
support")
---
 drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c
b/drivers/bus/vmbus/linux/vmbus_uio.c
index 12e97e3a420a..38df4d724ed5 100644
--- a/drivers/bus/vmbus/linux/vmbus_uio.c
+++ b/drivers/bus/vmbus/linux/vmbus_uio.c
@@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct
vmbus_channel
*primary,
  continue;
  }
 
+ if (!vmbus_isnew_subchannel(primary, relid))
+ continue; /* Already know about
you
*/
+
+ if (!vmbus_uio_ring_present(dev, relid))
+ continue; /* Ring may not be
ready
yet */
+
  snprintf(subchan_path, sizeof(subchan_path),
"%s/%lu",
   chan_path, relid);
  err = vmbus_uio_sysfs_read(subchan_path,
"subchannel_id",
@@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct
vmbus_channel
*primary,
  if (subid == 0)
  continue; /* skip primary channel
*/
 
- if (!vmbus_isnew_subchannel(primary, relid))
- continue;
-
- if (!vmbus_uio_ring_present(dev, relid))
- continue; /* Ring may not be
ready
yet */
-
  err = vmbus_uio_sysfs_read(subchan_path,
"monitor_id",
     &monid, UINT8_MAX);
  if (err) {  
[...]
Configuring Port 0 (socket 0)
hn_dev_configure():  >>  
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
hn_subchan_configure(): open subchannel failed: -2
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -2
hn_dev_rx_queue_release():  >>
hn_dev_rx_queue_release():  >>
hn_dev_tx_queue_release():  >>
hn_dev_tx_queue_release():  >>  
Fail to configure port 0
EAL: Error - exiting with code: 1
  Cause: Start ports failed
Apparently, no subchannels were ready. Anything I may have missed or
misconfigured?
Regards,
Mohammed
Could you check the kernel log?
I did. No relevant messages seem to be there.
Post by Stephen Hemminger
The way sub channel configuration works is that the userspace code in DPDK
sends a message to the hypervisor that it would like N subchannels, then
the response from the hypervisor is processed by the kernel causing sysfs
files to be created. Meanwhile the userspace is polling waiting for the
sysfs files to show up (for 10 seconds). You could increas the
timeout or
go looking in the sysfs directory  to see what is present.
Tried increasing that up to 100 seconds, still nothing. Could it be a
problem on my host? The VM I am using is on a local hyper-v instance.
Post by Stephen Hemminger
There is no good way to handle errors here, the hypervisor doesn't really
give much feedback.
Stephen Hemminger
2018-12-05 22:11:56 UTC
Permalink
When using multiple queues, there was a race with the kernel
in setting up the second channel. This regression is due to a kernel change
which does not allow accessing sysfs files for Hyper-V channels that are not opened.

The fix is simple, just move the logic to detect not ready
sub channels earlier in the existing loop.

Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Reported-by:Mohammed Gamal <***@redhat.com>
Signed-off-by: Stephen Hemminger <***@microsoft.com>
---
drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c b/drivers/bus/vmbus/linux/vmbus_uio.c
index 12e97e3a420a..38df4d724ed5 100644
--- a/drivers/bus/vmbus/linux/vmbus_uio.c
+++ b/drivers/bus/vmbus/linux/vmbus_uio.c
@@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
continue;
}

+ if (!vmbus_isnew_subchannel(primary, relid))
+ continue; /* Already know about you */
+
+ if (!vmbus_uio_ring_present(dev, relid))
+ continue; /* Ring may not be ready yet */
+
snprintf(subchan_path, sizeof(subchan_path), "%s/%lu",
chan_path, relid);
err = vmbus_uio_sysfs_read(subchan_path, "subchannel_id",
@@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
if (subid == 0)
continue; /* skip primary channel */

- if (!vmbus_isnew_subchannel(primary, relid))
- continue;
-
- if (!vmbus_uio_ring_present(dev, relid))
- continue; /* Ring may not be ready yet */
-
err = vmbus_uio_sysfs_read(subchan_path, "monitor_id",
&monid, UINT8_MAX);
if (err) {
--
2.19.2
Stephen Hemminger
2018-12-05 22:11:57 UTC
Permalink
Make DPDK enable SRIOV flag in same way as Linux and FreeBSD.

Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
Signed-off-by: Stephen Hemminger <***@microsoft.com>
---
drivers/net/netvsc/hn_nvs.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/netvsc/hn_nvs.c b/drivers/net/netvsc/hn_nvs.c
index 9690c5f8a3e7..d58770e0455e 100644
--- a/drivers/net/netvsc/hn_nvs.c
+++ b/drivers/net/netvsc/hn_nvs.c
@@ -326,9 +326,9 @@ hn_nvs_conf_ndis(struct hn_data *hv, unsigned int mtu)
conf.mtu = mtu + ETHER_HDR_LEN;
conf.caps = NVS_NDIS_CONF_VLAN;

- /* TODO enable SRIOV */
- //if (hv->nvs_ver >= NVS_VERSION_5)
- // conf.caps |= NVS_NDIS_CONF_SRIOV;
+ /* enable SRIOV */
+ if (hv->nvs_ver >= NVS_VERSION_5)
+ conf.caps |= NVS_NDIS_CONF_SRIOV;

/* NOTE: No response. */
error = hn_nvs_req_send(hv, &conf, sizeof(conf));
--
2.19.2
Stephen Hemminger
2018-12-05 22:11:58 UTC
Permalink
NDIS multi-queue support is only in WS2012 or later. Check the NDIS
version to limit to single queue on older versions. Similar code
exists in Linux driver.

Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Signed-off-by: Stephen Hemminger <***@microsoft.com>
---
drivers/net/netvsc/hn_ethdev.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index b330bf3d7255..1256fa399b16 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -732,6 +732,7 @@ eth_hn_dev_init(struct rte_eth_dev *eth_dev)
hv->chim_res = &vmbus->resource[HV_SEND_BUF_MAP];
hv->port_id = eth_dev->data->port_id;
hv->latency = HN_CHAN_LATENCY_NS;
+ hv->max_queues = 1;

err = hn_parse_args(eth_dev);
if (err)
@@ -770,6 +771,10 @@ eth_hn_dev_init(struct rte_eth_dev *eth_dev)
if (err)
goto failed;

+ /* Multi queue requires later versions of windows server */
+ if (hv->nvs_ver < NVS_VERSION_5)
+ return 0;
+
max_chan = rte_vmbus_max_channels(vmbus);
PMD_INIT_LOG(DEBUG, "VMBus max channels %d", max_chan);
if (max_chan <= 0)
--
2.19.2
Stephen Hemminger
2018-12-05 22:11:59 UTC
Permalink
Add more instrumentation to subchannel setup to help diagnose
startup issues.

Signed-off-by: Stephen Hemminger <***@microsoft.com>
---
drivers/bus/vmbus/linux/vmbus_uio.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c b/drivers/bus/vmbus/linux/vmbus_uio.c
index 38df4d724ed5..09f7efdca286 100644
--- a/drivers/bus/vmbus/linux/vmbus_uio.c
+++ b/drivers/bus/vmbus/linux/vmbus_uio.c
@@ -357,19 +357,25 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
continue;
}

- if (!vmbus_isnew_subchannel(primary, relid))
- continue; /* Already know about you */
+ if (!vmbus_isnew_subchannel(primary, relid)) {
+ VMBUS_LOG(DEBUG, "skip already found channel: %lu",
+ relid);
+ continue;
+ }

- if (!vmbus_uio_ring_present(dev, relid))
- continue; /* Ring may not be ready yet */
+ if (!vmbus_uio_ring_present(dev, relid)) {
+ VMBUS_LOG(DEBUG, "ring mmap not found (yet) for: %lu",
+ relid);
+ continue;
+ }

snprintf(subchan_path, sizeof(subchan_path), "%s/%lu",
chan_path, relid);
err = vmbus_uio_sysfs_read(subchan_path, "subchannel_id",
&subid, UINT16_MAX);
if (err) {
- VMBUS_LOG(NOTICE, "invalid subchannel id %lu",
- subid);
+ VMBUS_LOG(NOTICE, "no subchannel_id in %s:%s",
+ subchan_path, strerror(-err));
goto fail;
}

@@ -379,14 +385,14 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
err = vmbus_uio_sysfs_read(subchan_path, "monitor_id",
&monid, UINT8_MAX);
if (err) {
- VMBUS_LOG(NOTICE, "invalid monitor id %lu",
- monid);
+ VMBUS_LOG(NOTICE, "no monitor_id in %s:%s",
+ subchan_path, strerror(-err));
goto fail;
}

err = vmbus_chan_create(dev, relid, subid, monid, subchan);
if (err) {
- VMBUS_LOG(NOTICE, "subchannel setup failed");
+ VMBUS_LOG(ERR, "subchannel setup failed");
goto fail;
}
break;
--
2.19.2
Loading...