Discussion:
[dpdk-dev] [PATCH 00/22] Single virtio implementation
(too old to reply)
Ouyang Changchun
2015-01-15 05:15:08 UTC
Permalink
This is the patch set for single virtio implementation.

Why we need single virtio?
============================
As we know currently there are at least 3 virtio PMD driver implementations:
A) lib/librte_pmd_virtio(refer as virtio A);
B) virtio_net_pmd by 6wind(refer as virtio B);
C) virtio by Brocade/vyatta(refer as virtio C);

Integrating 3 implementations into one could reduce the maintaining cost and time,
in other hand, user don't need practice their application on 3 variant one by one to see
which one is the best for them;

What's the status?
====================
Currently virtio A has covered most features of virtio B except for using port io to get pci resource,
so there is a patch(17/22) to resolve it. But on the other hand there are a few differences between
virtio A and virtio C, it needs integrate features/codes of virtio C into virtio A.
This patch set bases on two original RFC patch sets from Stephen Hemminger[***@networkplumber.org]
Refer to [http://dpdk.org/ml/archives/dev/2014-August/004845.html ] for the original one.
This patch set also resolves some conflict with latest codes, removed duplicated codes, fix some
issues in original codes.

What this patch set contains:
===============================
1) virtio: Rearrange resource initialization, it extracts a function to setup PCI resources;
2) virtio: Use weaker barriers, as DPDK driver only has to deal with the case of running on PCI
and with SMP, In this case, the code can use the weaker barriers instead of using hard (fence)
barriers. This may help performance a bit;
3) virtio: Allow starting with link down, other driver has similar behavior;
4) virtio: Add support for Link State interrupt;
5) ether: Add soft vlan encap/decap functions, it helps if HW don't support vlan strip;
6) virtio: Use software vlan stripping;
7) virtio: Remove unnecessary adapter structure;
8) virtio: Remove redundant vq_alignment, as vq alignment is always 4K, so use constant when needed;
9) virtio: Fix how states are handled during initialization, this is to match Linux kernel;
10) virtio: Make vtpci_get_status a local function as it is used in one file;
11) virtio: Check for packet headroom at compile time;
12) virtio: Move allocation before initialization to avoid being stuck in middle of virtio init;
13) virtio: Add support for vlan filtering;
14) virtio: Add support for multiple mac addresses;
15) virtio: Add ability to set MAC address;
16) virtio: Free mbuf's with threshold, this makes its behavior more like ixgbe;
17) virtio: Use port IO to get PCI resource for security reasons and match virtio-net-pmd;
18) virtio: Fix descriptor index issue;
19) ether: Fix vlan strip/insert issue;
20) example/vhost: Avoid inserting vlan twice and guest and host;
21) example/vhost: Add vlan-strip cmd line option to turn on/off vlan strip on host;
22) virtio: Use soft vlan strip in mergeable Rx path, this makes it has consistent logic
with the normal Rx path.

Changchun Ouyang (6):
virtio: Use port IO to get PCI resource.
virtio: Fix descriptor index issue
ether: Fix vlan strip/insert issue
example/vhost: Avoid inserting vlan twice
example/vhost: Add vlan-strip cmd line option
virtio: Use soft vlan strip in mergeable Rx path

Stephen Hemminger (16):
virtio: Rearrange resource initialization
virtio: Use weaker barriers
virtio: Allow starting with link down
virtio: Add support for Link State interrupt
ether: Add soft vlan encap/decap functions
virtio: Use software vlan stripping
virtio: Remove unnecessary adapter structure
virtio: Remove redundant vq_alignment
virtio: Fix how states are handled during initialization
virtio: Make vtpci_get_status local
virtio: Check for packet headroom at compile time
virtio: Move allocation before initialization
virtio: Add support for vlan filtering
virtio: Add suport for multiple mac addresses
virtio: Add ability to set MAC address
virtio: Free mbuf's with threshold

config/common_linuxapp | 2 +
examples/vhost/main.c | 43 ++-
lib/librte_eal/common/include/rte_pci.h | 4 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 5 +-
lib/librte_ether/rte_ethdev.h | 8 +
lib/librte_ether/rte_ether.h | 76 +++++
lib/librte_pmd_virtio/virtio_ethdev.c | 497 +++++++++++++++++++++++++-------
lib/librte_pmd_virtio/virtio_ethdev.h | 12 +-
lib/librte_pmd_virtio/virtio_pci.c | 20 +-
lib/librte_pmd_virtio/virtio_pci.h | 8 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 110 +++++--
lib/librte_pmd_virtio/virtqueue.h | 59 +++-
12 files changed, 676 insertions(+), 168 deletions(-)
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:09 UTC
Permalink
For clarity make the setup of PCI resources for Linux into a function rather
than block of code #ifdef'd in middle of dev_init.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 76 ++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 33 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index c009f2a..6c31598 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -794,6 +794,41 @@ virtio_has_msix(const struct rte_pci_addr *loc)

return (d != NULL);
}
+
+/* Extract I/O port numbers from sysfs */
+static int virtio_resource_init(struct rte_pci_device *pci_dev)
+{
+ char dirname[PATH_MAX];
+ char filename[PATH_MAX];
+ unsigned long start, size;
+
+ if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
+ return -1;
+
+ /* get portio size */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/size", dirname);
+ if (parse_sysfs_value(filename, &size) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse size",
+ __func__);
+ return -1;
+ }
+
+ /* get portio start */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/start", dirname);
+ if (parse_sysfs_value(filename, &start) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
+ __func__);
+ return -1;
+ }
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
#else
static int
virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
@@ -801,6 +836,12 @@ virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
/* nic_uio does not enable interrupts, return 0 (false). */
return 0;
}
+
+static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
+{
+ /* no setup required */
+ return 0;
+}
#endif

/*
@@ -831,40 +872,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;

pci_dev = eth_dev->pci_dev;
+ if (virtio_resource_init(pci_dev) < 0)
+ return -1;

-#ifdef RTE_EXEC_ENV_LINUXAPP
- {
- char dirname[PATH_MAX];
- char filename[PATH_MAX];
- unsigned long start, size;
-
- if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
- return -1;
-
- /* get portio size */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/size", dirname);
- if (parse_sysfs_value(filename, &size) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse size",
- __func__);
- return -1;
- }
-
- /* get portio start */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/start", dirname);
- if (parse_sysfs_value(filename, &start) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
- __func__);
- return -1;
- }
- pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
- pci_dev->mem_resource[0].len = (uint64_t)size;
- PMD_INIT_LOG(DEBUG,
- "PCI Port IO found start=0x%lx with size=0x%lx",
- start, size);
- }
-#endif
hw->use_msix = virtio_has_msix(&pci_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:10 UTC
Permalink
The DPDK driver only has to deal with the case of running on PCI
and with SMP. In this case, the code can use the weaker barriers
instead of using hard (fence) barriers. This will help performance.
The rationale is explained in Linux kernel virtio_ring.h.

To make it clearer that this is a virtio thing and not some generic
barrier, prefix the barrier calls with virtio_.

Add missing (and needed) barrier between updating ring data
structure and notifying host.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 2 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 8 +++++---
lib/librte_pmd_virtio/virtqueue.h | 19 ++++++++++++++-----
3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 6c31598..78018f9 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -175,7 +175,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;

- rmb();
+ virtio_rmb();

used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 3f6bad2..f878c62 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -456,7 +456,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ);
@@ -516,6 +516,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
}

if (likely(nb_enqueued)) {
+ virtio_wmb();
if (unlikely(virtqueue_kick_prepare(rxvq))) {
virtqueue_notify(rxvq);
PMD_RX_LOG(DEBUG, "Notified\n");
@@ -547,7 +548,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

if (nb_used == 0)
return 0;
@@ -694,7 +695,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
nb_used = VIRTQUEUE_NUSED(txvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);

@@ -735,6 +736,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
}
}
vq_update_avail_idx(txvq);
+ virtio_wmb();

txvq->packets += nb_tx;

diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index fdee054..f6ad98d 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -46,9 +46,18 @@
#include "virtio_ring.h"
#include "virtio_logs.h"

-#define mb() rte_mb()
-#define wmb() rte_wmb()
-#define rmb() rte_rmb()
+/*
+ * Per virtio_config.h in Linux.
+ * For virtio_pci on SMP, we don't need to order with respect to MMIO
+ * accesses through relaxed memory I/O windows, so smp_mb() et al are
+ * sufficient.
+ *
+ * This driver is for virtio_pci on SMP and therefore can assume
+ * weaker (compiler barriers)
+ */
+#define virtio_mb() rte_mb()
+#define virtio_rmb() rte_compiler_barrier()
+#define virtio_wmb() rte_compiler_barrier()

#ifdef RTE_PMD_PACKET_PREFETCH
#define rte_packet_prefetch(p) rte_prefetch1(p)
@@ -225,7 +234,7 @@ virtqueue_full(const struct virtqueue *vq)
static inline void
vq_update_avail_idx(struct virtqueue *vq)
{
- rte_compiler_barrier();
+ virtio_rmb();
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}

@@ -255,7 +264,7 @@ static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
- * Ensure updated avail->idx is visible to host. mb() necessary?
+ * Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:11 UTC
Permalink
Starting driver with link down should be ok, it is with every
other driver. So just allow it.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 78018f9..4bff0fe 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1057,14 +1057,12 @@ virtio_dev_start(struct rte_eth_dev *dev)
vtpci_read_dev_config(hw,
offsetof(struct virtio_net_config, status),
&status, sizeof(status));
- if ((status & VIRTIO_NET_S_LINK_UP) == 0) {
+ if ((status & VIRTIO_NET_S_LINK_UP) == 0)
PMD_INIT_LOG(ERR, "Port: %d Link is DOWN",
dev->data->port_id);
- return -EIO;
- } else {
+ else
PMD_INIT_LOG(DEBUG, "Port: %d Link is UP",
dev->data->port_id);
- }
}
vtpci_reinit_complete(hw);
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:12 UTC
Permalink
Virtio has link state interrupt which can be used.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 78 +++++++++++++++++++++++++++--------
lib/librte_pmd_virtio/virtio_pci.c | 22 ++++++++++
lib/librte_pmd_virtio/virtio_pci.h | 4 ++
3 files changed, 86 insertions(+), 18 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 4bff0fe..d37f2e9 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -845,6 +845,34 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
#endif

/*
+ * Process Virtio Config changed interrupt and call the callback
+ * if link state changed.
+ */
+static void
+virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ struct rte_eth_dev *dev = param;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ uint8_t isr;
+
+ /* Read interrupt status which clears interrupt */
+ isr = vtpci_isr(hw);
+ PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+
+ if (isr & VIRTIO_PCI_ISR_CONFIG) {
+ if (virtio_dev_link_update(dev, 0) == 0)
+ _rte_eth_dev_callback_process(dev,
+ RTE_ETH_EVENT_INTR_LSC);
+ }
+
+}
+
+/*
* This function is based on probe() function in virtio_pci.c
* It returns 0 on success.
*/
@@ -968,6 +996,10 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
eth_dev->data->port_id, pci_dev->id.vendor_id,
pci_dev->id.device_id);
+
+ /* Setup interrupt callback */
+ rte_intr_callback_register(&pci_dev->intr_handle,
+ virtio_interrupt_handler, eth_dev);
return 0;
}

@@ -975,7 +1007,7 @@ static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
.id_table = pci_id_virtio_map,
- .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
},
.eth_dev_init = eth_virtio_dev_init,
.dev_private_size = sizeof(struct virtio_adapter),
@@ -1021,6 +1053,9 @@ static int
virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ int ret;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1029,7 +1064,11 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return (-EINVAL);
}

- return 0;
+ ret = vtpci_irq_config(hw, 0);
+ if (ret != 0)
+ PMD_DRV_LOG(ERR, "failed to set config vector");
+
+ return ret;
}


@@ -1037,7 +1076,6 @@ static int
virtio_dev_start(struct rte_eth_dev *dev)
{
uint16_t nb_queues, i;
- uint16_t status;
struct virtio_hw *hw =
VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);

@@ -1052,18 +1090,22 @@ virtio_dev_start(struct rte_eth_dev *dev)
/* Do final configuration before rx/tx engine starts */
virtio_dev_rxtx_start(dev);

- /* Check VIRTIO_NET_F_STATUS for link status*/
- if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
- vtpci_read_dev_config(hw,
- offsetof(struct virtio_net_config, status),
- &status, sizeof(status));
- if ((status & VIRTIO_NET_S_LINK_UP) == 0)
- PMD_INIT_LOG(ERR, "Port: %d Link is DOWN",
- dev->data->port_id);
- else
- PMD_INIT_LOG(DEBUG, "Port: %d Link is UP",
- dev->data->port_id);
+ /* check if lsc interrupt feature is enabled */
+ if (dev->data->dev_conf.intr_conf.lsc) {
+ if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
+ PMD_DRV_LOG(ERR, "link status not supported by host");
+ return -ENOTSUP;
+ }
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) {
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+ return -EIO;
+ }
}
+
+ /* Initialize Link state */
+ virtio_dev_link_update(dev, 0);
+
vtpci_reinit_complete(hw);

/*Notify the backend
@@ -1145,6 +1187,7 @@ virtio_dev_stop(struct rte_eth_dev *dev)
VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);

/* reset the NIC */
+ vtpci_irq_config(hw, 0);
vtpci_reset(hw);
virtio_dev_free_mbufs(dev);
}
@@ -1161,6 +1204,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
old = link;
link.link_duplex = FULL_DUPLEX;
link.link_speed = SPEED_10G;
+
if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
PMD_INIT_LOG(DEBUG, "Get link status from hw");
vtpci_read_dev_config(hw,
@@ -1179,10 +1223,8 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
link.link_status = 1; /* Link up */
}
virtio_dev_atomic_write_link_status(dev, &link);
- if (old.link_status == link.link_status)
- return -1;
- /*changed*/
- return 0;
+
+ return (old.link_status == link.link_status) ? -1 : 0;
}

static void
diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index ca9c748..6d51032 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -127,3 +127,25 @@ vtpci_set_status(struct virtio_hw *hw, uint8_t status)

VIRTIO_WRITE_REG_1(hw, VIRTIO_PCI_STATUS, status);
}
+
+uint8_t
+vtpci_isr(struct virtio_hw *hw)
+{
+
+ return VIRTIO_READ_REG_1(hw, VIRTIO_PCI_ISR);
+}
+
+
+/* Enable one vector (0) for Link State Intrerrupt */
+int
+vtpci_irq_config(struct virtio_hw *hw, uint16_t vec)
+{
+ VIRTIO_WRITE_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR, vec);
+ vec = VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
+ if (vec == VIRTIO_MSI_NO_VECTOR) {
+ PMD_DRV_LOG(ERR, "failed to set config vector");
+ return -EBUSY;
+ }
+
+ return 0;
+}
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 373f9dc..6998737 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -263,4 +263,8 @@ void vtpci_write_dev_config(struct virtio_hw *, uint64_t, void *, int);

void vtpci_read_dev_config(struct virtio_hw *, uint64_t, void *, int);

+uint8_t vtpci_isr(struct virtio_hw *);
+
+int vtpci_irq_config(struct virtio_hw *, uint16_t);
+
#endif /* _VIRTIO_PCI_H_ */
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:14 UTC
Permalink
Implement VLAN stripping in software. This allows application
to be device independent.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ethdev.h | 3 +++
lib/librte_pmd_virtio/virtio_ethdev.c | 2 ++
lib/librte_pmd_virtio/virtio_pci.h | 1 +
lib/librte_pmd_virtio/virtio_rxtx.c | 20 ++++++++++++++++++--
4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index f66805d..07d55b8 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -643,6 +643,9 @@ struct rte_eth_rxconf {
#define ETH_TXQ_FLAGS_NOOFFLOADS \
(ETH_TXQ_FLAGS_NOVLANOFFL | ETH_TXQ_FLAGS_NOXSUMSCTP | \
ETH_TXQ_FLAGS_NOXSUMUDP | ETH_TXQ_FLAGS_NOXSUMTCP)
+#define ETH_TXQ_FLAGS_NOXSUMS \
+ (ETH_TXQ_FLAGS_NOXSUMSCTP | ETH_TXQ_FLAGS_NOXSUMUDP | \
+ ETH_TXQ_FLAGS_NOXSUMTCP)
/**
* A structure used to configure a TX ring of an Ethernet port.
*/
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index d37f2e9..829838c 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1064,6 +1064,8 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return (-EINVAL);
}

+ hw->vlan_strip = rxmode->hw_vlan_strip;
+
ret = vtpci_irq_config(hw, 0);
if (ret != 0)
PMD_DRV_LOG(ERR, "failed to set config vector");
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 6998737..6d93fac 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -168,6 +168,7 @@ struct virtio_hw {
uint32_t max_tx_queues;
uint32_t max_rx_queues;
uint16_t vtnet_hdr_size;
+ uint8_t vlan_strip;
uint8_t use_msix;
uint8_t mac_addr[ETHER_ADDR_LEN];
};
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index f878c62..a5756e1 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -49,6 +49,7 @@
#include <rte_prefetch.h>
#include <rte_string_fns.h>
#include <rte_errno.h>
+#include <rte_byteorder.h>

#include "virtio_logs.h"
#include "virtio_ethdev.h"
@@ -408,8 +409,8 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,

PMD_INIT_FUNC_TRACE();

- if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOOFFLOADS)
- != ETH_TXQ_FLAGS_NOOFFLOADS) {
+ if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS)
+ != ETH_TXQ_FLAGS_NOXSUMS) {
PMD_INIT_LOG(ERR, "TX checksum offload not supported\n");
return -EINVAL;
}
@@ -446,6 +447,7 @@ uint16_t
virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -489,6 +491,9 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
rxm->pkt_len = (uint32_t)(len[i] - hdr_size);
rxm->data_len = (uint16_t)(len[i] - hdr_size);

+ if (hw->vlan_strip)
+ rte_vlan_strip(rxm);
+
VIRTIO_DUMP_PACKET(rxm, rxm->data_len);

rx_pkts[nb_rx++] = rxm;
@@ -717,6 +722,17 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
*/
if (likely(need <= 0)) {
txm = tx_pkts[nb_tx];
+
+ /* Do VLAN tag insertion */
+ if (txm->ol_flags & PKT_TX_VLAN_PKT) {
+ error = rte_vlan_insert(&txm);
+ if (unlikely(error)) {
+ rte_pktmbuf_free(txm);
+ ++nb_tx;
+ continue;
+ }
+ }
+
/* Enqueue Packet buffers */
error = virtqueue_enqueue_xmit(txvq, txm);
if (unlikely(error)) {
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:15 UTC
Permalink
Cleanup virtio code by eliminating unnecessary nesting of
virtio hardware structure inside adapter structure.
Also allows removing unneeded macro, making code clearer.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 43 ++++++++++++-----------------------
lib/librte_pmd_virtio/virtio_ethdev.h | 9 --------
lib/librte_pmd_virtio/virtio_rxtx.c | 3 +--
3 files changed, 16 insertions(+), 39 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 829838c..c89614d 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -207,8 +207,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
static int
virtio_set_multiple_queues(struct rte_eth_dev *dev, uint16_t nb_queues)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -242,8 +241,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
const struct rte_memzone *mz;
uint16_t vq_size;
int size;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtqueue *vq = NULL;

/* Write the virtqueue index to the Queue Select Field */
@@ -383,8 +381,7 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx,
struct virtqueue *vq;
uint16_t nb_desc = 0;
int ret;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

PMD_INIT_FUNC_TRACE();
ret = virtio_dev_queue_setup(dev, VTNET_CQ, VTNET_SQ_CQ_QUEUE_IDX,
@@ -410,8 +407,7 @@ virtio_dev_close(struct rte_eth_dev *dev)
static void
virtio_dev_promiscuous_enable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -430,8 +426,7 @@ virtio_dev_promiscuous_enable(struct rte_eth_dev *dev)
static void
virtio_dev_promiscuous_disable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -450,8 +445,7 @@ virtio_dev_promiscuous_disable(struct rte_eth_dev *dev)
static void
virtio_dev_allmulticast_enable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -470,8 +464,7 @@ virtio_dev_allmulticast_enable(struct rte_eth_dev *dev)
static void
virtio_dev_allmulticast_disable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -853,8 +846,7 @@ virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
void *param)
{
struct rte_eth_dev *dev = param;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
uint8_t isr;

/* Read interrupt status which clears interrupt */
@@ -880,12 +872,11 @@ static int
eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
struct rte_eth_dev *eth_dev)
{
+ struct virtio_hw *hw = eth_dev->data->dev_private;
struct virtio_net_config *config;
struct virtio_net_config local_config;
uint32_t offset_conf = sizeof(config->mac);
struct rte_pci_device *pci_dev;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);

if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) {
PMD_INIT_LOG(ERR,
@@ -1010,7 +1001,7 @@ static struct eth_driver rte_virtio_pmd = {
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
},
.eth_dev_init = eth_virtio_dev_init,
- .dev_private_size = sizeof(struct virtio_adapter),
+ .dev_private_size = sizeof(struct virtio_hw),
};

/*
@@ -1053,8 +1044,7 @@ static int
virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
int ret;

PMD_INIT_LOG(DEBUG, "configure");
@@ -1078,8 +1068,7 @@ static int
virtio_dev_start(struct rte_eth_dev *dev)
{
uint16_t nb_queues, i;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

/* Tell the host we've noticed this device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
@@ -1185,8 +1174,7 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev)
static void
virtio_dev_stop(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

/* reset the NIC */
vtpci_irq_config(hw, 0);
@@ -1199,8 +1187,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
{
struct rte_eth_link link, old;
uint16_t status;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
memset(&link, 0, sizeof(link));
virtio_dev_atomic_read_link_status(dev, &link);
old = link;
@@ -1232,7 +1219,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
static void
virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
{
- struct virtio_hw *hw = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

dev_info->driver_name = dev->driver->pci_drv.name;
dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h
index 1da3c62..55c9749 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.h
+++ b/lib/librte_pmd_virtio/virtio_ethdev.h
@@ -110,15 +110,6 @@ uint16_t virtio_recv_mergeable_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

-/*
- * Structure to store private data for each driver instance (for each port).
- */
-struct virtio_adapter {
- struct virtio_hw hw;
-};
-
-#define VIRTIO_DEV_PRIVATE_TO_HW(adapter)\
- (&((struct virtio_adapter *)adapter)->hw)

/*
* The VIRTIO_NET_F_GUEST_TSO[46] features permit the host to send us
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index a5756e1..73ad3ac 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -326,8 +326,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
void
virtio_dev_cq_start(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

if (hw->cvq) {
virtio_dev_vring_start(hw->cvq, VTNET_CQ);
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:13 UTC
Permalink
It is helpful to allow device drivers that don't support hardware
VLAN stripping to emulate this in software. This allows application
to be device independent.

Avoid discarding shared mbufs. Make a copy in rte_vlan_insert() of any
packet to be tagged that has a reference count > 1.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ether.h | 76 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 76 insertions(+)

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 187608d..3b6ab4b 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -49,6 +49,8 @@ extern "C" {

#include <rte_memcpy.h>
#include <rte_random.h>
+#include <rte_mbuf.h>
+#include <rte_byteorder.h>

#define ETHER_ADDR_LEN 6 /**< Length of Ethernet address. */
#define ETHER_TYPE_LEN 2 /**< Length of Ethernet type field. */
@@ -332,6 +334,80 @@ struct vxlan_hdr {
#define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr))
/**< VXLAN tunnel header length. */

+/**
+ * Extract VLAN tag information into mbuf
+ *
+ * Software version of VLAN stripping
+ *
+ * @param m
+ * The packet mbuf.
+ * @return
+ * - 0: Success
+ * - 1: not a vlan packet
+ */
+static inline int rte_vlan_strip(struct rte_mbuf *m)
+{
+ struct ether_hdr *eh
+ = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ if (eh->ether_type != ETHER_TYPE_VLAN)
+ return -1;
+
+ struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1);
+ m->ol_flags |= PKT_RX_VLAN_PKT;
+ m->vlan_tci = rte_be_to_cpu_16(vh->vlan_tci);
+
+ /* Copy ether header over rather than moving whole packet */
+ memmove(rte_pktmbuf_adj(m, sizeof(struct vlan_hdr)),
+ eh, 2 * ETHER_ADDR_LEN);
+
+ return 0;
+}
+
+/**
+ * Insert VLAN tag into mbuf.
+ *
+ * Software version of VLAN unstripping
+ *
+ * @param m
+ * The packet mbuf.
+ * @return
+ * - 0: On success
+ * -EPERM: mbuf is is shared overwriting would be unsafe
+ * -ENOSPC: not enough headroom in mbuf
+ */
+static inline int rte_vlan_insert(struct rte_mbuf **m)
+{
+ struct ether_hdr *oh, *nh;
+ struct vlan_hdr *vh;
+
+#ifdef RTE_MBUF_REFCNT
+ /* Can't insert header if mbuf is shared */
+ if (rte_mbuf_refcnt_read(*m) > 1) {
+ struct rte_mbuf *copy;
+
+ copy = rte_pktmbuf_clone(*m, (*m)->pool);
+ if (unlikely(copy == NULL))
+ return -ENOMEM;
+ rte_pktmbuf_free(*m);
+ *m = copy;
+ }
+#endif
+ oh = rte_pktmbuf_mtod(*m, struct ether_hdr *);
+ nh = (struct ether_hdr *)
+ rte_pktmbuf_prepend(*m, sizeof(struct vlan_hdr));
+ if (nh == NULL)
+ return -ENOSPC;
+
+ memmove(nh, oh, 2 * ETHER_ADDR_LEN);
+ nh->ether_type = ETHER_TYPE_VLAN;
+
+ vh = (struct vlan_hdr *) (nh + 1);
+ vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci);
+
+ return 0;
+}
+
#ifdef __cplusplus
}
#endif
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:16 UTC
Permalink
Since vq_alignment is constant (always 4K), it does not
need to be part of the vring struct.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 1 -
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
lib/librte_pmd_virtio/virtqueue.h | 3 +--
3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index c89614d..b7f65b9 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -294,7 +294,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
vq->port_id = dev->data->port_id;
vq->queue_id = queue_idx;
vq->vq_queue_index = vtpci_queue_idx;
- vq->vq_alignment = VIRTIO_PCI_VRING_ALIGN;
vq->vq_nentries = vq_size;
vq->vq_free_cnt = vq_size;

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 73ad3ac..b44f091 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -258,7 +258,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
* Reinitialise since virtio port might have been stopped and restarted
*/
memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
- vring_init(vr, size, ring_mem, vq->vq_alignment);
+ vring_init(vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN);
vq->vq_used_cons_idx = 0;
vq->vq_desc_head_idx = 0;
vq->vq_avail_idx = 0;
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index f6ad98d..5b8a255 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -138,8 +138,7 @@ struct virtqueue {
uint8_t port_id; /**< Device port identifier. */

void *vq_ring_virt_mem; /**< linear address of vring*/
- int vq_alignment;
- int vq_ring_size;
+ unsigned int vq_ring_size;
phys_addr_t vq_ring_mem; /**< physical address of vring */

struct vring vq_ring; /**< vring keeping desc, used and avail */
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:17 UTC
Permalink
Change order of initialiazation to match Linux kernel.
Don't blow away control queue by doing reset when stopped.

Calling dev_stop then dev_start would not work.
Dev_stop was calling virtio reset and that would clear all queues
and clear all feature negotiation.
Resolved by only doing reset on device removal.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 58 ++++++++++++++++++++---------------
lib/librte_pmd_virtio/virtio_pci.c | 10 ++----
lib/librte_pmd_virtio/virtio_pci.h | 3 +-
3 files changed, 37 insertions(+), 34 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b7f65b9..a07f4ca 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -398,9 +398,14 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx,
static void
virtio_dev_close(struct rte_eth_dev *dev)
{
+ struct virtio_hw *hw = dev->data->dev_private;
+
PMD_INIT_LOG(DEBUG, "virtio_dev_close");

- virtio_dev_stop(dev);
+ /* reset the NIC */
+ vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
+ vtpci_reset(hw);
+ virtio_dev_free_mbufs(dev);
}

static void
@@ -889,6 +894,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
if (rte_eal_process_type() == RTE_PROC_SECONDARY)
return 0;

+ /* Tell the host we've noticed this device. */
+ vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
+
pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
return -1;
@@ -899,9 +907,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Reset the device although not necessary at startup */
vtpci_reset(hw);

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
virtio_negotiate_features(hw);
@@ -990,6 +995,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Setup interrupt callback */
rte_intr_callback_register(&pci_dev->intr_handle,
virtio_interrupt_handler, eth_dev);
+
+ virtio_dev_cq_start(eth_dev);
+
return 0;
}

@@ -1044,7 +1052,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
struct virtio_hw *hw = dev->data->dev_private;
- int ret;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1055,11 +1062,12 @@ virtio_dev_configure(struct rte_eth_dev *dev)

hw->vlan_strip = rxmode->hw_vlan_strip;

- ret = vtpci_irq_config(hw, 0);
- if (ret != 0)
+ if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
PMD_DRV_LOG(ERR, "failed to set config vector");
+ return -EBUSY;
+ }

- return ret;
+ return 0;
}


@@ -1069,17 +1077,6 @@ virtio_dev_start(struct rte_eth_dev *dev)
uint16_t nb_queues, i;
struct virtio_hw *hw = dev->data->dev_private;

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
- /* Tell the host we've known how to drive the device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
-
- virtio_dev_cq_start(dev);
-
- /* Do final configuration before rx/tx engine starts */
- virtio_dev_rxtx_start(dev);
-
/* check if lsc interrupt feature is enabled */
if (dev->data->dev_conf.intr_conf.lsc) {
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
@@ -1096,8 +1093,16 @@ virtio_dev_start(struct rte_eth_dev *dev)
/* Initialize Link state */
virtio_dev_link_update(dev, 0);

+ /* On restart after stop do not touch queues */
+ if (hw->started)
+ return 0;
+
vtpci_reinit_complete(hw);

+ /* Do final configuration before rx/tx engine starts */
+ virtio_dev_rxtx_start(dev);
+ hw->started = 1;
+
/*Notify the backend
*Otherwise the tap backend might already stop its queue due to fullness.
*vhost backend will have no chance to be waked up
@@ -1168,17 +1173,20 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev)
}

/*
- * Stop device: disable rx and tx functions to allow for reconfiguring.
+ * Stop device: disable interrupt and mark link down
*/
static void
virtio_dev_stop(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw = dev->data->dev_private;
+ struct rte_eth_link link;

- /* reset the NIC */
- vtpci_irq_config(hw, 0);
- vtpci_reset(hw);
- virtio_dev_free_mbufs(dev);
+ PMD_INIT_LOG(DEBUG, "stop");
+
+ if (dev->data->dev_conf.intr_conf.lsc)
+ rte_intr_disable(&dev->pci_dev->intr_handle);
+
+ memset(&link, 0, sizeof(link));
+ virtio_dev_atomic_write_link_status(dev, &link);
}

static int
diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index 6d51032..b099e4f 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -137,15 +137,9 @@ vtpci_isr(struct virtio_hw *hw)


/* Enable one vector (0) for Link State Intrerrupt */
-int
+uint16_t
vtpci_irq_config(struct virtio_hw *hw, uint16_t vec)
{
VIRTIO_WRITE_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR, vec);
- vec = VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
- if (vec == VIRTIO_MSI_NO_VECTOR) {
- PMD_DRV_LOG(ERR, "failed to set config vector");
- return -EBUSY;
- }
-
- return 0;
+ return VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
}
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 6d93fac..0a4b578 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -170,6 +170,7 @@ struct virtio_hw {
uint16_t vtnet_hdr_size;
uint8_t vlan_strip;
uint8_t use_msix;
+ uint8_t started;
uint8_t mac_addr[ETHER_ADDR_LEN];
};

@@ -266,6 +267,6 @@ void vtpci_read_dev_config(struct virtio_hw *, uint64_t, void *, int);

uint8_t vtpci_isr(struct virtio_hw *);

-int vtpci_irq_config(struct virtio_hw *, uint16_t);
+uint16_t vtpci_irq_config(struct virtio_hw *, uint16_t);

#endif /* _VIRTIO_PCI_H_ */
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:18 UTC
Permalink
Make vtpci_get_status a local function as it is used in one file.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_pci.c | 4 +++-
lib/librte_pmd_virtio/virtio_pci.h | 2 --
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index b099e4f..2245bec 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -35,6 +35,8 @@
#include "virtio_pci.h"
#include "virtio_logs.h"

+static uint8_t vtpci_get_status(struct virtio_hw *);
+
void
vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset,
void *dst, int length)
@@ -113,7 +115,7 @@ vtpci_reinit_complete(struct virtio_hw *hw)
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER_OK);
}

-uint8_t
+static uint8_t
vtpci_get_status(struct virtio_hw *hw)
{
return VIRTIO_READ_REG_1(hw, VIRTIO_PCI_STATUS);
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 0a4b578..64d9c34 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -255,8 +255,6 @@ void vtpci_reset(struct virtio_hw *);

void vtpci_reinit_complete(struct virtio_hw *);

-uint8_t vtpci_get_status(struct virtio_hw *);
-
void vtpci_set_status(struct virtio_hw *, uint8_t);

uint32_t vtpci_negotiate_features(struct virtio_hw *, uint32_t);
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:19 UTC
Permalink
Better to check at compile time than fail at runtime.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index a07f4ca..c17cac8 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -882,11 +882,7 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
uint32_t offset_conf = sizeof(config->mac);
struct rte_pci_device *pci_dev;

- if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) {
- PMD_INIT_LOG(ERR,
- "MBUF HEADROOM should be enough to hold virtio net hdr\n");
- return -1;
- }
+ RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr));

eth_dev->dev_ops = &virtio_eth_dev_ops;
eth_dev->tx_pkt_burst = &virtio_xmit_pkts;
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:21 UTC
Permalink
Virtio supports vlan filtering.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 31 +++++++++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 13feda5..ec5a51e 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -84,6 +84,8 @@ static void virtio_dev_tx_queue_release(__rte_unused void *txq);
static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats);
static void virtio_dev_stats_reset(struct rte_eth_dev *dev);
static void virtio_dev_free_mbufs(struct rte_eth_dev *dev);
+static int virtio_vlan_filter_set(struct rte_eth_dev *dev,
+ uint16_t vlan_id, int on);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -511,6 +513,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.tx_queue_release = virtio_dev_tx_queue_release,
/* collect stats per queue */
.queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set,
+ .vlan_filter_set = virtio_vlan_filter_set,
};

static inline int
@@ -640,14 +643,31 @@ virtio_get_hwaddr(struct virtio_hw *hw)
}
}

+static int
+virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ struct virtio_pmd_ctrl ctrl;
+ int len;
+
+ if (!vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN))
+ return -ENOTSUP;
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_VLAN;
+ ctrl.hdr.cmd = on ? VIRTIO_NET_CTRL_VLAN_ADD : VIRTIO_NET_CTRL_VLAN_DEL;
+ memcpy(ctrl.data, &vlan_id, sizeof(vlan_id));
+ len = sizeof(vlan_id);
+
+ return virtio_send_command(hw->cvq, &ctrl, &len, 1);
+}

static void
virtio_negotiate_features(struct virtio_hw *hw)
{
uint32_t host_features, mask;

- mask = VIRTIO_NET_F_CTRL_VLAN;
- mask |= VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;
+ /* checksum offload not implemented */
+ mask = VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;

/* TSO and LRO are only available when their corresponding
* checksum offload feature is also negotiated.
@@ -1058,6 +1078,13 @@ virtio_dev_configure(struct rte_eth_dev *dev)

hw->vlan_strip = rxmode->hw_vlan_strip;

+ if (rxmode->hw_vlan_filter
+ && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) {
+ PMD_DRV_LOG(NOTICE,
+ "vlan filtering not available on this host");
+ return -ENOTSUP;
+ }
+
if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
PMD_DRV_LOG(ERR, "failed to set config vector");
return -EBUSY;
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:20 UTC
Permalink
If allocation fails, don't want to leave virtio device stuck
in middle of initialization sequence.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index c17cac8..13feda5 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -890,6 +890,15 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
if (rte_eal_process_type() == RTE_PROC_SECONDARY)
return 0;

+ /* Allocate memory for storing MAC addresses */
+ eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0);
+ if (eth_dev->data->mac_addrs == NULL) {
+ PMD_INIT_LOG(ERR,
+ "Failed to allocate %d bytes needed to store MAC addresses",
+ ETHER_ADDR_LEN);
+ return -ENOMEM;
+ }
+
/* Tell the host we've noticed this device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);

@@ -916,15 +925,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr);
}

- /* Allocate memory for storing MAC addresses */
- eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0);
- if (eth_dev->data->mac_addrs == NULL) {
- PMD_INIT_LOG(ERR,
- "Failed to allocate %d bytes needed to store MAC addresses",
- ETHER_ADDR_LEN);
- return -ENOMEM;
- }
-
/* Copy the permanent MAC address to: virtio_hw */
virtio_get_hwaddr(hw);
ether_addr_copy((struct ether_addr *) hw->mac_addr,
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:22 UTC
Permalink
Virtio support multiple MAC addresses.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 94 ++++++++++++++++++++++++++++++++++-
lib/librte_pmd_virtio/virtio_ethdev.h | 3 +-
lib/librte_pmd_virtio/virtqueue.h | 34 ++++++++++++-
3 files changed, 127 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index ec5a51e..e469ac2 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -86,6 +86,10 @@ static void virtio_dev_stats_reset(struct rte_eth_dev *dev);
static void virtio_dev_free_mbufs(struct rte_eth_dev *dev);
static int virtio_vlan_filter_set(struct rte_eth_dev *dev,
uint16_t vlan_id, int on);
+static void virtio_mac_addr_add(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr,
+ uint32_t index, uint32_t vmdq __rte_unused);
+static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -503,8 +507,6 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.stats_get = virtio_dev_stats_get,
.stats_reset = virtio_dev_stats_reset,
.link_update = virtio_dev_link_update,
- .mac_addr_add = NULL,
- .mac_addr_remove = NULL,
.rx_queue_setup = virtio_dev_rx_queue_setup,
/* meaningfull only to multiple queue */
.rx_queue_release = virtio_dev_rx_queue_release,
@@ -514,6 +516,8 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
/* collect stats per queue */
.queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set,
.vlan_filter_set = virtio_vlan_filter_set,
+ .mac_addr_add = virtio_mac_addr_add,
+ .mac_addr_remove = virtio_mac_addr_remove,
};

static inline int
@@ -644,6 +648,92 @@ virtio_get_hwaddr(struct virtio_hw *hw)
}

static int
+virtio_mac_table_set(struct virtio_hw *hw,
+ const struct virtio_net_ctrl_mac *uc,
+ const struct virtio_net_ctrl_mac *mc)
+{
+ struct virtio_pmd_ctrl ctrl;
+ int err, len[2];
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_MAC;
+ ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_TABLE_SET;
+
+ len[0] = uc->entries * ETHER_ADDR_LEN + sizeof(uc->entries);
+ memcpy(ctrl.data, uc, len[0]);
+
+ len[1] = mc->entries * ETHER_ADDR_LEN + sizeof(mc->entries);
+ memcpy(ctrl.data + len[0], mc, len[1]);
+
+ err = virtio_send_command(hw->cvq, &ctrl, len, 2);
+ if (err != 0)
+ PMD_DRV_LOG(NOTICE, "mac table set failed: %d", err);
+
+ return err;
+}
+
+static void
+virtio_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
+ uint32_t index, uint32_t vmdq __rte_unused)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ const struct ether_addr *addrs = dev->data->mac_addrs;
+ unsigned int i;
+ struct virtio_net_ctrl_mac *uc, *mc;
+
+ if (index >= VIRTIO_MAX_MAC_ADDRS) {
+ PMD_DRV_LOG(ERR, "mac address index %u out of range", index);
+ return;
+ }
+
+ uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries));
+ uc->entries = 0;
+ mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries));
+ mc->entries = 0;
+
+ for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) {
+ const struct ether_addr *addr
+ = (i == index) ? mac_addr : addrs + i;
+ struct virtio_net_ctrl_mac *tbl
+ = is_multicast_ether_addr(addr) ? mc : uc;
+
+ memcpy(&tbl->macs[tbl->entries++], addr, ETHER_ADDR_LEN);
+ }
+
+ virtio_mac_table_set(hw, uc, mc);
+}
+
+static void
+virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ struct ether_addr *addrs = dev->data->mac_addrs;
+ struct virtio_net_ctrl_mac *uc, *mc;
+ unsigned int i;
+
+ if (index >= VIRTIO_MAX_MAC_ADDRS) {
+ PMD_DRV_LOG(ERR, "mac address index %u out of range", index);
+ return;
+ }
+
+ uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries));
+ uc->entries = 0;
+ mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries));
+ mc->entries = 0;
+
+ for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) {
+ struct virtio_net_ctrl_mac *tbl;
+
+ if (i == index || is_zero_ether_addr(addrs + i))
+ continue;
+
+ tbl = is_multicast_ether_addr(addrs + i) ? mc : uc;
+ memcpy(&tbl->macs[tbl->entries++], addrs + i, ETHER_ADDR_LEN);
+ }
+
+ virtio_mac_table_set(hw, uc, mc);
+}
+
+static int
virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
{
struct virtio_hw *hw = dev->data->dev_private;
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h
index 55c9749..74ac7e0 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.h
+++ b/lib/librte_pmd_virtio/virtio_ethdev.h
@@ -51,7 +51,7 @@

#define VIRTIO_MAX_RX_QUEUES 128
#define VIRTIO_MAX_TX_QUEUES 128
-#define VIRTIO_MAX_MAC_ADDRS 1
+#define VIRTIO_MAX_MAC_ADDRS 64
#define VIRTIO_MIN_RX_BUFSIZE 64
#define VIRTIO_MAX_RX_PKTLEN 9728

@@ -60,6 +60,7 @@
(VIRTIO_NET_F_MAC | \
VIRTIO_NET_F_STATUS | \
VIRTIO_NET_F_MQ | \
+ VIRTIO_NET_F_CTRL_MAC_ADDR | \
VIRTIO_NET_F_CTRL_VQ | \
VIRTIO_NET_F_CTRL_RX | \
VIRTIO_NET_F_CTRL_VLAN | \
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index 5b8a255..d210f4f 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -99,6 +99,34 @@ enum { VTNET_RQ = 0, VTNET_TQ = 1, VTNET_CQ = 2 };
#define VIRTIO_NET_CTRL_RX_NOBCAST 5

/**
+ * Control the MAC
+ *
+ * The MAC filter table is managed by the hypervisor, the guest should
+ * assume the size is infinite. Filtering should be considered
+ * non-perfect, ie. based on hypervisor resources, the guest may
+ * received packets from sources not specified in the filter list.
+ *
+ * In addition to the class/cmd header, the TABLE_SET command requires
+ * two out scatterlists. Each contains a 4 byte count of entries followed
+ * by a concatenated byte stream of the ETH_ALEN MAC addresses. The
+ * first sg list contains unicast addresses, the second is for multicast.
+ * This functionality is present if the VIRTIO_NET_F_CTRL_RX feature
+ * is available.
+ *
+ * The ADDR_SET command requests one out scatterlist, it contains a
+ * 6 bytes MAC address. This functionality is present if the
+ * VIRTIO_NET_F_CTRL_MAC_ADDR feature is available.
+ */
+struct virtio_net_ctrl_mac {
+ uint32_t entries;
+ uint8_t macs[][ETHER_ADDR_LEN];
+} __attribute__((__packed__));
+
+#define VIRTIO_NET_CTRL_MAC 1
+ #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
+ #define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+
+/**
* Control VLAN filtering
*
* The VLAN filter table is controlled via a simple ADD/DEL interface.
@@ -121,7 +149,7 @@ typedef uint8_t virtio_net_ctrl_ack;
#define VIRTIO_NET_OK 0
#define VIRTIO_NET_ERR 1

-#define VIRTIO_MAX_CTRL_DATA 128
+#define VIRTIO_MAX_CTRL_DATA 2048

struct virtio_pmd_ctrl {
struct virtio_net_ctrl_hdr hdr;
@@ -180,6 +208,10 @@ struct virtqueue {
#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN 1
#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX 0x8000
#endif
+#ifndef VIRTIO_NET_F_CTRL_MAC_ADDR
+#define VIRTIO_NET_F_CTRL_MAC_ADDR 0x800000
+#define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+#endif

/**
* This is the first element of the scatter-gather list. If you don't
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:23 UTC
Permalink
Need to have do special things to set default mac address.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ethdev.h | 5 +++++
lib/librte_pmd_virtio/virtio_ethdev.c | 24 ++++++++++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 07d55b8..cbe3fdf 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1249,6 +1249,10 @@ typedef void (*eth_mac_addr_add_t)(struct rte_eth_dev *dev,
uint32_t vmdq);
/**< @internal Set a MAC address into Receive Address Address Register */

+typedef void (*eth_mac_addr_set_t)(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr);
+/**< @internal Set a MAC address into Receive Address Address Register */
+
typedef int (*eth_uc_hash_table_set_t)(struct rte_eth_dev *dev,
struct ether_addr *mac_addr,
uint8_t on);
@@ -1482,6 +1486,7 @@ struct eth_dev_ops {
priority_flow_ctrl_set_t priority_flow_ctrl_set; /**< Setup priority flow control.*/
eth_mac_addr_remove_t mac_addr_remove; /**< Remove MAC address */
eth_mac_addr_add_t mac_addr_add; /**< Add a MAC address */
+ eth_mac_addr_set_t mac_addr_set; /**< Set a MAC address */
eth_uc_hash_table_set_t uc_hash_table_set; /**< Set Unicast Table Array */
eth_uc_all_hash_table_set_t uc_all_hash_table_set; /**< Set Unicast hash bitmap */
eth_mirror_rule_set_t mirror_rule_set; /**< Add a traffic mirror rule.*/
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index e469ac2..c5f21c1 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -90,6 +90,8 @@ static void virtio_mac_addr_add(struct rte_eth_dev *dev,
struct ether_addr *mac_addr,
uint32_t index, uint32_t vmdq __rte_unused);
static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index);
+static void virtio_mac_addr_set(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -518,6 +520,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.vlan_filter_set = virtio_vlan_filter_set,
.mac_addr_add = virtio_mac_addr_add,
.mac_addr_remove = virtio_mac_addr_remove,
+ .mac_addr_set = virtio_mac_addr_set,
};

static inline int
@@ -733,6 +736,27 @@ virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
virtio_mac_table_set(hw, uc, mc);
}

+static void
+virtio_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+
+ memcpy(hw->mac_addr, mac_addr, ETHER_ADDR_LEN);
+
+ /* Use atomic update if available */
+ if (vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_MAC_ADDR)) {
+ struct virtio_pmd_ctrl ctrl;
+ int len = ETHER_ADDR_LEN;
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_MAC;
+ ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_ADDR_SET;
+
+ memcpy(ctrl.data, mac_addr, ETHER_ADDR_LEN);
+ virtio_send_command(hw->cvq, &ctrl, &len, 1);
+ } else if (vtpci_with_feature(hw, VIRTIO_NET_F_MAC))
+ virtio_set_hwaddr(hw);
+}
+
static int
virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
{
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:24 UTC
Permalink
This makes virtio driver work like ixgbe. Transmit buffers are
held until a transmit threshold is reached. The previous behavior
was to hold mbuf's until the ring entry was reused which caused
more memory usage than needed.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 7 ++--
lib/librte_pmd_virtio/virtio_rxtx.c | 75 +++++++++++++++++++++++++----------
lib/librte_pmd_virtio/virtqueue.h | 3 +-
3 files changed, 60 insertions(+), 25 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index c5f21c1..1ec29e1 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -176,15 +176,16 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,

virtqueue_notify(vq);

- while (vq->vq_used_cons_idx == vq->vq_ring.used->idx)
+ rte_rmb();
+ while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) {
+ rte_rmb();
usleep(100);
+ }

while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) {
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;

- virtio_rmb();
-
used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index b44f091..12c2310 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -129,17 +129,32 @@ virtqueue_dequeue_burst_rx(struct virtqueue *vq, struct rte_mbuf **rx_pkts,
return i;
}

+#ifndef DEFAULT_TX_FREE_THRESH
+#define DEFAULT_TX_FREE_THRESH 32
+#endif
+
+/* Cleanup from completed transmits. */
static void
-virtqueue_dequeue_pkt_tx(struct virtqueue *vq)
+virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)
{
- struct vring_used_elem *uep;
- uint16_t used_idx, desc_idx;
+ uint16_t i, used_idx, desc_idx;
+ for (i = 0; i < num; i++) {
+ struct vring_used_elem *uep;
+ struct vq_desc_extra *dxp;
+
+ used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
+ uep = &vq->vq_ring.used->ring[used_idx];
+ dxp = &vq->vq_descx[used_idx];
+
+ desc_idx = (uint16_t) uep->id;
+ vq->vq_used_cons_idx++;
+ vq_ring_free_chain(vq, desc_idx);

- used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
- uep = &vq->vq_ring.used->ring[used_idx];
- desc_idx = (uint16_t) uep->id;
- vq->vq_used_cons_idx++;
- vq_ring_free_chain(vq, desc_idx);
+ if (dxp->cookie != NULL) {
+ rte_pktmbuf_free(dxp->cookie);
+ dxp->cookie = NULL;
+ }
+ }
}


@@ -203,8 +218,6 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)

idx = head_idx;
dxp = &txvq->vq_descx[idx];
- if (dxp->cookie != NULL)
- rte_pktmbuf_free(dxp->cookie);
dxp->cookie = (void *)cookie;
dxp->ndescs = needed;

@@ -404,6 +417,7 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
{
uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX;
struct virtqueue *vq;
+ uint16_t tx_free_thresh;
int ret;

PMD_INIT_FUNC_TRACE();
@@ -421,6 +435,22 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
return ret;
}

+ tx_free_thresh = tx_conf->tx_free_thresh;
+ if (tx_free_thresh == 0)
+ tx_free_thresh =
+ RTE_MIN(vq->vq_nentries / 4, DEFAULT_TX_FREE_THRESH);
+
+ if (tx_free_thresh >= (vq->vq_nentries - 3)) {
+ RTE_LOG(ERR, PMD, "tx_free_thresh must be less than the "
+ "number of TX entries minus 3 (%u)."
+ " (tx_free_thresh=%u port=%u queue=%u)\n",
+ vq->vq_nentries - 3,
+ tx_free_thresh, dev->data->port_id, queue_idx);
+ return -EINVAL;
+ }
+
+ vq->vq_free_thresh = tx_free_thresh;
+
dev->data->tx_queues[queue_idx] = vq;
return 0;
}
@@ -688,11 +718,9 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
{
struct virtqueue *txvq = tx_queue;
struct rte_mbuf *txm;
- uint16_t nb_used, nb_tx, num;
+ uint16_t nb_used, nb_tx;
int error;

- nb_tx = 0;
-
if (unlikely(nb_pkts < 1))
return nb_pkts;

@@ -700,21 +728,26 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
nb_used = VIRTQUEUE_NUSED(txvq);

virtio_rmb();
+ if (likely(nb_used > txvq->vq_free_thresh))
+ virtio_xmit_cleanup(txvq, nb_used);

- num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);
+ nb_tx = 0;

while (nb_tx < nb_pkts) {
/* Need one more descriptor for virtio header. */
int need = tx_pkts[nb_tx]->nb_segs - txvq->vq_free_cnt + 1;
- int deq_cnt = RTE_MIN(need, (int)num);

- num -= (deq_cnt > 0) ? deq_cnt : 0;
- while (deq_cnt > 0) {
- virtqueue_dequeue_pkt_tx(txvq);
- deq_cnt--;
+ /*Positive value indicates it need free vring descriptors */
+ if (unlikely(need > 0)) {
+ nb_used = VIRTQUEUE_NUSED(txvq);
+ virtio_rmb();
+ need = RTE_MIN(need, (int)nb_used);
+
+ virtio_xmit_cleanup(txvq, need);
+ need = (int)tx_pkts[nb_tx]->nb_segs -
+ txvq->vq_free_cnt + 1;
}

- need = (int)tx_pkts[nb_tx]->nb_segs - txvq->vq_free_cnt + 1;
/*
* Zero or negative value indicates it has enough free
* descriptors to use for transmitting.
@@ -723,7 +756,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
txm = tx_pkts[nb_tx];

/* Do VLAN tag insertion */
- if (txm->ol_flags & PKT_TX_VLAN_PKT) {
+ if (unlikely(txm->ol_flags & PKT_TX_VLAN_PKT)) {
error = rte_vlan_insert(&txm);
if (unlikely(error)) {
rte_pktmbuf_free(txm);
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index d210f4f..6c45c27 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -164,6 +164,7 @@ struct virtqueue {
struct rte_mempool *mpool; /**< mempool for mbuf allocation */
uint16_t queue_id; /**< DPDK queue index. */
uint8_t port_id; /**< Device port identifier. */
+ uint16_t vq_queue_index; /**< PCI queue index */

void *vq_ring_virt_mem; /**< linear address of vring*/
unsigned int vq_ring_size;
@@ -172,7 +173,7 @@ struct virtqueue {
struct vring vq_ring; /**< vring keeping desc, used and avail */
uint16_t vq_free_cnt; /**< num of desc available */
uint16_t vq_nentries; /**< vring desc numbers */
- uint16_t vq_queue_index; /**< PCI queue index */
+ uint16_t vq_free_thresh; /**< free threshold */
/**
* Head of the free chain in the descriptor table. If
* there are no free descriptors, this will be set to
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:26 UTC
Permalink
It should use vring descriptor index instead of used_ring index to index vq_descx.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 12c2310..2529dc4 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -144,9 +144,9 @@ virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)

used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
- dxp = &vq->vq_descx[used_idx];

desc_idx = (uint16_t) uep->id;
+ dxp = &vq->vq_descx[desc_idx];
vq->vq_used_cons_idx++;
vq_ring_free_chain(vq, desc_idx);
--
1.8.4.2
Stephen Hemminger
2015-01-15 16:54:02 UTC
Permalink
On Thu, 15 Jan 2015 13:15:26 +0800
Post by Ouyang Changchun
It should use vring descriptor index instead of used_ring index to index vq_descx.
---
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 12c2310..2529dc4 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -144,9 +144,9 @@ virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)
used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
- dxp = &vq->vq_descx[used_idx];
desc_idx = (uint16_t) uep->id;
+ dxp = &vq->vq_descx[desc_idx];
vq->vq_used_cons_idx++;
vq_ring_free_chain(vq, desc_idx);
Rather than patching a code added by earlier patch in series, why
not just fix/merge the two patches?
Ouyang, Changchun
2015-01-16 00:55:35 UTC
Permalink
-----Original Message-----
Sent: Friday, January 16, 2015 12:54 AM
To: Ouyang, Changchun
Subject: Re: [PATCH 18/22] virtio: Fix descriptor index issue
On Thu, 15 Jan 2015 13:15:26 +0800
Post by Ouyang Changchun
It should use vring descriptor index instead of used_ring index to index
vq_descx.
Post by Ouyang Changchun
---
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c
b/lib/librte_pmd_virtio/virtio_rxtx.c
index 12c2310..2529dc4 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -144,9 +144,9 @@ virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)
used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq-
vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
- dxp = &vq->vq_descx[used_idx];
desc_idx = (uint16_t) uep->id;
+ dxp = &vq->vq_descx[desc_idx];
vq->vq_used_cons_idx++;
vq_ring_free_chain(vq, desc_idx);
Rather than patching a code added by earlier patch in series, why not just
fix/merge the two patches?
I think it could be more clear what's the original patch looks like and what I need to fix.
And I also resolve the patch author issue, which I am not care for, but someone may care for. :-)
Thanks
Changchun
Ouyang Changchun
2015-01-15 05:15:27 UTC
Permalink
Need swap the data from cpu to BE(big endian) for vlan-type.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ether.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 3b6ab4b..90fb3c9 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -350,7 +350,7 @@ static inline int rte_vlan_strip(struct rte_mbuf *m)
struct ether_hdr *eh
= rte_pktmbuf_mtod(m, struct ether_hdr *);

- if (eh->ether_type != ETHER_TYPE_VLAN)
+ if (eh->ether_type != rte_cpu_to_be_16(ETHER_TYPE_VLAN))
return -1;

struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1);
@@ -400,7 +400,7 @@ static inline int rte_vlan_insert(struct rte_mbuf **m)
return -ENOSPC;

memmove(nh, oh, 2 * ETHER_ADDR_LEN);
- nh->ether_type = ETHER_TYPE_VLAN;
+ nh->ether_type = rte_cpu_to_be_16(ETHER_TYPE_VLAN);

vh = (struct vlan_hdr *) (nh + 1);
vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci);
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:25 UTC
Permalink
Make virtio not require UIO for some security reasons, this is to match 6Wind's virtio-net-pmd.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
config/common_linuxapp | 2 +
lib/librte_eal/common/include/rte_pci.h | 4 ++
lib/librte_eal/linuxapp/eal/eal_pci.c | 5 +-
lib/librte_pmd_virtio/virtio_ethdev.c | 91 ++++++++++++++++++++++++++++++++-
4 files changed, 100 insertions(+), 2 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 6243d4b..a3227a2 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -125,6 +125,8 @@ CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
CONFIG_RTE_EAL_IGB_UIO=y
CONFIG_RTE_EAL_VFIO=y
+# Only for VIRTIO PMD currently
+CONFIG_RTE_EAL_PORT_IO=n

#
# Special configurations in PCI Config Space for high performance
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..19abc1f 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -193,6 +193,10 @@ struct rte_pci_driver {

/** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */
#define RTE_PCI_DRV_NEED_MAPPING 0x0001
+/** Device needs port IO(done with /proc/ioports) */
+#ifdef RTE_EAL_PORT_IO
+#define RTE_PCI_DRV_PORT_IO 0x0002
+#endif
/** Device driver must be registered several times until failure - deprecated */
#pragma GCC poison RTE_PCI_DRV_MULTIPLE
/** Device needs to be unbound even if no module is provided */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index b5f5410..5db0059 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -574,7 +574,10 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *d
/* map resources for devices that use igb_uio */
ret = pci_map_device(dev);
if (ret != 0)
- return ret;
+#ifdef RTE_EAL_PORT_IO
+ if ((dr->drv_flags & RTE_PCI_DRV_PORT_IO) == 0)
+#endif
+ return ret;
} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
rte_eal_process_type() == RTE_PROC_PRIMARY) {
/* unbind current driver */
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 1ec29e1..15324c9 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -961,6 +961,71 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev)
start, size);
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+/* Extract port I/O numbers from proc/ioports */
+static int virtio_resource_init_by_portio(struct rte_pci_device *pci_dev)
+{
+ uint16_t start, end;
+ int size;
+ FILE *fp;
+ char *line = NULL;
+ char pci_id[16];
+ int found = 0;
+ size_t linesz;
+
+ snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
+ pci_dev->addr.domain,
+ pci_dev->addr.bus,
+ pci_dev->addr.devid,
+ pci_dev->addr.function);
+
+ fp = fopen("/proc/ioports", "r");
+ if (fp == NULL) {
+ PMD_INIT_LOG(ERR, "%s(): can't open ioports", __func__);
+ return -1;
+ }
+
+ while (getdelim(&line, &linesz, '\n', fp) > 0) {
+ char *ptr = line;
+ char *left;
+ int n;
+
+ n = strcspn(ptr, ":");
+ ptr[n] = 0;
+ left = &ptr[n+1];
+
+ while (*left && isspace(*left))
+ left++;
+
+ if (!strncmp(left, pci_id, strlen(pci_id))) {
+ found = 1;
+
+ while (*ptr && isspace(*ptr))
+ ptr++;
+
+ sscanf(ptr, "%04hx-%04hx", &start, &end);
+ size = end - start + 1;
+
+ break;
+ }
+ }
+
+ free(line);
+ fclose(fp);
+
+ if (!found)
+ return -1;
+
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)(uint32_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
+#endif
+
#else
static int
virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
@@ -974,6 +1039,14 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
/* no setup required */
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+static int virtio_resource_init_by_portio(struct rte_pci_device *pci_dev)
+{
+ /* no setup required */
+ return 0;
+}
+#endif
#endif

/*
@@ -1039,7 +1112,10 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,

pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
- return -1;
+#ifdef RTE_EAL_PORT_IO
+ if (virtio_resource_init_by_portio(pci_dev) < 0)
+#endif
+ return -1;

hw->use_msix = virtio_has_msix(&pci_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
@@ -1132,6 +1208,18 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;
}

+#ifdef RTE_EAL_PORT_IO
+static struct eth_driver rte_virtio_pmd = {
+ {
+ .name = "rte_virtio_pmd",
+ .id_table = pci_id_virtio_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_PORT_IO |
+ RTE_PCI_DRV_INTR_LSC,
+ },
+ .eth_dev_init = eth_virtio_dev_init,
+ .dev_private_size = sizeof(struct virtio_hw),
+};
+#else
static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
@@ -1141,6 +1229,7 @@ static struct eth_driver rte_virtio_pmd = {
.eth_dev_init = eth_virtio_dev_init,
.dev_private_size = sizeof(struct virtio_hw),
};
+#endif

/*
* Driver initialization routine.
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:29 UTC
Permalink
Support turn on/off RX VLAN strip on host, this let guest get the chance of
using its software VALN strip functionality.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
examples/vhost/main.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index a7e623e..4df4977 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -178,6 +178,9 @@ static uint32_t num_devices;
static uint32_t zero_copy;
static int mergeable;

+/* Do vlan strip on host, enabled on default */
+static uint32_t vlan_strip = 1;
+
/* number of descriptors to apply*/
static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -570,6 +573,7 @@ us_vhost_usage(const char *prgname)
" --rx-retry-delay [0-N]: timeout(in usecond) between retries on RX. This makes effect only if retries on rx enabled\n"
" --rx-retry-num [0-N]: the number of retries on rx. This makes effect only if retries on rx enabled\n"
" --mergeable [0|1]: disable(default)/enable RX mergeable buffers\n"
+ " --vlan-strip [0|1]: disable/enable(default) RX VLAN strip on host\n"
" --stats [0-N]: 0: Disable stats, N: Time in seconds to print stats\n"
" --dev-basename: The basename to be used for the character device.\n"
" --zero-copy [0|1]: disable(default)/enable rx/tx "
@@ -597,6 +601,7 @@ us_vhost_parse_args(int argc, char **argv)
{"rx-retry-delay", required_argument, NULL, 0},
{"rx-retry-num", required_argument, NULL, 0},
{"mergeable", required_argument, NULL, 0},
+ {"vlan-strip", required_argument, NULL, 0},
{"stats", required_argument, NULL, 0},
{"dev-basename", required_argument, NULL, 0},
{"zero-copy", required_argument, NULL, 0},
@@ -697,6 +702,22 @@ us_vhost_parse_args(int argc, char **argv)
}
}

+ /* Enable/disable RX VLAN strip on host. */
+ if (!strncmp(long_option[option_index].name,
+ "vlan-strip", MAX_LONG_OPT_SZ)) {
+ ret = parse_num_opt(optarg, 1);
+ if (ret == -1) {
+ RTE_LOG(INFO, VHOST_CONFIG,
+ "Invalid argument for VLAN strip [0|1]\n");
+ us_vhost_usage(prgname);
+ return -1;
+ } else {
+ vlan_strip = !!ret;
+ vmdq_conf_default.rxmode.hw_vlan_strip =
+ vlan_strip;
+ }
+ }
+
/* Enable/disable stats. */
if (!strncmp(long_option[option_index].name, "stats", MAX_LONG_OPT_SZ)) {
ret = parse_num_opt(optarg, INT32_MAX);
@@ -955,7 +976,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
dev->device_fh);

/* Enable stripping of the vlan tag as we handle routing. */
- rte_eth_dev_set_vlan_strip_on_queue(ports[0], (uint16_t)vdev->vmdq_rx_q, 1);
+ if (vlan_strip)
+ rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+ (uint16_t)vdev->vmdq_rx_q, 1);

/* Set device as ready for RX. */
vdev->ready = DEVICE_RX;
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:28 UTC
Permalink
Check if it has already been vlan-tagged packet, if true, avoid inserting a
duplicated vlan tag into it.

This is a possible case when guest has the capability of inserting vlan tag.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
examples/vhost/main.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 1f1edbe..a7e623e 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1120,6 +1120,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
unsigned len, ret, offset = 0;
const uint16_t lcore_id = rte_lcore_id();
struct virtio_net *dev = vdev->dev;
+ struct ether_hdr *nh;

/*check if destination is local VM*/
if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) {
@@ -1141,12 +1142,21 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
tx_q = &lcore_tx_queue[lcore_id];
len = tx_q->len;

- m->ol_flags = PKT_TX_VLAN_PKT;
+ nh = rte_pktmbuf_mtod(m, struct ether_hdr *);
+ if (unlikely(nh->ether_type == rte_cpu_to_be_16(ETHER_TYPE_VLAN))) {
+ /* Guest has inserted the vlan tag. */
+ struct vlan_hdr *vh = (struct vlan_hdr *) (nh + 1);
+ uint16_t vlan_tag_be = rte_cpu_to_be_16(vlan_tag);
+ if (vh->vlan_tci != vlan_tag_be)
+ vh->vlan_tci = vlan_tag_be;
+ } else {
+ m->ol_flags = PKT_TX_VLAN_PKT;

- m->data_len += offset;
- m->pkt_len += offset;
+ m->data_len += offset;
+ m->pkt_len += offset;

- m->vlan_tci = vlan_tag;
+ m->vlan_tci = vlan_tag;
+ }

tx_q->m_table[len] = m;
len++;
--
1.8.4.2
Ouyang Changchun
2015-01-15 05:15:30 UTC
Permalink
To keep the consistent logic with normal Rx path, the mergeable
Rx path also needs software vlan strip/decap if it is enabled.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 2529dc4..9090613 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -568,6 +568,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -674,6 +675,9 @@ virtio_recv_mergeable_pkts(void *rx_queue,
seg_res -= rcv_cnt;
}

+ if (hw->vlan_strip)
+ rte_vlan_strip(rx_pkts[nb_rx]);
+
VIRTIO_DUMP_PACKET(rx_pkts[nb_rx],
rx_pkts[nb_rx]->data_len);
--
1.8.4.2
Stephen Hemminger
2015-01-15 16:55:57 UTC
Permalink
On Thu, 15 Jan 2015 13:15:30 +0800
Post by Ouyang Changchun
To keep the consistent logic with normal Rx path, the mergeable
Rx path also needs software vlan strip/decap if it is enabled.
---
lib/librte_pmd_virtio/virtio_rxtx.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 2529dc4..9090613 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -568,6 +568,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -674,6 +675,9 @@ virtio_recv_mergeable_pkts(void *rx_queue,
seg_res -= rcv_cnt;
}
+ if (hw->vlan_strip)
+ rte_vlan_strip(rx_pkts[nb_rx]);
+
VIRTIO_DUMP_PACKET(rx_pkts[nb_rx],
rx_pkts[nb_rx]->data_len);
If you resubmit, just combine this with earlier patch that does vlan strip
Ouyang, Changchun
2015-01-16 00:56:52 UTC
Permalink
-----Original Message-----
Sent: Friday, January 16, 2015 12:56 AM
To: Ouyang, Changchun
Subject: Re: [PATCH 22/22] virtio: Use soft vlan strip in mergeable Rx path
On Thu, 15 Jan 2015 13:15:30 +0800
To keep the consistent logic with normal Rx path, the mergeable Rx
path also needs software vlan strip/decap if it is enabled.
---
lib/librte_pmd_virtio/virtio_rxtx.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c
b/lib/librte_pmd_virtio/virtio_rxtx.c
index 2529dc4..9090613 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -568,6 +568,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -674,6 +675,9 @@ virtio_recv_mergeable_pkts(void *rx_queue,
seg_res -= rcv_cnt;
}
+ if (hw->vlan_strip)
+ rte_vlan_strip(rx_pkts[nb_rx]);
+
VIRTIO_DUMP_PACKET(rx_pkts[nb_rx],
rx_pkts[nb_rx]->data_len);
If you resubmit, just combine this with earlier patch that does vlan strip
I think just keeping it as a separate a patch may be a good way.
Thanks
Changchun
Ouyang Changchun
2015-01-27 02:35:40 UTC
Permalink
This is the patch set for single virtio implementation.

Why we need single virtio?
============================
As we know currently there are at least 3 virtio PMD driver implementations:
A) lib/librte_pmd_virtio(refer as virtio A);
B) virtio_net_pmd by 6wind(refer as virtio B);
C) virtio by Brocade/vyatta(refer as virtio C);

Integrating 3 implementations into one could reduce the maintaining cost and time,
in other hand, user don't need practice their application on 3 variant one by one to see
which one is the best for them;

What's the status?
====================
Currently virtio A has covered most features of virtio B except for using port io to get pci resource,
so there is a patch(17/22) to resolve it. But on the other hand there are a few differences between
virtio A and virtio C, it needs integrate features/codes of virtio C into virtio A.
This patch set bases on two original RFC patch sets from Stephen Hemminger[***@networkplumber.org]
Refer to [http://dpdk.org/ml/archives/dev/2014-August/004845.html ] for the original one.
This patch set also resolves some conflict with latest codes, removed duplicated codes, fix some
issues in original codes.

What this patch set contains:
===============================
1) virtio: Rearrange resource initialization, it extracts a function to setup PCI resources;
2) virtio: Use weaker barriers, as DPDK driver only has to deal with the case of running on PCI
and with SMP, In this case, the code can use the weaker barriers instead of using hard (fence)
barriers. This may help performance a bit;
3) virtio: Allow starting with link down, other driver has similar behavior;
4) virtio: Add support for Link State interrupt;
5) ether: Add soft vlan encap/decap functions, it helps if HW don't support vlan strip;
6) virtio: Use software vlan stripping;
7) virtio: Remove unnecessary adapter structure;
8) virtio: Remove redundant vq_alignment, as vq alignment is always 4K, so use constant when needed;
9) virtio: Fix how states are handled during initialization, this is to match Linux kernel;
10) virtio: Make vtpci_get_status a local function as it is used in one file;
11) virtio: Check for packet headroom at compile time;
12) virtio: Move allocation before initialization to avoid being stuck in middle of virtio init;
13) virtio: Add support for vlan filtering;
14) virtio: Add support for multiple mac addresses;
15) virtio: Add ability to set MAC address;
16) virtio: Free mbuf's with threshold, this makes its behavior more like ixgbe;
17) virtio: Use port IO to get PCI resource for security reasons and match virtio-net-pmd;
18) virtio: Fix descriptor index issue;
19) ether: Fix vlan strip/insert issue;
20) example/vhost: Avoid inserting vlan twice and guest and host;
21) example/vhost: Add vlan-strip cmd line option to turn on/off vlan strip on host;
22) virtio: Use soft vlan strip in mergeable Rx path, this makes it has consistent logic
with the normal Rx path.

Changes in v2:
23) virtio: Fix zero copy break issue, the vring should be ready before virtio PMD set
the status of DRIVER_OK;
24) virtio: Remove unnecessary hotspots in data path.

Changchun Ouyang (8):
virtio: Use port IO to get PCI resource.
virtio: Fix descriptor index issue
ether: Fix vlan strip/insert issue
example/vhost: Avoid inserting vlan twice
example/vhost: Add vlan-strip cmd line option
virtio: Use soft vlan strip in mergeable Rx path
virtio: Fix zero copy break issue
virtio: Remove hotspots

Stephen Hemminger (16):
virtio: Rearrange resource initialization
virtio: Use weaker barriers
virtio: Allow starting with link down
virtio: Add support for Link State interrupt
ether: Add soft vlan encap/decap functions
virtio: Use software vlan stripping
virtio: Remove unnecessary adapter structure
virtio: Remove redundant vq_alignment
virtio: Fix how states are handled during initialization
virtio: Make vtpci_get_status local
virtio: Check for packet headroom at compile time
virtio: Move allocation before initialization
virtio: Add support for vlan filtering
virtio: Add suport for multiple mac addresses
virtio: Add ability to set MAC address
virtio: Free mbuf's with threshold

config/common_linuxapp | 2 +
examples/vhost/main.c | 39 ++-
lib/librte_eal/common/include/rte_pci.h | 4 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 5 +-
lib/librte_ether/rte_ethdev.h | 8 +
lib/librte_ether/rte_ether.h | 76 +++++
lib/librte_pmd_virtio/virtio_ethdev.c | 492 +++++++++++++++++++++++++-------
lib/librte_pmd_virtio/virtio_ethdev.h | 12 +-
lib/librte_pmd_virtio/virtio_pci.c | 20 +-
lib/librte_pmd_virtio/virtio_pci.h | 8 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 139 ++++++---
lib/librte_pmd_virtio/virtqueue.h | 59 +++-
12 files changed, 693 insertions(+), 171 deletions(-)
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:41 UTC
Permalink
For clarity make the setup of PCI resources for Linux into a function rather
than block of code #ifdef'd in middle of dev_init.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 76 ++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 33 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b3b5bb6..662a49c 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -794,6 +794,41 @@ virtio_has_msix(const struct rte_pci_addr *loc)

return (d != NULL);
}
+
+/* Extract I/O port numbers from sysfs */
+static int virtio_resource_init(struct rte_pci_device *pci_dev)
+{
+ char dirname[PATH_MAX];
+ char filename[PATH_MAX];
+ unsigned long start, size;
+
+ if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
+ return -1;
+
+ /* get portio size */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/size", dirname);
+ if (parse_sysfs_value(filename, &size) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse size",
+ __func__);
+ return -1;
+ }
+
+ /* get portio start */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/start", dirname);
+ if (parse_sysfs_value(filename, &start) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
+ __func__);
+ return -1;
+ }
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
#else
static int
virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
@@ -801,6 +836,12 @@ virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
/* nic_uio does not enable interrupts, return 0 (false). */
return 0;
}
+
+static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
+{
+ /* no setup required */
+ return 0;
+}
#endif

/*
@@ -831,40 +872,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;

pci_dev = eth_dev->pci_dev;
+ if (virtio_resource_init(pci_dev) < 0)
+ return -1;

-#ifdef RTE_EXEC_ENV_LINUXAPP
- {
- char dirname[PATH_MAX];
- char filename[PATH_MAX];
- unsigned long start, size;
-
- if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
- return -1;
-
- /* get portio size */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/size", dirname);
- if (parse_sysfs_value(filename, &size) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse size",
- __func__);
- return -1;
- }
-
- /* get portio start */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/start", dirname);
- if (parse_sysfs_value(filename, &start) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
- __func__);
- return -1;
- }
- pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
- pci_dev->mem_resource[0].len = (uint64_t)size;
- PMD_INIT_LOG(DEBUG,
- "PCI Port IO found start=0x%lx with size=0x%lx",
- start, size);
- }
-#endif
hw->use_msix = virtio_has_msix(&pci_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:43 UTC
Permalink
Starting driver with link down should be ok, it is with every
other driver. So just allow it.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index dc47e72..5df3b54 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1057,14 +1057,12 @@ virtio_dev_start(struct rte_eth_dev *dev)
vtpci_read_dev_config(hw,
offsetof(struct virtio_net_config, status),
&status, sizeof(status));
- if ((status & VIRTIO_NET_S_LINK_UP) == 0) {
+ if ((status & VIRTIO_NET_S_LINK_UP) == 0)
PMD_INIT_LOG(ERR, "Port: %d Link is DOWN",
dev->data->port_id);
- return -EIO;
- } else {
+ else
PMD_INIT_LOG(DEBUG, "Port: %d Link is UP",
dev->data->port_id);
- }
}
vtpci_reinit_complete(hw);
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:44 UTC
Permalink
Virtio has link state interrupt which can be used.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 78 +++++++++++++++++++++++++++--------
lib/librte_pmd_virtio/virtio_pci.c | 22 ++++++++++
lib/librte_pmd_virtio/virtio_pci.h | 4 ++
3 files changed, 86 insertions(+), 18 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 5df3b54..ef87ff8 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -845,6 +845,34 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
#endif

/*
+ * Process Virtio Config changed interrupt and call the callback
+ * if link state changed.
+ */
+static void
+virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ struct rte_eth_dev *dev = param;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ uint8_t isr;
+
+ /* Read interrupt status which clears interrupt */
+ isr = vtpci_isr(hw);
+ PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+
+ if (isr & VIRTIO_PCI_ISR_CONFIG) {
+ if (virtio_dev_link_update(dev, 0) == 0)
+ _rte_eth_dev_callback_process(dev,
+ RTE_ETH_EVENT_INTR_LSC);
+ }
+
+}
+
+/*
* This function is based on probe() function in virtio_pci.c
* It returns 0 on success.
*/
@@ -968,6 +996,10 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
eth_dev->data->port_id, pci_dev->id.vendor_id,
pci_dev->id.device_id);
+
+ /* Setup interrupt callback */
+ rte_intr_callback_register(&pci_dev->intr_handle,
+ virtio_interrupt_handler, eth_dev);
return 0;
}

@@ -975,7 +1007,7 @@ static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
.id_table = pci_id_virtio_map,
- .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
},
.eth_dev_init = eth_virtio_dev_init,
.dev_private_size = sizeof(struct virtio_adapter),
@@ -1021,6 +1053,9 @@ static int
virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ int ret;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1029,7 +1064,11 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return (-EINVAL);
}

- return 0;
+ ret = vtpci_irq_config(hw, 0);
+ if (ret != 0)
+ PMD_DRV_LOG(ERR, "failed to set config vector");
+
+ return ret;
}


@@ -1037,7 +1076,6 @@ static int
virtio_dev_start(struct rte_eth_dev *dev)
{
uint16_t nb_queues, i;
- uint16_t status;
struct virtio_hw *hw =
VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);

@@ -1052,18 +1090,22 @@ virtio_dev_start(struct rte_eth_dev *dev)
/* Do final configuration before rx/tx engine starts */
virtio_dev_rxtx_start(dev);

- /* Check VIRTIO_NET_F_STATUS for link status*/
- if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
- vtpci_read_dev_config(hw,
- offsetof(struct virtio_net_config, status),
- &status, sizeof(status));
- if ((status & VIRTIO_NET_S_LINK_UP) == 0)
- PMD_INIT_LOG(ERR, "Port: %d Link is DOWN",
- dev->data->port_id);
- else
- PMD_INIT_LOG(DEBUG, "Port: %d Link is UP",
- dev->data->port_id);
+ /* check if lsc interrupt feature is enabled */
+ if (dev->data->dev_conf.intr_conf.lsc) {
+ if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
+ PMD_DRV_LOG(ERR, "link status not supported by host");
+ return -ENOTSUP;
+ }
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) {
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+ return -EIO;
+ }
}
+
+ /* Initialize Link state */
+ virtio_dev_link_update(dev, 0);
+
vtpci_reinit_complete(hw);

/*Notify the backend
@@ -1145,6 +1187,7 @@ virtio_dev_stop(struct rte_eth_dev *dev)
VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);

/* reset the NIC */
+ vtpci_irq_config(hw, 0);
vtpci_reset(hw);
virtio_dev_free_mbufs(dev);
}
@@ -1161,6 +1204,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
old = link;
link.link_duplex = FULL_DUPLEX;
link.link_speed = SPEED_10G;
+
if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
PMD_INIT_LOG(DEBUG, "Get link status from hw");
vtpci_read_dev_config(hw,
@@ -1179,10 +1223,8 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
link.link_status = 1; /* Link up */
}
virtio_dev_atomic_write_link_status(dev, &link);
- if (old.link_status == link.link_status)
- return -1;
- /*changed*/
- return 0;
+
+ return (old.link_status == link.link_status) ? -1 : 0;
}

static void
diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index ca9c748..6d51032 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -127,3 +127,25 @@ vtpci_set_status(struct virtio_hw *hw, uint8_t status)

VIRTIO_WRITE_REG_1(hw, VIRTIO_PCI_STATUS, status);
}
+
+uint8_t
+vtpci_isr(struct virtio_hw *hw)
+{
+
+ return VIRTIO_READ_REG_1(hw, VIRTIO_PCI_ISR);
+}
+
+
+/* Enable one vector (0) for Link State Intrerrupt */
+int
+vtpci_irq_config(struct virtio_hw *hw, uint16_t vec)
+{
+ VIRTIO_WRITE_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR, vec);
+ vec = VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
+ if (vec == VIRTIO_MSI_NO_VECTOR) {
+ PMD_DRV_LOG(ERR, "failed to set config vector");
+ return -EBUSY;
+ }
+
+ return 0;
+}
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 373f9dc..6998737 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -263,4 +263,8 @@ void vtpci_write_dev_config(struct virtio_hw *, uint64_t, void *, int);

void vtpci_read_dev_config(struct virtio_hw *, uint64_t, void *, int);

+uint8_t vtpci_isr(struct virtio_hw *);
+
+int vtpci_irq_config(struct virtio_hw *, uint16_t);
+
#endif /* _VIRTIO_PCI_H_ */
--
1.8.4.2
Xie, Huawei
2015-01-27 09:04:07 UTC
Permalink
-----Original Message-----
Sent: Tuesday, January 27, 2015 10:36 AM
Subject: [dpdk-dev] [PATCH v2 04/24] virtio: Add support for Link State interrupt
Virtio has link state interrupt which can be used.
---
lib/librte_pmd_virtio/virtio_ethdev.c | 78 +++++++++++++++++++++++++++------
--
lib/librte_pmd_virtio/virtio_pci.c | 22 ++++++++++
lib/librte_pmd_virtio/virtio_pci.h | 4 ++
3 files changed, 86 insertions(+), 18 deletions(-)
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 5df3b54..ef87ff8 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -845,6 +845,34 @@ static int virtio_resource_init(struct rte_pci_device
*pci_dev __rte_unused)
#endif
/*
+ * Process Virtio Config changed interrupt and call the callback
+ * if link state changed.
+ */
+static void
+virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ struct rte_eth_dev *dev = param;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ uint8_t isr;
+
+ /* Read interrupt status which clears interrupt */
+ isr = vtpci_isr(hw);
+ PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+
Is it better to put rte_intr_enable after we have handled the interrupt.
Is there the possibility of interrupt reentrant in uio intr framework?
+ if (isr & VIRTIO_PCI_ISR_CONFIG) {
+ if (virtio_dev_link_update(dev, 0) == 0)
+ _rte_eth_dev_callback_process(dev,
+
RTE_ETH_EVENT_INTR_LSC);
+ }
+
+}
+
Stephen Hemminger
2015-01-27 10:00:06 UTC
Permalink
On Tue, 27 Jan 2015 09:04:07 +0000
Post by Xie, Huawei
-----Original Message-----
Sent: Tuesday, January 27, 2015 10:36 AM
Subject: [dpdk-dev] [PATCH v2 04/24] virtio: Add support for Link State interrupt
Virtio has link state interrupt which can be used.
---
lib/librte_pmd_virtio/virtio_ethdev.c | 78 +++++++++++++++++++++++++++------
--
lib/librte_pmd_virtio/virtio_pci.c | 22 ++++++++++
lib/librte_pmd_virtio/virtio_pci.h | 4 ++
3 files changed, 86 insertions(+), 18 deletions(-)
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 5df3b54..ef87ff8 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -845,6 +845,34 @@ static int virtio_resource_init(struct rte_pci_device
*pci_dev __rte_unused)
#endif
/*
+ * Process Virtio Config changed interrupt and call the callback
+ * if link state changed.
+ */
+static void
+virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ struct rte_eth_dev *dev = param;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ uint8_t isr;
+
+ /* Read interrupt status which clears interrupt */
+ isr = vtpci_isr(hw);
+ PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+
Is it better to put rte_intr_enable after we have handled the interrupt.
Is there the possibility of interrupt reentrant in uio intr framework?
The UIO framework handles IRQ's via posix thread that is reading
fd, then calling this code. Therefore it is always single threaded.
Ouyang, Changchun
2015-01-28 03:03:32 UTC
Permalink
Hi Stephen,
-----Original Message-----
Sent: Tuesday, January 27, 2015 6:00 PM
To: Xie, Huawei
Subject: Re: [dpdk-dev] [PATCH v2 04/24] virtio: Add support for Link State
interrupt
On Tue, 27 Jan 2015 09:04:07 +0000
Post by Xie, Huawei
-----Original Message-----
Sent: Tuesday, January 27, 2015 10:36 AM
Subject: [dpdk-dev] [PATCH v2 04/24] virtio: Add support for Link State interrupt
Virtio has link state interrupt which can be used.
---
lib/librte_pmd_virtio/virtio_ethdev.c | 78
+++++++++++++++++++++++++++------
--
lib/librte_pmd_virtio/virtio_pci.c | 22 ++++++++++
lib/librte_pmd_virtio/virtio_pci.h | 4 ++
3 files changed, 86 insertions(+), 18 deletions(-)
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 5df3b54..ef87ff8 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -845,6 +845,34 @@ static int virtio_resource_init(struct
rte_pci_device *pci_dev __rte_unused) #endif
/*
+ * Process Virtio Config changed interrupt and call the callback
+ * if link state changed.
+ */
+static void
+virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ struct rte_eth_dev *dev = param;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ uint8_t isr;
+
+ /* Read interrupt status which clears interrupt */
+ isr = vtpci_isr(hw);
+ PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+
Is it better to put rte_intr_enable after we have handled the interrupt.
Is there the possibility of interrupt reentrant in uio intr framework?
The UIO framework handles IRQ's via posix thread that is reading fd, then
calling this code. Therefore it is always single threaded.
Even if it is under UIO framework, and always single threaded,
How about move rte_intr_enable after the virtio_dev_link_update() and _rte_eth_dev_callback_process is called.
This make it more like interrupt handler in linux kernel.
What do you think of it?
Thanks
Changchun
Stephen Hemminger
2015-01-28 15:11:40 UTC
Permalink
On Wed, 28 Jan 2015 03:03:32 +0000
Post by Ouyang, Changchun
Hi Stephen,
-----Original Message-----
Sent: Tuesday, January 27, 2015 6:00 PM
To: Xie, Huawei
Subject: Re: [dpdk-dev] [PATCH v2 04/24] virtio: Add support for Link State
interrupt
On Tue, 27 Jan 2015 09:04:07 +0000
Post by Xie, Huawei
-----Original Message-----
Sent: Tuesday, January 27, 2015 10:36 AM
Subject: [dpdk-dev] [PATCH v2 04/24] virtio: Add support for Link
State interrupt
Virtio has link state interrupt which can be used.
---
lib/librte_pmd_virtio/virtio_ethdev.c | 78
+++++++++++++++++++++++++++------
--
lib/librte_pmd_virtio/virtio_pci.c | 22 ++++++++++
lib/librte_pmd_virtio/virtio_pci.h | 4 ++
3 files changed, 86 insertions(+), 18 deletions(-)
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 5df3b54..ef87ff8 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -845,6 +845,34 @@ static int virtio_resource_init(struct
rte_pci_device *pci_dev __rte_unused) #endif
/*
+ * Process Virtio Config changed interrupt and call the callback
+ * if link state changed.
+ */
+static void
+virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ struct rte_eth_dev *dev = param;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ uint8_t isr;
+
+ /* Read interrupt status which clears interrupt */
+ isr = vtpci_isr(hw);
+ PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+
Is it better to put rte_intr_enable after we have handled the interrupt.
Is there the possibility of interrupt reentrant in uio intr framework?
The UIO framework handles IRQ's via posix thread that is reading fd, then
calling this code. Therefore it is always single threaded.
Even if it is under UIO framework, and always single threaded,
How about move rte_intr_enable after the virtio_dev_link_update() and _rte_eth_dev_callback_process is called.
This make it more like interrupt handler in linux kernel.
What do you think of it?
I ordered the interrupt handling to match what happens in e1000/igb
handler. My concern is that interrupt was level (not edge triggered)
and another link transisition could occur and be missed.
Ouyang Changchun
2015-01-27 02:35:42 UTC
Permalink
The DPDK driver only has to deal with the case of running on PCI
and with SMP. In this case, the code can use the weaker barriers
instead of using hard (fence) barriers. This will help performance.
The rationale is explained in Linux kernel virtio_ring.h.

To make it clearer that this is a virtio thing and not some generic
barrier, prefix the barrier calls with virtio_.

Add missing (and needed) barrier between updating ring data
structure and notifying host.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 2 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 8 +++++---
lib/librte_pmd_virtio/virtqueue.h | 19 ++++++++++++++-----
3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 662a49c..dc47e72 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -175,7 +175,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;

- rmb();
+ virtio_rmb();

used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index c013f97..78af334 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -456,7 +456,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ);
@@ -516,6 +516,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
}

if (likely(nb_enqueued)) {
+ virtio_wmb();
if (unlikely(virtqueue_kick_prepare(rxvq))) {
virtqueue_notify(rxvq);
PMD_RX_LOG(DEBUG, "Notified\n");
@@ -547,7 +548,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

if (nb_used == 0)
return 0;
@@ -694,7 +695,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
nb_used = VIRTQUEUE_NUSED(txvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);

@@ -735,6 +736,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
}
}
vq_update_avail_idx(txvq);
+ virtio_wmb();

txvq->packets += nb_tx;

diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index fdee054..f6ad98d 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -46,9 +46,18 @@
#include "virtio_ring.h"
#include "virtio_logs.h"

-#define mb() rte_mb()
-#define wmb() rte_wmb()
-#define rmb() rte_rmb()
+/*
+ * Per virtio_config.h in Linux.
+ * For virtio_pci on SMP, we don't need to order with respect to MMIO
+ * accesses through relaxed memory I/O windows, so smp_mb() et al are
+ * sufficient.
+ *
+ * This driver is for virtio_pci on SMP and therefore can assume
+ * weaker (compiler barriers)
+ */
+#define virtio_mb() rte_mb()
+#define virtio_rmb() rte_compiler_barrier()
+#define virtio_wmb() rte_compiler_barrier()

#ifdef RTE_PMD_PACKET_PREFETCH
#define rte_packet_prefetch(p) rte_prefetch1(p)
@@ -225,7 +234,7 @@ virtqueue_full(const struct virtqueue *vq)
static inline void
vq_update_avail_idx(struct virtqueue *vq)
{
- rte_compiler_barrier();
+ virtio_rmb();
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}

@@ -255,7 +264,7 @@ static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
- * Ensure updated avail->idx is visible to host. mb() necessary?
+ * Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
--
1.8.4.2
Xie, Huawei
2015-01-27 07:03:47 UTC
Permalink
-----Original Message-----
Sent: Tuesday, January 27, 2015 10:36 AM
Subject: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
The DPDK driver only has to deal with the case of running on PCI
and with SMP. In this case, the code can use the weaker barriers
instead of using hard (fence) barriers. This will help performance.
The rationale is explained in Linux kernel virtio_ring.h.
To make it clearer that this is a virtio thing and not some generic
barrier, prefix the barrier calls with virtio_.
Add missing (and needed) barrier between updating ring data
structure and notifying host.
---
lib/librte_pmd_virtio/virtio_ethdev.c | 2 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 8 +++++---
lib/librte_pmd_virtio/virtqueue.h | 19 ++++++++++++++-----
3 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 662a49c..dc47e72 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -175,7 +175,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;
- rmb();
+ virtio_rmb();
used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c
b/lib/librte_pmd_virtio/virtio_rxtx.c
index c013f97..78af334 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -456,7 +456,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf
**rx_pkts, uint16_t nb_pkts)
nb_used = VIRTQUEUE_NUSED(rxvq);
- rmb();
+ virtio_rmb();
num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
VIRTIO_MBUF_BURST_SZ);
@@ -516,6 +516,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf
**rx_pkts, uint16_t nb_pkts)
}
if (likely(nb_enqueued)) {
+ virtio_wmb();
if (unlikely(virtqueue_kick_prepare(rxvq))) {
virtqueue_notify(rxvq);
PMD_RX_LOG(DEBUG, "Notified\n");
@@ -547,7 +548,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
nb_used = VIRTQUEUE_NUSED(rxvq);
- rmb();
+ virtio_rmb();
if (nb_used == 0)
return 0;
@@ -694,7 +695,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf
**tx_pkts, uint16_t nb_pkts)
PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
nb_used = VIRTQUEUE_NUSED(txvq);
- rmb();
+ virtio_rmb();
VIRTIO_MBUF_BURST_SZ);
@@ -735,6 +736,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf
**tx_pkts, uint16_t nb_pkts)
}
}
vq_update_avail_idx(txvq);
+ virtio_wmb();
txvq->packets += nb_tx;
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index fdee054..f6ad98d 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -46,9 +46,18 @@
#include "virtio_ring.h"
#include "virtio_logs.h"
-#define mb() rte_mb()
-#define wmb() rte_wmb()
-#define rmb() rte_rmb()
+/*
+ * Per virtio_config.h in Linux.
+ * For virtio_pci on SMP, we don't need to order with respect to MMIO
+ * accesses through relaxed memory I/O windows, so smp_mb() et al are
+ * sufficient.
+ *
+ * This driver is for virtio_pci on SMP and therefore can assume
+ * weaker (compiler barriers)
+ */
+#define virtio_mb() rte_mb()
+#define virtio_rmb() rte_compiler_barrier()
+#define virtio_wmb() rte_compiler_barrier()
#ifdef RTE_PMD_PACKET_PREFETCH
#define rte_packet_prefetch(p) rte_prefetch1(p)
@@ -225,7 +234,7 @@ virtqueue_full(const struct virtqueue *vq)
static inline void
vq_update_avail_idx(struct virtqueue *vq)
{
- rte_compiler_barrier();
+ virtio_rmb();
I recall our original code is virtio_wmb().
Use store fence to ensure all updates to entries before updating the index.
Why do we need virtio_rmb() here and add virtio_wmb after vq_update_avail_idx()?
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}
@@ -255,7 +264,7 @@ static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
- * Ensure updated avail->idx is visible to host. mb() necessary?
+ * Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
--
1.8.4.2
Stephen Hemminger
2015-01-27 09:58:31 UTC
Permalink
Post by Xie, Huawei
I recall our original code is virtio_wmb().
Use store fence to ensure all updates to entries before updating the index.
Why do we need virtio_rmb() here and add virtio_wmb after vq_update_avail_idx()?
Store fence is unnecessary, Intel CPU's are cache coherent, please read
the virtio Linux ring header file for explanation. A full fence WMB
is more expensive and causes CPU stall
Post by Xie, Huawei
Post by Ouyang Changchun
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}
@@ -255,7 +264,7 @@ static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
- * Ensure updated avail->idx is visible to host. mb() necessary?
+ * Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
--
1.8.4.2
Xie, Huawei
2015-01-27 16:16:10 UTC
Permalink
-----Original Message-----
Sent: Tuesday, January 27, 2015 5:59 PM
To: Xie, Huawei
Subject: Re: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
Post by Xie, Huawei
I recall our original code is virtio_wmb().
Use store fence to ensure all updates to entries before updating the index.
Why do we need virtio_rmb() here and add virtio_wmb after
vq_update_avail_idx()?
Store fence is unnecessary, Intel CPU's are cache coherent, please read
the virtio Linux ring header file for explanation. A full fence WMB
is more expensive and causes CPU stall
I mean virtio_wmb rather than virtio_rmb should be used here,
and both of them are defined as compiler barrier.

The following code is linux virtio driver for adding buffer to vring.
/* Put entry in available array (but don't update avail->idx until they
* do sync). */
avail = (vq->vring.avail->idx & (vq->vring.num-1));
vq->vring.avail->ring[avail] = head;

/* Descriptors and available array need to be set before we expose the
* new available array entries. */
virtio_wmb(vq->weak_barriers);
vq->vring.avail->idx++;
Post by Xie, Huawei
Post by Ouyang Changchun
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}
@@ -255,7 +264,7 @@ static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
- * Ensure updated avail->idx is visible to host. mb() necessary?
+ * Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
--
1.8.4.2
Ouyang, Changchun
2015-01-28 06:12:59 UTC
Permalink
-----Original Message-----
From: Xie, Huawei
Sent: Wednesday, January 28, 2015 12:16 AM
To: Stephen Hemminger
Subject: RE: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
-----Original Message-----
Sent: Tuesday, January 27, 2015 5:59 PM
To: Xie, Huawei
Subject: Re: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
Post by Xie, Huawei
I recall our original code is virtio_wmb().
Use store fence to ensure all updates to entries before updating the
index.
Post by Xie, Huawei
Why do we need virtio_rmb() here and add virtio_wmb after
vq_update_avail_idx()?
Store fence is unnecessary, Intel CPU's are cache coherent, please
read the virtio Linux ring header file for explanation. A full fence
WMB is more expensive and causes CPU stall
I mean virtio_wmb rather than virtio_rmb should be used here, and both of
them are defined as compiler barrier.
The following code is linux virtio driver for adding buffer to vring.
/* Put entry in available array (but don't update avail->idx until they
* do sync). */
avail = (vq->vring.avail->idx & (vq->vring.num-1));
vq->vring.avail->ring[avail] = head;
/* Descriptors and available array need to be set before we expose the
* new available array entries. */
virtio_wmb(vq->weak_barriers);
vq->vring.avail->idx++;
Yes, use virtio_wmb is better here, will change it in next version.

Thanks
Changchun
Xie, Huawei
2015-01-27 07:56:36 UTC
Permalink
-------if (likely(nb_enqueued)) {
------->-------virtio_wmb();
------->-------if (unlikely(virtqueue_kick_prepare(rxvq))) {
------->------->-------virtqueue_notify(rxvq);
------->------->-------PMD_RX_LOG(DEBUG, "Notified\n");
------->-------}
-------}
-------vq_update_avail_idx(rxvq);
Two confuses for the modification here:

1.
why notify host without updating avail idx?
Will this cause potential deadlock?

2.
Why update avail index even no packets are enqueued?
Ouyang, Changchun
2015-01-27 08:04:44 UTC
Permalink
Hi Stephen,
Although it is original code logic,
But we can move vq_update_avail_idx(rxvq) into if block to resolve it.
What do you think of it?

Thanks
Changchun

-----Original Message-----
From: Xie, Huawei
Sent: Tuesday, January 27, 2015 3:57 PM
To: Ouyang, Changchun; ***@dpdk.org
Cc: Stephen Hemminger (***@networkplumber.org)
Subject: RE: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
-------if (likely(nb_enqueued)) {
------->-------virtio_wmb();
------->-------if (unlikely(virtqueue_kick_prepare(rxvq))) {
------->------->-------virtqueue_notify(rxvq);
------->------->-------PMD_RX_LOG(DEBUG, "Notified\n");
------->-------}
-------}
-------vq_update_avail_idx(rxvq);
Two confuses for the modification here:

1.
why notify host without updating avail idx?
Will this cause potential deadlock?

2.
Why update avail index even no packets are enqueued?
Ouyang Changchun
2015-01-27 02:35:45 UTC
Permalink
It is helpful to allow device drivers that don't support hardware
VLAN stripping to emulate this in software. This allows application
to be device independent.

Avoid discarding shared mbufs. Make a copy in rte_vlan_insert() of any
packet to be tagged that has a reference count > 1.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ether.h | 76 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 76 insertions(+)

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 7e7d22c..74f71c2 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -49,6 +49,8 @@ extern "C" {

#include <rte_memcpy.h>
#include <rte_random.h>
+#include <rte_mbuf.h>
+#include <rte_byteorder.h>

#define ETHER_ADDR_LEN 6 /**< Length of Ethernet address. */
#define ETHER_TYPE_LEN 2 /**< Length of Ethernet type field. */
@@ -333,6 +335,80 @@ struct vxlan_hdr {
#define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr))
/**< VXLAN tunnel header length. */

+/**
+ * Extract VLAN tag information into mbuf
+ *
+ * Software version of VLAN stripping
+ *
+ * @param m
+ * The packet mbuf.
+ * @return
+ * - 0: Success
+ * - 1: not a vlan packet
+ */
+static inline int rte_vlan_strip(struct rte_mbuf *m)
+{
+ struct ether_hdr *eh
+ = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ if (eh->ether_type != ETHER_TYPE_VLAN)
+ return -1;
+
+ struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1);
+ m->ol_flags |= PKT_RX_VLAN_PKT;
+ m->vlan_tci = rte_be_to_cpu_16(vh->vlan_tci);
+
+ /* Copy ether header over rather than moving whole packet */
+ memmove(rte_pktmbuf_adj(m, sizeof(struct vlan_hdr)),
+ eh, 2 * ETHER_ADDR_LEN);
+
+ return 0;
+}
+
+/**
+ * Insert VLAN tag into mbuf.
+ *
+ * Software version of VLAN unstripping
+ *
+ * @param m
+ * The packet mbuf.
+ * @return
+ * - 0: On success
+ * -EPERM: mbuf is is shared overwriting would be unsafe
+ * -ENOSPC: not enough headroom in mbuf
+ */
+static inline int rte_vlan_insert(struct rte_mbuf **m)
+{
+ struct ether_hdr *oh, *nh;
+ struct vlan_hdr *vh;
+
+#ifdef RTE_MBUF_REFCNT
+ /* Can't insert header if mbuf is shared */
+ if (rte_mbuf_refcnt_read(*m) > 1) {
+ struct rte_mbuf *copy;
+
+ copy = rte_pktmbuf_clone(*m, (*m)->pool);
+ if (unlikely(copy == NULL))
+ return -ENOMEM;
+ rte_pktmbuf_free(*m);
+ *m = copy;
+ }
+#endif
+ oh = rte_pktmbuf_mtod(*m, struct ether_hdr *);
+ nh = (struct ether_hdr *)
+ rte_pktmbuf_prepend(*m, sizeof(struct vlan_hdr));
+ if (nh == NULL)
+ return -ENOSPC;
+
+ memmove(nh, oh, 2 * ETHER_ADDR_LEN);
+ nh->ether_type = ETHER_TYPE_VLAN;
+
+ vh = (struct vlan_hdr *) (nh + 1);
+ vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci);
+
+ return 0;
+}
+
#ifdef __cplusplus
}
#endif
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:49 UTC
Permalink
Change order of initialiazation to match Linux kernel.
Don't blow away control queue by doing reset when stopped.

Calling dev_stop then dev_start would not work.
Dev_stop was calling virtio reset and that would clear all queues
and clear all feature negotiation.
Resolved by only doing reset on device removal.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 58 ++++++++++++++++++++---------------
lib/librte_pmd_virtio/virtio_pci.c | 10 ++----
lib/librte_pmd_virtio/virtio_pci.h | 3 +-
3 files changed, 37 insertions(+), 34 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 0d41e7f..47dd33d 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -398,9 +398,14 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx,
static void
virtio_dev_close(struct rte_eth_dev *dev)
{
+ struct virtio_hw *hw = dev->data->dev_private;
+
PMD_INIT_LOG(DEBUG, "virtio_dev_close");

- virtio_dev_stop(dev);
+ /* reset the NIC */
+ vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
+ vtpci_reset(hw);
+ virtio_dev_free_mbufs(dev);
}

static void
@@ -889,6 +894,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
if (rte_eal_process_type() == RTE_PROC_SECONDARY)
return 0;

+ /* Tell the host we've noticed this device. */
+ vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
+
pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
return -1;
@@ -899,9 +907,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Reset the device although not necessary at startup */
vtpci_reset(hw);

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
virtio_negotiate_features(hw);
@@ -990,6 +995,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Setup interrupt callback */
rte_intr_callback_register(&pci_dev->intr_handle,
virtio_interrupt_handler, eth_dev);
+
+ virtio_dev_cq_start(eth_dev);
+
return 0;
}

@@ -1044,7 +1052,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
struct virtio_hw *hw = dev->data->dev_private;
- int ret;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1055,11 +1062,12 @@ virtio_dev_configure(struct rte_eth_dev *dev)

hw->vlan_strip = rxmode->hw_vlan_strip;

- ret = vtpci_irq_config(hw, 0);
- if (ret != 0)
+ if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
PMD_DRV_LOG(ERR, "failed to set config vector");
+ return -EBUSY;
+ }

- return ret;
+ return 0;
}


@@ -1069,17 +1077,6 @@ virtio_dev_start(struct rte_eth_dev *dev)
uint16_t nb_queues, i;
struct virtio_hw *hw = dev->data->dev_private;

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
- /* Tell the host we've known how to drive the device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
-
- virtio_dev_cq_start(dev);
-
- /* Do final configuration before rx/tx engine starts */
- virtio_dev_rxtx_start(dev);
-
/* check if lsc interrupt feature is enabled */
if (dev->data->dev_conf.intr_conf.lsc) {
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
@@ -1096,8 +1093,16 @@ virtio_dev_start(struct rte_eth_dev *dev)
/* Initialize Link state */
virtio_dev_link_update(dev, 0);

+ /* On restart after stop do not touch queues */
+ if (hw->started)
+ return 0;
+
vtpci_reinit_complete(hw);

+ /* Do final configuration before rx/tx engine starts */
+ virtio_dev_rxtx_start(dev);
+ hw->started = 1;
+
/*Notify the backend
*Otherwise the tap backend might already stop its queue due to fullness.
*vhost backend will have no chance to be waked up
@@ -1168,17 +1173,20 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev)
}

/*
- * Stop device: disable rx and tx functions to allow for reconfiguring.
+ * Stop device: disable interrupt and mark link down
*/
static void
virtio_dev_stop(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw = dev->data->dev_private;
+ struct rte_eth_link link;

- /* reset the NIC */
- vtpci_irq_config(hw, 0);
- vtpci_reset(hw);
- virtio_dev_free_mbufs(dev);
+ PMD_INIT_LOG(DEBUG, "stop");
+
+ if (dev->data->dev_conf.intr_conf.lsc)
+ rte_intr_disable(&dev->pci_dev->intr_handle);
+
+ memset(&link, 0, sizeof(link));
+ virtio_dev_atomic_write_link_status(dev, &link);
}

static int
diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index 6d51032..b099e4f 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -137,15 +137,9 @@ vtpci_isr(struct virtio_hw *hw)


/* Enable one vector (0) for Link State Intrerrupt */
-int
+uint16_t
vtpci_irq_config(struct virtio_hw *hw, uint16_t vec)
{
VIRTIO_WRITE_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR, vec);
- vec = VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
- if (vec == VIRTIO_MSI_NO_VECTOR) {
- PMD_DRV_LOG(ERR, "failed to set config vector");
- return -EBUSY;
- }
-
- return 0;
+ return VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
}
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 6d93fac..0a4b578 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -170,6 +170,7 @@ struct virtio_hw {
uint16_t vtnet_hdr_size;
uint8_t vlan_strip;
uint8_t use_msix;
+ uint8_t started;
uint8_t mac_addr[ETHER_ADDR_LEN];
};

@@ -266,6 +267,6 @@ void vtpci_read_dev_config(struct virtio_hw *, uint64_t, void *, int);

uint8_t vtpci_isr(struct virtio_hw *);

-int vtpci_irq_config(struct virtio_hw *, uint16_t);
+uint16_t vtpci_irq_config(struct virtio_hw *, uint16_t);

#endif /* _VIRTIO_PCI_H_ */
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:52 UTC
Permalink
If allocation fails, don't want to leave virtio device stuck
in middle of initialization sequence.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 9679c2f..39b1fb4 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -890,6 +890,15 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
if (rte_eal_process_type() == RTE_PROC_SECONDARY)
return 0;

+ /* Allocate memory for storing MAC addresses */
+ eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0);
+ if (eth_dev->data->mac_addrs == NULL) {
+ PMD_INIT_LOG(ERR,
+ "Failed to allocate %d bytes needed to store MAC addresses",
+ ETHER_ADDR_LEN);
+ return -ENOMEM;
+ }
+
/* Tell the host we've noticed this device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);

@@ -916,15 +925,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr);
}

- /* Allocate memory for storing MAC addresses */
- eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0);
- if (eth_dev->data->mac_addrs == NULL) {
- PMD_INIT_LOG(ERR,
- "Failed to allocate %d bytes needed to store MAC addresses",
- ETHER_ADDR_LEN);
- return -ENOMEM;
- }
-
/* Copy the permanent MAC address to: virtio_hw */
virtio_get_hwaddr(hw);
ether_addr_copy((struct ether_addr *) hw->mac_addr,
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:46 UTC
Permalink
Implement VLAN stripping in software. This allows application
to be device independent.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ethdev.h | 3 +++
lib/librte_pmd_virtio/virtio_ethdev.c | 2 ++
lib/librte_pmd_virtio/virtio_pci.h | 1 +
lib/librte_pmd_virtio/virtio_rxtx.c | 20 ++++++++++++++++++--
4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 1200c1c..94d6b2b 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -643,6 +643,9 @@ struct rte_eth_rxconf {
#define ETH_TXQ_FLAGS_NOOFFLOADS \
(ETH_TXQ_FLAGS_NOVLANOFFL | ETH_TXQ_FLAGS_NOXSUMSCTP | \
ETH_TXQ_FLAGS_NOXSUMUDP | ETH_TXQ_FLAGS_NOXSUMTCP)
+#define ETH_TXQ_FLAGS_NOXSUMS \
+ (ETH_TXQ_FLAGS_NOXSUMSCTP | ETH_TXQ_FLAGS_NOXSUMUDP | \
+ ETH_TXQ_FLAGS_NOXSUMTCP)
/**
* A structure used to configure a TX ring of an Ethernet port.
*/
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index ef87ff8..da74659 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1064,6 +1064,8 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return (-EINVAL);
}

+ hw->vlan_strip = rxmode->hw_vlan_strip;
+
ret = vtpci_irq_config(hw, 0);
if (ret != 0)
PMD_DRV_LOG(ERR, "failed to set config vector");
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 6998737..6d93fac 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -168,6 +168,7 @@ struct virtio_hw {
uint32_t max_tx_queues;
uint32_t max_rx_queues;
uint16_t vtnet_hdr_size;
+ uint8_t vlan_strip;
uint8_t use_msix;
uint8_t mac_addr[ETHER_ADDR_LEN];
};
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 78af334..e0216ec 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -49,6 +49,7 @@
#include <rte_prefetch.h>
#include <rte_string_fns.h>
#include <rte_errno.h>
+#include <rte_byteorder.h>

#include "virtio_logs.h"
#include "virtio_ethdev.h"
@@ -408,8 +409,8 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,

PMD_INIT_FUNC_TRACE();

- if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOOFFLOADS)
- != ETH_TXQ_FLAGS_NOOFFLOADS) {
+ if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS)
+ != ETH_TXQ_FLAGS_NOXSUMS) {
PMD_INIT_LOG(ERR, "TX checksum offload not supported\n");
return -EINVAL;
}
@@ -446,6 +447,7 @@ uint16_t
virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -489,6 +491,9 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
rxm->pkt_len = (uint32_t)(len[i] - hdr_size);
rxm->data_len = (uint16_t)(len[i] - hdr_size);

+ if (hw->vlan_strip)
+ rte_vlan_strip(rxm);
+
VIRTIO_DUMP_PACKET(rxm, rxm->data_len);

rx_pkts[nb_rx++] = rxm;
@@ -717,6 +722,17 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
*/
if (likely(need <= 0)) {
txm = tx_pkts[nb_tx];
+
+ /* Do VLAN tag insertion */
+ if (txm->ol_flags & PKT_TX_VLAN_PKT) {
+ error = rte_vlan_insert(&txm);
+ if (unlikely(error)) {
+ rte_pktmbuf_free(txm);
+ ++nb_tx;
+ continue;
+ }
+ }
+
/* Enqueue Packet buffers */
error = virtqueue_enqueue_xmit(txvq, txm);
if (unlikely(error)) {
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:47 UTC
Permalink
Cleanup virtio code by eliminating unnecessary nesting of
virtio hardware structure inside adapter structure.
Also allows removing unneeded macro, making code clearer.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 43 ++++++++++++-----------------------
lib/librte_pmd_virtio/virtio_ethdev.h | 9 --------
lib/librte_pmd_virtio/virtio_rxtx.c | 3 +--
3 files changed, 16 insertions(+), 39 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index da74659..59b74b7 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -207,8 +207,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
static int
virtio_set_multiple_queues(struct rte_eth_dev *dev, uint16_t nb_queues)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -242,8 +241,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
const struct rte_memzone *mz;
uint16_t vq_size;
int size;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtqueue *vq = NULL;

/* Write the virtqueue index to the Queue Select Field */
@@ -383,8 +381,7 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx,
struct virtqueue *vq;
uint16_t nb_desc = 0;
int ret;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

PMD_INIT_FUNC_TRACE();
ret = virtio_dev_queue_setup(dev, VTNET_CQ, VTNET_SQ_CQ_QUEUE_IDX,
@@ -410,8 +407,7 @@ virtio_dev_close(struct rte_eth_dev *dev)
static void
virtio_dev_promiscuous_enable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -430,8 +426,7 @@ virtio_dev_promiscuous_enable(struct rte_eth_dev *dev)
static void
virtio_dev_promiscuous_disable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -450,8 +445,7 @@ virtio_dev_promiscuous_disable(struct rte_eth_dev *dev)
static void
virtio_dev_allmulticast_enable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -470,8 +464,7 @@ virtio_dev_allmulticast_enable(struct rte_eth_dev *dev)
static void
virtio_dev_allmulticast_disable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -853,8 +846,7 @@ virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
void *param)
{
struct rte_eth_dev *dev = param;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
uint8_t isr;

/* Read interrupt status which clears interrupt */
@@ -880,12 +872,11 @@ static int
eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
struct rte_eth_dev *eth_dev)
{
+ struct virtio_hw *hw = eth_dev->data->dev_private;
struct virtio_net_config *config;
struct virtio_net_config local_config;
uint32_t offset_conf = sizeof(config->mac);
struct rte_pci_device *pci_dev;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);

if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) {
PMD_INIT_LOG(ERR,
@@ -1010,7 +1001,7 @@ static struct eth_driver rte_virtio_pmd = {
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
},
.eth_dev_init = eth_virtio_dev_init,
- .dev_private_size = sizeof(struct virtio_adapter),
+ .dev_private_size = sizeof(struct virtio_hw),
};

/*
@@ -1053,8 +1044,7 @@ static int
virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
int ret;

PMD_INIT_LOG(DEBUG, "configure");
@@ -1078,8 +1068,7 @@ static int
virtio_dev_start(struct rte_eth_dev *dev)
{
uint16_t nb_queues, i;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

/* Tell the host we've noticed this device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
@@ -1185,8 +1174,7 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev)
static void
virtio_dev_stop(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

/* reset the NIC */
vtpci_irq_config(hw, 0);
@@ -1199,8 +1187,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
{
struct rte_eth_link link, old;
uint16_t status;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
memset(&link, 0, sizeof(link));
virtio_dev_atomic_read_link_status(dev, &link);
old = link;
@@ -1232,7 +1219,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
static void
virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
{
- struct virtio_hw *hw = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

dev_info->driver_name = dev->driver->pci_drv.name;
dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h
index 1da3c62..55c9749 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.h
+++ b/lib/librte_pmd_virtio/virtio_ethdev.h
@@ -110,15 +110,6 @@ uint16_t virtio_recv_mergeable_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

-/*
- * Structure to store private data for each driver instance (for each port).
- */
-struct virtio_adapter {
- struct virtio_hw hw;
-};
-
-#define VIRTIO_DEV_PRIVATE_TO_HW(adapter)\
- (&((struct virtio_adapter *)adapter)->hw)

/*
* The VIRTIO_NET_F_GUEST_TSO[46] features permit the host to send us
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index e0216ec..a82d5ff 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -326,8 +326,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
void
virtio_dev_cq_start(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

if (hw->cvq) {
virtio_dev_vring_start(hw->cvq, VTNET_CQ);
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:53 UTC
Permalink
Virtio supports vlan filtering.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 31 +++++++++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 39b1fb4..591d692 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -84,6 +84,8 @@ static void virtio_dev_tx_queue_release(__rte_unused void *txq);
static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats);
static void virtio_dev_stats_reset(struct rte_eth_dev *dev);
static void virtio_dev_free_mbufs(struct rte_eth_dev *dev);
+static int virtio_vlan_filter_set(struct rte_eth_dev *dev,
+ uint16_t vlan_id, int on);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -511,6 +513,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.tx_queue_release = virtio_dev_tx_queue_release,
/* collect stats per queue */
.queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set,
+ .vlan_filter_set = virtio_vlan_filter_set,
};

static inline int
@@ -640,14 +643,31 @@ virtio_get_hwaddr(struct virtio_hw *hw)
}
}

+static int
+virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ struct virtio_pmd_ctrl ctrl;
+ int len;
+
+ if (!vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN))
+ return -ENOTSUP;
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_VLAN;
+ ctrl.hdr.cmd = on ? VIRTIO_NET_CTRL_VLAN_ADD : VIRTIO_NET_CTRL_VLAN_DEL;
+ memcpy(ctrl.data, &vlan_id, sizeof(vlan_id));
+ len = sizeof(vlan_id);
+
+ return virtio_send_command(hw->cvq, &ctrl, &len, 1);
+}

static void
virtio_negotiate_features(struct virtio_hw *hw)
{
uint32_t host_features, mask;

- mask = VIRTIO_NET_F_CTRL_VLAN;
- mask |= VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;
+ /* checksum offload not implemented */
+ mask = VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;

/* TSO and LRO are only available when their corresponding
* checksum offload feature is also negotiated.
@@ -1058,6 +1078,13 @@ virtio_dev_configure(struct rte_eth_dev *dev)

hw->vlan_strip = rxmode->hw_vlan_strip;

+ if (rxmode->hw_vlan_filter
+ && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) {
+ PMD_DRV_LOG(NOTICE,
+ "vlan filtering not available on this host");
+ return -ENOTSUP;
+ }
+
if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
PMD_DRV_LOG(ERR, "failed to set config vector");
return -EBUSY;
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:50 UTC
Permalink
Make vtpci_get_status a local function as it is used in one file.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_pci.c | 4 +++-
lib/librte_pmd_virtio/virtio_pci.h | 2 --
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index b099e4f..2245bec 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -35,6 +35,8 @@
#include "virtio_pci.h"
#include "virtio_logs.h"

+static uint8_t vtpci_get_status(struct virtio_hw *);
+
void
vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset,
void *dst, int length)
@@ -113,7 +115,7 @@ vtpci_reinit_complete(struct virtio_hw *hw)
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER_OK);
}

-uint8_t
+static uint8_t
vtpci_get_status(struct virtio_hw *hw)
{
return VIRTIO_READ_REG_1(hw, VIRTIO_PCI_STATUS);
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 0a4b578..64d9c34 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -255,8 +255,6 @@ void vtpci_reset(struct virtio_hw *);

void vtpci_reinit_complete(struct virtio_hw *);

-uint8_t vtpci_get_status(struct virtio_hw *);
-
void vtpci_set_status(struct virtio_hw *, uint8_t);

uint32_t vtpci_negotiate_features(struct virtio_hw *, uint32_t);
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:36:03 UTC
Permalink
vHOST zero copy need get vring descriptor and its buffer address to
set the DMA address of HW ring, it is done in new_device when ioctl set_backend
is called. This requies virtio_dev_rxtx_start is called before vtpci_reinit_complete,
which makes sure the vring descriptro and its buffer is ready before its using.

this patch also fixes one set status issue, according to virtio spec,
VIRTIO_CONFIG_STATUS_ACK should be set after virtio hw reset.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b905532..648c761 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -414,6 +414,7 @@ virtio_dev_close(struct rte_eth_dev *dev)
/* reset the NIC */
vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
vtpci_reset(hw);
+ hw->started = 0;
virtio_dev_free_mbufs(dev);
}

@@ -1107,9 +1108,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return -ENOMEM;
}

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
#ifdef RTE_EAL_PORT_IO
@@ -1123,6 +1121,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Reset the device although not necessary at startup */
vtpci_reset(hw);

+ /* Tell the host we've noticed this device. */
+ vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
+
/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
virtio_negotiate_features(hw);
@@ -1324,10 +1325,10 @@ virtio_dev_start(struct rte_eth_dev *dev)
if (hw->started)
return 0;

- vtpci_reinit_complete(hw);
-
/* Do final configuration before rx/tx engine starts */
virtio_dev_rxtx_start(dev);
+ vtpci_reinit_complete(hw);
+
hw->started = 1;

/*Notify the backend
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:56 UTC
Permalink
This makes virtio driver work like ixgbe. Transmit buffers are
held until a transmit threshold is reached. The previous behavior
was to hold mbuf's until the ring entry was reused which caused
more memory usage than needed.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 7 ++--
lib/librte_pmd_virtio/virtio_rxtx.c | 75 +++++++++++++++++++++++++----------
lib/librte_pmd_virtio/virtqueue.h | 3 +-
3 files changed, 60 insertions(+), 25 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b30ab2a..8cd2d51 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -176,15 +176,16 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,

virtqueue_notify(vq);

- while (vq->vq_used_cons_idx == vq->vq_ring.used->idx)
+ rte_rmb();
+ while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) {
+ rte_rmb();
usleep(100);
+ }

while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) {
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;

- virtio_rmb();
-
used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index b6d6832..580701a 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -129,17 +129,32 @@ virtqueue_dequeue_burst_rx(struct virtqueue *vq, struct rte_mbuf **rx_pkts,
return i;
}

+#ifndef DEFAULT_TX_FREE_THRESH
+#define DEFAULT_TX_FREE_THRESH 32
+#endif
+
+/* Cleanup from completed transmits. */
static void
-virtqueue_dequeue_pkt_tx(struct virtqueue *vq)
+virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)
{
- struct vring_used_elem *uep;
- uint16_t used_idx, desc_idx;
+ uint16_t i, used_idx, desc_idx;
+ for (i = 0; i < num; i++) {
+ struct vring_used_elem *uep;
+ struct vq_desc_extra *dxp;
+
+ used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
+ uep = &vq->vq_ring.used->ring[used_idx];
+ dxp = &vq->vq_descx[used_idx];
+
+ desc_idx = (uint16_t) uep->id;
+ vq->vq_used_cons_idx++;
+ vq_ring_free_chain(vq, desc_idx);

- used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
- uep = &vq->vq_ring.used->ring[used_idx];
- desc_idx = (uint16_t) uep->id;
- vq->vq_used_cons_idx++;
- vq_ring_free_chain(vq, desc_idx);
+ if (dxp->cookie != NULL) {
+ rte_pktmbuf_free(dxp->cookie);
+ dxp->cookie = NULL;
+ }
+ }
}


@@ -203,8 +218,6 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)

idx = head_idx;
dxp = &txvq->vq_descx[idx];
- if (dxp->cookie != NULL)
- rte_pktmbuf_free(dxp->cookie);
dxp->cookie = (void *)cookie;
dxp->ndescs = needed;

@@ -404,6 +417,7 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
{
uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX;
struct virtqueue *vq;
+ uint16_t tx_free_thresh;
int ret;

PMD_INIT_FUNC_TRACE();
@@ -421,6 +435,22 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
return ret;
}

+ tx_free_thresh = tx_conf->tx_free_thresh;
+ if (tx_free_thresh == 0)
+ tx_free_thresh =
+ RTE_MIN(vq->vq_nentries / 4, DEFAULT_TX_FREE_THRESH);
+
+ if (tx_free_thresh >= (vq->vq_nentries - 3)) {
+ RTE_LOG(ERR, PMD, "tx_free_thresh must be less than the "
+ "number of TX entries minus 3 (%u)."
+ " (tx_free_thresh=%u port=%u queue=%u)\n",
+ vq->vq_nentries - 3,
+ tx_free_thresh, dev->data->port_id, queue_idx);
+ return -EINVAL;
+ }
+
+ vq->vq_free_thresh = tx_free_thresh;
+
dev->data->tx_queues[queue_idx] = vq;
return 0;
}
@@ -688,11 +718,9 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
{
struct virtqueue *txvq = tx_queue;
struct rte_mbuf *txm;
- uint16_t nb_used, nb_tx, num;
+ uint16_t nb_used, nb_tx;
int error;

- nb_tx = 0;
-
if (unlikely(nb_pkts < 1))
return nb_pkts;

@@ -700,21 +728,26 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
nb_used = VIRTQUEUE_NUSED(txvq);

virtio_rmb();
+ if (likely(nb_used > txvq->vq_free_thresh))
+ virtio_xmit_cleanup(txvq, nb_used);

- num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);
+ nb_tx = 0;

while (nb_tx < nb_pkts) {
/* Need one more descriptor for virtio header. */
int need = tx_pkts[nb_tx]->nb_segs - txvq->vq_free_cnt + 1;
- int deq_cnt = RTE_MIN(need, (int)num);

- num -= (deq_cnt > 0) ? deq_cnt : 0;
- while (deq_cnt > 0) {
- virtqueue_dequeue_pkt_tx(txvq);
- deq_cnt--;
+ /*Positive value indicates it need free vring descriptors */
+ if (unlikely(need > 0)) {
+ nb_used = VIRTQUEUE_NUSED(txvq);
+ virtio_rmb();
+ need = RTE_MIN(need, (int)nb_used);
+
+ virtio_xmit_cleanup(txvq, need);
+ need = (int)tx_pkts[nb_tx]->nb_segs -
+ txvq->vq_free_cnt + 1;
}

- need = (int)tx_pkts[nb_tx]->nb_segs - txvq->vq_free_cnt + 1;
/*
* Zero or negative value indicates it has enough free
* descriptors to use for transmitting.
@@ -723,7 +756,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
txm = tx_pkts[nb_tx];

/* Do VLAN tag insertion */
- if (txm->ol_flags & PKT_TX_VLAN_PKT) {
+ if (unlikely(txm->ol_flags & PKT_TX_VLAN_PKT)) {
error = rte_vlan_insert(&txm);
if (unlikely(error)) {
rte_pktmbuf_free(txm);
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index d210f4f..6c45c27 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -164,6 +164,7 @@ struct virtqueue {
struct rte_mempool *mpool; /**< mempool for mbuf allocation */
uint16_t queue_id; /**< DPDK queue index. */
uint8_t port_id; /**< Device port identifier. */
+ uint16_t vq_queue_index; /**< PCI queue index */

void *vq_ring_virt_mem; /**< linear address of vring*/
unsigned int vq_ring_size;
@@ -172,7 +173,7 @@ struct virtqueue {
struct vring vq_ring; /**< vring keeping desc, used and avail */
uint16_t vq_free_cnt; /**< num of desc available */
uint16_t vq_nentries; /**< vring desc numbers */
- uint16_t vq_queue_index; /**< PCI queue index */
+ uint16_t vq_free_thresh; /**< free threshold */
/**
* Head of the free chain in the descriptor table. If
* there are no free descriptors, this will be set to
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:48 UTC
Permalink
Since vq_alignment is constant (always 4K), it does not
need to be part of the vring struct.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 1 -
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
lib/librte_pmd_virtio/virtqueue.h | 3 +--
3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 59b74b7..0d41e7f 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -294,7 +294,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
vq->port_id = dev->data->port_id;
vq->queue_id = queue_idx;
vq->vq_queue_index = vtpci_queue_idx;
- vq->vq_alignment = VIRTIO_PCI_VRING_ALIGN;
vq->vq_nentries = vq_size;
vq->vq_free_cnt = vq_size;

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index a82d5ff..b6d6832 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -258,7 +258,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
* Reinitialise since virtio port might have been stopped and restarted
*/
memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
- vring_init(vr, size, ring_mem, vq->vq_alignment);
+ vring_init(vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN);
vq->vq_used_cons_idx = 0;
vq->vq_desc_head_idx = 0;
vq->vq_avail_idx = 0;
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index f6ad98d..5b8a255 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -138,8 +138,7 @@ struct virtqueue {
uint8_t port_id; /**< Device port identifier. */

void *vq_ring_virt_mem; /**< linear address of vring*/
- int vq_alignment;
- int vq_ring_size;
+ unsigned int vq_ring_size;
phys_addr_t vq_ring_mem; /**< physical address of vring */

struct vring vq_ring; /**< vring keeping desc, used and avail */
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:54 UTC
Permalink
Virtio support multiple MAC addresses.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 94 ++++++++++++++++++++++++++++++++++-
lib/librte_pmd_virtio/virtio_ethdev.h | 3 +-
lib/librte_pmd_virtio/virtqueue.h | 34 ++++++++++++-
3 files changed, 127 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 591d692..0e74eea 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -86,6 +86,10 @@ static void virtio_dev_stats_reset(struct rte_eth_dev *dev);
static void virtio_dev_free_mbufs(struct rte_eth_dev *dev);
static int virtio_vlan_filter_set(struct rte_eth_dev *dev,
uint16_t vlan_id, int on);
+static void virtio_mac_addr_add(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr,
+ uint32_t index, uint32_t vmdq __rte_unused);
+static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -503,8 +507,6 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.stats_get = virtio_dev_stats_get,
.stats_reset = virtio_dev_stats_reset,
.link_update = virtio_dev_link_update,
- .mac_addr_add = NULL,
- .mac_addr_remove = NULL,
.rx_queue_setup = virtio_dev_rx_queue_setup,
/* meaningfull only to multiple queue */
.rx_queue_release = virtio_dev_rx_queue_release,
@@ -514,6 +516,8 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
/* collect stats per queue */
.queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set,
.vlan_filter_set = virtio_vlan_filter_set,
+ .mac_addr_add = virtio_mac_addr_add,
+ .mac_addr_remove = virtio_mac_addr_remove,
};

static inline int
@@ -644,6 +648,92 @@ virtio_get_hwaddr(struct virtio_hw *hw)
}

static int
+virtio_mac_table_set(struct virtio_hw *hw,
+ const struct virtio_net_ctrl_mac *uc,
+ const struct virtio_net_ctrl_mac *mc)
+{
+ struct virtio_pmd_ctrl ctrl;
+ int err, len[2];
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_MAC;
+ ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_TABLE_SET;
+
+ len[0] = uc->entries * ETHER_ADDR_LEN + sizeof(uc->entries);
+ memcpy(ctrl.data, uc, len[0]);
+
+ len[1] = mc->entries * ETHER_ADDR_LEN + sizeof(mc->entries);
+ memcpy(ctrl.data + len[0], mc, len[1]);
+
+ err = virtio_send_command(hw->cvq, &ctrl, len, 2);
+ if (err != 0)
+ PMD_DRV_LOG(NOTICE, "mac table set failed: %d", err);
+
+ return err;
+}
+
+static void
+virtio_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
+ uint32_t index, uint32_t vmdq __rte_unused)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ const struct ether_addr *addrs = dev->data->mac_addrs;
+ unsigned int i;
+ struct virtio_net_ctrl_mac *uc, *mc;
+
+ if (index >= VIRTIO_MAX_MAC_ADDRS) {
+ PMD_DRV_LOG(ERR, "mac address index %u out of range", index);
+ return;
+ }
+
+ uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries));
+ uc->entries = 0;
+ mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries));
+ mc->entries = 0;
+
+ for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) {
+ const struct ether_addr *addr
+ = (i == index) ? mac_addr : addrs + i;
+ struct virtio_net_ctrl_mac *tbl
+ = is_multicast_ether_addr(addr) ? mc : uc;
+
+ memcpy(&tbl->macs[tbl->entries++], addr, ETHER_ADDR_LEN);
+ }
+
+ virtio_mac_table_set(hw, uc, mc);
+}
+
+static void
+virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ struct ether_addr *addrs = dev->data->mac_addrs;
+ struct virtio_net_ctrl_mac *uc, *mc;
+ unsigned int i;
+
+ if (index >= VIRTIO_MAX_MAC_ADDRS) {
+ PMD_DRV_LOG(ERR, "mac address index %u out of range", index);
+ return;
+ }
+
+ uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries));
+ uc->entries = 0;
+ mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries));
+ mc->entries = 0;
+
+ for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) {
+ struct virtio_net_ctrl_mac *tbl;
+
+ if (i == index || is_zero_ether_addr(addrs + i))
+ continue;
+
+ tbl = is_multicast_ether_addr(addrs + i) ? mc : uc;
+ memcpy(&tbl->macs[tbl->entries++], addrs + i, ETHER_ADDR_LEN);
+ }
+
+ virtio_mac_table_set(hw, uc, mc);
+}
+
+static int
virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
{
struct virtio_hw *hw = dev->data->dev_private;
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h
index 55c9749..74ac7e0 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.h
+++ b/lib/librte_pmd_virtio/virtio_ethdev.h
@@ -51,7 +51,7 @@

#define VIRTIO_MAX_RX_QUEUES 128
#define VIRTIO_MAX_TX_QUEUES 128
-#define VIRTIO_MAX_MAC_ADDRS 1
+#define VIRTIO_MAX_MAC_ADDRS 64
#define VIRTIO_MIN_RX_BUFSIZE 64
#define VIRTIO_MAX_RX_PKTLEN 9728

@@ -60,6 +60,7 @@
(VIRTIO_NET_F_MAC | \
VIRTIO_NET_F_STATUS | \
VIRTIO_NET_F_MQ | \
+ VIRTIO_NET_F_CTRL_MAC_ADDR | \
VIRTIO_NET_F_CTRL_VQ | \
VIRTIO_NET_F_CTRL_RX | \
VIRTIO_NET_F_CTRL_VLAN | \
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index 5b8a255..d210f4f 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -99,6 +99,34 @@ enum { VTNET_RQ = 0, VTNET_TQ = 1, VTNET_CQ = 2 };
#define VIRTIO_NET_CTRL_RX_NOBCAST 5

/**
+ * Control the MAC
+ *
+ * The MAC filter table is managed by the hypervisor, the guest should
+ * assume the size is infinite. Filtering should be considered
+ * non-perfect, ie. based on hypervisor resources, the guest may
+ * received packets from sources not specified in the filter list.
+ *
+ * In addition to the class/cmd header, the TABLE_SET command requires
+ * two out scatterlists. Each contains a 4 byte count of entries followed
+ * by a concatenated byte stream of the ETH_ALEN MAC addresses. The
+ * first sg list contains unicast addresses, the second is for multicast.
+ * This functionality is present if the VIRTIO_NET_F_CTRL_RX feature
+ * is available.
+ *
+ * The ADDR_SET command requests one out scatterlist, it contains a
+ * 6 bytes MAC address. This functionality is present if the
+ * VIRTIO_NET_F_CTRL_MAC_ADDR feature is available.
+ */
+struct virtio_net_ctrl_mac {
+ uint32_t entries;
+ uint8_t macs[][ETHER_ADDR_LEN];
+} __attribute__((__packed__));
+
+#define VIRTIO_NET_CTRL_MAC 1
+ #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
+ #define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+
+/**
* Control VLAN filtering
*
* The VLAN filter table is controlled via a simple ADD/DEL interface.
@@ -121,7 +149,7 @@ typedef uint8_t virtio_net_ctrl_ack;
#define VIRTIO_NET_OK 0
#define VIRTIO_NET_ERR 1

-#define VIRTIO_MAX_CTRL_DATA 128
+#define VIRTIO_MAX_CTRL_DATA 2048

struct virtio_pmd_ctrl {
struct virtio_net_ctrl_hdr hdr;
@@ -180,6 +208,10 @@ struct virtqueue {
#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN 1
#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX 0x8000
#endif
+#ifndef VIRTIO_NET_F_CTRL_MAC_ADDR
+#define VIRTIO_NET_F_CTRL_MAC_ADDR 0x800000
+#define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+#endif

/**
* This is the first element of the scatter-gather list. If you don't
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:36:00 UTC
Permalink
Check if it has already been vlan-tagged packet, if true, avoid inserting a
duplicated vlan tag into it.

This is a possible case when guest has the capability of inserting vlan tag.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
examples/vhost/main.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 04f0118..1d31520 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1115,6 +1115,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
unsigned len, ret, offset = 0;
const uint16_t lcore_id = rte_lcore_id();
struct virtio_net *dev = vdev->dev;
+ struct ether_hdr *nh;

/*check if destination is local VM*/
if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) {
@@ -1135,7 +1136,15 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
tx_q = &lcore_tx_queue[lcore_id];
len = tx_q->len;

- m->ol_flags = PKT_TX_VLAN_PKT;
+ nh = rte_pktmbuf_mtod(m, struct ether_hdr *);
+ if (unlikely(nh->ether_type == rte_cpu_to_be_16(ETHER_TYPE_VLAN))) {
+ /* Guest has inserted the vlan tag. */
+ struct vlan_hdr *vh = (struct vlan_hdr *) (nh + 1);
+ uint16_t vlan_tag_be = rte_cpu_to_be_16(vlan_tag);
+ if (vh->vlan_tci != vlan_tag_be)
+ vh->vlan_tci = vlan_tag_be;
+ } else {
+ m->ol_flags = PKT_TX_VLAN_PKT;

/*
* Find the right seg to adjust the data len when offset is
@@ -1156,7 +1165,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
m->pkt_len += offset;
}

- m->vlan_tci = vlan_tag;
+ m->vlan_tci = vlan_tag;
+ }

tx_q->m_table[len] = m;
len++;
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:36:01 UTC
Permalink
Support turn on/off RX VLAN strip on host, this let guest get the chance of
using its software VALN strip functionality.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
examples/vhost/main.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 1d31520..4ff916d 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -159,6 +159,9 @@ static uint32_t num_devices;
static uint32_t zero_copy;
static int mergeable;

+/* Do vlan strip on host, enabled on default */
+static uint32_t vlan_strip = 1;
+
/* number of descriptors to apply*/
static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -564,6 +567,7 @@ us_vhost_usage(const char *prgname)
" --rx-retry-delay [0-N]: timeout(in usecond) between retries on RX. This makes effect only if retries on rx enabled\n"
" --rx-retry-num [0-N]: the number of retries on rx. This makes effect only if retries on rx enabled\n"
" --mergeable [0|1]: disable(default)/enable RX mergeable buffers\n"
+ " --vlan-strip [0|1]: disable/enable(default) RX VLAN strip on host\n"
" --stats [0-N]: 0: Disable stats, N: Time in seconds to print stats\n"
" --dev-basename: The basename to be used for the character device.\n"
" --zero-copy [0|1]: disable(default)/enable rx/tx "
@@ -591,6 +595,7 @@ us_vhost_parse_args(int argc, char **argv)
{"rx-retry-delay", required_argument, NULL, 0},
{"rx-retry-num", required_argument, NULL, 0},
{"mergeable", required_argument, NULL, 0},
+ {"vlan-strip", required_argument, NULL, 0},
{"stats", required_argument, NULL, 0},
{"dev-basename", required_argument, NULL, 0},
{"zero-copy", required_argument, NULL, 0},
@@ -691,6 +696,22 @@ us_vhost_parse_args(int argc, char **argv)
}
}

+ /* Enable/disable RX VLAN strip on host. */
+ if (!strncmp(long_option[option_index].name,
+ "vlan-strip", MAX_LONG_OPT_SZ)) {
+ ret = parse_num_opt(optarg, 1);
+ if (ret == -1) {
+ RTE_LOG(INFO, VHOST_CONFIG,
+ "Invalid argument for VLAN strip [0|1]\n");
+ us_vhost_usage(prgname);
+ return -1;
+ } else {
+ vlan_strip = !!ret;
+ vmdq_conf_default.rxmode.hw_vlan_strip =
+ vlan_strip;
+ }
+ }
+
/* Enable/disable stats. */
if (!strncmp(long_option[option_index].name, "stats", MAX_LONG_OPT_SZ)) {
ret = parse_num_opt(optarg, INT32_MAX);
@@ -950,7 +971,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
dev->device_fh);

/* Enable stripping of the vlan tag as we handle routing. */
- rte_eth_dev_set_vlan_strip_on_queue(ports[0], (uint16_t)vdev->vmdq_rx_q, 1);
+ if (vlan_strip)
+ rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+ (uint16_t)vdev->vmdq_rx_q, 1);

/* Set device as ready for RX. */
vdev->ready = DEVICE_RX;
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:36:04 UTC
Permalink
Remove those hotspots which is unnecessary when early returning occurs;
Also reverse one likely to unlikely to let compiler has better decision;

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index c6d9ae7..c4731b5 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -476,13 +476,13 @@ uint16_t
virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
- struct virtio_hw *hw = rxvq->hw;
+ struct virtio_hw *hw;
struct rte_mbuf *rxm, *new_mbuf;
- uint16_t nb_used, num, nb_rx = 0;
+ uint16_t nb_used, num, nb_rx;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
struct rte_mbuf *rcv_pkts[VIRTIO_MBUF_BURST_SZ];
int error;
- uint32_t i, nb_enqueued = 0;
+ uint32_t i, nb_enqueued;
const uint32_t hdr_size = sizeof(struct virtio_net_hdr);

nb_used = VIRTQUEUE_NUSED(rxvq);
@@ -491,7 +491,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)

num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ);
- if (likely(num > DESC_PER_CACHELINE))
+ if (unlikely(num > DESC_PER_CACHELINE))
num = num - ((rxvq->vq_used_cons_idx + num) % DESC_PER_CACHELINE);

if (num == 0)
@@ -499,6 +499,11 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)

num = virtqueue_dequeue_burst_rx(rxvq, rcv_pkts, len, num);
PMD_RX_LOG(DEBUG, "used:%d dequeue:%d", nb_used, num);
+
+ hw = rxvq->hw;
+ nb_rx = 0;
+ nb_enqueued = 0;
+
for (i = 0; i < num ; i++) {
rxm = rcv_pkts[i];

@@ -568,17 +573,17 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
- struct virtio_hw *hw = rxvq->hw;
+ struct virtio_hw *hw;
struct rte_mbuf *rxm, *new_mbuf;
- uint16_t nb_used, num, nb_rx = 0;
+ uint16_t nb_used, num, nb_rx;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
struct rte_mbuf *rcv_pkts[VIRTIO_MBUF_BURST_SZ];
struct rte_mbuf *prev;
int error;
- uint32_t i = 0, nb_enqueued = 0;
- uint32_t seg_num = 0;
- uint16_t extra_idx = 0;
- uint32_t seg_res = 0;
+ uint32_t i, nb_enqueued;
+ uint32_t seg_num;
+ uint16_t extra_idx;
+ uint32_t seg_res;
const uint32_t hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);

nb_used = VIRTQUEUE_NUSED(rxvq);
@@ -590,6 +595,14 @@ virtio_recv_mergeable_pkts(void *rx_queue,

PMD_RX_LOG(DEBUG, "used:%d\n", nb_used);

+ hw = rxvq->hw;
+ nb_rx = 0;
+ i = 0;
+ nb_enqueued = 0;
+ seg_num = 0;
+ extra_idx = 0;
+ seg_res = 0;
+
while (i < nb_used) {
struct virtio_net_hdr_mrg_rxbuf *header;
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:55 UTC
Permalink
Need to have do special things to set default mac address.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ethdev.h | 5 +++++
lib/librte_pmd_virtio/virtio_ethdev.c | 24 ++++++++++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 94d6b2b..5a54276 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1240,6 +1240,10 @@ typedef void (*eth_mac_addr_add_t)(struct rte_eth_dev *dev,
uint32_t vmdq);
/**< @internal Set a MAC address into Receive Address Address Register */

+typedef void (*eth_mac_addr_set_t)(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr);
+/**< @internal Set a MAC address into Receive Address Address Register */
+
typedef int (*eth_uc_hash_table_set_t)(struct rte_eth_dev *dev,
struct ether_addr *mac_addr,
uint8_t on);
@@ -1459,6 +1463,7 @@ struct eth_dev_ops {
priority_flow_ctrl_set_t priority_flow_ctrl_set; /**< Setup priority flow control.*/
eth_mac_addr_remove_t mac_addr_remove; /**< Remove MAC address */
eth_mac_addr_add_t mac_addr_add; /**< Add a MAC address */
+ eth_mac_addr_set_t mac_addr_set; /**< Set a MAC address */
eth_uc_hash_table_set_t uc_hash_table_set; /**< Set Unicast Table Array */
eth_uc_all_hash_table_set_t uc_all_hash_table_set; /**< Set Unicast hash bitmap */
eth_mirror_rule_set_t mirror_rule_set; /**< Add a traffic mirror rule.*/
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 0e74eea..b30ab2a 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -90,6 +90,8 @@ static void virtio_mac_addr_add(struct rte_eth_dev *dev,
struct ether_addr *mac_addr,
uint32_t index, uint32_t vmdq __rte_unused);
static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index);
+static void virtio_mac_addr_set(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -518,6 +520,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.vlan_filter_set = virtio_vlan_filter_set,
.mac_addr_add = virtio_mac_addr_add,
.mac_addr_remove = virtio_mac_addr_remove,
+ .mac_addr_set = virtio_mac_addr_set,
};

static inline int
@@ -733,6 +736,27 @@ virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
virtio_mac_table_set(hw, uc, mc);
}

+static void
+virtio_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+
+ memcpy(hw->mac_addr, mac_addr, ETHER_ADDR_LEN);
+
+ /* Use atomic update if available */
+ if (vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_MAC_ADDR)) {
+ struct virtio_pmd_ctrl ctrl;
+ int len = ETHER_ADDR_LEN;
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_MAC;
+ ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_ADDR_SET;
+
+ memcpy(ctrl.data, mac_addr, ETHER_ADDR_LEN);
+ virtio_send_command(hw->cvq, &ctrl, &len, 1);
+ } else if (vtpci_with_feature(hw, VIRTIO_NET_F_MAC))
+ virtio_set_hwaddr(hw);
+}
+
static int
virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
{
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:58 UTC
Permalink
It should use vring descriptor index instead of used_ring index to index vq_descx.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 580701a..a82e8eb 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -144,9 +144,9 @@ virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)

used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
- dxp = &vq->vq_descx[used_idx];

desc_idx = (uint16_t) uep->id;
+ dxp = &vq->vq_descx[desc_idx];
vq->vq_used_cons_idx++;
vq_ring_free_chain(vq, desc_idx);
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:51 UTC
Permalink
Better to check at compile time than fail at runtime.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 47dd33d..9679c2f 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -882,11 +882,7 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
uint32_t offset_conf = sizeof(config->mac);
struct rte_pci_device *pci_dev;

- if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) {
- PMD_INIT_LOG(ERR,
- "MBUF HEADROOM should be enough to hold virtio net hdr\n");
- return -1;
- }
+ RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr));

eth_dev->dev_ops = &virtio_eth_dev_ops;
eth_dev->tx_pkt_burst = &virtio_xmit_pkts;
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:59 UTC
Permalink
Need swap the data from cpu to BE(big endian) for vlan-type.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ether.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 74f71c2..0797908 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -351,7 +351,7 @@ static inline int rte_vlan_strip(struct rte_mbuf *m)
struct ether_hdr *eh
= rte_pktmbuf_mtod(m, struct ether_hdr *);

- if (eh->ether_type != ETHER_TYPE_VLAN)
+ if (eh->ether_type != rte_cpu_to_be_16(ETHER_TYPE_VLAN))
return -1;

struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1);
@@ -401,7 +401,7 @@ static inline int rte_vlan_insert(struct rte_mbuf **m)
return -ENOSPC;

memmove(nh, oh, 2 * ETHER_ADDR_LEN);
- nh->ether_type = ETHER_TYPE_VLAN;
+ nh->ether_type = rte_cpu_to_be_16(ETHER_TYPE_VLAN);

vh = (struct vlan_hdr *) (nh + 1);
vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci);
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:35:57 UTC
Permalink
Make virtio not require UIO for some security reasons, this is to match 6Wind's virtio-net-pmd.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
config/common_linuxapp | 2 +
lib/librte_eal/common/include/rte_pci.h | 4 ++
lib/librte_eal/linuxapp/eal/eal_pci.c | 5 +-
lib/librte_pmd_virtio/virtio_ethdev.c | 91 ++++++++++++++++++++++++++++++++-
4 files changed, 100 insertions(+), 2 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2f9643b..a412457 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -100,6 +100,8 @@ CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
CONFIG_RTE_EAL_IGB_UIO=y
CONFIG_RTE_EAL_VFIO=y
+# Only for VIRTIO PMD currently
+CONFIG_RTE_EAL_PORT_IO=n

#
# Special configurations in PCI Config Space for high performance
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..19abc1f 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -193,6 +193,10 @@ struct rte_pci_driver {

/** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */
#define RTE_PCI_DRV_NEED_MAPPING 0x0001
+/** Device needs port IO(done with /proc/ioports) */
+#ifdef RTE_EAL_PORT_IO
+#define RTE_PCI_DRV_PORT_IO 0x0002
+#endif
/** Device driver must be registered several times until failure - deprecated */
#pragma GCC poison RTE_PCI_DRV_MULTIPLE
/** Device needs to be unbound even if no module is provided */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index b5f5410..5db0059 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -574,7 +574,10 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *d
/* map resources for devices that use igb_uio */
ret = pci_map_device(dev);
if (ret != 0)
- return ret;
+#ifdef RTE_EAL_PORT_IO
+ if ((dr->drv_flags & RTE_PCI_DRV_PORT_IO) == 0)
+#endif
+ return ret;
} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
rte_eal_process_type() == RTE_PROC_PRIMARY) {
/* unbind current driver */
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 8cd2d51..b905532 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -961,6 +961,71 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev)
start, size);
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+/* Extract port I/O numbers from proc/ioports */
+static int virtio_resource_init_by_portio(struct rte_pci_device *pci_dev)
+{
+ uint16_t start, end;
+ int size;
+ FILE *fp;
+ char *line = NULL;
+ char pci_id[16];
+ int found = 0;
+ size_t linesz;
+
+ snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
+ pci_dev->addr.domain,
+ pci_dev->addr.bus,
+ pci_dev->addr.devid,
+ pci_dev->addr.function);
+
+ fp = fopen("/proc/ioports", "r");
+ if (fp == NULL) {
+ PMD_INIT_LOG(ERR, "%s(): can't open ioports", __func__);
+ return -1;
+ }
+
+ while (getdelim(&line, &linesz, '\n', fp) > 0) {
+ char *ptr = line;
+ char *left;
+ int n;
+
+ n = strcspn(ptr, ":");
+ ptr[n] = 0;
+ left = &ptr[n+1];
+
+ while (*left && isspace(*left))
+ left++;
+
+ if (!strncmp(left, pci_id, strlen(pci_id))) {
+ found = 1;
+
+ while (*ptr && isspace(*ptr))
+ ptr++;
+
+ sscanf(ptr, "%04hx-%04hx", &start, &end);
+ size = end - start + 1;
+
+ break;
+ }
+ }
+
+ free(line);
+ fclose(fp);
+
+ if (!found)
+ return -1;
+
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)(uint32_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
+#endif
+
#else
static int
virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
@@ -974,6 +1039,14 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
/* no setup required */
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+static int virtio_resource_init_by_portio(struct rte_pci_device *pci_dev)
+{
+ /* no setup required */
+ return 0;
+}
+#endif
#endif

/*
@@ -1039,7 +1112,10 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,

pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
- return -1;
+#ifdef RTE_EAL_PORT_IO
+ if (virtio_resource_init_by_portio(pci_dev) < 0)
+#endif
+ return -1;

hw->use_msix = virtio_has_msix(&pci_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
@@ -1132,6 +1208,18 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;
}

+#ifdef RTE_EAL_PORT_IO
+static struct eth_driver rte_virtio_pmd = {
+ {
+ .name = "rte_virtio_pmd",
+ .id_table = pci_id_virtio_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_PORT_IO |
+ RTE_PCI_DRV_INTR_LSC,
+ },
+ .eth_dev_init = eth_virtio_dev_init,
+ .dev_private_size = sizeof(struct virtio_hw),
+};
+#else
static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
@@ -1141,6 +1229,7 @@ static struct eth_driver rte_virtio_pmd = {
.eth_dev_init = eth_virtio_dev_init,
.dev_private_size = sizeof(struct virtio_hw),
};
+#endif

/*
* Driver initialization routine.
--
1.8.4.2
Ouyang Changchun
2015-01-27 02:36:02 UTC
Permalink
To keep the consistent logic with normal Rx path, the mergeable
Rx path also needs software vlan strip/decap if it is enabled.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index a82e8eb..c6d9ae7 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -568,6 +568,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -674,6 +675,9 @@ virtio_recv_mergeable_pkts(void *rx_queue,
seg_res -= rcv_cnt;
}

+ if (hw->vlan_strip)
+ rte_vlan_strip(rx_pkts[nb_rx]);
+
VIRTIO_DUMP_PACKET(rx_pkts[nb_rx],
rx_pkts[nb_rx]->data_len);
--
1.8.4.2
Matthew Hall
2015-01-27 03:06:12 UTC
Permalink
Post by Ouyang Changchun
This is the patch set for single virtio implementation.
Why we need single virtio?
============================
A) lib/librte_pmd_virtio(refer as virtio A);
B) virtio_net_pmd by 6wind(refer as virtio B);
C) virtio by Brocade/vyatta(refer as virtio C);
Integrating 3 implementations into one could reduce the maintaining cost and time,
in other hand, user don't need practice their application on 3 variant one by one to see
which one is the best for them;
Thank you so much for this, using virtio drivers in DPDK has been messy and
unpleasant in the past, and you clearly wrote a lot of nice new code to help
improve it all.

Previously I'd reported a bug, where all RTE virtio drivers I tried (A and B,
because I did not know C existed), failed to work with the virtio-net
interfaces exposed in VirtualBox, due to various strange errors, and they all
only worked with the virtio-net interfaces from qemu.

I wanted to find out if we managed to fix this other problem, because I would
really like to use the Vagrant VM deployment tool (https://www.vagrantup.com/)
to distribute my open-source DPDK based application to everyone in the
open source community.

The better the out-of-box experience of practical community-created DPDK-based
real-life example applications similar to mine, the more adoption of DPDK and
better DPDK community we will be able to have as time marches forward.

If we could manage to get it to work in VirtualBox, then I could surely help
do some app-level testing on the new code, if we could see it in a test branch
or test repo somewhere I could access it.

Sincerely,
Matthew Hall
Wiles, Keith
2015-01-27 03:42:00 UTC
Permalink
Post by Matthew Hall
Post by Ouyang Changchun
This is the patch set for single virtio implementation.
Why we need single virtio?
============================
As we know currently there are at least 3 virtio PMD driver
A) lib/librte_pmd_virtio(refer as virtio A);
B) virtio_net_pmd by 6wind(refer as virtio B);
C) virtio by Brocade/vyatta(refer as virtio C);
Integrating 3 implementations into one could reduce the maintaining cost and time,
in other hand, user don't need practice their application on 3 variant one by one to see
which one is the best for them;
Thank you so much for this, using virtio drivers in DPDK has been messy and
unpleasant in the past, and you clearly wrote a lot of nice new code to help
improve it all.
Previously I'd reported a bug, where all RTE virtio drivers I tried (A and B,
because I did not know C existed), failed to work with the virtio-net
interfaces exposed in VirtualBox, due to various strange errors, and they all
only worked with the virtio-net interfaces from qemu.
I wanted to find out if we managed to fix this other problem, because I would
really like to use the Vagrant VM deployment tool
(https://www.vagrantup.com/)
to distribute my open-source DPDK based application to everyone in the
open source community.
The better the out-of-box experience of practical community-created DPDK-based
real-life example applications similar to mine, the more adoption of DPDK and
better DPDK community we will be able to have as time marches forward.
If we could manage to get it to work in VirtualBox, then I could surely help
do some app-level testing on the new code, if we could see it in a test branch
or test repo somewhere I could access it.
There is an app note on how to get DPDK working in VirtualBox, it is a bit
bumpy on getting it work.
Here is the link:
http://plvision.eu/blog/deploying-intel-dpdk-in-oracle-virtualbox/

I have not tried it, but it was suggested to me it should work. It will be
nice if the new driver works better :-)
Post by Matthew Hall
Sincerely,
Matthew Hall
Matthew Hall
2015-01-27 09:41:13 UTC
Permalink
Post by Wiles, Keith
There is an app note on how to get DPDK working in VirtualBox, it is a bit
bumpy on getting it work.
http://plvision.eu/blog/deploying-intel-dpdk-in-oracle-virtualbox/
I have not tried it, but it was suggested to me it should work. It will be
nice if the new driver works better :-)
I already used a derivative of these directions... "cheated" and used the igb
driver like they did. Unlike them I automated the entire process, including
updating the base OS to latest kernel and recompiling against it, as well as
auto-enabling the NICs, the SSE instruction sets, etc. etc.

However their directions use an IGB NIC not a virtio-net NIC which would be
much better for performance and resource consumption. So I really would be
very very happy if we had a virtio-net which worked properly with both qemu
and VirtualBox.

Matthew.
Stephen Hemminger
2015-01-27 10:02:24 UTC
Permalink
On Mon, 26 Jan 2015 19:06:12 -0800
Post by Matthew Hall
Thank you so much for this, using virtio drivers in DPDK has been messy and
unpleasant in the past, and you clearly wrote a lot of nice new code to help
improve it all.
Previously I'd reported a bug, where all RTE virtio drivers I tried (A and B,
because I did not know C existed), failed to work with the virtio-net
interfaces exposed in VirtualBox, due to various strange errors, and they all
only worked with the virtio-net interfaces from qemu.
I suspect a problem with features required (and not supported by VirtualBox).
Build driver with debug enabled and send the log please.
Matthew Hall
2015-01-27 18:59:54 UTC
Permalink
Post by Stephen Hemminger
On Mon, 26 Jan 2015 19:06:12 -0800
Post by Matthew Hall
Thank you so much for this, using virtio drivers in DPDK has been messy and
unpleasant in the past, and you clearly wrote a lot of nice new code to help
improve it all.
Previously I'd reported a bug, where all RTE virtio drivers I tried (A and B,
because I did not know C existed), failed to work with the virtio-net
interfaces exposed in VirtualBox, due to various strange errors, and they all
only worked with the virtio-net interfaces from qemu.
I suspect a problem with features required (and not supported by VirtualBox).
Build driver with debug enabled and send the log please.
Hi Stephen,

Here is everything that happened when I tried it before.

http://dpdk.org/ml/archives/dev/2014-October/006623.html

Matthew.
Ouyang Changchun
2015-01-29 07:23:45 UTC
Permalink
For clarity make the setup of PCI resources for Linux into a function rather
than block of code #ifdef'd in middle of dev_init.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 76 ++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 33 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b3b5bb6..662a49c 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -794,6 +794,41 @@ virtio_has_msix(const struct rte_pci_addr *loc)

return (d != NULL);
}
+
+/* Extract I/O port numbers from sysfs */
+static int virtio_resource_init(struct rte_pci_device *pci_dev)
+{
+ char dirname[PATH_MAX];
+ char filename[PATH_MAX];
+ unsigned long start, size;
+
+ if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
+ return -1;
+
+ /* get portio size */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/size", dirname);
+ if (parse_sysfs_value(filename, &size) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse size",
+ __func__);
+ return -1;
+ }
+
+ /* get portio start */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/start", dirname);
+ if (parse_sysfs_value(filename, &start) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
+ __func__);
+ return -1;
+ }
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
#else
static int
virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
@@ -801,6 +836,12 @@ virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
/* nic_uio does not enable interrupts, return 0 (false). */
return 0;
}
+
+static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
+{
+ /* no setup required */
+ return 0;
+}
#endif

/*
@@ -831,40 +872,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;

pci_dev = eth_dev->pci_dev;
+ if (virtio_resource_init(pci_dev) < 0)
+ return -1;

-#ifdef RTE_EXEC_ENV_LINUXAPP
- {
- char dirname[PATH_MAX];
- char filename[PATH_MAX];
- unsigned long start, size;
-
- if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
- return -1;
-
- /* get portio size */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/size", dirname);
- if (parse_sysfs_value(filename, &size) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse size",
- __func__);
- return -1;
- }
-
- /* get portio start */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/start", dirname);
- if (parse_sysfs_value(filename, &start) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
- __func__);
- return -1;
- }
- pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
- pci_dev->mem_resource[0].len = (uint64_t)size;
- PMD_INIT_LOG(DEBUG,
- "PCI Port IO found start=0x%lx with size=0x%lx",
- start, size);
- }
-#endif
hw->use_msix = virtio_has_msix(&pci_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:48 UTC
Permalink
Virtio has link state interrupt which can be used.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 78 +++++++++++++++++++++++++++--------
lib/librte_pmd_virtio/virtio_pci.c | 22 ++++++++++
lib/librte_pmd_virtio/virtio_pci.h | 4 ++
3 files changed, 86 insertions(+), 18 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 5df3b54..ef87ff8 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -845,6 +845,34 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
#endif

/*
+ * Process Virtio Config changed interrupt and call the callback
+ * if link state changed.
+ */
+static void
+virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ struct rte_eth_dev *dev = param;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ uint8_t isr;
+
+ /* Read interrupt status which clears interrupt */
+ isr = vtpci_isr(hw);
+ PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+
+ if (isr & VIRTIO_PCI_ISR_CONFIG) {
+ if (virtio_dev_link_update(dev, 0) == 0)
+ _rte_eth_dev_callback_process(dev,
+ RTE_ETH_EVENT_INTR_LSC);
+ }
+
+}
+
+/*
* This function is based on probe() function in virtio_pci.c
* It returns 0 on success.
*/
@@ -968,6 +996,10 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
eth_dev->data->port_id, pci_dev->id.vendor_id,
pci_dev->id.device_id);
+
+ /* Setup interrupt callback */
+ rte_intr_callback_register(&pci_dev->intr_handle,
+ virtio_interrupt_handler, eth_dev);
return 0;
}

@@ -975,7 +1007,7 @@ static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
.id_table = pci_id_virtio_map,
- .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
},
.eth_dev_init = eth_virtio_dev_init,
.dev_private_size = sizeof(struct virtio_adapter),
@@ -1021,6 +1053,9 @@ static int
virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
+ struct virtio_hw *hw =
+ VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ int ret;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1029,7 +1064,11 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return (-EINVAL);
}

- return 0;
+ ret = vtpci_irq_config(hw, 0);
+ if (ret != 0)
+ PMD_DRV_LOG(ERR, "failed to set config vector");
+
+ return ret;
}


@@ -1037,7 +1076,6 @@ static int
virtio_dev_start(struct rte_eth_dev *dev)
{
uint16_t nb_queues, i;
- uint16_t status;
struct virtio_hw *hw =
VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);

@@ -1052,18 +1090,22 @@ virtio_dev_start(struct rte_eth_dev *dev)
/* Do final configuration before rx/tx engine starts */
virtio_dev_rxtx_start(dev);

- /* Check VIRTIO_NET_F_STATUS for link status*/
- if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
- vtpci_read_dev_config(hw,
- offsetof(struct virtio_net_config, status),
- &status, sizeof(status));
- if ((status & VIRTIO_NET_S_LINK_UP) == 0)
- PMD_INIT_LOG(ERR, "Port: %d Link is DOWN",
- dev->data->port_id);
- else
- PMD_INIT_LOG(DEBUG, "Port: %d Link is UP",
- dev->data->port_id);
+ /* check if lsc interrupt feature is enabled */
+ if (dev->data->dev_conf.intr_conf.lsc) {
+ if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
+ PMD_DRV_LOG(ERR, "link status not supported by host");
+ return -ENOTSUP;
+ }
+
+ if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) {
+ PMD_DRV_LOG(ERR, "interrupt enable failed");
+ return -EIO;
+ }
}
+
+ /* Initialize Link state */
+ virtio_dev_link_update(dev, 0);
+
vtpci_reinit_complete(hw);

/*Notify the backend
@@ -1145,6 +1187,7 @@ virtio_dev_stop(struct rte_eth_dev *dev)
VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);

/* reset the NIC */
+ vtpci_irq_config(hw, 0);
vtpci_reset(hw);
virtio_dev_free_mbufs(dev);
}
@@ -1161,6 +1204,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
old = link;
link.link_duplex = FULL_DUPLEX;
link.link_speed = SPEED_10G;
+
if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
PMD_INIT_LOG(DEBUG, "Get link status from hw");
vtpci_read_dev_config(hw,
@@ -1179,10 +1223,8 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
link.link_status = 1; /* Link up */
}
virtio_dev_atomic_write_link_status(dev, &link);
- if (old.link_status == link.link_status)
- return -1;
- /*changed*/
- return 0;
+
+ return (old.link_status == link.link_status) ? -1 : 0;
}

static void
diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index ca9c748..6d51032 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -127,3 +127,25 @@ vtpci_set_status(struct virtio_hw *hw, uint8_t status)

VIRTIO_WRITE_REG_1(hw, VIRTIO_PCI_STATUS, status);
}
+
+uint8_t
+vtpci_isr(struct virtio_hw *hw)
+{
+
+ return VIRTIO_READ_REG_1(hw, VIRTIO_PCI_ISR);
+}
+
+
+/* Enable one vector (0) for Link State Intrerrupt */
+int
+vtpci_irq_config(struct virtio_hw *hw, uint16_t vec)
+{
+ VIRTIO_WRITE_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR, vec);
+ vec = VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
+ if (vec == VIRTIO_MSI_NO_VECTOR) {
+ PMD_DRV_LOG(ERR, "failed to set config vector");
+ return -EBUSY;
+ }
+
+ return 0;
+}
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 373f9dc..6998737 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -263,4 +263,8 @@ void vtpci_write_dev_config(struct virtio_hw *, uint64_t, void *, int);

void vtpci_read_dev_config(struct virtio_hw *, uint64_t, void *, int);

+uint8_t vtpci_isr(struct virtio_hw *);
+
+int vtpci_irq_config(struct virtio_hw *, uint16_t);
+
#endif /* _VIRTIO_PCI_H_ */
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:47 UTC
Permalink
Starting driver with link down should be ok, it is with every
other driver. So just allow it.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index dc47e72..5df3b54 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1057,14 +1057,12 @@ virtio_dev_start(struct rte_eth_dev *dev)
vtpci_read_dev_config(hw,
offsetof(struct virtio_net_config, status),
&status, sizeof(status));
- if ((status & VIRTIO_NET_S_LINK_UP) == 0) {
+ if ((status & VIRTIO_NET_S_LINK_UP) == 0)
PMD_INIT_LOG(ERR, "Port: %d Link is DOWN",
dev->data->port_id);
- return -EIO;
- } else {
+ else
PMD_INIT_LOG(DEBUG, "Port: %d Link is UP",
dev->data->port_id);
- }
}
vtpci_reinit_complete(hw);
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:49 UTC
Permalink
It is helpful to allow device drivers that don't support hardware
VLAN stripping to emulate this in software. This allows application
to be device independent.

Avoid discarding shared mbufs. Make a copy in rte_vlan_insert() of any
packet to be tagged that has a reference count > 1.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ether.h | 76 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 76 insertions(+)

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 7e7d22c..74f71c2 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -49,6 +49,8 @@ extern "C" {

#include <rte_memcpy.h>
#include <rte_random.h>
+#include <rte_mbuf.h>
+#include <rte_byteorder.h>

#define ETHER_ADDR_LEN 6 /**< Length of Ethernet address. */
#define ETHER_TYPE_LEN 2 /**< Length of Ethernet type field. */
@@ -333,6 +335,80 @@ struct vxlan_hdr {
#define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr))
/**< VXLAN tunnel header length. */

+/**
+ * Extract VLAN tag information into mbuf
+ *
+ * Software version of VLAN stripping
+ *
+ * @param m
+ * The packet mbuf.
+ * @return
+ * - 0: Success
+ * - 1: not a vlan packet
+ */
+static inline int rte_vlan_strip(struct rte_mbuf *m)
+{
+ struct ether_hdr *eh
+ = rte_pktmbuf_mtod(m, struct ether_hdr *);
+
+ if (eh->ether_type != ETHER_TYPE_VLAN)
+ return -1;
+
+ struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1);
+ m->ol_flags |= PKT_RX_VLAN_PKT;
+ m->vlan_tci = rte_be_to_cpu_16(vh->vlan_tci);
+
+ /* Copy ether header over rather than moving whole packet */
+ memmove(rte_pktmbuf_adj(m, sizeof(struct vlan_hdr)),
+ eh, 2 * ETHER_ADDR_LEN);
+
+ return 0;
+}
+
+/**
+ * Insert VLAN tag into mbuf.
+ *
+ * Software version of VLAN unstripping
+ *
+ * @param m
+ * The packet mbuf.
+ * @return
+ * - 0: On success
+ * -EPERM: mbuf is is shared overwriting would be unsafe
+ * -ENOSPC: not enough headroom in mbuf
+ */
+static inline int rte_vlan_insert(struct rte_mbuf **m)
+{
+ struct ether_hdr *oh, *nh;
+ struct vlan_hdr *vh;
+
+#ifdef RTE_MBUF_REFCNT
+ /* Can't insert header if mbuf is shared */
+ if (rte_mbuf_refcnt_read(*m) > 1) {
+ struct rte_mbuf *copy;
+
+ copy = rte_pktmbuf_clone(*m, (*m)->pool);
+ if (unlikely(copy == NULL))
+ return -ENOMEM;
+ rte_pktmbuf_free(*m);
+ *m = copy;
+ }
+#endif
+ oh = rte_pktmbuf_mtod(*m, struct ether_hdr *);
+ nh = (struct ether_hdr *)
+ rte_pktmbuf_prepend(*m, sizeof(struct vlan_hdr));
+ if (nh == NULL)
+ return -ENOSPC;
+
+ memmove(nh, oh, 2 * ETHER_ADDR_LEN);
+ nh->ether_type = ETHER_TYPE_VLAN;
+
+ vh = (struct vlan_hdr *) (nh + 1);
+ vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci);
+
+ return 0;
+}
+
#ifdef __cplusplus
}
#endif
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:46 UTC
Permalink
The DPDK driver only has to deal with the case of running on PCI
and with SMP. In this case, the code can use the weaker barriers
instead of using hard (fence) barriers. This will help performance.
The rationale is explained in Linux kernel virtio_ring.h.

To make it clearer that this is a virtio thing and not some generic
barrier, prefix the barrier calls with virtio_.

Add missing (and needed) barrier between updating ring data
structure and notifying host.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 2 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 8 +++++---
lib/librte_pmd_virtio/virtqueue.h | 19 ++++++++++++++-----
3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 662a49c..dc47e72 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -175,7 +175,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;

- rmb();
+ virtio_rmb();

used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index c013f97..78af334 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -456,7 +456,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ);
@@ -516,6 +516,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
}

if (likely(nb_enqueued)) {
+ virtio_wmb();
if (unlikely(virtqueue_kick_prepare(rxvq))) {
virtqueue_notify(rxvq);
PMD_RX_LOG(DEBUG, "Notified\n");
@@ -547,7 +548,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

if (nb_used == 0)
return 0;
@@ -694,7 +695,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
nb_used = VIRTQUEUE_NUSED(txvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);

@@ -735,6 +736,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
}
}
vq_update_avail_idx(txvq);
+ virtio_wmb();

txvq->packets += nb_tx;

diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index fdee054..f6ad98d 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -46,9 +46,18 @@
#include "virtio_ring.h"
#include "virtio_logs.h"

-#define mb() rte_mb()
-#define wmb() rte_wmb()
-#define rmb() rte_rmb()
+/*
+ * Per virtio_config.h in Linux.
+ * For virtio_pci on SMP, we don't need to order with respect to MMIO
+ * accesses through relaxed memory I/O windows, so smp_mb() et al are
+ * sufficient.
+ *
+ * This driver is for virtio_pci on SMP and therefore can assume
+ * weaker (compiler barriers)
+ */
+#define virtio_mb() rte_mb()
+#define virtio_rmb() rte_compiler_barrier()
+#define virtio_wmb() rte_compiler_barrier()

#ifdef RTE_PMD_PACKET_PREFETCH
#define rte_packet_prefetch(p) rte_prefetch1(p)
@@ -225,7 +234,7 @@ virtqueue_full(const struct virtqueue *vq)
static inline void
vq_update_avail_idx(struct virtqueue *vq)
{
- rte_compiler_barrier();
+ virtio_rmb();
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}

@@ -255,7 +264,7 @@ static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
- * Ensure updated avail->idx is visible to host. mb() necessary?
+ * Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:50 UTC
Permalink
Implement VLAN stripping in software. This allows application
to be device independent.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ethdev.h | 3 +++
lib/librte_pmd_virtio/virtio_ethdev.c | 2 ++
lib/librte_pmd_virtio/virtio_pci.h | 1 +
lib/librte_pmd_virtio/virtio_rxtx.c | 20 ++++++++++++++++++--
4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 1200c1c..94d6b2b 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -643,6 +643,9 @@ struct rte_eth_rxconf {
#define ETH_TXQ_FLAGS_NOOFFLOADS \
(ETH_TXQ_FLAGS_NOVLANOFFL | ETH_TXQ_FLAGS_NOXSUMSCTP | \
ETH_TXQ_FLAGS_NOXSUMUDP | ETH_TXQ_FLAGS_NOXSUMTCP)
+#define ETH_TXQ_FLAGS_NOXSUMS \
+ (ETH_TXQ_FLAGS_NOXSUMSCTP | ETH_TXQ_FLAGS_NOXSUMUDP | \
+ ETH_TXQ_FLAGS_NOXSUMTCP)
/**
* A structure used to configure a TX ring of an Ethernet port.
*/
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index ef87ff8..da74659 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1064,6 +1064,8 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return (-EINVAL);
}

+ hw->vlan_strip = rxmode->hw_vlan_strip;
+
ret = vtpci_irq_config(hw, 0);
if (ret != 0)
PMD_DRV_LOG(ERR, "failed to set config vector");
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 6998737..6d93fac 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -168,6 +168,7 @@ struct virtio_hw {
uint32_t max_tx_queues;
uint32_t max_rx_queues;
uint16_t vtnet_hdr_size;
+ uint8_t vlan_strip;
uint8_t use_msix;
uint8_t mac_addr[ETHER_ADDR_LEN];
};
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 78af334..e0216ec 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -49,6 +49,7 @@
#include <rte_prefetch.h>
#include <rte_string_fns.h>
#include <rte_errno.h>
+#include <rte_byteorder.h>

#include "virtio_logs.h"
#include "virtio_ethdev.h"
@@ -408,8 +409,8 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,

PMD_INIT_FUNC_TRACE();

- if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOOFFLOADS)
- != ETH_TXQ_FLAGS_NOOFFLOADS) {
+ if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS)
+ != ETH_TXQ_FLAGS_NOXSUMS) {
PMD_INIT_LOG(ERR, "TX checksum offload not supported\n");
return -EINVAL;
}
@@ -446,6 +447,7 @@ uint16_t
virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -489,6 +491,9 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
rxm->pkt_len = (uint32_t)(len[i] - hdr_size);
rxm->data_len = (uint16_t)(len[i] - hdr_size);

+ if (hw->vlan_strip)
+ rte_vlan_strip(rxm);
+
VIRTIO_DUMP_PACKET(rxm, rxm->data_len);

rx_pkts[nb_rx++] = rxm;
@@ -717,6 +722,17 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
*/
if (likely(need <= 0)) {
txm = tx_pkts[nb_tx];
+
+ /* Do VLAN tag insertion */
+ if (txm->ol_flags & PKT_TX_VLAN_PKT) {
+ error = rte_vlan_insert(&txm);
+ if (unlikely(error)) {
+ rte_pktmbuf_free(txm);
+ ++nb_tx;
+ continue;
+ }
+ }
+
/* Enqueue Packet buffers */
error = virtqueue_enqueue_xmit(txvq, txm);
if (unlikely(error)) {
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:53 UTC
Permalink
Change order of initialiazation to match Linux kernel.
Don't blow away control queue by doing reset when stopped.

Calling dev_stop then dev_start would not work.
Dev_stop was calling virtio reset and that would clear all queues
and clear all feature negotiation.
Resolved by only doing reset on device removal.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 58 ++++++++++++++++++++---------------
lib/librte_pmd_virtio/virtio_pci.c | 10 ++----
lib/librte_pmd_virtio/virtio_pci.h | 3 +-
3 files changed, 37 insertions(+), 34 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 0d41e7f..47dd33d 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -398,9 +398,14 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx,
static void
virtio_dev_close(struct rte_eth_dev *dev)
{
+ struct virtio_hw *hw = dev->data->dev_private;
+
PMD_INIT_LOG(DEBUG, "virtio_dev_close");

- virtio_dev_stop(dev);
+ /* reset the NIC */
+ vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
+ vtpci_reset(hw);
+ virtio_dev_free_mbufs(dev);
}

static void
@@ -889,6 +894,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
if (rte_eal_process_type() == RTE_PROC_SECONDARY)
return 0;

+ /* Tell the host we've noticed this device. */
+ vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
+
pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
return -1;
@@ -899,9 +907,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Reset the device although not necessary at startup */
vtpci_reset(hw);

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
virtio_negotiate_features(hw);
@@ -990,6 +995,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Setup interrupt callback */
rte_intr_callback_register(&pci_dev->intr_handle,
virtio_interrupt_handler, eth_dev);
+
+ virtio_dev_cq_start(eth_dev);
+
return 0;
}

@@ -1044,7 +1052,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
struct virtio_hw *hw = dev->data->dev_private;
- int ret;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1055,11 +1062,12 @@ virtio_dev_configure(struct rte_eth_dev *dev)

hw->vlan_strip = rxmode->hw_vlan_strip;

- ret = vtpci_irq_config(hw, 0);
- if (ret != 0)
+ if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
PMD_DRV_LOG(ERR, "failed to set config vector");
+ return -EBUSY;
+ }

- return ret;
+ return 0;
}


@@ -1069,17 +1077,6 @@ virtio_dev_start(struct rte_eth_dev *dev)
uint16_t nb_queues, i;
struct virtio_hw *hw = dev->data->dev_private;

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
- /* Tell the host we've known how to drive the device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
-
- virtio_dev_cq_start(dev);
-
- /* Do final configuration before rx/tx engine starts */
- virtio_dev_rxtx_start(dev);
-
/* check if lsc interrupt feature is enabled */
if (dev->data->dev_conf.intr_conf.lsc) {
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
@@ -1096,8 +1093,16 @@ virtio_dev_start(struct rte_eth_dev *dev)
/* Initialize Link state */
virtio_dev_link_update(dev, 0);

+ /* On restart after stop do not touch queues */
+ if (hw->started)
+ return 0;
+
vtpci_reinit_complete(hw);

+ /* Do final configuration before rx/tx engine starts */
+ virtio_dev_rxtx_start(dev);
+ hw->started = 1;
+
/*Notify the backend
*Otherwise the tap backend might already stop its queue due to fullness.
*vhost backend will have no chance to be waked up
@@ -1168,17 +1173,20 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev)
}

/*
- * Stop device: disable rx and tx functions to allow for reconfiguring.
+ * Stop device: disable interrupt and mark link down
*/
static void
virtio_dev_stop(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw = dev->data->dev_private;
+ struct rte_eth_link link;

- /* reset the NIC */
- vtpci_irq_config(hw, 0);
- vtpci_reset(hw);
- virtio_dev_free_mbufs(dev);
+ PMD_INIT_LOG(DEBUG, "stop");
+
+ if (dev->data->dev_conf.intr_conf.lsc)
+ rte_intr_disable(&dev->pci_dev->intr_handle);
+
+ memset(&link, 0, sizeof(link));
+ virtio_dev_atomic_write_link_status(dev, &link);
}

static int
diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index 6d51032..b099e4f 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -137,15 +137,9 @@ vtpci_isr(struct virtio_hw *hw)


/* Enable one vector (0) for Link State Intrerrupt */
-int
+uint16_t
vtpci_irq_config(struct virtio_hw *hw, uint16_t vec)
{
VIRTIO_WRITE_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR, vec);
- vec = VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
- if (vec == VIRTIO_MSI_NO_VECTOR) {
- PMD_DRV_LOG(ERR, "failed to set config vector");
- return -EBUSY;
- }
-
- return 0;
+ return VIRTIO_READ_REG_2(hw, VIRTIO_MSI_CONFIG_VECTOR);
}
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 6d93fac..0a4b578 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -170,6 +170,7 @@ struct virtio_hw {
uint16_t vtnet_hdr_size;
uint8_t vlan_strip;
uint8_t use_msix;
+ uint8_t started;
uint8_t mac_addr[ETHER_ADDR_LEN];
};

@@ -266,6 +267,6 @@ void vtpci_read_dev_config(struct virtio_hw *, uint64_t, void *, int);

uint8_t vtpci_isr(struct virtio_hw *);

-int vtpci_irq_config(struct virtio_hw *, uint16_t);
+uint16_t vtpci_irq_config(struct virtio_hw *, uint16_t);

#endif /* _VIRTIO_PCI_H_ */
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:55 UTC
Permalink
Better to check at compile time than fail at runtime.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 47dd33d..9679c2f 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -882,11 +882,7 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
uint32_t offset_conf = sizeof(config->mac);
struct rte_pci_device *pci_dev;

- if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) {
- PMD_INIT_LOG(ERR,
- "MBUF HEADROOM should be enough to hold virtio net hdr\n");
- return -1;
- }
+ RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr));

eth_dev->dev_ops = &virtio_eth_dev_ops;
eth_dev->tx_pkt_burst = &virtio_xmit_pkts;
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:52 UTC
Permalink
Since vq_alignment is constant (always 4K), it does not
need to be part of the vring struct.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 1 -
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
lib/librte_pmd_virtio/virtqueue.h | 3 +--
3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 59b74b7..0d41e7f 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -294,7 +294,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
vq->port_id = dev->data->port_id;
vq->queue_id = queue_idx;
vq->vq_queue_index = vtpci_queue_idx;
- vq->vq_alignment = VIRTIO_PCI_VRING_ALIGN;
vq->vq_nentries = vq_size;
vq->vq_free_cnt = vq_size;

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index a82d5ff..b6d6832 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -258,7 +258,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
* Reinitialise since virtio port might have been stopped and restarted
*/
memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
- vring_init(vr, size, ring_mem, vq->vq_alignment);
+ vring_init(vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN);
vq->vq_used_cons_idx = 0;
vq->vq_desc_head_idx = 0;
vq->vq_avail_idx = 0;
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index f6ad98d..5b8a255 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -138,8 +138,7 @@ struct virtqueue {
uint8_t port_id; /**< Device port identifier. */

void *vq_ring_virt_mem; /**< linear address of vring*/
- int vq_alignment;
- int vq_ring_size;
+ unsigned int vq_ring_size;
phys_addr_t vq_ring_mem; /**< physical address of vring */

struct vring vq_ring; /**< vring keeping desc, used and avail */
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:56 UTC
Permalink
If allocation fails, don't want to leave virtio device stuck
in middle of initialization sequence.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 9679c2f..39b1fb4 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -890,6 +890,15 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
if (rte_eal_process_type() == RTE_PROC_SECONDARY)
return 0;

+ /* Allocate memory for storing MAC addresses */
+ eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0);
+ if (eth_dev->data->mac_addrs == NULL) {
+ PMD_INIT_LOG(ERR,
+ "Failed to allocate %d bytes needed to store MAC addresses",
+ ETHER_ADDR_LEN);
+ return -ENOMEM;
+ }
+
/* Tell the host we've noticed this device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);

@@ -916,15 +925,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr);
}

- /* Allocate memory for storing MAC addresses */
- eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0);
- if (eth_dev->data->mac_addrs == NULL) {
- PMD_INIT_LOG(ERR,
- "Failed to allocate %d bytes needed to store MAC addresses",
- ETHER_ADDR_LEN);
- return -ENOMEM;
- }
-
/* Copy the permanent MAC address to: virtio_hw */
virtio_get_hwaddr(hw);
ether_addr_copy((struct ether_addr *) hw->mac_addr,
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:51 UTC
Permalink
Cleanup virtio code by eliminating unnecessary nesting of
virtio hardware structure inside adapter structure.
Also allows removing unneeded macro, making code clearer.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 43 ++++++++++++-----------------------
lib/librte_pmd_virtio/virtio_ethdev.h | 9 --------
lib/librte_pmd_virtio/virtio_rxtx.c | 3 +--
3 files changed, 16 insertions(+), 39 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index da74659..59b74b7 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -207,8 +207,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
static int
virtio_set_multiple_queues(struct rte_eth_dev *dev, uint16_t nb_queues)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -242,8 +241,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
const struct rte_memzone *mz;
uint16_t vq_size;
int size;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtqueue *vq = NULL;

/* Write the virtqueue index to the Queue Select Field */
@@ -383,8 +381,7 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx,
struct virtqueue *vq;
uint16_t nb_desc = 0;
int ret;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

PMD_INIT_FUNC_TRACE();
ret = virtio_dev_queue_setup(dev, VTNET_CQ, VTNET_SQ_CQ_QUEUE_IDX,
@@ -410,8 +407,7 @@ virtio_dev_close(struct rte_eth_dev *dev)
static void
virtio_dev_promiscuous_enable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -430,8 +426,7 @@ virtio_dev_promiscuous_enable(struct rte_eth_dev *dev)
static void
virtio_dev_promiscuous_disable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -450,8 +445,7 @@ virtio_dev_promiscuous_disable(struct rte_eth_dev *dev)
static void
virtio_dev_allmulticast_enable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -470,8 +464,7 @@ virtio_dev_allmulticast_enable(struct rte_eth_dev *dev)
static void
virtio_dev_allmulticast_disable(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
struct virtio_pmd_ctrl ctrl;
int dlen[1];
int ret;
@@ -853,8 +846,7 @@ virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
void *param)
{
struct rte_eth_dev *dev = param;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
uint8_t isr;

/* Read interrupt status which clears interrupt */
@@ -880,12 +872,11 @@ static int
eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
struct rte_eth_dev *eth_dev)
{
+ struct virtio_hw *hw = eth_dev->data->dev_private;
struct virtio_net_config *config;
struct virtio_net_config local_config;
uint32_t offset_conf = sizeof(config->mac);
struct rte_pci_device *pci_dev;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);

if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) {
PMD_INIT_LOG(ERR,
@@ -1010,7 +1001,7 @@ static struct eth_driver rte_virtio_pmd = {
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
},
.eth_dev_init = eth_virtio_dev_init,
- .dev_private_size = sizeof(struct virtio_adapter),
+ .dev_private_size = sizeof(struct virtio_hw),
};

/*
@@ -1053,8 +1044,7 @@ static int
virtio_dev_configure(struct rte_eth_dev *dev)
{
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
int ret;

PMD_INIT_LOG(DEBUG, "configure");
@@ -1078,8 +1068,7 @@ static int
virtio_dev_start(struct rte_eth_dev *dev)
{
uint16_t nb_queues, i;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

/* Tell the host we've noticed this device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
@@ -1185,8 +1174,7 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev)
static void
virtio_dev_stop(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

/* reset the NIC */
vtpci_irq_config(hw, 0);
@@ -1199,8 +1187,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
{
struct rte_eth_link link, old;
uint16_t status;
- struct virtio_hw *hw =
- VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;
memset(&link, 0, sizeof(link));
virtio_dev_atomic_read_link_status(dev, &link);
old = link;
@@ -1232,7 +1219,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complet
static void
virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
{
- struct virtio_hw *hw = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

dev_info->driver_name = dev->driver->pci_drv.name;
dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h
index 1da3c62..55c9749 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.h
+++ b/lib/librte_pmd_virtio/virtio_ethdev.h
@@ -110,15 +110,6 @@ uint16_t virtio_recv_mergeable_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

-/*
- * Structure to store private data for each driver instance (for each port).
- */
-struct virtio_adapter {
- struct virtio_hw hw;
-};
-
-#define VIRTIO_DEV_PRIVATE_TO_HW(adapter)\
- (&((struct virtio_adapter *)adapter)->hw)

/*
* The VIRTIO_NET_F_GUEST_TSO[46] features permit the host to send us
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index e0216ec..a82d5ff 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -326,8 +326,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
void
virtio_dev_cq_start(struct rte_eth_dev *dev)
{
- struct virtio_hw *hw
- = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct virtio_hw *hw = dev->data->dev_private;

if (hw->cvq) {
virtio_dev_vring_start(hw->cvq, VTNET_CQ);
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:57 UTC
Permalink
Virtio supports vlan filtering.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 31 +++++++++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 39b1fb4..591d692 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -84,6 +84,8 @@ static void virtio_dev_tx_queue_release(__rte_unused void *txq);
static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats);
static void virtio_dev_stats_reset(struct rte_eth_dev *dev);
static void virtio_dev_free_mbufs(struct rte_eth_dev *dev);
+static int virtio_vlan_filter_set(struct rte_eth_dev *dev,
+ uint16_t vlan_id, int on);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -511,6 +513,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.tx_queue_release = virtio_dev_tx_queue_release,
/* collect stats per queue */
.queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set,
+ .vlan_filter_set = virtio_vlan_filter_set,
};

static inline int
@@ -640,14 +643,31 @@ virtio_get_hwaddr(struct virtio_hw *hw)
}
}

+static int
+virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ struct virtio_pmd_ctrl ctrl;
+ int len;
+
+ if (!vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN))
+ return -ENOTSUP;
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_VLAN;
+ ctrl.hdr.cmd = on ? VIRTIO_NET_CTRL_VLAN_ADD : VIRTIO_NET_CTRL_VLAN_DEL;
+ memcpy(ctrl.data, &vlan_id, sizeof(vlan_id));
+ len = sizeof(vlan_id);
+
+ return virtio_send_command(hw->cvq, &ctrl, &len, 1);
+}

static void
virtio_negotiate_features(struct virtio_hw *hw)
{
uint32_t host_features, mask;

- mask = VIRTIO_NET_F_CTRL_VLAN;
- mask |= VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;
+ /* checksum offload not implemented */
+ mask = VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;

/* TSO and LRO are only available when their corresponding
* checksum offload feature is also negotiated.
@@ -1058,6 +1078,13 @@ virtio_dev_configure(struct rte_eth_dev *dev)

hw->vlan_strip = rxmode->hw_vlan_strip;

+ if (rxmode->hw_vlan_filter
+ && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) {
+ PMD_DRV_LOG(NOTICE,
+ "vlan filtering not available on this host");
+ return -ENOTSUP;
+ }
+
if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
PMD_DRV_LOG(ERR, "failed to set config vector");
return -EBUSY;
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:59 UTC
Permalink
Need to have do special things to set default mac address.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ethdev.h | 5 +++++
lib/librte_pmd_virtio/virtio_ethdev.c | 24 ++++++++++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 94d6b2b..5a54276 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1240,6 +1240,10 @@ typedef void (*eth_mac_addr_add_t)(struct rte_eth_dev *dev,
uint32_t vmdq);
/**< @internal Set a MAC address into Receive Address Address Register */

+typedef void (*eth_mac_addr_set_t)(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr);
+/**< @internal Set a MAC address into Receive Address Address Register */
+
typedef int (*eth_uc_hash_table_set_t)(struct rte_eth_dev *dev,
struct ether_addr *mac_addr,
uint8_t on);
@@ -1459,6 +1463,7 @@ struct eth_dev_ops {
priority_flow_ctrl_set_t priority_flow_ctrl_set; /**< Setup priority flow control.*/
eth_mac_addr_remove_t mac_addr_remove; /**< Remove MAC address */
eth_mac_addr_add_t mac_addr_add; /**< Add a MAC address */
+ eth_mac_addr_set_t mac_addr_set; /**< Set a MAC address */
eth_uc_hash_table_set_t uc_hash_table_set; /**< Set Unicast Table Array */
eth_uc_all_hash_table_set_t uc_all_hash_table_set; /**< Set Unicast hash bitmap */
eth_mirror_rule_set_t mirror_rule_set; /**< Add a traffic mirror rule.*/
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 0e74eea..b30ab2a 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -90,6 +90,8 @@ static void virtio_mac_addr_add(struct rte_eth_dev *dev,
struct ether_addr *mac_addr,
uint32_t index, uint32_t vmdq __rte_unused);
static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index);
+static void virtio_mac_addr_set(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -518,6 +520,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.vlan_filter_set = virtio_vlan_filter_set,
.mac_addr_add = virtio_mac_addr_add,
.mac_addr_remove = virtio_mac_addr_remove,
+ .mac_addr_set = virtio_mac_addr_set,
};

static inline int
@@ -733,6 +736,27 @@ virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
virtio_mac_table_set(hw, uc, mc);
}

+static void
+virtio_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+
+ memcpy(hw->mac_addr, mac_addr, ETHER_ADDR_LEN);
+
+ /* Use atomic update if available */
+ if (vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_MAC_ADDR)) {
+ struct virtio_pmd_ctrl ctrl;
+ int len = ETHER_ADDR_LEN;
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_MAC;
+ ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_ADDR_SET;
+
+ memcpy(ctrl.data, mac_addr, ETHER_ADDR_LEN);
+ virtio_send_command(hw->cvq, &ctrl, &len, 1);
+ } else if (vtpci_with_feature(hw, VIRTIO_NET_F_MAC))
+ virtio_set_hwaddr(hw);
+}
+
static int
virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
{
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:54 UTC
Permalink
Make vtpci_get_status a local function as it is used in one file.

igned-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_pci.c | 4 +++-
lib/librte_pmd_virtio/virtio_pci.h | 2 --
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c
index b099e4f..2245bec 100644
--- a/lib/librte_pmd_virtio/virtio_pci.c
+++ b/lib/librte_pmd_virtio/virtio_pci.c
@@ -35,6 +35,8 @@
#include "virtio_pci.h"
#include "virtio_logs.h"

+static uint8_t vtpci_get_status(struct virtio_hw *);
+
void
vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset,
void *dst, int length)
@@ -113,7 +115,7 @@ vtpci_reinit_complete(struct virtio_hw *hw)
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER_OK);
}

-uint8_t
+static uint8_t
vtpci_get_status(struct virtio_hw *hw)
{
return VIRTIO_READ_REG_1(hw, VIRTIO_PCI_STATUS);
diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h
index 0a4b578..64d9c34 100644
--- a/lib/librte_pmd_virtio/virtio_pci.h
+++ b/lib/librte_pmd_virtio/virtio_pci.h
@@ -255,8 +255,6 @@ void vtpci_reset(struct virtio_hw *);

void vtpci_reinit_complete(struct virtio_hw *);

-uint8_t vtpci_get_status(struct virtio_hw *);
-
void vtpci_set_status(struct virtio_hw *, uint8_t);

uint32_t vtpci_negotiate_features(struct virtio_hw *, uint32_t);
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:00 UTC
Permalink
This makes virtio driver work like ixgbe. Transmit buffers are
held until a transmit threshold is reached. The previous behavior
was to hold mbuf's until the ring entry was reused which caused
more memory usage than needed.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 7 ++--
lib/librte_pmd_virtio/virtio_rxtx.c | 75 +++++++++++++++++++++++++----------
lib/librte_pmd_virtio/virtqueue.h | 3 +-
3 files changed, 60 insertions(+), 25 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b30ab2a..8cd2d51 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -176,15 +176,16 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,

virtqueue_notify(vq);

- while (vq->vq_used_cons_idx == vq->vq_ring.used->idx)
+ rte_rmb();
+ while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) {
+ rte_rmb();
usleep(100);
+ }

while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) {
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;

- virtio_rmb();
-
used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index b6d6832..580701a 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -129,17 +129,32 @@ virtqueue_dequeue_burst_rx(struct virtqueue *vq, struct rte_mbuf **rx_pkts,
return i;
}

+#ifndef DEFAULT_TX_FREE_THRESH
+#define DEFAULT_TX_FREE_THRESH 32
+#endif
+
+/* Cleanup from completed transmits. */
static void
-virtqueue_dequeue_pkt_tx(struct virtqueue *vq)
+virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)
{
- struct vring_used_elem *uep;
- uint16_t used_idx, desc_idx;
+ uint16_t i, used_idx, desc_idx;
+ for (i = 0; i < num; i++) {
+ struct vring_used_elem *uep;
+ struct vq_desc_extra *dxp;
+
+ used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
+ uep = &vq->vq_ring.used->ring[used_idx];
+ dxp = &vq->vq_descx[used_idx];
+
+ desc_idx = (uint16_t) uep->id;
+ vq->vq_used_cons_idx++;
+ vq_ring_free_chain(vq, desc_idx);

- used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
- uep = &vq->vq_ring.used->ring[used_idx];
- desc_idx = (uint16_t) uep->id;
- vq->vq_used_cons_idx++;
- vq_ring_free_chain(vq, desc_idx);
+ if (dxp->cookie != NULL) {
+ rte_pktmbuf_free(dxp->cookie);
+ dxp->cookie = NULL;
+ }
+ }
}


@@ -203,8 +218,6 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)

idx = head_idx;
dxp = &txvq->vq_descx[idx];
- if (dxp->cookie != NULL)
- rte_pktmbuf_free(dxp->cookie);
dxp->cookie = (void *)cookie;
dxp->ndescs = needed;

@@ -404,6 +417,7 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
{
uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX;
struct virtqueue *vq;
+ uint16_t tx_free_thresh;
int ret;

PMD_INIT_FUNC_TRACE();
@@ -421,6 +435,22 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
return ret;
}

+ tx_free_thresh = tx_conf->tx_free_thresh;
+ if (tx_free_thresh == 0)
+ tx_free_thresh =
+ RTE_MIN(vq->vq_nentries / 4, DEFAULT_TX_FREE_THRESH);
+
+ if (tx_free_thresh >= (vq->vq_nentries - 3)) {
+ RTE_LOG(ERR, PMD, "tx_free_thresh must be less than the "
+ "number of TX entries minus 3 (%u)."
+ " (tx_free_thresh=%u port=%u queue=%u)\n",
+ vq->vq_nentries - 3,
+ tx_free_thresh, dev->data->port_id, queue_idx);
+ return -EINVAL;
+ }
+
+ vq->vq_free_thresh = tx_free_thresh;
+
dev->data->tx_queues[queue_idx] = vq;
return 0;
}
@@ -688,11 +718,9 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
{
struct virtqueue *txvq = tx_queue;
struct rte_mbuf *txm;
- uint16_t nb_used, nb_tx, num;
+ uint16_t nb_used, nb_tx;
int error;

- nb_tx = 0;
-
if (unlikely(nb_pkts < 1))
return nb_pkts;

@@ -700,21 +728,26 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
nb_used = VIRTQUEUE_NUSED(txvq);

virtio_rmb();
+ if (likely(nb_used > txvq->vq_free_thresh))
+ virtio_xmit_cleanup(txvq, nb_used);

- num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);
+ nb_tx = 0;

while (nb_tx < nb_pkts) {
/* Need one more descriptor for virtio header. */
int need = tx_pkts[nb_tx]->nb_segs - txvq->vq_free_cnt + 1;
- int deq_cnt = RTE_MIN(need, (int)num);

- num -= (deq_cnt > 0) ? deq_cnt : 0;
- while (deq_cnt > 0) {
- virtqueue_dequeue_pkt_tx(txvq);
- deq_cnt--;
+ /*Positive value indicates it need free vring descriptors */
+ if (unlikely(need > 0)) {
+ nb_used = VIRTQUEUE_NUSED(txvq);
+ virtio_rmb();
+ need = RTE_MIN(need, (int)nb_used);
+
+ virtio_xmit_cleanup(txvq, need);
+ need = (int)tx_pkts[nb_tx]->nb_segs -
+ txvq->vq_free_cnt + 1;
}

- need = (int)tx_pkts[nb_tx]->nb_segs - txvq->vq_free_cnt + 1;
/*
* Zero or negative value indicates it has enough free
* descriptors to use for transmitting.
@@ -723,7 +756,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
txm = tx_pkts[nb_tx];

/* Do VLAN tag insertion */
- if (txm->ol_flags & PKT_TX_VLAN_PKT) {
+ if (unlikely(txm->ol_flags & PKT_TX_VLAN_PKT)) {
error = rte_vlan_insert(&txm);
if (unlikely(error)) {
rte_pktmbuf_free(txm);
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index d210f4f..6c45c27 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -164,6 +164,7 @@ struct virtqueue {
struct rte_mempool *mpool; /**< mempool for mbuf allocation */
uint16_t queue_id; /**< DPDK queue index. */
uint8_t port_id; /**< Device port identifier. */
+ uint16_t vq_queue_index; /**< PCI queue index */

void *vq_ring_virt_mem; /**< linear address of vring*/
unsigned int vq_ring_size;
@@ -172,7 +173,7 @@ struct virtqueue {
struct vring vq_ring; /**< vring keeping desc, used and avail */
uint16_t vq_free_cnt; /**< num of desc available */
uint16_t vq_nentries; /**< vring desc numbers */
- uint16_t vq_queue_index; /**< PCI queue index */
+ uint16_t vq_free_thresh; /**< free threshold */
/**
* Head of the free chain in the descriptor table. If
* there are no free descriptors, this will be set to
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:23:58 UTC
Permalink
Virtio support multiple MAC addresses.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 94 ++++++++++++++++++++++++++++++++++-
lib/librte_pmd_virtio/virtio_ethdev.h | 3 +-
lib/librte_pmd_virtio/virtqueue.h | 34 ++++++++++++-
3 files changed, 127 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 591d692..0e74eea 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -86,6 +86,10 @@ static void virtio_dev_stats_reset(struct rte_eth_dev *dev);
static void virtio_dev_free_mbufs(struct rte_eth_dev *dev);
static int virtio_vlan_filter_set(struct rte_eth_dev *dev,
uint16_t vlan_id, int on);
+static void virtio_mac_addr_add(struct rte_eth_dev *dev,
+ struct ether_addr *mac_addr,
+ uint32_t index, uint32_t vmdq __rte_unused);
+static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index);

static int virtio_dev_queue_stats_mapping_set(
__rte_unused struct rte_eth_dev *eth_dev,
@@ -503,8 +507,6 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.stats_get = virtio_dev_stats_get,
.stats_reset = virtio_dev_stats_reset,
.link_update = virtio_dev_link_update,
- .mac_addr_add = NULL,
- .mac_addr_remove = NULL,
.rx_queue_setup = virtio_dev_rx_queue_setup,
/* meaningfull only to multiple queue */
.rx_queue_release = virtio_dev_rx_queue_release,
@@ -514,6 +516,8 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
/* collect stats per queue */
.queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set,
.vlan_filter_set = virtio_vlan_filter_set,
+ .mac_addr_add = virtio_mac_addr_add,
+ .mac_addr_remove = virtio_mac_addr_remove,
};

static inline int
@@ -644,6 +648,92 @@ virtio_get_hwaddr(struct virtio_hw *hw)
}

static int
+virtio_mac_table_set(struct virtio_hw *hw,
+ const struct virtio_net_ctrl_mac *uc,
+ const struct virtio_net_ctrl_mac *mc)
+{
+ struct virtio_pmd_ctrl ctrl;
+ int err, len[2];
+
+ ctrl.hdr.class = VIRTIO_NET_CTRL_MAC;
+ ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_TABLE_SET;
+
+ len[0] = uc->entries * ETHER_ADDR_LEN + sizeof(uc->entries);
+ memcpy(ctrl.data, uc, len[0]);
+
+ len[1] = mc->entries * ETHER_ADDR_LEN + sizeof(mc->entries);
+ memcpy(ctrl.data + len[0], mc, len[1]);
+
+ err = virtio_send_command(hw->cvq, &ctrl, len, 2);
+ if (err != 0)
+ PMD_DRV_LOG(NOTICE, "mac table set failed: %d", err);
+
+ return err;
+}
+
+static void
+virtio_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
+ uint32_t index, uint32_t vmdq __rte_unused)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ const struct ether_addr *addrs = dev->data->mac_addrs;
+ unsigned int i;
+ struct virtio_net_ctrl_mac *uc, *mc;
+
+ if (index >= VIRTIO_MAX_MAC_ADDRS) {
+ PMD_DRV_LOG(ERR, "mac address index %u out of range", index);
+ return;
+ }
+
+ uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries));
+ uc->entries = 0;
+ mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries));
+ mc->entries = 0;
+
+ for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) {
+ const struct ether_addr *addr
+ = (i == index) ? mac_addr : addrs + i;
+ struct virtio_net_ctrl_mac *tbl
+ = is_multicast_ether_addr(addr) ? mc : uc;
+
+ memcpy(&tbl->macs[tbl->entries++], addr, ETHER_ADDR_LEN);
+ }
+
+ virtio_mac_table_set(hw, uc, mc);
+}
+
+static void
+virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+ struct virtio_hw *hw = dev->data->dev_private;
+ struct ether_addr *addrs = dev->data->mac_addrs;
+ struct virtio_net_ctrl_mac *uc, *mc;
+ unsigned int i;
+
+ if (index >= VIRTIO_MAX_MAC_ADDRS) {
+ PMD_DRV_LOG(ERR, "mac address index %u out of range", index);
+ return;
+ }
+
+ uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries));
+ uc->entries = 0;
+ mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries));
+ mc->entries = 0;
+
+ for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) {
+ struct virtio_net_ctrl_mac *tbl;
+
+ if (i == index || is_zero_ether_addr(addrs + i))
+ continue;
+
+ tbl = is_multicast_ether_addr(addrs + i) ? mc : uc;
+ memcpy(&tbl->macs[tbl->entries++], addrs + i, ETHER_ADDR_LEN);
+ }
+
+ virtio_mac_table_set(hw, uc, mc);
+}
+
+static int
virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
{
struct virtio_hw *hw = dev->data->dev_private;
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h
index 55c9749..74ac7e0 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.h
+++ b/lib/librte_pmd_virtio/virtio_ethdev.h
@@ -51,7 +51,7 @@

#define VIRTIO_MAX_RX_QUEUES 128
#define VIRTIO_MAX_TX_QUEUES 128
-#define VIRTIO_MAX_MAC_ADDRS 1
+#define VIRTIO_MAX_MAC_ADDRS 64
#define VIRTIO_MIN_RX_BUFSIZE 64
#define VIRTIO_MAX_RX_PKTLEN 9728

@@ -60,6 +60,7 @@
(VIRTIO_NET_F_MAC | \
VIRTIO_NET_F_STATUS | \
VIRTIO_NET_F_MQ | \
+ VIRTIO_NET_F_CTRL_MAC_ADDR | \
VIRTIO_NET_F_CTRL_VQ | \
VIRTIO_NET_F_CTRL_RX | \
VIRTIO_NET_F_CTRL_VLAN | \
diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index 5b8a255..d210f4f 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -99,6 +99,34 @@ enum { VTNET_RQ = 0, VTNET_TQ = 1, VTNET_CQ = 2 };
#define VIRTIO_NET_CTRL_RX_NOBCAST 5

/**
+ * Control the MAC
+ *
+ * The MAC filter table is managed by the hypervisor, the guest should
+ * assume the size is infinite. Filtering should be considered
+ * non-perfect, ie. based on hypervisor resources, the guest may
+ * received packets from sources not specified in the filter list.
+ *
+ * In addition to the class/cmd header, the TABLE_SET command requires
+ * two out scatterlists. Each contains a 4 byte count of entries followed
+ * by a concatenated byte stream of the ETH_ALEN MAC addresses. The
+ * first sg list contains unicast addresses, the second is for multicast.
+ * This functionality is present if the VIRTIO_NET_F_CTRL_RX feature
+ * is available.
+ *
+ * The ADDR_SET command requests one out scatterlist, it contains a
+ * 6 bytes MAC address. This functionality is present if the
+ * VIRTIO_NET_F_CTRL_MAC_ADDR feature is available.
+ */
+struct virtio_net_ctrl_mac {
+ uint32_t entries;
+ uint8_t macs[][ETHER_ADDR_LEN];
+} __attribute__((__packed__));
+
+#define VIRTIO_NET_CTRL_MAC 1
+ #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
+ #define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+
+/**
* Control VLAN filtering
*
* The VLAN filter table is controlled via a simple ADD/DEL interface.
@@ -121,7 +149,7 @@ typedef uint8_t virtio_net_ctrl_ack;
#define VIRTIO_NET_OK 0
#define VIRTIO_NET_ERR 1

-#define VIRTIO_MAX_CTRL_DATA 128
+#define VIRTIO_MAX_CTRL_DATA 2048

struct virtio_pmd_ctrl {
struct virtio_net_ctrl_hdr hdr;
@@ -180,6 +208,10 @@ struct virtqueue {
#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN 1
#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX 0x8000
#endif
+#ifndef VIRTIO_NET_F_CTRL_MAC_ADDR
+#define VIRTIO_NET_F_CTRL_MAC_ADDR 0x800000
+#define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+#endif

/**
* This is the first element of the scatter-gather list. If you don't
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:02 UTC
Permalink
It should use vring descriptor index instead of used_ring index to index vq_descx.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index 580701a..a82e8eb 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -144,9 +144,9 @@ virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)

used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
- dxp = &vq->vq_descx[used_idx];

desc_idx = (uint16_t) uep->id;
+ dxp = &vq->vq_descx[desc_idx];
vq->vq_used_cons_idx++;
vq_ring_free_chain(vq, desc_idx);
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:03 UTC
Permalink
Need swap the data from cpu to BE(big endian) for vlan-type.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_ether/rte_ether.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 74f71c2..0797908 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -351,7 +351,7 @@ static inline int rte_vlan_strip(struct rte_mbuf *m)
struct ether_hdr *eh
= rte_pktmbuf_mtod(m, struct ether_hdr *);

- if (eh->ether_type != ETHER_TYPE_VLAN)
+ if (eh->ether_type != rte_cpu_to_be_16(ETHER_TYPE_VLAN))
return -1;

struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1);
@@ -401,7 +401,7 @@ static inline int rte_vlan_insert(struct rte_mbuf **m)
return -ENOSPC;

memmove(nh, oh, 2 * ETHER_ADDR_LEN);
- nh->ether_type = ETHER_TYPE_VLAN;
+ nh->ether_type = rte_cpu_to_be_16(ETHER_TYPE_VLAN);

vh = (struct vlan_hdr *) (nh + 1);
vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci);
--
1.8.4.2
Xie, Huawei
2015-02-04 10:54:23 UTC
Permalink
Post by Ouyang Changchun
Need swap the data from cpu to BE(big endian) for vlan-type.
---
lib/librte_ether/rte_ether.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 74f71c2..0797908 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -351,7 +351,7 @@ static inline int rte_vlan_strip(struct rte_mbuf *m)
struct ether_hdr *eh
= rte_pktmbuf_mtod(m, struct ether_hdr *);
- if (eh->ether_type != ETHER_TYPE_VLAN)
+ if (eh->ether_type != rte_cpu_to_be_16(ETHER_TYPE_VLAN))
rte_be_to_cpu_16?
Post by Ouyang Changchun
return -1;
struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1);
@@ -401,7 +401,7 @@ static inline int rte_vlan_insert(struct rte_mbuf **m)
return -ENOSPC;
memmove(nh, oh, 2 * ETHER_ADDR_LEN);
- nh->ether_type = ETHER_TYPE_VLAN;
+ nh->ether_type = rte_cpu_to_be_16(ETHER_TYPE_VLAN);
rte_be_to_cpu_16?
Post by Ouyang Changchun
vh = (struct vlan_hdr *) (nh + 1);
vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci);
Ouyang, Changchun
2015-02-05 00:54:27 UTC
Permalink
Hi huawei,
-----Original Message-----
From: Xie, Huawei
Sent: Wednesday, February 4, 2015 6:54 PM
Subject: Re: [PATCH v3 19/25] ether: Fix vlan strip/insert issue
Post by Ouyang Changchun
Need swap the data from cpu to BE(big endian) for vlan-type.
---
lib/librte_ether/rte_ether.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/librte_ether/rte_ether.h
b/lib/librte_ether/rte_ether.h index 74f71c2..0797908 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -351,7 +351,7 @@ static inline int rte_vlan_strip(struct rte_mbuf *m)
struct ether_hdr *eh
= rte_pktmbuf_mtod(m, struct ether_hdr *);
- if (eh->ether_type != ETHER_TYPE_VLAN)
+ if (eh->ether_type != rte_cpu_to_be_16(ETHER_TYPE_VLAN))
rte_be_to_cpu_16?
Eh->ether_type is in network byte order(big endian),
While ETHER_TYPE_VLAN is host byte order(little endian on x86), so it need change into big endian.
Post by Ouyang Changchun
return -1;
**m)
return -ENOSPC;
memmove(nh, oh, 2 * ETHER_ADDR_LEN);
- nh->ether_type = ETHER_TYPE_VLAN;
+ nh->ether_type = rte_cpu_to_be_16(ETHER_TYPE_VLAN);
rte_be_to_cpu_16?
Similar reason as above.

Thanks
Changchun
Ouyang Changchun
2015-01-29 07:24:01 UTC
Permalink
Make virtio not require UIO for some security reasons, this is to match 6Wind's virtio-net-pmd.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
config/common_linuxapp | 2 +
lib/librte_eal/common/include/rte_pci.h | 4 ++
lib/librte_eal/linuxapp/eal/eal_pci.c | 5 +-
lib/librte_pmd_virtio/virtio_ethdev.c | 91 ++++++++++++++++++++++++++++++++-
4 files changed, 100 insertions(+), 2 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2f9643b..a412457 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -100,6 +100,8 @@ CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
CONFIG_RTE_EAL_IGB_UIO=y
CONFIG_RTE_EAL_VFIO=y
+# Only for VIRTIO PMD currently
+CONFIG_RTE_EAL_PORT_IO=n

#
# Special configurations in PCI Config Space for high performance
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..19abc1f 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -193,6 +193,10 @@ struct rte_pci_driver {

/** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */
#define RTE_PCI_DRV_NEED_MAPPING 0x0001
+/** Device needs port IO(done with /proc/ioports) */
+#ifdef RTE_EAL_PORT_IO
+#define RTE_PCI_DRV_PORT_IO 0x0002
+#endif
/** Device driver must be registered several times until failure - deprecated */
#pragma GCC poison RTE_PCI_DRV_MULTIPLE
/** Device needs to be unbound even if no module is provided */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index b5f5410..5db0059 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -574,7 +574,10 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *d
/* map resources for devices that use igb_uio */
ret = pci_map_device(dev);
if (ret != 0)
- return ret;
+#ifdef RTE_EAL_PORT_IO
+ if ((dr->drv_flags & RTE_PCI_DRV_PORT_IO) == 0)
+#endif
+ return ret;
} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
rte_eal_process_type() == RTE_PROC_PRIMARY) {
/* unbind current driver */
diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 8cd2d51..b905532 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -961,6 +961,71 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev)
start, size);
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+/* Extract port I/O numbers from proc/ioports */
+static int virtio_resource_init_by_portio(struct rte_pci_device *pci_dev)
+{
+ uint16_t start, end;
+ int size;
+ FILE *fp;
+ char *line = NULL;
+ char pci_id[16];
+ int found = 0;
+ size_t linesz;
+
+ snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
+ pci_dev->addr.domain,
+ pci_dev->addr.bus,
+ pci_dev->addr.devid,
+ pci_dev->addr.function);
+
+ fp = fopen("/proc/ioports", "r");
+ if (fp == NULL) {
+ PMD_INIT_LOG(ERR, "%s(): can't open ioports", __func__);
+ return -1;
+ }
+
+ while (getdelim(&line, &linesz, '\n', fp) > 0) {
+ char *ptr = line;
+ char *left;
+ int n;
+
+ n = strcspn(ptr, ":");
+ ptr[n] = 0;
+ left = &ptr[n+1];
+
+ while (*left && isspace(*left))
+ left++;
+
+ if (!strncmp(left, pci_id, strlen(pci_id))) {
+ found = 1;
+
+ while (*ptr && isspace(*ptr))
+ ptr++;
+
+ sscanf(ptr, "%04hx-%04hx", &start, &end);
+ size = end - start + 1;
+
+ break;
+ }
+ }
+
+ free(line);
+ fclose(fp);
+
+ if (!found)
+ return -1;
+
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)(uint32_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
+#endif
+
#else
static int
virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
@@ -974,6 +1039,14 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
/* no setup required */
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+static int virtio_resource_init_by_portio(struct rte_pci_device *pci_dev)
+{
+ /* no setup required */
+ return 0;
+}
+#endif
#endif

/*
@@ -1039,7 +1112,10 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,

pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
- return -1;
+#ifdef RTE_EAL_PORT_IO
+ if (virtio_resource_init_by_portio(pci_dev) < 0)
+#endif
+ return -1;

hw->use_msix = virtio_has_msix(&pci_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
@@ -1132,6 +1208,18 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;
}

+#ifdef RTE_EAL_PORT_IO
+static struct eth_driver rte_virtio_pmd = {
+ {
+ .name = "rte_virtio_pmd",
+ .id_table = pci_id_virtio_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_PORT_IO |
+ RTE_PCI_DRV_INTR_LSC,
+ },
+ .eth_dev_init = eth_virtio_dev_init,
+ .dev_private_size = sizeof(struct virtio_hw),
+};
+#else
static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
@@ -1141,6 +1229,7 @@ static struct eth_driver rte_virtio_pmd = {
.eth_dev_init = eth_virtio_dev_init,
.dev_private_size = sizeof(struct virtio_hw),
};
+#endif

/*
* Driver initialization routine.
--
1.8.4.2
Thomas Monjalon
2015-01-29 23:14:27 UTC
Permalink
Hi Changchun,
Post by Ouyang Changchun
Make virtio not require UIO for some security reasons, this is to match 6Wind's virtio-net-pmd.
Thanks for your effort.
I think port IO is a really interesting option but it needs more EAL rework
to be correctly integrated. Then virtio-net-pmd (http://dpdk.org/browse/virtio-net-pmd/)
will be obsolete and moved in a deprecated area.
Post by Ouyang Changchun
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
+# Only for VIRTIO PMD currently
+CONFIG_RTE_EAL_PORT_IO=n
This is the first problem. We must stop adding new build-time options.
We should be able to choose between PCI mapping and port IO at runtime.
Post by Ouyang Changchun
+/** Device needs port IO(done with /proc/ioports) */
+#ifdef RTE_EAL_PORT_IO
+#define RTE_PCI_DRV_PORT_IO 0x0002
+#endif
A flag should never be ifdef'ed.
Post by Ouyang Changchun
@@ -574,7 +574,10 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *d
/* map resources for devices that use igb_uio */
ret = pci_map_device(dev);
if (ret != 0)
- return ret;
+#ifdef RTE_EAL_PORT_IO
+ if ((dr->drv_flags & RTE_PCI_DRV_PORT_IO) == 0)
+#endif
+ return ret;
} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
rte_eal_process_type() == RTE_PROC_PRIMARY) {
/* unbind current driver */
Why do you need this ugly return?
Post by Ouyang Changchun
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -961,6 +961,71 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev)
start, size);
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+/* Extract port I/O numbers from proc/ioports */
+static int virtio_resource_init_by_portio(struct rte_pci_device *pci_dev)
+{
+ uint16_t start, end;
+ int size;
+ FILE *fp;
+ char *line = NULL;
+ char pci_id[16];
+ int found = 0;
+ size_t linesz;
+
+ snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
+ pci_dev->addr.domain,
+ pci_dev->addr.bus,
+ pci_dev->addr.devid,
+ pci_dev->addr.function);
+
+ fp = fopen("/proc/ioports", "r");
+ if (fp == NULL) {
+ PMD_INIT_LOG(ERR, "%s(): can't open ioports", __func__);
+ return -1;
+ }
+
+ while (getdelim(&line, &linesz, '\n', fp) > 0) {
+ char *ptr = line;
+ char *left;
+ int n;
+
+ n = strcspn(ptr, ":");
+ ptr[n] = 0;
+ left = &ptr[n+1];
+
+ while (*left && isspace(*left))
+ left++;
+
+ if (!strncmp(left, pci_id, strlen(pci_id))) {
+ found = 1;
+
+ while (*ptr && isspace(*ptr))
+ ptr++;
+
+ sscanf(ptr, "%04hx-%04hx", &start, &end);
+ size = end - start + 1;
+
+ break;
+ }
+ }
+
+ free(line);
+ fclose(fp);
+
+ if (!found)
+ return -1;
+
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)(uint32_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
+#endif
This part should be a Linux EAL service.
Post by Ouyang Changchun
+#ifdef RTE_EAL_PORT_IO
+static struct eth_driver rte_virtio_pmd = {
+ {
+ .name = "rte_virtio_pmd",
+ .id_table = pci_id_virtio_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_PORT_IO |
Why does it need PCI mapping in port IO mode?
Post by Ouyang Changchun
+ RTE_PCI_DRV_INTR_LSC,
+ },
+ .eth_dev_init = eth_virtio_dev_init,
+ .dev_private_size = sizeof(struct virtio_hw),
+};
+#else
static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
This is the biggest problem.
You are defining port IO as a different driver instead of providing a way to
choose the method for each virtio device.
I think that you should use devargs to configure the pci device.

Thanks
--
Thomas
Ouyang, Changchun
2015-02-04 01:31:43 UTC
Permalink
Hi Thomas,
-----Original Message-----
Sent: Friday, January 30, 2015 7:14 AM
To: Ouyang, Changchun
Subject: Re: [dpdk-dev] [PATCH v3 17/25] virtio: Use port IO to get PCI
resource.
Hi Changchun,
Post by Ouyang Changchun
Make virtio not require UIO for some security reasons, this is to match
6Wind's virtio-net-pmd.
Thanks for your effort.
I think port IO is a really interesting option but it needs more EAL rework to
be correctly integrated. Then virtio-net-pmd (http://dpdk.org/browse/virtio-
net-pmd/)
will be obsolete and moved in a deprecated area.
Post by Ouyang Changchun
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
+# Only for VIRTIO PMD currently
+CONFIG_RTE_EAL_PORT_IO=n
This is the first problem. We must stop adding new build-time options.
We should be able to choose between PCI mapping and port IO at runtime.
But I don't think choosing between PCI mapping and port IO at runtime is easy way,
That means virtio-pmd need support both method, as we discussed before,
port IO can't support lsc interrupt, while pci mapping can support lsc interrupt,
they are contradict, e.g. the driver flag has issue to set its value.

So I think using build flag should be a better way to let virtio-pmd determine its method at compilation time.
Post by Ouyang Changchun
+/** Device needs port IO(done with /proc/ioports) */ #ifdef
+RTE_EAL_PORT_IO #define RTE_PCI_DRV_PORT_IO 0x0002 #endif
A flag should never be ifdef'ed.
I can remove the ifdef'ed.
Post by Ouyang Changchun
@@ -574,7 +574,10 @@ rte_eal_pci_probe_one_driver(struct
rte_pci_driver *dr, struct rte_pci_device *d
Post by Ouyang Changchun
/* map resources for devices that use igb_uio */
ret = pci_map_device(dev);
if (ret != 0)
- return ret;
+#ifdef RTE_EAL_PORT_IO
+ if ((dr->drv_flags & RTE_PCI_DRV_PORT_IO)
== 0) #endif
Post by Ouyang Changchun
+ return ret;
} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
rte_eal_process_type() == RTE_PROC_PRIMARY) {
/* unbind current driver */
Why do you need this ugly return?
Without it, port-io method will return error when probe one driver the vritio device,
As it don't use uio, so it can't map bar to virtual address.
Here, just let port-io method don't check the return value of pci_map_device.
Post by Ouyang Changchun
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -961,6 +961,71 @@ static int virtio_resource_init(struct rte_pci_device
*pci_dev)
Post by Ouyang Changchun
start, size);
return 0;
}
+
+#ifdef RTE_EAL_PORT_IO
+/* Extract port I/O numbers from proc/ioports */ static int
+virtio_resource_init_by_portio(struct rte_pci_device *pci_dev) {
+ uint16_t start, end;
+ int size;
+ FILE *fp;
+ char *line = NULL;
+ char pci_id[16];
+ int found = 0;
+ size_t linesz;
+
+ snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
+ pci_dev->addr.domain,
+ pci_dev->addr.bus,
+ pci_dev->addr.devid,
+ pci_dev->addr.function);
+
+ fp = fopen("/proc/ioports", "r");
+ if (fp == NULL) {
+ PMD_INIT_LOG(ERR, "%s(): can't open ioports", __func__);
+ return -1;
+ }
+
+ while (getdelim(&line, &linesz, '\n', fp) > 0) {
+ char *ptr = line;
+ char *left;
+ int n;
+
+ n = strcspn(ptr, ":");
+ ptr[n] = 0;
+ left = &ptr[n+1];
+
+ while (*left && isspace(*left))
+ left++;
+
+ if (!strncmp(left, pci_id, strlen(pci_id))) {
+ found = 1;
+
+ while (*ptr && isspace(*ptr))
+ ptr++;
+
+ sscanf(ptr, "%04hx-%04hx", &start, &end);
+ size = end - start + 1;
+
+ break;
+ }
+ }
+
+ free(line);
+ fclose(fp);
+
+ if (!found)
+ return -1;
+
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)(uint32_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
+#endif
This part should be a Linux EAL service.
As the port-io method is not used by other pmd code but only for virtio pmd, so we can say it is virtio specific codes, so
Putting them here is good way.

out of similar reason, the function get_uio_dev for uio method also is put here.
So just keep the consistence between both uio and port-io methods.
Post by Ouyang Changchun
+#ifdef RTE_EAL_PORT_IO
+static struct eth_driver rte_virtio_pmd = {
+ {
+ .name = "rte_virtio_pmd",
+ .id_table = pci_id_virtio_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING |
RTE_PCI_DRV_PORT_IO |
Why does it need PCI mapping in port IO mode?
Good catch, I need remove the pci mapping here in next patch.
Post by Ouyang Changchun
+ RTE_PCI_DRV_INTR_LSC,
+ },
+ .eth_dev_init = eth_virtio_dev_init,
+ .dev_private_size = sizeof(struct virtio_hw), }; #else
static struct eth_driver rte_virtio_pmd = {
{
.name = "rte_virtio_pmd",
This is the biggest problem.
You are defining port IO as a different driver instead of providing a way to
choose the method for each virtio device.
I think that you should use devargs to configure the pci device.
Do you mean I need new rte_devtype to handle it?
Currently the implement check if the virtio dev is bind to igb_uio or not,
If it bind to igb_uio, then it use uio/mapping method to get the address,
If it doesn't bind to igb_uio, and it is in white list, then use port-io method to get the address.
I don't see any big issue here for the logic.

Thanks
Changchun
Ouyang Changchun
2015-01-29 07:24:04 UTC
Permalink
Check if it has already been vlan-tagged packet, if true, avoid inserting a
duplicated vlan tag into it.

This is a possible case when guest has the capability of inserting vlan tag.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
examples/vhost/main.c | 45 ++++++++++++++++++++++++++++-----------------
1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 04f0118..6af7874 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1115,6 +1115,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
unsigned len, ret, offset = 0;
const uint16_t lcore_id = rte_lcore_id();
struct virtio_net *dev = vdev->dev;
+ struct ether_hdr *nh;

/*check if destination is local VM*/
if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) {
@@ -1135,28 +1136,38 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
tx_q = &lcore_tx_queue[lcore_id];
len = tx_q->len;

- m->ol_flags = PKT_TX_VLAN_PKT;
+ nh = rte_pktmbuf_mtod(m, struct ether_hdr *);
+ if (unlikely(nh->ether_type == rte_cpu_to_be_16(ETHER_TYPE_VLAN))) {
+ /* Guest has inserted the vlan tag. */
+ struct vlan_hdr *vh = (struct vlan_hdr *) (nh + 1);
+ uint16_t vlan_tag_be = rte_cpu_to_be_16(vlan_tag);
+ if ((vm2vm_mode == VM2VM_HARDWARE) &&
+ (vh->vlan_tci != vlan_tag_be))
+ vh->vlan_tci = vlan_tag_be;
+ } else {
+ m->ol_flags = PKT_TX_VLAN_PKT;

- /*
- * Find the right seg to adjust the data len when offset is
- * bigger than tail room size.
- */
- if (unlikely(vm2vm_mode == VM2VM_HARDWARE)) {
- if (likely(offset <= rte_pktmbuf_tailroom(m)))
- m->data_len += offset;
- else {
- struct rte_mbuf *seg = m;
+ /*
+ * Find the right seg to adjust the data len when offset is
+ * bigger than tail room size.
+ */
+ if (unlikely(vm2vm_mode == VM2VM_HARDWARE)) {
+ if (likely(offset <= rte_pktmbuf_tailroom(m)))
+ m->data_len += offset;
+ else {
+ struct rte_mbuf *seg = m;

- while ((seg->next != NULL) &&
- (offset > rte_pktmbuf_tailroom(seg)))
- seg = seg->next;
+ while ((seg->next != NULL) &&
+ (offset > rte_pktmbuf_tailroom(seg)))
+ seg = seg->next;

- seg->data_len += offset;
+ seg->data_len += offset;
+ }
+ m->pkt_len += offset;
}
- m->pkt_len += offset;
- }

- m->vlan_tci = vlan_tag;
+ m->vlan_tci = vlan_tag;
+ }

tx_q->m_table[len] = m;
len++;
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:05 UTC
Permalink
Support turn on/off RX VLAN strip on host, this let guest get the chance of
using its software VALN strip functionality.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
examples/vhost/main.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 6af7874..1876c8e 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -159,6 +159,9 @@ static uint32_t num_devices;
static uint32_t zero_copy;
static int mergeable;

+/* Do vlan strip on host, enabled on default */
+static uint32_t vlan_strip = 1;
+
/* number of descriptors to apply*/
static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -564,6 +567,7 @@ us_vhost_usage(const char *prgname)
" --rx-retry-delay [0-N]: timeout(in usecond) between retries on RX. This makes effect only if retries on rx enabled\n"
" --rx-retry-num [0-N]: the number of retries on rx. This makes effect only if retries on rx enabled\n"
" --mergeable [0|1]: disable(default)/enable RX mergeable buffers\n"
+ " --vlan-strip [0|1]: disable/enable(default) RX VLAN strip on host\n"
" --stats [0-N]: 0: Disable stats, N: Time in seconds to print stats\n"
" --dev-basename: The basename to be used for the character device.\n"
" --zero-copy [0|1]: disable(default)/enable rx/tx "
@@ -591,6 +595,7 @@ us_vhost_parse_args(int argc, char **argv)
{"rx-retry-delay", required_argument, NULL, 0},
{"rx-retry-num", required_argument, NULL, 0},
{"mergeable", required_argument, NULL, 0},
+ {"vlan-strip", required_argument, NULL, 0},
{"stats", required_argument, NULL, 0},
{"dev-basename", required_argument, NULL, 0},
{"zero-copy", required_argument, NULL, 0},
@@ -691,6 +696,22 @@ us_vhost_parse_args(int argc, char **argv)
}
}

+ /* Enable/disable RX VLAN strip on host. */
+ if (!strncmp(long_option[option_index].name,
+ "vlan-strip", MAX_LONG_OPT_SZ)) {
+ ret = parse_num_opt(optarg, 1);
+ if (ret == -1) {
+ RTE_LOG(INFO, VHOST_CONFIG,
+ "Invalid argument for VLAN strip [0|1]\n");
+ us_vhost_usage(prgname);
+ return -1;
+ } else {
+ vlan_strip = !!ret;
+ vmdq_conf_default.rxmode.hw_vlan_strip =
+ vlan_strip;
+ }
+ }
+
/* Enable/disable stats. */
if (!strncmp(long_option[option_index].name, "stats", MAX_LONG_OPT_SZ)) {
ret = parse_num_opt(optarg, INT32_MAX);
@@ -950,7 +971,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
dev->device_fh);

/* Enable stripping of the vlan tag as we handle routing. */
- rte_eth_dev_set_vlan_strip_on_queue(ports[0], (uint16_t)vdev->vmdq_rx_q, 1);
+ if (vlan_strip)
+ rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+ (uint16_t)vdev->vmdq_rx_q, 1);

/* Set device as ready for RX. */
vdev->ready = DEVICE_RX;
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:06 UTC
Permalink
To keep the consistent logic with normal Rx path, the mergeable
Rx path also needs software vlan strip/decap if it is enabled.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index a82e8eb..c6d9ae7 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -568,6 +568,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
+ struct virtio_hw *hw = rxvq->hw;
struct rte_mbuf *rxm, *new_mbuf;
uint16_t nb_used, num, nb_rx = 0;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
@@ -674,6 +675,9 @@ virtio_recv_mergeable_pkts(void *rx_queue,
seg_res -= rcv_cnt;
}

+ if (hw->vlan_strip)
+ rte_vlan_strip(rx_pkts[nb_rx]);
+
VIRTIO_DUMP_PACKET(rx_pkts[nb_rx],
rx_pkts[nb_rx]->data_len);
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:07 UTC
Permalink
vHOST zero copy need get vring descriptor and its buffer address to
set the DMA address of HW ring, it is done in new_device when ioctl set_backend
is called. This requies virtio_dev_rxtx_start is called before vtpci_reinit_complete,
which makes sure the vring descriptro and its buffer is ready before its using.

this patch also fixes one set status issue, according to virtio spec,
VIRTIO_CONFIG_STATUS_ACK should be set after virtio hw reset.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b905532..648c761 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -414,6 +414,7 @@ virtio_dev_close(struct rte_eth_dev *dev)
/* reset the NIC */
vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
vtpci_reset(hw);
+ hw->started = 0;
virtio_dev_free_mbufs(dev);
}

@@ -1107,9 +1108,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return -ENOMEM;
}

- /* Tell the host we've noticed this device. */
- vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
-
pci_dev = eth_dev->pci_dev;
if (virtio_resource_init(pci_dev) < 0)
#ifdef RTE_EAL_PORT_IO
@@ -1123,6 +1121,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Reset the device although not necessary at startup */
vtpci_reset(hw);

+ /* Tell the host we've noticed this device. */
+ vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
+
/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
virtio_negotiate_features(hw);
@@ -1324,10 +1325,10 @@ virtio_dev_start(struct rte_eth_dev *dev)
if (hw->started)
return 0;

- vtpci_reinit_complete(hw);
-
/* Do final configuration before rx/tx engine starts */
virtio_dev_rxtx_start(dev);
+ vtpci_reinit_complete(hw);
+
hw->started = 1;

/*Notify the backend
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:08 UTC
Permalink
Remove those hotspots which is unnecessary when early returning occurs;

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_rxtx.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index c6d9ae7..0225cc9 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -476,13 +476,13 @@ uint16_t
virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
- struct virtio_hw *hw = rxvq->hw;
+ struct virtio_hw *hw;
struct rte_mbuf *rxm, *new_mbuf;
- uint16_t nb_used, num, nb_rx = 0;
+ uint16_t nb_used, num, nb_rx;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
struct rte_mbuf *rcv_pkts[VIRTIO_MBUF_BURST_SZ];
int error;
- uint32_t i, nb_enqueued = 0;
+ uint32_t i, nb_enqueued;
const uint32_t hdr_size = sizeof(struct virtio_net_hdr);

nb_used = VIRTQUEUE_NUSED(rxvq);
@@ -499,6 +499,11 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)

num = virtqueue_dequeue_burst_rx(rxvq, rcv_pkts, len, num);
PMD_RX_LOG(DEBUG, "used:%d dequeue:%d", nb_used, num);
+
+ hw = rxvq->hw;
+ nb_rx = 0;
+ nb_enqueued = 0;
+
for (i = 0; i < num ; i++) {
rxm = rcv_pkts[i];

@@ -568,17 +573,17 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint16_t nb_pkts)
{
struct virtqueue *rxvq = rx_queue;
- struct virtio_hw *hw = rxvq->hw;
+ struct virtio_hw *hw;
struct rte_mbuf *rxm, *new_mbuf;
- uint16_t nb_used, num, nb_rx = 0;
+ uint16_t nb_used, num, nb_rx;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
struct rte_mbuf *rcv_pkts[VIRTIO_MBUF_BURST_SZ];
struct rte_mbuf *prev;
int error;
- uint32_t i = 0, nb_enqueued = 0;
- uint32_t seg_num = 0;
- uint16_t extra_idx = 0;
- uint32_t seg_res = 0;
+ uint32_t i, nb_enqueued;
+ uint32_t seg_num;
+ uint16_t extra_idx;
+ uint32_t seg_res;
const uint32_t hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);

nb_used = VIRTQUEUE_NUSED(rxvq);
@@ -590,6 +595,14 @@ virtio_recv_mergeable_pkts(void *rx_queue,

PMD_RX_LOG(DEBUG, "used:%d\n", nb_used);

+ hw = rxvq->hw;
+ nb_rx = 0;
+ i = 0;
+ nb_enqueued = 0;
+ seg_num = 0;
+ extra_idx = 0;
+ seg_res = 0;
+
while (i < nb_used) {
struct virtio_net_hdr_mrg_rxbuf *header;
--
1.8.4.2
Ouyang Changchun
2015-01-29 07:24:09 UTC
Permalink
It needs use virtio_wmb instead of virtio_rmb for store memory barrier.

Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtqueue.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index 6c45c27..41dda50 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -266,7 +266,7 @@ virtqueue_full(const struct virtqueue *vq)
static inline void
vq_update_avail_idx(struct virtqueue *vq)
{
- virtio_rmb();
+ virtio_wmb();
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}
--
1.8.4.2
Ouyang Changchun
2015-02-09 01:13:49 UTC
Permalink
This is the patch set for single virtio implementation.

Why we need single virtio?
============================
As we know currently there are at least 3 virtio PMD driver implementations:
A) lib/librte_pmd_virtio(refer as virtio A);
B) virtio_net_pmd by 6wind(refer as virtio B);
C) virtio by Brocade/vyatta(refer as virtio C);

Integrating 3 implementations into one could reduce the maintaining cost and time,
in other hand, user don't need practice their application on 3 variant one by one to see
which one is the best for them;

What's the status?
====================
Currently virtio A has covered most features of virtio B except for using port io to get pci resource,
so there is a patch(17/22) to resolve it. But on the other hand there are a few differences between
virtio A and virtio C, it needs integrate features/codes of virtio C into virtio A.
This patch set bases on two original RFC patch sets from Stephen Hemminger[***@networkplumber.org]
Refer to [http://dpdk.org/ml/archives/dev/2014-August/004845.html ] for the original one.
This patch set also resolves some conflict with latest codes, removed duplicated codes, fix some
issues in original codes.

What this patch set contains:
===============================
1) virtio: Rearrange resource initialization, it extracts a function to setup PCI resources;
2) virtio: Use weaker barriers, as DPDK driver only has to deal with the case of running on PCI
and with SMP, In this case, the code can use the weaker barriers instead of using hard (fence)
barriers. This may help performance a bit;
3) virtio: Allow starting with link down, other driver has similar behavior;
4) virtio: Add support for Link State interrupt;
5) ether: Add soft vlan encap/decap functions, it helps if HW don't support vlan strip;
6) virtio: Use software vlan stripping;
7) virtio: Remove unnecessary adapter structure;
8) virtio: Remove redundant vq_alignment, as vq alignment is always 4K, so use constant when needed;
9) virtio: Fix how states are handled during initialization, this is to match Linux kernel;
10) virtio: Make vtpci_get_status a local function as it is used in one file;
11) virtio: Check for packet headroom at compile time;
12) virtio: Move allocation before initialization to avoid being stuck in middle of virtio init;
13) virtio: Add support for vlan filtering;
14) virtio: Add support for multiple mac addresses;
15) virtio: Add ability to set MAC address;
16) virtio: Free mbuf's with threshold, this makes its behavior more like ixgbe;
17) virtio: Use port IO to get PCI resource for security reasons and match virtio-net-pmd;
18) virtio: Fix descriptor index issue;
19) ether: Fix vlan strip/insert issue;
20) example/vhost: Avoid inserting vlan twice and guest and host;
21) example/vhost: Add vlan-strip cmd line option to turn on/off vlan strip on host;
22) virtio: Use soft vlan strip in mergeable Rx path, this makes it has consistent logic
with the normal Rx path.

Changes in v2:
23) virtio: Fix zero copy break issue, the vring should be ready before virtio PMD set
the status of DRIVER_OK;
24) virtio: Remove unnecessary hotspots in data path.

Changes in v3:
25) virtio: Fix wmb issue;
26) Fix one minor issue in patch 20, also fix its idention.

Changes in v4:
27) Fix updating vring descriptor index issue and memory barrier issue;
28) Reslove comments for patch 17.

Changchun Ouyang (10):
virtio: Use port IO to get PCI resource.
virtio: Fix descriptor index issue
ether: Fix vlan strip/insert issue
example/vhost: Avoid inserting vlan twice
example/vhost: Add vlan-strip cmd line option
virtio: Use soft vlan strip in mergeable Rx path
virtio: Fix zero copy break issue
virtio: Remove hotspots
virtio: Fix wmb issue
virtio: Fix updating vring descriptor index issue

Stephen Hemminger (16):
virtio: Rearrange resource initialization
virtio: Use weaker barriers
virtio: Allow starting with link down
virtio: Add support for Link State interrupt
ether: Add soft vlan encap/decap functions
virtio: Use software vlan stripping
virtio: Remove unnecessary adapter structure
virtio: Remove redundant vq_alignment
virtio: Fix how states are handled during initialization
virtio: Make vtpci_get_status local
virtio: Check for packet headroom at compile time
virtio: Move allocation before initialization
virtio: Add support for vlan filtering
virtio: Add suport for multiple mac addresses
virtio: Add ability to set MAC address
virtio: Free mbuf's with threshold

examples/vhost/main.c | 70 +++--
lib/librte_ether/rte_ethdev.h | 8 +
lib/librte_ether/rte_ether.h | 76 +++++
lib/librte_pmd_virtio/Makefile | 2 +
lib/librte_pmd_virtio/virtio_ethdev.c | 519 ++++++++++++++++++++++++++--------
lib/librte_pmd_virtio/virtio_ethdev.h | 12 +-
lib/librte_pmd_virtio/virtio_pci.c | 20 +-
lib/librte_pmd_virtio/virtio_pci.h | 8 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 154 +++++++---
lib/librte_pmd_virtio/virtqueue.h | 59 +++-
10 files changed, 729 insertions(+), 199 deletions(-)
--
1.8.4.2
Ouyang Changchun
2015-02-09 01:13:50 UTC
Permalink
For clarity make the setup of PCI resources for Linux into a function rather
than block of code #ifdef'd in middle of dev_init.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 76 ++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 33 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index b3b5bb6..662a49c 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -794,6 +794,41 @@ virtio_has_msix(const struct rte_pci_addr *loc)

return (d != NULL);
}
+
+/* Extract I/O port numbers from sysfs */
+static int virtio_resource_init(struct rte_pci_device *pci_dev)
+{
+ char dirname[PATH_MAX];
+ char filename[PATH_MAX];
+ unsigned long start, size;
+
+ if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
+ return -1;
+
+ /* get portio size */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/size", dirname);
+ if (parse_sysfs_value(filename, &size) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse size",
+ __func__);
+ return -1;
+ }
+
+ /* get portio start */
+ snprintf(filename, sizeof(filename),
+ "%s/portio/port0/start", dirname);
+ if (parse_sysfs_value(filename, &start) < 0) {
+ PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
+ __func__);
+ return -1;
+ }
+ pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
+ pci_dev->mem_resource[0].len = (uint64_t)size;
+ PMD_INIT_LOG(DEBUG,
+ "PCI Port IO found start=0x%lx with size=0x%lx",
+ start, size);
+ return 0;
+}
#else
static int
virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
@@ -801,6 +836,12 @@ virtio_has_msix(const struct rte_pci_addr *loc __rte_unused)
/* nic_uio does not enable interrupts, return 0 (false). */
return 0;
}
+
+static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused)
+{
+ /* no setup required */
+ return 0;
+}
#endif

/*
@@ -831,40 +872,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;

pci_dev = eth_dev->pci_dev;
+ if (virtio_resource_init(pci_dev) < 0)
+ return -1;

-#ifdef RTE_EXEC_ENV_LINUXAPP
- {
- char dirname[PATH_MAX];
- char filename[PATH_MAX];
- unsigned long start, size;
-
- if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0)
- return -1;
-
- /* get portio size */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/size", dirname);
- if (parse_sysfs_value(filename, &size) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse size",
- __func__);
- return -1;
- }
-
- /* get portio start */
- snprintf(filename, sizeof(filename),
- "%s/portio/port0/start", dirname);
- if (parse_sysfs_value(filename, &start) < 0) {
- PMD_INIT_LOG(ERR, "%s(): cannot parse portio start",
- __func__);
- return -1;
- }
- pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start;
- pci_dev->mem_resource[0].len = (uint64_t)size;
- PMD_INIT_LOG(DEBUG,
- "PCI Port IO found start=0x%lx with size=0x%lx",
- start, size);
- }
-#endif
hw->use_msix = virtio_has_msix(&pci_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
--
1.8.4.2
Ouyang Changchun
2015-02-09 01:13:51 UTC
Permalink
The DPDK driver only has to deal with the case of running on PCI
and with SMP. In this case, the code can use the weaker barriers
instead of using hard (fence) barriers. This will help performance.
The rationale is explained in Linux kernel virtio_ring.h.

To make it clearer that this is a virtio thing and not some generic
barrier, prefix the barrier calls with virtio_.

Add missing (and needed) barrier between updating ring data
structure and notifying host.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 2 +-
lib/librte_pmd_virtio/virtio_rxtx.c | 8 +++++---
lib/librte_pmd_virtio/virtqueue.h | 19 ++++++++++++++-----
3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index 662a49c..dc47e72 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -175,7 +175,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
uint32_t idx, desc_idx, used_idx;
struct vring_used_elem *uep;

- rmb();
+ virtio_rmb();

used_idx = (uint32_t)(vq->vq_used_cons_idx
& (vq->vq_nentries - 1));
diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c
index c013f97..78af334 100644
--- a/lib/librte_pmd_virtio/virtio_rxtx.c
+++ b/lib/librte_pmd_virtio/virtio_rxtx.c
@@ -456,7 +456,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ);
@@ -516,6 +516,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
}

if (likely(nb_enqueued)) {
+ virtio_wmb();
if (unlikely(virtqueue_kick_prepare(rxvq))) {
virtqueue_notify(rxvq);
PMD_RX_LOG(DEBUG, "Notified\n");
@@ -547,7 +548,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,

nb_used = VIRTQUEUE_NUSED(rxvq);

- rmb();
+ virtio_rmb();

if (nb_used == 0)
return 0;
@@ -694,7 +695,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
nb_used = VIRTQUEUE_NUSED(txvq);

- rmb();
+ virtio_rmb();

num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ);

@@ -735,6 +736,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
}
}
vq_update_avail_idx(txvq);
+ virtio_wmb();

txvq->packets += nb_tx;

diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h
index fdee054..f6ad98d 100644
--- a/lib/librte_pmd_virtio/virtqueue.h
+++ b/lib/librte_pmd_virtio/virtqueue.h
@@ -46,9 +46,18 @@
#include "virtio_ring.h"
#include "virtio_logs.h"

-#define mb() rte_mb()
-#define wmb() rte_wmb()
-#define rmb() rte_rmb()
+/*
+ * Per virtio_config.h in Linux.
+ * For virtio_pci on SMP, we don't need to order with respect to MMIO
+ * accesses through relaxed memory I/O windows, so smp_mb() et al are
+ * sufficient.
+ *
+ * This driver is for virtio_pci on SMP and therefore can assume
+ * weaker (compiler barriers)
+ */
+#define virtio_mb() rte_mb()
+#define virtio_rmb() rte_compiler_barrier()
+#define virtio_wmb() rte_compiler_barrier()

#ifdef RTE_PMD_PACKET_PREFETCH
#define rte_packet_prefetch(p) rte_prefetch1(p)
@@ -225,7 +234,7 @@ virtqueue_full(const struct virtqueue *vq)
static inline void
vq_update_avail_idx(struct virtqueue *vq)
{
- rte_compiler_barrier();
+ virtio_rmb();
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}

@@ -255,7 +264,7 @@ static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
- * Ensure updated avail->idx is visible to host. mb() necessary?
+ * Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
--
1.8.4.2
Ouyang Changchun
2015-02-09 01:13:52 UTC
Permalink
Starting driver with link down should be ok, it is with every
other driver. So just allow it.

Signed-off-by: Stephen Hemminger <***@networkplumber.org>
Signed-off-by: Changchun Ouyang <***@intel.com>
---
lib/librte_pmd_virtio/virtio_ethdev.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index dc47e72..5df3b54 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1057,14 +1057,12 @@ virtio_dev_start(struct rte_eth_dev *dev)
vtpci_read_dev_config(hw,
offsetof(struct virtio_net_config, status),
&status, sizeof(status));
- if ((status & VIRTIO_NET_S_LINK_UP) == 0) {
+ if ((status & VIRTIO_NET_S_LINK_UP) == 0)
PMD_INIT_LOG(ERR, "Port: %d Link is DOWN",
dev->data->port_id);
- return -EIO;
- } else {
+ else
PMD_INIT_LOG(DEBUG, "Port: %d Link is UP",
dev->data->port_id);
- }
}
vtpci_reinit_complete(hw);
--
1.8.4.2
Continue reading on narkive:
Loading...