Fundamentals 10 min read

NVMe/TCP Q&A: Technical Overview and Answers to Common Questions

This article provides a comprehensive technical overview of NVMe over TCP, covering its specification status, required host components, namespace limits, latency impact, OS support, performance considerations with DPDK, TCP congestion control, R2T handling, upgrade procedures, open‑source implementations, comparisons with iSCSI, NVMe/FC, and NVMe/RDMA, as well as practical guidance for data‑center deployments.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
NVMe/TCP Q&A: Technical Overview and Answers to Common Questions

NVMeExpress, Inc. recently added NVMe over TCP (NVMe/TCP) to the NVMe transport family, marking a significant development for NVMe‑oF.

The author’s 2019 webcast explored the advantages and features of the new specification; the recording is available at https://www.brighttalk.com/webcast/12367/348656.

In this blog, the author answers the unresolved NVMe/TCP questions raised by the audience.

1. Official documentation for NVMe/TCP? The approved Technical Proposal TP‑8000 will be integrated into the NVMe‑oF 1.1 specification, with an expected release later this year.

2. Host requirements (hardware, firmware, software, drivers)? No special hardware/firmware is needed; the host software (NVMe/TCP host) and NVM subsystem can run on Linux Kernel v5.0, SPDK v19.01, and commercial target devices.

3. Namespace count limits and host resources? NVMe/TCP imposes no limits on the number of namespaces; they are purely logical and do not allocate host resources.

4. Does NVMe/TCP add latency for directly attached NVMe SSDs? Generally it does not add latency, though specific controller implementations may mitigate it.

5. Operating system kernel support? Linux kernel versions 5.0 and newer support NVMe/TCP.

6. Performance differences when running NVMe/TCP on a DPDK‑based data‑plane? On platforms with sufficient CPU resources, NVMe/TCP over the generic Linux stack shows no fundamental difference, but DPDK can yield better performance if the controller lacks dedicated CPU for the Linux network stack.

7. Recommendation to use Data Center TCP for NVMe/TCP workloads? Data Center TCP may provide better congestion control, but overall NVMe/TCP benefits from TCP’s flow control mechanisms.

8. Multiple R2T support compared to FCP buffers? Controllers can send multiple lightweight R2T PDUs, subject to a MAXR2T limit; the credit mechanism is analogous to FCP’s credit system but operates at the NVMe command level.

9. Traffic management – R2T and standard TCP congestion window? End‑to‑end flow control is handled by TCP/IP, while NVMe‑oF uses the R2T credit mechanism for transport‑level flow control.

10. PDU ordering constraints for multiple outstanding SQ requests? No ordering constraints; PDUs for different NVMe commands are unordered.

11. Patch and upgrade management – is it non‑destructive? Large‑scale environments should follow vendor‑specific rollback procedures; the NVMe/TCP protocol itself does not enforce or forbid such operations.

12. Open‑source implementations? Both Linux and SPDK include NVMe/TCP target implementations.

13. Equivalent implementation in iSCSI? iSCSI does not have a direct NVMe/TCP equivalent, though both use TCP/IP for transport.

14. Performance comparison with NVMe/FC? While not tested by the author, NVMe/TCP is expected to exhibit a modest performance drop compared to directly attached NVMe, similar to NVMe/FC.

15. CPU utilization data for NVMe/RoCE vs. NVMe/TCP? No official data; NVMe/TCP generally consumes more CPU than NVMe/RDMA because the latter offloads part of the protocol to hardware.

16. Pros and cons of NVMe/TCP vs. NVMe/RDMA? NVMe/TCP offers hardware‑agnostic scalability without requiring network changes, whereas NVMe/RDMA can achieve lower latency and CPU usage when supported.

17. Can NVMe/TCP and NVMe/RDMA coexist on the same 100 Gb/s Ethernet? Yes; both run over Ethernet and can share the same physical network.

18. Maximum tolerable latency for NVMe PDU on Ethernet switches? NVMe/TCP does not define a maximum latency; the default Keep‑Alive timeout is two minutes.

In summary, the NVMe/TCP transport binding specification is publicly available and adds TCP as a new transport alongside PCIe, RDMA, and FC, supporting optional features such as inline data integrity (DIGEST) and TLS. It enables efficient end‑to‑end NVMe operations over standard IP networks, facilitating large‑scale data‑center deployments without requiring specialized hardware.

performanceLinuxstorageNetworkingNVMeSPDKNVMe/TCP
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.