Deficiencies of Existing Virtualized User Plane in Commercial 5G Scenarios
3GPP R15 defines three major 5G application scenarios: eMBB (enhanced Mobile Broadband enhanced mobile broadband), URLLC (Ultra Reliable Low Latency Communications) and mMTC (Massive Machine Type Communications). The eMBB scenario provides large-traffic mobile broadband services, mainly oriented to human network services, such as high-speed download, HD video and VR/AR etc, the peak rate exceeds 10Gbps. The URLLC scenario provides ultra-reliable and ultra-low latency communication. For example, automatic driving and industrial automation require end-to-end 99.999% high reliability and end-to-end less than 1 ms ultra-low latency.
To meet the requirements of high bandwidth and low latency in above-mentioned 5G application scenarios, the user plane of the 5G core network should be deployed to the edge or regional data center to reduce transmission delay and greatly reduces the forwarding delay of user-plane packets. However, the 5G core network is designed based on the NFV virtualization architecture, and the hardware generally adopts the x86 universal server. However, the I/O performance of the x86 universal server, such as throughput and delay, is far less than the traditional dedicated hardware, and cannot meet the commercial deployment requirements in the 5G scenario. Therefore, it is necessary to optimize and accelerate the I/O performance of the virtualized user plane, reduce the service delay, increase the system bandwidth, and achieve better service adaptability. In order to provide the virtualized user plane I/O performance, ZTE has made optimization/acceleration research and application in two aspects: software acceleration and hardware acceleration (mainly smart NIC acceleration), which will be introduced separately.
Research and Application of the Virtualized User Plane Software Acceleration Technology
At present, the most commonly used I/O virtualization acceleration technology on the user plane is SR-IOV (Single Root I/O Virtualization). However, using SR-IOV only makes the I/O performance of virtualized user plane close to that based on bare metal, and it is very difficult to make a breakthrough. Therefore, based on the SR-IOV, ZTE has improved the forwarding flow of upper-layer service streams and introduced the intelligent self-learning function to learn rules of service streams. For most service flows, service rules can be matched, and the rules can be automatically modified in accordance with subsequent service flow changes to cope with service changes. After service rules are matched, vector forwarding is performed for service streams, and the original single stream is expanded to form multiple concurrent streams, thus improving forwarding efficiency and reducing the forwarding bottleneck.
Figure 1 Performance Comparison between Software-Based Acceleration +SR-IOV Technology and Pure SR-IOV Technology1
With the combination of the improved software acceleration technology and SR-IOV, ZTE virtualized user plane products have greatly improved the performance of I/O. After lab test, the overall throughput of a single server has been increased by twice as much as that of a single server, reaching 60Gbps, almost the performance limit of a single server. In addition to the lab test, the 5G AR/VR service scene test has also been carried out in the field. Before the software acceleration is enabled, when the server is close to the system capability threshold, the screen starts to have jamming and mosaic, and there are three jamming sessions within 15 minutes. After software acceleration is enabled, the rate is greatly improved. Within 30 minutes, there is no pause or mosaic, screen fluency is greatly improved, and user experience is obviously improved.
Research and Application of the User-Plane Hardware Acceleration Technology
At present, the software acceleration technology cannot meet the commercial requirements in the 5G scenario, the hardware acceleration technology needs to be introduced. For hardware acceleration of the 5G user plane, Smart NIC (Smart NIC) is usually used to unload the data packets previously processed by the CPU to the Smart NIC for processing. Most packets are forwarded directly after being processed by the Smart NIC by itself. Only a few packets (such as initial flow packets and abnormal flow matching packets) need to be processed by the CPU, which greatly reduces the occupation of CPU resources, improves performance and reduces delay. There are multiple types of smart NICs: FPGA-based, NP-based, and ASIC-based. ZTE 5G user-plane products use the highest-maturity and cost-effective FPGA smart NIC solution.
Figure 2 Hardware Acceleration Solution for ZTE 5G User Plane Products
The procedure of processing data packets by the FPGA smart NIC: The CPU of the central processor of the server creates a service flow table in accordance with the service dynamic traffic information and delivers the service flow table to the smart NIC. The smart NIC processes the data flow table in accordance with the service flow table sent by the CPU and learns the data flow intelligently. The smart NIC synchronizes the flow table with the CPU in real time. For data stream packets that need to be accelerated, CPUs do not need to be uploaded, only locally processed and forwarded by intelligent NICs. This implements hardware-level processing and forwarding of data packets, minimizes the use of service packets, and saves a large number of CPU processing resources and system IO (input and output) resources.
ZTE smart NIC is designed based on FPGA. It can be programmed with hardware logic, has a large capacity of flow tables and AI algorithms, and can intelligently identify services that need to be accelerated, such as industrial control services and Internet of Vehicles services. In addition, ZTE smart network adapter is designed based on the standard ePCI (Enhanced Peripheral Component Interconnect) bus, and has passed the test in the Open Lab to be compatible with the mainstream general servers in the current market. In a deployment scenario, ZTE smart NICs can be deployed in the central data equipment room (core network), or on edge computing nodes (MECs) to further reduce service forwarding paths and data packet delay.
Since a local forwarding flow table is created on the intelligent network card, the delay-sensitive data traffic is directly processed and transferred on the network card without passing the CPU, thus greatly reducing the forwarding delay and the CPU load, as well as improving the forwarding efficiency. The average packet delay is reduced from 100us to 10us, and the single server throughput is increased from 60Gbps to 180Gbps. Compared with the soft acceleration solution, the forwarding delay of the FPGA smart NIC acceleration solution is reduced by 90%, the throughput is improved by 200%, and the power consumption is reduced by 55%, to better meet the special requirements of 5G URLLC and eMBB for the forwarding capability of the edge data center.
Actual hybrid service scenario test: When the server is near full load, the industrial sensor controls the service traffic in a large amount of video background traffic. When the intelligent NIC acceleration is not enabled, as the traffic forwarding has reached the processing capability of the server, the average packet forwarding delay reaches 260us, some packets forwarding delay exceeds 500us, and the industrial control is no different from the video service. The packet delay of both is the same. When the intelligent network card acceleration is enabled, the industrial control traffic delay decreases instantaneously, and the forwarding delay is lower than 80us, reducing 70%.
In the 2019 World Communications Conference in Barcelona, ZTE launched the 5G UPF (User Plane Function, 5G user-plane NE) product based on the acceleration of the intelligent network card. In addition, ZTE demonstrated services, and tested the 5G UPF in real time by using a third-party meter. The high throughput, low delay and other indexes were widely recognized by the industry.
ZTE's research and application of two software and hardware acceleration technologies have achieved good results in 5G user plane optimization, greatly improving the I/O performance such as the throughput and delay of the virtualized user plane, so that the virtualized user plane meets the requirements of 5G high bandwidth and low delay, such as the Internet of Vehicles (IoV), AR/VR and other service scenarios. In this way, the 5G network adopts a unified virtualization platform, helping the carriers to build a green and energy-saving network with high bandwidth and low delay.