In the past year, the rapid development of cloud computing, Internet, and AI make the data center develop very fast, whether it is the building of data center or the need of data center performance improvement becomes more and more urgent. By 2019, the Global Data Center optical transceiver shipments reached 10 million, and market scale will be 4.9 billion U.S. dollars in 2021, growth is very rapid. From the previous technology, the main driving force is in the telecommunications network, you know, like routers, optical transmission, bandwidth growth needs more urgent. But from what we see now, taking 100G as an example, there will be several years earlier on routers and transmissions than in data center switches. But 400G optics, there is CFP module appears. But data center 100G Optical transceiver is expected to appear at the end of this year, the gap has become a year, the future time demand may be the same. From another point of view, the data center’s characteristic requirements for optical transceiver are not the same as the telecommunication network, which has higher requirements for miniaturization, high density, low power consumption and low cost. In other words, we believe that the data center has now become another engine for the development of optical communication technologies.
Discussion on optical interconnection technology of next-generation data center
In addition, we see the data center hardware, software, all present a comprehensive open trend. The infrastructure of the data center is becoming more and more white-box to our end users, not the black boxes that we can’t see. This benefit is in addition to cost reduction, also make us more able to touch the internal technology, so as to be able to reflect the real needs of the upstream suppliers faster, our needs into reality, which is why to the 100G era of a variety of mic standards come out, not like the previous standard, This is because the data center’s user needs are also becoming more diverse.
Open Optical Transceiver Experience
The first is developing technical specifications, which is very important. Everyone knows that, like the optical transceiver, AOC has a standard organization that has all the photonics parameters or definitions. When we integrate, we often find that optical transceiver or AOC is plugged into the system, or that it is not recognized, not working, unstable, or that access information is found to be incorrect. The reason is that although the standard is there, everyone in the process of implementation, equipment and optical transceiver suppliers they will have a different understanding of the specification, or in order to quickly launch the product, the standard does not do all the appropriate. For example, the understanding of the content caused by the difference, or high-speed signal between the match, especially to the 25G era, like the optical transceiver and AOC have CDR, there is a balance, the combination of these parameters caused more problems than before like 10G, 1G Optical Transceiver.
Second, the importance of integration testing needs to be combined with the specification. The problems we find in our integrated design are quickly fed back to our specs, and these two can make the entire optical transceiver used in system devices to become more smooth.
Third, the performance, stability, reliability challenges. We know that the cloud computing business is very critical, in the optical communication of the entire rate of the ascension process, the efficiency of continuous ascension, 1×10-12 , is more than 16 minutes, 10G speed may be 100 seconds, 100G speed is 10 seconds. The same error rate in the case of data growth in the case will be more and more obvious human perception. In fact, our data center is not willing to see any error, especially today’s storage business is more and more sensitive to the loss of packets, so our performance of optical transceiver is actually required to improve, not reduce, while our business on the stability, reliability requirements are higher, We have a 2000-hour test request for optical transceiver manufacturers.
4th, when we use the Open optical transceiver, we also have to face this kind of construction Operation Maintenance Challenge. Because in the past we have to buy the optical transceiver from the system equipment manufacturer, now all is by the user to build and run the maintenance. If there is any problem, we have to locate the problem.
More importantly, we should continue to summarize in practice, how to find these problems, process problems, technical problems, and finally clear the whole process, so that the entire open third-party optical transceiver and AOC in the data center will not appear any problems.
Trend of future data center network evolution
Now let’s talk about the direction of the next generation Data Center network evolution. Our optical interconnection technology is divided into two parts, the first of which is from the server to the access switch, which typically uses active optical cable(AOC) as the transmission medium. And then to the core switch, we use the optical transceiver. From the rate, we know their rate multiples are 4 times relationship, before it may be gigabit and 10 gigabit, the relationship is 10 times.
Because the distance between the server and the access switch is relatively near, the general connection is used AOC cable. The connection distance between switches and switch is usually longer, we use optical transceiver + fiber cable. The deployment of the past is 10G, 40G, today is the deployment of 25G, 100G, the future we want to 100G, 400G network, access layer use 100G, The interconnection to the core switch is 400G. The next generation of Single channel 25G should be 50G. Why do we skip 50G and 200G? because we feel that both upstream suppliers or users, we put so much effort to improve this rate if the benefits are only twice times, which we feel is not very good. We want to jump to 400G, 100G rate directly.
400G Optical Transceiver Package
Now we introduce the future 400G Optical Transceiver Possible package. we know that package of optical transceiver has many types, some are small, some are large. Larger form factor package facilitates the inclusion of more optical devices, can provide more interfaces, and allow optical transceiver speed are also higher. CDFP and CDP8 are like this, this is the early use of the way, like CDFP and CFP8 can only put 16 unit, the power consumption can reach 12W, the maximum bandwidth per U provides 6.4T. These two packages are too large, so we do not think it will be the choice of data center switches, more should be the choice of the telecommunications network. The number of channels in the electrical signal is 16 25G, which means that the optical transceiver can be used with the current service capability.
The data center’s 400G optical transceiver more likely to be packaged should be OSFP and QSFP-DD. The both are the 8x50G electrical signal interface, and the mam ports per unit can be a small difference, 32 and 36. We tend to QSFP-DD, the size and previous QSFP28 is the same, for our data center in the field of personnel, the optical transceiver size, appearance no big changes, For those of us who run maintenance, it is easier to identify and there is no other risk. At the same time, it is still small, for system equipment manufacturers can maintain the previous strength, for our architecture design can be inherited from the past.
QSFP112 is a 400G transceiver, and in the short term, it should be difficult to achieve such a low-cost 4x100G electric channel scheme. In addition to these pluggable module packages, there is also a plan for the board, this is for data center users because it cannot be an on-site operation, this is a relatively large pain point, unless we can plug the package is really uncertain, otherwise we will not choose this kind of board-loaded scheme.
Now let us highlight the data center transceiver choice. 10G is used AOC, 40G is mainly used ESR4, deployed in 2013. 25G, 100G is deployed this year, but because 100G only reach 100 meters over PSM4, so we have to solve the question of 100 meters longer distance. The future of the 100G, 400G network, our initial plan is to access the layer with 100G SSFP56-DD. The entire evolution process is clear, from the Access data 10G to 100G, the port density on the switch can be maintained, while the bandwidth density increased 2.5 times and 10 times.
Why choose such a scheme?
In the 10G, 40G era, in fact, the standard is not a lot, mainly is 40G SR4, ESR4, and LR4, the connection to the data center, 300 meters has covered most of the connection scene, And that’s why we’re in the 40G most of the time choosing multimode solution. A very small number of more than 300 meters to choose 40G LR4 lite or LR4 single-mode solution. 10G access mainly to use AOC, the distance restrictions are small, the cost is also acceptable.
In today’s 25G, 100G era, we now see the 100G SR4 Multi-mode technology is also relatively mature, but it can only solve the longest to 100 meters distance, this distance is covered by most of the scene, but there are many more than 70 meters, 100 meters of connection, we choose 100G PSM4, This is more advantageous. Most data centers in China can be mixed with multimode and Single-mode, with a handful of single structured, whereas there is more structured wiring in the United States, and possibly full single mode solutions. For 25G Access, we now see that AOC is still relatively high cost, but in a fast-descending channel. The DAC is not in the performance or the maintenance interface of the AOC scheme, but the cost is currently relatively low, so some places still have the space to apply.
By the next generation of 100G and 400G networks, our high-speed signal implementations are getting harder. We used to know that there are two ways to increase the bandwidth of the entire optical connection, and the first is increasing the bit rate of each channel, and the second is increasing the number of channels. There are two ways to increase the bit rate, the first one is simple, we raise the baud rate directly, the second possibility is to keep the baud rate unchanged, we use a higher debug encoding format. In the thousand trillion, less than million, because this time the bottleneck of technology is not yet, we directly improve the baud rate. But by the 10G above, we have to increase the baud rate, whether it is electricity or light, it becomes more and more difficult, so we have to use coded ways to add bandwidth. The other is to increase the wavelength channel and increase the Fibre Channel, which will lead to cost increases.
The future of the 100G access solution, we analyze probably there should be three generations of the evolution process. The first generation is now, now we actually have a small number of 100G access to the application scene, according to the current technology has to choose QSP28 Telecom transceiver, the second generation we will cooperate with the next generation of IC chip, whether it is electricity or light, have become 2x50G realization way. The future of the third generation, is the single channel 100G, for this 100G access to have their own application scenarios, AOC is mainly responsible for the relatively long connection, copper cable is used for a short connection.
400G solution, the development is divided into four generations, generally speaking, the speed of light is faster than the electricity, the first generation can see now there are products, that is, using CFP8 optical transceiver package, telecommunications is still 16G and 25G, Light signals in this block in Multimode is 16G and 25G. There is 8x50G FR8 and L8 solutions. The second generation, the entire electrical signal upgraded to 50G, 8 channels. The single mode has FR8, LOR8, electrical signals and light data to fully match. The third generation of producing electrical signals or 50G, light can be upgraded to 100G, there are three kinds of programs. 400G SR4 depend on whether the multimode technology has the potential to ascend to a single channel 100G scheme. To the last fourth generation from the electricity to the light to ascend to the single channel 100G. In the past, the cost should be the lowest when there is no optical signal mismatch, and now if the optical signal does not match, it is necessary to increase the gearbox technology.
In the Next generation 100G access solution, currently in favor of the access layer with 100G SR2 AOC, this AOC mainly can solve the longest to 25 meters to 30 meters of access connection, the advantages are obvious, long distance, few limit. The disadvantage is because the chip and module development, standardization progress will be slower, the initial costs involved will be relatively high. Copper Connection scheme, the main advantages can be 25G DAC, quickly developed products, shortcomings also obvious, short distance, the entire cable will be thicker, the performance of large-scale deployment will be a risk.
For 400G Optical Interconnection scheme, we first learn about multimode, and some schemes are SR4.2 or SR8, now the potential has been difficult to dig, but VCSEL cost advantage is very large, if the 50G can be implemented or there are application costs, the module cost can be controlled very low. SR16 is not recommended to use this. SR8 and SR4.2 while meeting our requirements, however, from our previous operation and maintenance, still want to tend to use SR4.2, need two channels multimode, so that may use broadband multimode fiber, fiber cost which is lower, we think the eight-star broadband multimode lower. So multimode scheme in the 400G era can go on, the key is in optical fiber, if the fiber plus optical transceiver overall cost comparison single mode has the advantage, still have its vitality.
Next is a 400G single-mode scheme, single mode is more clear and simple, because our data center within the maximum length of 500 meters to cover the vast majority of applications, so DR4 should be the main single mode, can be used in the PSM4 of the 8-core single-mode fiber, optical fiber cost acceptable, Do not need to hop wave of the device, to achieve DR4 inside also have more advantages. FR4 also has an application scenario in which more than 500 meters of cross-building applications are likely to be used, and these two programs are the main solutions we consider to be future data centers.
Next is the 400G optical transceiver package that we want to select, the package just mentioned earlier, QSFP-DD is a choice, support pluggable, maintain the same maintenance habits and density as ever, and can be backward-compatible, also can smooth up, Upgrade to the next 400G, the evolution route is very clear.
The package of the 100G optical module, because it becomes two channels, if the two channels are still used in the previous 4-channel package, is not conducive to miniaturization. We have also made innovations on the basis of SFP, combined with a number of suppliers to promote and establish the SFP-DD MSA Organization, adding a high-speed signal. The most important meaning is to fill the gap between the two channels in the optical transceiver package. The reason for choosing it is because the encapsulation is smaller than the QSFP, suitable for our data center, also maintain compatibility, can be compatible with 25G, 50G, maybe some customers will need such application.