Cloud Computing Drives Data Center Upgrades

DateTime:2018/9/20 0:00:00

---by sinovo telecom r&d team
Cloud computing has become a recognized trend, resulting in data center hardware and software construction, data center Technological advancements and upgrades in data communications between the two are also continuing to change. Open data The center's activities and its committees communicate regularly, and leading companies lead the development of industry technology. this The report summarizes and summarizes the views of the experts at the ODCC2017 summit. Discover the key points of current technology and industry trends. Technical upgrade of server hardware: With artificial intelligence moving from shallow learning to deep learning, AI back The requirements for computing performance of the server are improved under the circumstance, and the AI chip seeks to calculate the performance under a heterogeneous platform. broken. Huge amount of data requires a lot of servers to support computing and processing, but also for data centers Supporting servers brings challenges. Data center energy conservation and emission reduction will become the focus of the entire industry. The ODCC server working group has also opened two sets of storage technology standards to the public. Standardization of board design to further reduce R&D and production delivery costs, optimize supply chain, reduce research Development and supply risks.
In-depth development of the Internet to promote the demand for data center construction: According to China's IDC circle forecast, by 2018 The IDC market in China is nearly 140 billion yuan. The overseas IDC industry is mainly operated by professional service providers. With the continuous development of the data center, overseas companies continue to expand through mergers and acquisitions, entering the integration stage. segment. At present, the traditional telecommunications room needs to be reconstructed and at the same time to improve the mobile Internet. The user experience, as well as the development of the Internet of the industry, the Internet of Things, a lot of content, computing needs were pushed to the side Edge, marginalization, miniaturized data centers will gradually emerge and take on important roles.
 Data Center Optical Interconnect: According to Cisco's forecast, mobile data traffic average composite for 2016-2021 The rate of growth is 53%; mobile video traffic is growing at a rate of 62%. Mobile Internet Industry Development, Traffic Growth drives the construction of data centers. By 2020, the global cloud computing data center traffic growth will be 30%, dominated more than 90% of data center traffic. Cloud IT facilities are centralized in super data centers. The number of super data centers reached 485. Increased East-West Traffic Drives Data Center Architecture towards Flat The development of flattening has increased the demand for high-end optical modules, while the data rate of the data center port is increasing. 10G/40G is rapidly iteratively upgraded towards 25G/100G and gradually evolves towards 400G. Data Center The characteristics of open optical modules are an open attitude toward new technologies. New technologies include simplified packaging technologies, Silicon photonics technology, onboard optical module technology, etc.
 Investment Advice :Optimistic about optical devices and optical module manufacturers: Massive shipments of servers and cloud computing are driving the rapid growth of data centers The exhibition, followed by data center internal optical interconnection and DCI will use a large number of high-speed optical modules. supply The two aspects of demand jointly determine the continued prosperity of the industry. Also according to Light Counting Testing, Data Center 100GbE Optical Module Sales Will Surpass Carrier Markets Since 2017, 100G Light The main driving force for the development of the module market will be the switch from the communications operator to the Internet. Recommended high-speed optical module leader The inter-equipment equipment focuses on Botron Technology, Tianfu Communication and Acceleration Technology. n Pay attention to IDC's operating companies and IDC post-cycle enterprises: Consider IDC's operating companies dominated by traffic growth The industry's new website, Halo, also proposes to pay attention to Invicia, a company that benefited from IDC's post-cycle energy conservation and emission reduction.
 First, the data volume surged to promote server hardware technology upgrades 1.1 Heterogeneous Computing in the AI Era - Breakthrough and Compatibility of Computational Capabilities 伴随 Computational performance of server under AI background, with shallow learning from artificial intelligence to deep learning Request for improvement. Early research mainly based on existing experience, let the machine search in the defined rules Suo. This kind of statistics-based machine learning method is very It shows superiority in many aspects, but it is actually only a shallow learning model. In 2006 Later, as the amount of data and hardware computing capabilities increased substantially, people began to try to directly To the computer, let the computer form a more abstract high-level representation attribute by combining low-level data Categories or features, expressed as distributed features of discovery data, are called deep learning. In the whole game In the calculation process, the algorithm is the core, and hardware and data are the basis.  Deep learning, the core technology of artificial intelligence, is analysed by mimicking the neural mechanisms of the human brain And interpret data, such as images, sounds, texts. Neural network includes input layer, output layer, and hidden In the reservoir, each layer contains several neurons, and the neurons are calculated by weight parameters from And form a neural network.
 As the volume of transistors gradually approaches the physical limit, Moore's Law will likely face loss in the future. Effectively, AI chips seek a breakthrough in computing performance under heterogeneous platforms. In the AI field with the amount of data and Sudden increase in the number of calculations, has a higher pursuit of computing performance, so GPU, FPGA, AISC and other performance Powerful devices are on the dust. Compared with the CPU, the GPU is replaced by a large number of computational cores Therefore, the GPU has a large-scale data computing capability far beyond the CPU. FPGA and CPU, Unlike GPUs, which consume less power than GPUs, ASIC performance increases at the expense of versatility. Have Off-the-shelf heterogeneous platforms are widely available, ranging from data centers to mobile phones, tablets, and wearable devices. use CPU + to describe heterogeneous platforms, CPU + GPU, CPU + FPGA, CPU + TPU and so on.
 Whether it is the development of IoT or the personal pursuit of life, office, home, and equipment management, With the calculation of the mobile phone will continue to increase the signal, but the device can not be infinitely increased, or even pre Smartphone shipments may drop at the inflection point in 2018 and 2019. But will be smart hands There is an impediment to migrating machine workloads to traditional servers. Smartphone business can use traditional The CS mode allows the computational load to migrate to the traditional server. But because smart phones are based on ARM's processor architecture, and now the traditional server is the X86 architecture, with intelligence There are some differences in the hardware and software environment of the terminal. There are also many pioneers in this industry trying to Run the Android emulator on the X86 and run these applications on the smart terminal to try to migrate the calculations Moving to the cloud has also encountered many problems. Recompile the software and find it recompiled and rebuilt The workload of new development and recommissioning is even more unacceptable.
 1.2 The explosion of data and energy consumption - exploring new cooling solutions  In the years when data will be more explosive, the number of servers and data centers will increase dramatically. From 19 In the 1960s, there were only 1 million mainframe equipment and 10 million minicomputers in the 1980s. From the era of private network in 2000 to the era of mobile internet, the IoT Internet of Things will enter the next 2020 In the era, it is estimated that 50 billion devices will be produced and large amounts of data will be continuously generated at any time. all The ball's data ranged from 0.1ZB in 2005 to 1.2ZB in 2010 and then to 2.8ZB in 2012. 2015 is 8.5ZB. The huge amount of data requires a lot of servers to support calculations and processing. At the same time, it also poses a challenge to the data center supporting servers. n According to the “National Green Data Center Pilot Work Program”: China's Data Center Development in 2016 Rapidly, the total amount has exceeded 400,000, and the annual power consumption exceeds 1.5% of the electricity consumption of the entire society, including large PUE (Power Usage Effectiveness) in most data centers Still generally greater than 2.2, there is a big gap compared with the international advanced level. At the same time as the rapid development of information Exhibition, the pace of the construction of global data centers has been significantly accelerated, the total number has exceeded 3 million, and the total power consumption The proportion of the total power consumption of the ball is 1.1%-1.5%, and its high energy consumption has caused a high degree of seriousness among governments. Depending on.
Globally, the PUE value of some data centers in the United States is close to 1, becoming the data center's energy consumption. model. Microsoft puts the data center on the sea floor, Facebook moves the data center to the frigid Arctic Side, because this place has annual temperature of over 30 degrees for no more than 24 hours. In addition to traditional design In addition to the energy consumption of the equipment, AI is also extremely energy-consuming, and a 3U GPU server can reach 3.2kw. According to the trend of Ali server utilization, the past may be a density of 3-5kw, it is now 7.2- A density of 15kw may be a density of 25 to 40kw in the future. Liquid cooling becomes the best heat dissipation way. The data center replacement is 8-10 year cycle, and the server replacement cycle is 3-5 year. Liquid cooling is the replacement of air with liquids to dissipate heat from the CPU, memory, and other components. The methods are indirect refrigeration and direct liquid cooling. For the equipment room, air cooling equipment is needed. Also need to add liquid-cooled equipment
The PUE value of China's small and medium-sized data centers is generally 2.2-3, although the new large-scale data center The PUE has decreased, but overall it has a large gap with the international average of 1.3-2. The sub-cost has seriously affected the business development of operators and service providers. Use the external environment "Free" cold source to provide cooling for the engine room and is expected to become the next generation of high-density IDC data center And the development direction of the engine room. Considering low energy consumption and environmental protection, Ali is currently studying submerged liquid cooling and trying to design a simple structure. Single, system simplification, lower construction and maintenance costs, smaller PUE, and more durable liquid cooling system Prepared. However, there are still many challenges associated with immersion liquid cooling. One is the server itself, and the second is IDC. The third is the TCO aspect. Alibaba addresses these issues by including new material technologies and new IT. Developments were made in equipment, new liquid cooling systems, and new monitoring and management.
Ali's research route in liquid cooling is mainly divided into five steps: basic technology, component level, and system level. Small batch, large-scale. Chart 7: Ali Cooling Research Course Source: ODCC, National Gold Securities Research Institute 1.3 Large Storage - Tianhao Cold Storage Innovative Hardware Design n With the development of cloud computing and big data, global data will reach tens of ZBs by 2020. It is foreseeable that this scale effect will lead to more large-scale data centers and services. There will also be explosive growth in the scale of the device. In order to solve the storage capacity requirement, advance the storage hard Design thinking and layout are particularly important. It is necessary to consider the capacity of data center capacity Seek, but also pay attention to energy-saving IT equipment. 
In response to the rapid growth of global data, Tianzhu cold storage technologyProvides a solution for low cost mass storage Solution. Based on the Tianyi 2.0 full cabinet design, the ODCC server working group was in September 2016 It has introduced the cost-effective cold storage technology specification 1.0. In 2017, the ODCC server The working group will open two sets of standards to the outside world. The first one is Tianzhu cold storage 2.0, and the second one is Tianzhu 2.5. Public board. Tianzhu Cold Storage Technology Specification 2.0 maintains the characteristics of mass storage, 1U node (single U Supports up to 18 3.5-inch large-capacity disks. Compared to traditional 2U server designs, storage capacity increases 50%, effectively reducing the overall cost of a stand-alone cluster. n Tianshen Cold Storage 2.0. provides hardware solutions for the business mainly embodied in the following five parties surface. The first is to calculate the speed increase, based on low-power SOC processors, and the single SOC calculation performance is doubled. Increase; Second, memory enhancement, maximum support DDR4 128GB; Third, open network, compatible OCP2.0 design, flexible adaptation to different network cards; four compatible pooling, multi-level cascading, through PCIe Expansion can achieve deeper cold storage, combined with different capacity hard disks, 4TB, 8TB, 
 12TB, 16TB, can further reduce stand-alone costs; 5 is hierarchical storage, compatible support 2 NVMe SSD. n The days of cold storage deployment have grown rapidly year by year. Since 2014, the first days of cold storage in the industry Since the agency, each year has grown at a very rapid rate. As of 2017, the entire cabinet of Tianhe The Department’s scale is close to 1,000, effectively supporting the industry’s private cloud or public cloud business. Good service application requirements, especially in the field of artificial intelligence, because artificial intelligence needs to exist Storage, calculation of large amounts of data, Tianshao cold storage solution of this technology mass storage solution Decided on this aspect of the appeal. Hope that through the ODCC open platform, Tianhe Cold Storage will solve both data The demand for mass storage in the heart can also make the entire industry green and energy-saving. n n n  2.5  is based on the design of Tianyi 2.0, and the card and motherboard for the main cabinet Standardization. This will further reduce R&D and production delivery costs, optimize the supply chain, and reduce R&D. And supply risks, improve the ecology of the entire cabinet system. The design features of the 2.5-inch version of Tianzhu can be summarized in four aspects.
 Architecture Tianyi 2.5 public version is based on Tianyi's 21-inch cabinet design, supporting 1U1 node, 1U2 node and 1U3 node design Program Normalized Motherboards There is only one motherboard design for multiple manufacturers, one set of hardware solutions, corresponding hardware and software interfaces, and APIs Interfaces and cabinet management units are software standards that are easy to operate and execute based on CLI instructions. Low cost Firstly, the same motherboard design reduces R&D expenses; second, it is in the terminal and can be provided according to the application requirements of the business. The most ideal design Eco-supply chain system uses multiple suppliers of one item for ease of cabinet integration and quick delivery
Third, the data center optical interconnection 3.1 Data Center Leads Optical Interconnect Technology - Taking Ali as an Example n 2017 Ari’s network has fully deployed 10G/40G networks, including 25G/100G network, open third party and AOC optical modules. Cloud computing, internet of things and The various developments of AI have made Alibaba's data center grow rapidly. n According to forecast, the volume of optical module data center will reach 10 million by 2019, 2021 The market size reached 4.9 billion U.S. dollars and the growth rate was very rapid. The main driving force from past technology - - Telecommunications networks, such as routers and optical transmissions, are increasingly in demand for increased bandwidth. The characteristic requirements of optical modules for data centers are different from those of telecommunications networks. It is compact, high-density, and low-density. There are higher requirements for power consumption and lower costs, so it can be assumed that the data center is now Promote another development direction of optical communication technology.
 The open optical module needs to include specifications such as technical specifications, tests, and stability.
Developing Optical Modules AOC Highlights:
Optical modules, AOC have SLF, IEE standards organizations to define all optoelectronic parameters, etc. • Reduce problems during integration Integration Testing • Technical specifications, problems found in integrated design should be quickly feedback stability • The IEEE-defined bit error rate is 1*10-12, which may be 16 minutes/IO at 1G and 10G. It may be 100s Construction operation and maintenance • Users build and operate their own Experience summary • The AOC of the entire open third-party optical module does not appear in the data center problem
At present, the optical interconnection of Alibaba Data Center is mainly divided into two parts. The first piece is server access to exchange Machine, this layer usually uses AOC as the transmission medium, and the second one is the light that goes up to the core switch. Modules, they are 4 times faster. Exhibit 26: Alibaba Data Center Optical Interconnection Rate Network Architecture 40G 100G 400G Switch connection 40G 100G 400G Server access 10G 25G 100G Source: ODCC, National Gold Securities Research Institute n From the packaging technology point of view, CDFP and CFP8 have 16 units in 1RU space, and the power consumption is 12W, the bandwidth per RU is 6.4T. But Ali's related technical experts do not think it will be the data The choice of heart switch, but should be the choice of telecommunications network. At the same time they think that the data center The 400G optical module package should be OSFP and QSFP-DD.
At present, the optical interconnection of Alibaba Data Center is mainly divided into two parts. The first piece is server access to exchange Machine, this layer usually uses AOC as the transmission medium, and the second one is the light that goes up to the core switch. Modules, they are 4 times faster. Exhibit 26: Alibaba Data Center Optical Interconnection Rate Network Architecture 40G 100G 400G Switch connection 40G 100G 400G Server access 10G 25G 100G Source: ODCC, National Gold Securities Research Institute n From the packaging technology point of view, CDFP and CFP8 have 16 units in 1RU space, and the power consumption is 12W, the bandwidth per RU is 6.4T. But Ali's related technical experts do not think it will be the data The choice of heart switch, but should be the choice of telecommunications network. At the same time they think that the data center The 400G optical module package should be OSFP and QSFP-DD.
At present, the optical interconnection of Alibaba Data Center is mainly divided into two parts. The first piece is server access to exchange Machine, this layer usually uses AOC as the transmission medium, and the second one is the light that goes up to the core switch. Modules, they are 4 times faster. Exhibit 26: Alibaba Data Center Optical Interconnection Rate Network Architecture 40G 100G 400G Switch connection 40G 100G 400G Server access 10G 25G 100G Source: ODCC, National Gold Securities Research Institute n From the packaging technology point of view, CDFP and CFP8 have 16 units in 1RU space, and the power consumption is 12W, the bandwidth per RU is 6.4T. But Ali's related technical experts do not think it will be the data The choice of heart switch, but should be the choice of telecommunications network. At the same time they think that the data center The 400G optical module package should be OSFP and QSFP-DD.
3.2 Optical Module: Traffic-Bearing Bearer Engine The optical device industry chain can be divided into optical chips, optical components, optical components and optical modules. The optical device can Divided into "optical passive devices" and "optical active devices" two categories. Optical passive devices means that they can be realized Connection and coupling of signals, but devices that do not require photoelectric conversion. Including ceramic sleeves, Ceramic ferrules, fiber optic adapters and other passive devices. The optical module is an optical active device. Chart 32: Optical Device Industry Chain Source: National Securities Research Institute The cost of light chips accounts for a high proportion, occupying the commanding height of the optical device industry chain. In general, the chip is in the light The ratio of the cost components of the parts/optical modules is 30%-40%. Of course, the more high-end optical devices/optical modes Blocks, such as 100G optical modules, are mainly controlled by a handful of foreign companies because of their high chip technology content. Among them, the proportion of its cost may even reach more than 50%.
Optical passive devices include fiber optic connectors, optical splitters, and wavelength division demultiplexers. The optical fiber connector It is mainly used to realize equipment room, equipment and instrument, equipment and fiber, and fiber and light in the system. The non-permanent fixed connection between fibers is an indispensable passive component in optical communication systems and is currently Use the largest number of optical passive components. 3.3 Cloud Data Center: Flat Structure Increases Mass Demand New services such as cloud computing, virtualization, and hyper-convergence make data communications within the data center more and more Industry depth study - 18 - See the special statement on the last page As a result, the internal traffic in the data center is much higher than the traffic between the data center and the outside. According to Cisco's statistics, data center internal traffic accounts for 70% of the entire network. IDC internal server, storage Traffic transmission (east-west traffic) between the node and the network device quickly broke out.
The cloud data center requires data exchange between servers, and traffic flows between servers in the data center. The interactive east-west traffic is dominated. Based on Cisco's forecast, more than 80% of traffic in the cloud data center is East-west traffic. Data center traffic changes from "North and South" mainly to "East and West", pushing the number According to the evolution of the leaf-spine network structure that facilitates east-west transmission, the structure The leaf exchange layer and the ridge exchange layer are composed of two parts, and each leaf exchange node in the cluster has each ridge Switching nodes are connected. Exhibit 34: Leaflet Network Architecture 
 According to Light Counting, data center 100GbE optical module sales will exceed 2017 In the operator market, the main driver for the development of the 100G optical module market will be telecom operators network. According to Ovum's prediction, in the data center high-speed optical module market, the 2017 100G multimode The sales volume of single-mode optical modules will exceed 40G.
At the same time, China still has a big gap compared with foreign giants. Capital expenditures of major cloud service providers worldwide Expansion, domestic BAT still has a big difference compared to foreign giants in terms of capital expenditure compared with foreign giants With the spread of domestic cloud computing big data, domestic BAT capital expenditure in the data center will continue increase. n The U.S. data center is currently transitioning from 40G to 100G. In the future, China will surely follow the United States. The development direction of high-speed and large-scale data center, optical module procurement will also evolve to high-end. this The leading cloud computing cloud leader Alibaba Cloud enters the world's first echelon of cloud computing and realizes revenue in 2016 5.566 billion yuan, a year-on-year increase of 138%. It is estimated that by 2017 Alibaba Cloud will exceed the 10 billion yuan mark. Tencent also announced that it will maintain an annual speed of 2 billion yuan for Tencent Cloud's infrastructure in the next 5 years. Construction and operation. So comprehensive judgment
3.4 New Technologies: Simplified Packaging, Silicon Photonics Technology
The data center market is a reasonable definition of the optical module working conditions, making full use of the optical module Ability to continuously optimize the cost-effective market. Open the optical module in the data center, also It is based on new applications in the data center, such as custom temperature, custom transmission distance, etc. The data center of the application is characterized by its openness to new technologies. New technologies include simplified packaging. Technology, silicon photonics technology, onboard optical module technology, and new standards include 1310/1550nm, QSFP, OSFP, QSFP, etc.
The optical module is a combination of light and electricity. The basic function is to perform electro-optic conversion by using a laser. Photoelectric conversion to achieve data and data interaction. Simplified encapsulation evolution trend of optical modules Can be divided into traditional TO/BOX packages, COB packages, onboard optical components, and scale optoelectronic integration four stages. At this stage, the packaging technology is at the stage of COB packaging. n From the perspective of cost, packaging costs and radio and television chip costs account for 80%-90% of the total cost of optical modules. We believe that continuous optimization of packaging technology and chip technology is a major development of optical modules. direction. Figure 38: Evolution of Optical Module Packaging Source: ODCC, National Gold Securities Research Institute n TO (Transistor Out-line) in Chinese means "transistor profile". This is an early package Specifications, such as TO-92, TO-92L, TO-220, TO-252, etc. are plug-in package designs meter. In recent years, the market demand for surface mounting has increased, and the TO package has also advanced to surface mount packages.
The COB package is a Chip On board, which is to connect a bare chip with conductive or non-conductive adhesive on the interconnect. new technology • Custom temperature • Customized transmission distance New application • Simplified packaging technology • Silicon Photonics Technology new standard • 1310/1550nm • QSFP, OSFP TO/BOX package • Hermetically protected light Port package COB package • Remove TO/BOX • Multi-mode maturity Onboard optical components • High port density • Low loss Photovoltaic integration • System-inPackage Industry depth study - 20 - See the special statement on the last page The substrate is then wire bonded for electrical connection. If the bare chip is directly exposed to air In the case of susceptibility to contamination or man-made damage, affecting or destroying the chip's function, the chips and keys are then glued. The packaged leads are enclosed. Photoelectric integrated circuits can be divided into two categories: one is the conversion of optical information into electrical information. Road, which consists of photodetectors, amplifiers, and bias circuits. The other is to complete the electrical information to the light The information conversion circuit consists of a light emitting device, a driving circuit, and a bias circuit. For on-board optical components and scale-scale optoelectronic integration, silicon photonics technology is of great use. Silicon Photonics The advantages of technology mainly include four points: cost and performance.
 The advantage of silicon semiconductor materials is that they can use effective manufacturing equipment in the integrated circuit field to reduce This, improve system integration. Under Moore's Law, the price per unit of integrated circuits has declined year by year. phase Compared to the traditional transceiver artificial optical manufacturing process, silicon optical technology reduces the optical transceiver costs. this In addition, silicon light technology is also expected to significantly reduce the energy consumption and volume of optical transceivers. n According to reports, Facebook is optimistic about silicon light technology will promote 100G CWDM4 cost reduced to 100 Dollars. Therefore, the IDC market is expected to become the direction of commercial use of silicon technology. According to Yole Development forecast, between 2013 and 2024, the compound annual growth rate of the silicon photonics market is approaching 40%, the overall market will reach 700 million U.S. dollars by 2024. In this process, from the data The demand of the center may make the silicon photonic market usher in a breakthrough in 2018.