OPPO, as an explorer of ultimate technology, is committed to creating a multi-intelligent terminal and service for the era of interconnectivity, with the intention of making our life better. According to market reports on the global mobile phones industry, OPPO shipped over 100 million smartphones in the year 2023 alone, making it one of the world's leading smartphone suppliers. Currently, ColorOS has provided a comprehensive coverage of system applications, creating a boundless and unrestricted user experience for 600 million global users.
With the explosive growth of audio and video applications and artificial intelligence technology, OPPO Cloud Service, as the main carrier of future cloud-based applications, not only contains massive user data but also includes its own analytical data such as computing models and algorithm training. How to serve hundreds of millions of users around the world and protect and secure user data while accumulating users are what OPPO is focusing on.
For enterprises with massive amounts of data, it is impractical to use a single storage medium to store the data. Different storage media come in a variety of capacities and performances depending on the requirements and functionality, hence the cost varies depending on the storage media. Based on years of experience, storage vendors and users have jointly introduced tiered storage solutions to address storage pains.
He Xiaochun, head of OPPO's cloud computing department, said: "OPPO, as an explorer in the mobile phone industry, is in the context of how to meet the substantial annual growth of the data volume of the company's internal business units and the continuous increase in the backup data volume of mobile phone users. After comprehensive consideration of the two aspects of protecting the security of all data storage and reducing the storage costs, our research and development team analyzed and compared the characteristics of various storage media, we decided to adopt the tape storage system in the mobile phone industry in 2023 and began to try to import and deploy the long-term archiving solution for massive, big data."
He Xiaochun Head of OPPO's Cloud Computing Department
Massive data can be accessed through different storage media and a distributed, scalable management of cloud computing. Based on the frequency of data access, data is divided into three tiers: The data needs to be immediately available, which is called hot data; The data can be archived outside of daily processes while it hasn't been active in more than a year but may be recalled at some point, which is called cold data; and data that falls between these two is called warm data.
Oppo has a huge amount of data to handle and the data increases rapidly day by day. Moreover, with the volume of data increasing dramatically every year, in order to better manage this data and keep it safe, we need to sense the temperature of the data through the frequency of data access, etc. According to the temperature of the data, OPPO has defined a 5-tiered storage resource pool.
Tier 0 storage resource pool: Tier 0 storages existing hot and high-frequency data. It loads hot data into the server memory of various CDN data centers which enables quick response to user real-time access requests.
Tier 1 storage resource pool: By using structured and unstructured technologies, data is stored in local flash drives to meet users' normal access needs.
Tier 2 storage resource pool: By using distributed technology and low-cost, high-capacity traditional disk technology, this low-cost, high-capacity storage resource pool is built to meet the needs of users with no discernible access latency.
Tier 3 storage resource pool: By combining traditional tape technology with distributed storage technology, this massive storage resource pool is designed based on tape storage. It provides a high-density storage pool with high write bandwidth, yet it is cost-effective and has low power consumption. It meets data access requirements in minutes.
Tier 4 storage resource pool: This tier is an offline data storage using tape technology.
For data with different "temperatures", organizations need to provide different technical architectures. The nearline tape data storage solution meets the needs of OPPO's current and future cold data management. Not only does it reduce storage costs, but it also provides minute-level data access.
The current OPPO object storage adopts a distributed architecture, which is a technology that stores data in the form of objects in distributed storage clusters to achieve high availability, high reliability, and high scalability. In this mode, data is divided into multiple slices and stored on multiple nodes, which ensures data reliability and fault tolerance through data redundancy and replication mechanisms. This distributed object storage technology not only improves data read/write performance and concurrency but also enables dynamic expansion and flexible management of data, meeting the needs of different application scenarios.
OPPO's object HDD storage adopts the classic distributed object architecture. The metadata server stores the relevant index information of objects and slices, while the servers in the HDD storage pool store the slice information of objects. The client obtains index information from the metadata server through the S3 protocol and then retrieves the specific data of the object from the slice storage server.
Due to the enormous amount of data and the large scale of the distributed object storage cluster, OPPO needs to purchase a significant number of servers each year to build new storage space or replace out-of-warranty servers. OPPO realized that there is room to improve their current IT spending.
In order to reduce the TCO of object storage and optimize the storage service for massive data, OPPO introduced tape storage. Tape storage is an efficient, robust, and secure solution for organizations to manage long-term storage, active archiving, data protection, and recovery needs.
In distributed storage systems, tape storage is widely used for cold data. Rather than storing all data on expensive, high-performance storage devices, organizations can leverage tape storage to store less frequently accessed data. As in tape media, there are two types. One is LTO (Linear Tape Open) which is an open tape format, and the other is IBM's unique 3592 tape. There are two types of tape drives corresponding to them.
Dr. Shi Zemin, Deputy head of the Recording Media Division of Fujifilm (China) Investment Co Ltd, said,“Tape storage has the following advantages:
Cost-Effective: Tape storage is a highly economical option. By offloading cold data from high-performance storage to tape, organizations can take pressure off expensive storage and reduce overall storage costs. A study by Improving Information Technology Sustainability with Modern Tape Storage-2022 *[1], compared the average cost of archiving hard disk and tape over 10 years, assuming that 100 petabytes of information need to be stored for 10 years and the impact of its storage medium is assessed. The total cost of ownership (TCO) is 78% lower if cold data on HDD disks is archived to tape compared to solutions that only use HDD disks.
Reliability: Due to its low bit error rate, error correction capabilities, and redundancy features, tape is more reliable than other media. In a distributed storage system, archiving data on tape increases data redundancy and improves data reliability.
Scalability: Tape technology is easily scalable due to its ability to seamlessly add additional capacity with new media. By integrating tape storage into a tiered storage architecture, organizations can easily expand the storage capacity to meet the growing needs of data storage.
Green Energy: According to the paper Improving Information Technology Sustainability with Modern Tape Storage-2022 , assuming that 100 petabytes of information need to be stored for 10 years, a deep archive solution storing 100% of the data on hard disk drives generates 2,663 tons of CO2e over ten years. A deep archive solution storing all the data on tape generates only 79 tons of CO2e, a 97% reduction. According to the conversion between carbon emissions and electricity consumption, it is equivalent to saving 2.59 million kWh (reducing 0.997kg CO2=1 kWh).
Dr. Shi Zemin General Manager of Recording Media Division in Fujifilm (China) Investment Co., Ltd.
The OPPO cold data storage project adopts the IBM tape nearline storage solution, which is based on the IBM tape library TS4500 and 3592 enterprise tape media.
Hou Miao General Manager of IBM, China& General Manager of Technology Department in China
Hou Miao, general manager of IBM, China and general manager of technology department in China, said,
“The TS4500 tape library is an IBM enterprise tape library that can be configured with 18 frames, 128 3592 tape drives, and 17,550 3592-type tapes. If configured with 20TB JE type 3592 tapes, the entire tape library can store 351PB of data; if configured with 50TB JF type 3592 tapes, the entire tape library can store up to 877.5PB of data.
IBM's enterprise-level 3592 tape media utilizes IBM's tape drive technology, as well as Fujifilm’s accumulated expertise on nanotechnology, nano-coating, and nano-dispersion, which has been developed during decades of film manufacturing. This original technology ensures the 3592 tape has a higher storage density. The JF series reaches up to 50TB per cartridge. The 3592 tape is more stable and reliable, meeting data storage requirements in data center environments. In addition, the performance of the 3592 tape is better, especially in terms of efficiency in random access of small files. Overall, the 3592 tape has six sustainable advantages compared to other types of media: low cost, low energy consumption, high security, high reliability, large capacity, and energy-saving environmental sustainability.”

Joint research on distributed tape storage architecture

Rao Youqing IBM Hyperscale System Solutions Chief Architect
He Xiaochun, Tang Hu, Hou Jingpeng, Wu HuoChen, and other main OPPO technology development and architecture team, and IBM hyperscale system solutions sales manager Liang Xiao and chief architect Rao Youqing designed the OPPO cold data storage project which integrates OPPO object storage and IBM tape nearline storage technology to build an end-to-end distributed tape storage architecture. OPPO's warm and hot data is stored in the HDD storage pool, while cold data is stored in a tape storage pool built from tape. OPPO sorts data by ‘temperature’. If the data is cold enough, the system archives the data to tape. If there are access demands, the system can recall the data from the tape at minute level. The OPPO tape-based storage pool utilizes Erasure Code technology to ensure high reliability and high availability of tape storage data while improving tape volume.
OPPO cold data storage architecture diagram
By monitoring the usage of the distributed tape storage system, the average performance of a single node tape is about 440MB/s, and the read/write performance of the cache file system is around 1GB/s. The utilization rate of the local cache file system is kept within 70%, which meets OPPO’s archive requirements for cold data.
OPPO's object storage cold data archiving system has been launched for over half a year, and the system is running stably. When building the system, OPPO found that although the file sizes of the sliced data nodes and erasure-coded data nodes were the same before compression, after compression the tape volume they took was inconsistent. OPPO has successfully solved this issue and achieved better storage volume allocation.
The Enterprise Tape Nearline Storage Solution, which is based on OPPO object storage, IBM Spectrum Scale/Archive EE, IBM TS4500 tape library, and Fujifilm tape technology, achieved the organic integration of disk storage and tape storage, manages a unified enterprise-level object namespace, realizes seamless circulation of data among different storage pools. The tape nearline storage system enhances IT support for businesses and effectively reduces the cost of data storage. Compared to traditional distributed disk storage, the tape nearline storage system has the features of low cost, low power consumption, high spatial data density, and green data center, which to some extent, reduces the storage costs of low-frequency data archiving.
Looking ahead to the future, OPPO will continue to explore long-term preservation technologies for massive cold data, diversify storage services, reduce storage costs, and concurrently, reduce energy consumption in data centers and provide more economical, more secure, and more reliable data storage solutions for all kinds of businesses.
好文章,需要你的鼓励
穆拉蒂时隔18个月首次接受重大媒体采访,介绍其创立的Thinking Machines Lab正在开发的"交互模型"。该模型能以200毫秒间隔处理音频、文本和视频流,捕捉人类交流中的中断、修正和停顿。她还谈及OpenAI"政变周"经历,强调行业决策权过于集中的担忧,并回应了公司近期研究人员离职问题,表示这是初创实验室的正常波动。
STATE16研究院这篇综述发现,物理AI系统存在"静默失效"风险——AI以高度自信执行基于错误世界信息的动作,却不触发任何报警,并提出在AI输出与物理执行之间建立独立授权层的框架。
本期《Quick Charge》播客涵盖多个热点话题:特斯拉疑似试图删除FSD欺诈相关证据以规避巨额赔付;卡特彼勒持续推进建筑领域电气化布局;住宅太阳能30%税收抵免即将到期。此外,嘉宾Tom Pacheco就高压系统与电池技术培训展开探讨,强调电动车技术人才培养的紧迫性。节目同时提醒有意安装太阳能的用户尽快行动,可通过EnergySage平台比较多家安装商报价。
UIUC与微软联合研发的OpenWebRL框架让4B小模型仅凭400条初始数据,通过在真实网站上边做边学的强化学习方式,在网页智能体基准上超越了用27万条数据训练的竞争对手。