Even though disks sold to Chia miners has slowed. lots of disks are sold to the server markets where 100PB setups are now becoming common. Some shops are now looking at 250PB setups and beyond. The main driver for 100PB storage is drug design, biological systems nd similar technologies. Database design for DNA has to consider various alleles which complicate the fields design.
HDDs total volume: 351 EB. Chia: 40 PB. Pretty crazy considering that a decent chunk of that is new HDDs. SSD sales top 99 million with Samsung being the dominant provider.
Seagate, Western Digital and Toshiba are all reporting excellent sales of hard disks. SSD vendors are doing fine but hard disks are growing faster than SSD in recent years. One of the main drivers of hard disks sales is that LTO-9 deliveries still have not come to market. Storage servers have software to emulate tape libraries or other legacy requirements as required.
Hard disks prices are stable as supplies are good. Seagate has ramped up production and the others are also looking at channel inventories to govern production. Toshiba hard disks are largely sold out in recent months but are now back in stock.
The internet archive has exceeded 10PB of data so far and its not going to be long before it grows toward 100PB. Over time some new site designs are now supporting larger screen users properly. This site is more adaptive in nature. The archive is attempting to open a digital library which has led to litigation over copyright and fair use etc,
Most vendors are cloud based but NASA is one who uses staggering amounts of storage for their various projects. NASA is the best example of brutal storage needs. NASA probably buys 20% of the hard disks made at present. NASA also buts a vast amount of LTO-8 tape for their storage robots. LTO-9 is late to come to the market probably due to some patent problems again. It appears that further development of LTO has slowed as well. This leaves the hard disk bsed servers more room for sales in larger hyperscale systems.
SQL is simply a layer on top of some flat file system or another. SQL has made the use of storage more effectively than earlier approaches. Today SQL runs well on SSD storage given the need for high IOPS ratings. Many smaller setups use a hybrid SSD and hard disk system. SQL is now very widely used and the future for the API seems to be assured.
Content delivery networks (CDN) are popular with many sites. WordPress with JetPack provides free cloud storage for images. CDN works best when there CDN today is largely a mature business with lower server costs.
NAS users are growing in number and many are buying server class hard disks which have a better workload rating. Generally, NAS units with 6 or more disks are advantaged with server class disks.
Looking forward for science requirements the expectations for more storage is largely being met by building large buildings and stuffing them with hard disk servers. Tape vendors are falling behind as the LTO-8 tape is smaller and slower than a server class hard disk. NASA has 9 known major projects on the go and they need a spectacular amount of additional storage as the projects mature.
- High Energy Astrophysics Science Archive Research Center
- Infrared Science Archive
- Legacy Archive for Microwave Background Data Analysis
- NASA/IPAC Extragalactic Database
- NASA Space Science Data Coordinated Archive
- NASA Exoplanet Science Institute
- Astrophysics Data System
- Chandra X-Ray Observatory
- Spitzer Space Telescope
Many legacy projects such as Hubble are still operating to some extent so server requirements are spectacular. The Allen radio array needs 50 PB servers to handle the volume of data received, Many new astronomy projects need even more storage suggesting vast data centers of storage are needed.
Junk data is any kind of data that no longer serves a purpose. Estimates vary but as much is 90% of accumulated data fall into the junk data category. The reason for so much junk is that people don’t properly focus on ensuring data as an asset that aligns with and helps deliver business goals. There are no easy solutions but a focus on the needs of the business can reduce IT spending when servers are better leveraged.