Why Move to Supermicro Servers with 4th Gen Intel Xeon Scalable Processors
The base capabilities compare the different generations of Intel Xeon CPUs.
2nd Gen (Cascade Lake) (92xx series excluded) | 3rd Gen (Ice Lake) | 4th Gen (Sapphire Rapids) | Increase 2nd to 4th | |
---|---|---|---|---|
Maximum Cores | 28 | 40 | 60 | 114% |
Max GHz at Max Cores | 2.7 | 2.3 | 1.9 | |
Max Core*GHz | =28*2.7 = 75.6 | =60*1.9 = 114 | 51% | |
Memory Speed | 2400 MHz | 3200 MHz | 4800 MHz | 100% |
Max Memory Per Socket | 3TB | 8TB (DRAM Only) | 8TB (DRAM Only) | 166% |
High Bandwidth Memory | X | X | Up to 64 GB | N/A |
UPI Links*Performance | 2 @ 9.6 GT/s = 19.2 GT/s | 3 @ 11.2 GT/s = 33.6 GT/s | 4@16 GT/s = 64 GT/s | 233 % |
Range of Benchmarks
Although specific benchmarks may exist that are generally accepted, many workloads that a modern enterprise may run cannot be simply forced into a general benchmark report. Benchmarks can be categorized in the following hierarchy, from low level to full applications.
Lowest – absolute maximum performance based on capabilities of the CPU. This number is the theoretical performance of a single CPU and can generally be calculated by multiplying the clock rate by the number of cores by the instructions per clock.
Math Kernel levels – a small application highly tuned to the CPU architecture. Maximum performance is usually about 85% of the theoretical performance. The most common Math benchmark is LINPACK, which solves linear equations.
Small Applications – The most popular for enterprise class servers that are commonly used to test the performance of the system is SPEC (Standard Performance Evaluation Corporation). SPEC has been the provider and collector of various test suites for over 30 years).
Complete Applications – Entire applications are run, and the time to completion is recorded.
Supermicro servers with 4th Gen Intel Xeon Scalable Processors perform excellently on various SPEC results. Specifically,
The SPECcpu2017 suite measures the performance of a system in the following ways:
Floating Point: (applications are heavily floating-point focused)
- Speed – A single copy of each application from the suite is run. The “score” is then calculated by dividing the time to completion of a reference machine.
- Rate – The system is loaded with many copies of the test suite (typically equal to the number of threads), and the result is then divided by a time from a reference machine.
Integer: (applications use only integer calculations)
- Speed – A single copy of each application from the suite is run. The “score” is then calculated by dividing the time to completion of a reference machine.
- Rate – The system is loaded with many copies of the test suite (typically equal to the number of threads), and the result is then divided by a reference machine.
Peak – Each application source code can be recompiled with specific flags.
Base – The same compiler flags are used for compiling all applications.
SPEC Results
Supermicro 8-socket SPEC CPU Benchmarks:
System | Intel Xeon | Workload | Significance | Score |
---|---|---|---|---|
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_int_speed_base | Best 8 socket system | 13.8 |
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_int_speed_peak | Best 8 socket system | 14.0 |
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_int_rate_base | Top 3 Best 8 socket system | 3510 |
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_int_rate_peak | Top 2 Best 8 socket system | 3560 |
System | Intel Xeon | Workload | Significance | Score |
---|---|---|---|---|
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_fp_rate_base | Top 2 Best 8 socket system | 3540 |
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_fp_rate_peak | Top 2 Best 8 socket system | 3560 |
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_fp_speed_base | Best 8 socket system | 343 |
SuperServer SYS-681E-TR | 8490H | SPECcpu2017_fp_speed_peak | Best 8 socket system | 334 |
Supermicro 4-socket SPEC CPU Benchmarks:
System | Intel Xeon | Workload | Significance | Score |
---|---|---|---|---|
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_int_rate_base | Top 4 Best 4 socket system | 1930 |
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_int_rate_peak | Top 4 Best 4 socket system | 1970 |
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_int_speed_base | Top 3 Best 4 socket system | 16 |
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_int_speed_peak | Top 3 Best 4 socket system | 16.2 |
System | Intel Xeon | Workload | Significance | Score |
---|---|---|---|---|
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_fp_rate_base | Top 2 Best 4 socket system | 1900 |
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_fp_rate_peak | Top 2 Best 4 socket system | 2010 |
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_fp_speed_base | Top 2 Best 4 socket system | 387 |
SuperServer SYS-241H-TNRTTP | 8490H | SPECcpu2017_fp_speed_peak | Top 2 Best 4 socket system | 387 |
SPECStorage
The SPECstorage Solution 2020 benchmark measures the performance of an entire storage configuration as it interacts with application-based workloads. The latest version includes new workloads for artificial intelligence (AI) and genomics, expanded custom workload capabilities, massively better scaling, and a statistical visualization mechanism for displaying benchmark results.
(https://www.spec.org/storage2020/press/release.html)
System | Intel Xeon | Workload | Significance | Score |
---|---|---|---|---|
SYS-221H-TN24R Hyper Storage Server | 8468V 8450H | SPECstorage Solution 2020 | Best SpecStorage_2020 result on AI Image | 0.57 |
SYS-221H-TN24R Hyper Storage Server | 8468V 8450H | SPECstorage Solution 2020 | Best SpecStorage_2020 result on SWBUILD/Jobs: 72 | 0.47 |
SYS-221H-TN24R Hyper Storage Sever | 8468V 8450H | SPECstorage Solution 2020 | #1 SpecStorage_2020 leadership on Genomics per top 5 IDC vendors. | 0.19 |
SYS-221H-TN24R Hyper Storage Sever | 8468V 8450H | SPECstorage Solution 2020 | #1 SpecStorage_2020 leadership on VDA/Jobs: 720 per top 5 IDC vendors. | 5.56 |
SYS-220U-TNR with 22 NVMe Storage Node | 8380 8360Y | SPECstorage Solution 2020 | #1 SpecStorage_2020 leadership on EDA/Jobs: 240 per top 5 IDC vendors. | 0.28 |
SuperServer SYS-741GE-TNRT | 8490H | SPEChpc2021_Tiny | Best single node base result on MPI model | 8.20 |
SuperServer SYS-741GE-TNRT | 8490H | SPEChpc2021_Tiny | #1 single node base result leadership on OPM model among top 5 vendors per IDC | 9.24 |
SuperServer SYS-741GE-TNRT | 8490H | SPECpower_ssj2008 | #1 leadership 4U server among top 5 vendors per IDC | 13546 |
Full Application Benchmarks Using Intel Accelerator Engines
Supermicro has run several benchmarks that compare the 4th Gen Intel Xeon Scalable processors with different Intel Accelerator Engines turned on. The chart below shows real-world benchmarks and compares an Intel Xeon 8380 to an Intel Xeon 8490H for both performance and performance per watt. The Intel Accelerator Engine that was used for the particular benchmark is listed as well. The specifics of the servers that the benchmarks were run on are described at the end of this document.
Supermicro’s X13 CloudDC server was used for testing the ResNet 50 v1.5 Inference benchmark, and the Intel Xeon 8480+ was compared to the Intel Xeon 8380 CPU. In the chart below, the performance gain was from 2.38X to 3.24X, depending on the data set. Intel’s AMX acceleration features were used for this benchmark.
Supermicro’s X13 GrandTwin® system was compared to 3rd Gen Intel Xeon Gold processors with the 4th Gen Intel Xeon Gold processors using the Intel® AMX features. The results show that there was between a 2.38 and 3.24 speedup when running the ResNet 50 v1.5 inference test.
There is a significant improvement for database and analytics applications when moving from a Supermicro X12 generation system with the 3rd Gen Intel Xeon Scalable processor (using 80 cores) to the 4th Gen Intel Xeon Scalable processor (using 48 cores). Using an X12 Ultra platform compared to an X13 Hyper platform, a 25% performance gain is observed using the ClickHouse database with 40% fewer cores.
Summary
The 4th Gen Intel Xeon Scalable processors show significant performance gains running applications on the Supermicro 8 and 4 socket systems. The Supermicro SYS-681E-TR eight socket system shows the fastest performance on a single system ever recorded for:
- SPECcpu2017_int_rate_base
- SPECcpu2017_int_rate_peak
- SPECcpu2017_fp_rate_base
- SPECcpu2017_fp_rate_peak
The SMP architecture of the eight socket and four socket Supermicro servers is ideal for large scale enterprise applications that require many cores and memory.
Intel consistently improves performance and security from generation to generation. Below is a comparison, courtesy of Intel, that shows how Intel is improving its performance. Supermicro servers incorporate the latest 4th Gen Intel Xeon Scalable processors across the product line, from the edge to multi-processor systems that reside in the data center.