Enhanced Computational Performance with ThinkSystem Servers

Enhanced computational performance with ThinkSystem servers is showcased through a round of MLPerf benchmark tests, evaluating two server models (SR680a V3 and SR675 V3) equipped with NVIDIA accelerated computing.

The systems demonstrated exceptional performance in image generation using Stable Diffusion v2, LLM fine-tuning using Llama2 70B-LoRA, and object detection through RetinaNet, achieving world records in several MLPerf benchmarks.

These results highlight the superior capabilities of ThinkSystem servers in handling complex computational workloads, validating their potential to satisfy the demands of organizations today.

Introduction

This round of MLPerf intends to analyze and showcase the performance capabilities of two Lenovo ThinkSystem servers, specifically the SR680a V3 with 8-GPU NVIDIA HGX H200 and SR675 V3 models with 8x NVIDIA H100 NVL (PCIe form factor).

Three computationally intensive tasks are employed to evaluate these systems:

Image generation using Stable Diffusion v2
LLM fine-tuning using Llama2 70B-LoRA
Object detection through RetinaNet

Two world records:

Public ID: 4.1-0039
- System name: SR675 v3 with 8x H100 NVL
- Benchmark: Retinanet
Public ID: 4.1-0040
- System name: SR680a v3 with 8x H200 SXM5
- Benchmarks: Llama2 and Stable-Diffusion

Methodology

These benchmarks were conducted at MLCommons with the specified ThinkSystem servers. Key hardware configurations for each system include:

ThinkSystem SR680a V3: 8-GPU NVIDIA HGX H200 System
ThinkSystem SR675 V3: 8x NVIDIA H100 NVL (PCIe form factor)

The chosen tasks and corresponding runtimes are outlined as follows:

Stable Diffusion v2 on SR680a V3 (Image Generation): Completed in 29.364 minutes (Public ID: 4.1-0040)
LLama2 70B-LoRA fine-Tuning on SR680a V3 (LLM Fine-tuning): Completed in 23.062 minutes (Public ID: 4.1-0040)
RetinaNet on SR675 V3 (Object Detection): Executed in 45.256 minutes (Public ID: 4.1-0039)

Results and Analysis

Based on the benchmark data, the SR680a V3 server consistently demonstrates best-in-class performance in both image generation and LLM fine-tuning tasks, showcasing its robust capabilities for computationally demanding workloads.

In image generation with Stable Diffusion v2, the SR680a V3 managed to achieve results within a reasonable timeframe of 29.364 minutes. (Public ID: 4.1-0040)
Similarly impressive was the performance shown during Large Language Model (LLM) fine-tuning with Llama2 70B-LoRA, which completed in just 23.062 minutes. (Public ID: 4.1-0040)

On the other hand, the SR675 V3 server exhibited extraordinary competence in object detection using RetinaNet, marking a notable milestone as the only system with an equivalent GPU configuration to obtain results within this extremely short timeframe (45.256 minutes). (Public ID: 4.1-0039)

Conclusion

This round of benchmarks presents a comprehensive evaluation of the performance capacity of ThinkSystem servers bolstered with NVIDIA accelerated computing, showcasing exceptional results across specified computational tasks. Notable accomplishments include:

Exceptional image generation capabilities on SR680a V3.
Efficient LLM fine-tuning on SR680a V3
Lightning-fast object detection on SR675 V3 using RetinaNet

The experimental tests confirm the powerful performance of these ThinkSystem servers in handling complex computational workloads, thereby validating their capability to satisfy the computational demands of organizations today. The success of these architectures can only be expected as they continue to unlock deeper computing power and accelerate insights generation for future success.

The insights from the latest MLPerf benchmarks are critical for stakeholders in the generative AI and machine learning ecosystem, from system architects to application developers. They provide a quantitative foundation for hardware selection and optimization, crucial for deploying scalable and efficient AI/ML systems. Future developments in hardware and software are anticipated to further influence these benchmarks, continuing the cycle of innovation and evaluation in the field of machine learning.

For more information

For more information, see the following resources:

Explore Lenovo AI solutions:
https://www.lenovo.com/us/en/servers-storage/solutions/ai/
Engage the Lenovo AI Center of Excellence:
https://lenovo-ai-discover.atlassian.net/servicedesk/customer/portal/3
MLCommons®, the open engineering consortium and leading force behind MLPerf, has now released new results for MLPerf benchmark suites:
- Benchmark results: https://mlcommons.org/benchmarks/inference-datacenter/
- Latest news about MLCommons: https://mlcommons.org/news-blog

Author

David Ellison is the Chief Data Scientist for Lenovo ISG. Through Lenovo’s US and European AI Discover Centers, he leads a team that uses cutting-edge AI techniques to deliver solutions for external customers while internally supporting the overall AI strategy for the Worldwide Infrastructure Solutions Group. Before joining Lenovo, he ran an international scientific analysis and equipment company and worked as a Data Scientist for the US Postal Service. Previous to that, he received a PhD in Biomedical Engineering from Johns Hopkins University. He has numerous publications in top tier journals including two in the Proceedings of the National Academy of the Sciences.

Related product families

Product families related to this document are the following:

Trademarks

Lenovo and the Lenovo logo are trademarks or registered trademarks of Lenovo in the United States, other countries, or both. A current list of Lenovo trademarks is available on the Web at https://www.lenovo.com/us/en/legal/copytrade/.

The following terms are trademarks of Lenovo in the United States, other countries, or both:
Lenovo®
ThinkSystem®

Other company, product, or service names may be trademarks or service marks of others.

Lenovo Press

Lenovo Press

Enhanced Computational Performance with ThinkSystem Servers

Article

Author

Published

Form Number

PDF size

Abstract

Introduction

Methodology

Results and Analysis

Conclusion

For more information

Author

Related product families

Trademarks