AMAX Launches MATRIX, The World's First Elastic Deep Learning Cloud, at GTC 2017
FREMONT, CA, May 8, 2017 (Newswire.com) - AMAX, a leading provider of Cloud/IaaS, Deep Learning, Enterprise computing infrastructure solutions, today announced the release of the MATRIX, the world’s first elastic on-premise deep learning cloud platform developed for AI, Machine Learning and HPC. AMAX will be showcasing the MATRIX platforms and technology in Booth #400 at the GPU Technology Conference (GTC) 2017 between May 9th and May 11th.
The MATRIX combines AMAX’s award-winning deep learning platforms with first-in-industry GPU over Fabrics technology. The GPU over Fabrics technology was developed to transcend physical system limitations by aggregating and sharing GPU resources across multiple nodes within a single network. The MATRIX breaks the current limitations of CUDA to maximize GPU utilization, consolidate GPU compute power on demand, and spin up elastic on-premise GPU clouds for unlimited resource distribution and flexibility.
"The MATRIX unleashes GPU computing in the same way VMware revolutionized general computing years ago."
Rene Meyer, VP of Technology, AMAX
“The MATRIX unleashes GPU computing in the same way VMware revolutionized general computing years ago,” said Dr. Rene Meyer, VP of Technology, AMAX. “The MATRIX takes the high-performance capabilities of GPU computing one step further by removing restrictions to resource distribution while eliminating processing inefficiencies, with cost savings features thrown in for extra value.”
How the MATRIX Works
Using NVIDIA GPUs with CUDA, applications communicate with GPUs via CUDA APIs, which call CUDA libraries to execute kernels on local GPUs. The MATRIX GPU virtualization framework replaces CUDA APIs with MATRIX APIs to reroute API calls through high-speed Ethernet or Infiniband fabrics to one or several remote GPU host servers (GPU over Fabrics). To the client application, MATRIX presents virtual (remote) GPUs as local. The framework supports as many as 64 vGPUs per client, with the limitation being the network bandwidth. GPU sharing among multiple kernels and vGPU overprovisioning within VMware are also supported through the resource manager for dynamic resource allocation.
MATRIX features include:
- Increased hardware resource utilization across multiple jobs and users
- Support for major virtualization frameworks like VMware and Docker
- Unprecedented flexibility in GPU allocation to clients and virtual machines
- Dynamic concurrent GPU access across multiple users
- Creating virtual GPU clusters on demand using workstations and servers
- Easily upgrading non-GPU clusters to virtual GPU clusters via GPU over Fabrics
- Less than 5% overhead penalty for network traffic when using high-speed networks (10Gb and above)
- Reduced training and inference times to accelerate Deep Learning development
- Reduced processing time for HPC applications (Monte Carlo, Gene Sequencing, etc.)
- Minimizing infrastructure costs through improved resource efficiency
The MATRIX product line includes workstations ideal for startups, incubators and universities, allowing developers to work as individual pods, yet leverage collective resources to increase compute power on demand. The MATRIX also includes high-performance servers and the Machine Learning [SMART]Rack—a data-center ready rack-scale Machine Learning and Analytics platform featuring 64x NVIDIA Tesla P100 cards per rack, as well as All-Flash storage, 25Gb high speed networking, [SMART]DC Data Center Manager and an in-rack battery for graceful shutdown in power loss scenarios.
All MATRIX solutions come pre-bundled with Deep Learning software tools and libraries, as well as a one-year subscription to the MATRIX GPU Virtualization software.
AMAX will be hosting a Presenter Series on various topics around GPU and Cloud Computing for AI/Machine Learning & HPC throughout GTC 2017. To learn more about AMAX MATRIX GPU virtualization solution or the Presenter Series schedule, please visit Booth #400 or www.amax.com
Source: AMAX