DRIZTI - Delivering Personal Supercomputing
  • Home
  • HPCBOX
    • Case Studies
  • Solutions
  • Availability
    • Azure Marketplace
    • Fully Managed HPC
  • About
    • News
    • Partners
    • Press
    • Support
    • Contact
  • Blog

HPCBOX Intelligent Autoscaler available in preview

6/18/2021

 
A cool new Auto-Scaling feature is now available in preview on HPCBOX. The Intelligent AutoScaler, built into the HPCBOX platform, automatically starts required number of Compute, GPU and CUDA workers suitable for a particular user job. Furthermore, the AutoScaler can automatically identify idle workers and power them off when there are no user jobs waiting to be executed on the HPCBOX cluster.

The HPCBOX AutoScaler is designed to require almost zero configuration from the administrator for taking advantage of auto scaling (no configuration required with set up of special host groups, scale sets etc.) and be cloud vendor agnostic, meaning, when HPCBOX is available on other cloud platforms like AWS or GCP, autoscaling should work the same way as it does on Microsoft Azure which is the current preferred platform for HPCBOX.

Why AutoScaling?

HPCBOX has two modes of operation, a cluster can either be used for Personal Supercomputing, meaning, a cluster is for dedicated use of one user, or, a cluster can be in a Multi-User configuration which is more of a traditional set up with multiple users sharing a cluster, running different applications, distributed parallel, GPU accelerated and those that are used for visualization on workers which have a OpenGL capable GPU.
 
Dedicated single user clusters do not generally require any kind of autoscaling functionality because a user operates it like their PC/laptop and has complete control over its operation. Multi-User setups, however, can involve more complexity, especially when:
  1. Having a combination of reserved instances and pay-as-you-go instances.
  2. Applications benefiting from different hardware configurations are executed on the same HPCBOX cluster.
In the multi-user HPCBOX clusters, enterprises would like to optimize spend on the cloud resources, specifically for the pay-as-you-go hardware and also automate the selection of hardware best suited for a particular application and making it available as and when the jobs come in.

Example Scenario

Let us use a use-case to understand how the Intelligent AutoScaler in HPCBOX handles optimization of resources and budgets on the cluster.
​
The following picture represents an HPCBOX Cluster which is a combination of both reserved (resources with a usage commitment, on Azure called Reserved Instances) and pay-as-you-go resources. To optimize the budget spend in such a configuration, one would want the reserved instances to be always powered-on to provide a baseline capacity for the cluster and automate the use of pay-as-you-go resources to minimize resource wastage. Furthermore, we could assume that the compute workers on this cluster are a combination of different hardware configurations, for example, on Azure, we could assume a combination of HB120rs_V2 and HB120-16rs_v3 (combination of AMD EPYC “Rome” and “Milan” hardware).
Picture
Reserved Instances -> 3 HB120rs_V2
Pay-as-you-go -> 1 HB120rs_V2 and 2 HB120-16rs_v3

When a job which can be satisfied by the baseline resources comes into the system, the AutoScaler does nothing and lets the job get scheduled on the available compute workers.
Picture
When new jobs come into the system, the AutoScaler gets into action and matches the jobs to the most suitable hardware, intelligently calculates the required number of workers which would satisfy the job and powers them on with no admin/user interaction. For example, we see in the image below that two jobs have entered the system, each suitable for different hardware configurations, AMD Epyc “Rome” powered HB120rs_V2 and AMD Epyc “Milan” powered 16 core HB120-16rs_v3.
Picture
When jobs exit the system, the AutoScaler automatically identifies the idle workers and powers them off while maintaining the baseline configuration of the cluster. 
Picture

Visual Monitoring

Visualization is important for users to know when their jobs might start. HPCBOX includes a new AutoScaler event monitoring stream which automatically gets updated with every action the AutoScaler is currently taking and will be taking in the next iterations. 

Availability

HPCBOX AutoScaler is now available in preview and we would be very pleased to run a demo or perform a POC or pilot with you to optimize your cloud spend on HPC resources while making sure your jobs are always matched to the most suitable hardware. Schedule a meeting here.

Picture



​     
​Author

Dev S.
Founder and CTO, Drizti Inc

All third-party product and company names are trademarks or registered trademarks 
of their respective holders. Use of them does not imply any affiliation or endorsement by them.


Comments are closed.

    Categories

    All
    AMD EPYC "Milan"
    AMD EPYC "Milan X"
    AMD EPYC "Milan-X"
    ANSYS
    Application Showcase Series
    Autoscaling
    Azure
    Azure HBv3
    CFD
    CONVERGE
    Docker
    Hpc
    Hpc Automation
    Hpc Teams
    Modern Workplace
    Newsletter
    NVIDIA A-100
    OpenFOAM
    SU2

    RSS Feed

Home

About

Press

HPCBOX

News

Contact

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them. 
Copyright © 2017-2022. Drizti Inc. All Rights Reserved.
We use icons from Icons8.



  • Home
  • HPCBOX
    • Case Studies
  • Solutions
  • Availability
    • Azure Marketplace
    • Fully Managed HPC
  • About
    • News
    • Partners
    • Press
    • Support
    • Contact
  • Blog