top of page

Train it like FAANG

AI training 10x-1000x faster

 Reduce AI compute cost by 10x-30x 

 Zero Code Change 

Developing a DL model is a sprint,
not a marathon anymore.  

Минимальное рабочее пространство

DAP - Training On Steroids 
Hours will transform into minutes

DAP (Disaggregated Asynchronous Processing Engine), an engine that relies on asynchronous and disaggregated execution of Pytorch training workloads. This results in training running 4x-30x faster on the same number of GPUs. 

 

A video AI startup used Scaletorch's DAP* engine to train a pose detection model for their crowd control feature and went from 4 hours per epoch to 13 minutes.

How it works? 4D Scaling!

DAP Engine

Distributed training

Trials Parallelization

Unique Scaletorch invention.
Existing PyTorch code with zero code change speeds up by 4x – 30x on the same number of GPUs.

Automatic Multi-Node training that linearly scales to 128 GPUs

100s of trials of the experiment run in parallel.
Time for 100 trials equals to time for 1 trial

Multi-cloud

Get more GPUs across clouds

The speed up our clients got with the same number of GPUs using DAP

Privacy & Security

Scaletorch is purely software that works with your cloud accounts/VPCs as well as your existing data sources (S3, Azure Blob, HTTPS, GCS, etc).

Scaletorch isn't a cloud provider and hence no data or code flows through or is stored in Scaletorch

Use your cloud accounts

  • Connect clouds that you are already using (AWS, GCP, Azure) to Scaletorch. 

  • Scaletorch uses your cloud accounts/VPCs to create GPU VMs, execute the training script, and then perform a cleanup after the training job has completed.

  • Data from any source is automatically streamed into the GPU VMs, with Scaletorch purely working like an orchestrator

  • Use one or multiple clouds.

Virtual Mounts

  • Connect to any of your existing data source (S3, Google Drive, Google Storage, HTTPS, etc.)

  • Virtual mounts present any data source as a local folder and leverages encryption, caching and prefetching to make the process secure and fast

  • Virtual Mounts are created directly inside the GPU VMs that run in your cloud infrastructure. Hence, no data passes through Scaletorch

Why Scaletorch?

Privacy and Security

Operate inside your infrastructure. Use your existing cloud accounts and storage.

Decrease cloud costs  by 30x

Fast training naturally leads to smaller charge from cloud providers, since they charge on an hourly basis.

Increase productivity

Get results faster using our DAP engine and run more experiments.

Zero code change. Seriously.

Launch Pytorch script with our Web App or CLI, and we will take care of the rest.

Virtual Mounts

Present cloud storage as a local disk and scale experiments in one click. 

Latest blogs

Check back soon
Once posts are published, you’ll see them here.

Supported Clouds

aws-3215369-2673787.png
bottom of page