
Scaling science with Last-Mile HPC: Helping our aerospace customer 15x the capacity of their HPC clusters
Challenge
A customer wanted to scale their HPC installation after their initial research phase was completed. They had a team of top-tier scientists, but lacked experience compiling low-level software for distributed systems. The scientists were told, “You’re smart, you can figure this out.” Six months later, they had not. This resulted in long simulation run times, high network latency, and $MMs in unnecessary cloud charges.
Solution
- Our infrastructure specialists set up compute, ML, and CFD clusters, together with Lustre file storage for high-throughput file transfers.
- Implemented a fully-managed “it just works” services environment, allowing scientists to simply submit their jobs without worrying about infrastructure configuration.
- Set up pre-configured remote simulation environments, allowing the science team to use well-configured machines with minimal effort.
Results
- Infrastructure improvements and software optimization allowed for far more jobs to be submitted to the HPC stack, and for the science team to scale
- Transitioning to the managed services environment, including 24/7 monitoring and support, freed the science team to be much more effective
- Once the infrastructure was properly configured, the HPC cluster was able to scale up massively: from a 40-node on-prem stack to a fully cloud-based 600-node HPC farm
Switching to a managed services environment for their HPC cluster allowed the customer’s top-shelf science team to focus on doing their best work. Any worries about network configurations, vendor support relationships, and software licenses faded into the background with ISC’s managed services.
Work with Insight Softmax
If you have a problem that can be solved with data, we can help. Our problem-solving approach works across company sizes and industries. Contact us to set up a free consultation.
Book now