post-thumb Yates Laboratory at The Scripps Research Institute is Pushing for Better Understanding of Cystic Fibrosis with the help of Cloud HPC on AWS with CloudyCluster and IP2

  Cystic Fibrosis (CF) is one of the most common inherited childhood diseases, impacting 1:4,000 children born in the US (www.cff.org). CF is caused by mutations in the gene that encodes the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein, which is responsible for regulating the flow of salt and fluids in and out of the cells in different parts of the body. Mutations to the CFTR gene produces a defective protein that is unable to fold into the conformation necessary for proper function.

Read More
post-thumb AWS Spot Pricing Jupyter Notebook

CloudyCluster and CCQ enable easy usage of Spot Instances and Spot Fleet for HPC, saving the cost of computation for compatible workloads (short runtime, restartable, etc…) All you have to do is add directives directly to the job script and CCQ will take care of creating the spot and spot fleet requests. To help determine the appropriate spot instance price and what region / availability zone is optimal for the run we have created a jupyter notebook that can pull down the pricing for an instance type in a region across as many AZs in that region you select.

Read More
post-thumb Scaling HPC in AWS

We have been approached by many people wanting to scale their science or engineering to the scale of the 1.1m vCPU run recently completed by the DICE Lab at Clemson. In addition, others have wanted to automate their workflows in AWS as well. We have created this paper to demonstrate how to use the pattern created by this project for your own workflows. This paper will hopefully help guide people through the process and the tools used.

Read More
post-thumb Natural Language Processing at Clemson University – 1.1 Million vCPUs & EC2 Spot Instances

(reprint)… My colleague Sanjay Padhi shared the guest post below in order to recognize an important milestone in the use of EC2 Spot Instances. — Jeff… A group of researchers from Clemson University achieved a remarkable milestone while studying topic modeling, an important component of machine learning associated with natural language processing, breaking the record for creating the largest high-performance cluster in the cloud by using more than 1,100,000 vCPUs on Amazon EC2 Spot Instances running in a single AWS region.

Read More
post-thumb The OrangeFS Project, an Open Source Product with a Research Outreach

(reprint)… More and more businesses rely on data-driven decisions, simulations, models and customer interaction. More of these data-driven requirements need faster and faster responses, even real-time, and this demand is stressing traditional storage architectures. Historically, storage systems and protocols have single points of ingress and egress for data. These single pinch-points are increasingly becoming bottlenecks, limiting the ability to process data in a timely fashion. The High Performance Computing community has faced these problems since the inception of the Beowulf cluster, the original concept of tying several commodity computers together to distribute computational workloads.

Read More
post-thumb How API-driven Public Clouds and HPC are Bringing a Brighter Future

(reprint)… API-Driven Public Cloud The increasing popularity of cloud computing is evident in a myriad of news articles, blog posts and videos. Although cloud computing offers many benefits, one of the greatest benefits is largely hidden within the cloud. When public clouds appeared, initial conversation focused on comparing the public cloud to data center virtualization. A critical aspect gradually emerged, the secret sauce: the public cloud is more than just virtualization.

Read More