X-ISS Tweaks Cluster Configuration during Setup to Speed Pump Design Simulations

Personnel at a prominent engineering company in Houston, Texas, no longer have to stay up all night to make sure their pump design simulations finish on time and produce the desired results. Thanks to a Dell HPC cluster deployed and optimized by X-ISS Inc., the company is running simulations 18 times faster than was possible on their computer workstations.

Much of what the company’s Houston office does is destined for the oil industry. One type of time-critical project for the division’s engineers is figuring out why a pump from an oil rig or pipeline has failed and coming up with a solution to the failure.

Usually when a pump breaks down in the oil patch, the flow of petroleum products ceases until the faulty equipment can be repaired or replaced. Every minute that oil isn’t being pumped costs the operator money. It’s up to the company’s engineers to examine the defective hardware and develop a better design that can be rushed to manufacturing as quickly as possible. For the company’s dedicated personnel that often meant keeping watch over computerized design simulations late into the night or early morning.

“The client uses the ANSYS Fluent software to simulate and test various pump designs,” explained X-ISS CEO Deepak Khosla. “As the simulations became more complex, they simply took too long, even on high-end computer workstations.”

By nature, design simulations are an iterative, trial-and-error process. Engineers have to choose just the right level of detail, or granularity, in the simulation to produce workable design alternatives. Using their workstations, the engineers sometimes had to run a simulation for several hours before seeing the results would be inadequate. After tweaking the inputs, they then restarted the simulation from the beginning.

In some situations requiring high detail, the workstation was overwhelmed by data volume and the simulation crashed before completion. Both instances often required late nights at the office tending to the computer rather than risking coming in the next morning only to find poor results or a stalled simulation.

The company contracted Dell to build a Windows 2012 HPC cluster that could speed up the simulations in ANSYS Fluent, which scales extremely well in the HPC environment. Among the many advantages, faster simulations meant the engineers could identify and tweak problems with their designs in minutes rather than hours, ultimately delivering workable solutions to the oil field customers more quickly.

Tweaking the Deployment

X-ISS worked closely with Dell and ANSYS to design and build the HPC cluster. The HPC system is comprised of 10 mid-level Dell servers and Cisco 10-GB networking equipment. X-ISS personnel deployed, set up and configured the cluster onsite at the Company. As is standard procedure, the X-ISS team ensured the Fluent application ran well, fine tuning the cluster in the process so the client would get maximum speed from the new system.

A critical step in configuring the cluster was designing the networks. X-ISS created three separate networks for data transmission so that large volumes of data could move in many directions at once. The primary network carried the data needed for the nodes to run the Fluent simulations. A second was set up for cluster managers to monitor overall system operations. And the third network enabled the engineers to run the simulations.

“Three networks maintain system speed and throughput,” said Khosla. “If we had set up just one network for the cluster, it would have had to connect with the company LAN, and the large volume of data would have slowed both the LAN and the cluster.”

A second recommendation made by X-ISS during configuration also contributed to keeping the cluster running fast. The cluster itself had 9 TB of data storage available in the head node. X-ISS suggested the company’s engineers move their pump design data to the cluster and keep it there, rather than moving data back and forth from workstations or remote nodes, which would have slowed jobs.

This configuration concept required buy-in from the company’s IT department because they were the ones responsible for backing up massive volumes of data from the cluster on a regular basis. These backups were required for archiving purposes in case the engineers later had to revisit one of their design simulations. Fortunately, the IT staff agreed with the suggestion, and data is stored locally on the cluster.

To further maximize the power, speed and efficiency of the new HPC cluster, X-ISS ran several validation tests on it, adjusting power and BIOS settings as needed. At the client’s request, the team also set up redundant hardware connections so the cluster could be maintained while it is still operating.

“The company’s engineers can now run a simulation in five minutes that once took 90 minutes,” said Khosla. “The engineers are elated because the simulations no longer cut into their personal time, and the pump re-design process is faster than ever.”

Download this case study: SpeedBoostII.CaseStudy8

X-ISS Helps Provide Major Speed Boost for Water Disinfection Simulations

X-ISS worked closely with simulation software developer ANSYS Inc. to design and build an HPC cluster that completes water treatment simulations in minutes rather than the hours or days once required. X-ISS created the HPC system using seven mid-level Dell servers and InfiniBand networking equipment and then performed extensive fine tuning on the cluster resulting in a 98.9% efficiency rating.

The client is a company specializing in disinfecting water using ultraviolet light technology. They treat water destined for drinking use in municipal systems as well as making waste water safe prior to discharge into the environment. The company has thousands of UV treatment installations around the world. One impressive example is a system they developed to serve a metro area, which cleans more than two billion gallons of water per day. This $1.5 billion system is the largest UV drinking water facility in the world.

Designing and building these water treatment systems requires engineering simulations to verify that all the water flowing through a treatment device is exposed to sufficient UV light to kill bacteria and destroy contaminants. The company uses ANSYS Fluent to simulate its water treatment processes. Simulating these designs on a single workstation can take hours or even a couple of days. Company engineers needed to complete their designs faster in order to keep up with market demand for their systems. The company contracted X-ISS to design and build what became the company’s first HPC cluster.

One of the remarkable achievements for X-ISS was the sheer efficiency of the cluster. Cluster performance is measured in “FLOPS,” or floating point operations per second. In theory, the maximum a cluster can compute is found by multiplying these variables: number of processors, number of cores on each processor, GHz speed of each processor, and number of FLOPS each core can perform per cycle.

For this cluster, the theoretical maximum was 1,200 Giga FLOPS, or 1.2 trillion math calculations per second. Through proper design and tuning, the cluster achieved 1,187 GFLOPS on a Linpack cluster efficiency test. That’s 98.9% efficient. To put that number in a practical perspective, the cluster ran a sample simulation in nine seconds that previously took several minutes on the desktop computers.

“The speed of this cluster is remarkable considering it is only comprised of seven nodes,” said X-ISS President Deepak Khosla. “The cluster performs at the same speed of systems that are much larger and more expensive.”

Most clusters cannot achieve this kind of efficiency due to the sheer volume of data to be transferred in order to “feed” the cluster and gather the results.  Imagine if you are a math teacher passing out 1.2 trillion math problems. Just the logistics of distribution, tracking and gathering the results would take more time and effort the than the actual computation. The cluster faces the same challenge, needing to distribute the trillions of data and gather the 1.2 trillion results.  All in one second!

“Key to achieving this kind of performance was the way we ordered and configured the memory in the nodes and set up the InfiniBand topology,” said Khosla. “We also ran our own custom optimization routines on the cluster to improve the way the nodes talk to each other.”

For the company’s engineers, a more practical way to measure the cluster performance is to ask how long one step or iteration of a simulation takes. The cluster was able to perform iterations for its simulations in 0.087 seconds, or nearly 12 iterations per second.  Most simulations require hundreds, even thousands of iterations to “solve” a simulation. But with iteration time reduced to nearly a dozen per second, simulations requiring thousands of iterations require mere minutes to solve, not hours or days.

The company’s engineers are pleased with the cluster performance and expect to use it frequently to speed up their simulations and get their designs out in the hands of the customer much faster.

Download this case study: SpeedBoost.CaseStudy5