HPC Announcements

For current information on problems and outages, please go to the HPC News blog.

2013-07-15   CloudBlue - a Wiki for HPC Researchers

CloudBlue (mss.ccs.uky.edu/wiki) is the University of Kentucky's online community for High Performance Computing (HPC) researchers. Run by our researchers with support from CCS and APAT, this is a forum for UK HPC users. Ask questions, help others researchers, and learn more about the HPC facilities at UK. The wiki is what you make it!

Note: when visiting this site you might get a warning from your browser saying that "The certificate is not trusted because it is self-signed" This message can be safely ignored. CCS is working on getting a proper certificate for the site.

2013-07-09   HPC Training Workshop - September 26, 2013

The UK Center for Computational Sciences is hosting a day-long Intel HPC training workshop on Thursday, September 26, at the W.T. Young Library. You can find more details at the www.ccs.uky.edu/Conferences/Intel2013/ website. Please register early, as space is limited. There is no registration fee. Please pass this info along to others that you think may be interested.

2013-06-18   CPU Limits on all Queues

On the recommendation of the HPC Advisory Committee, we have implemented CPU time limits on all DLX jobs. None of the national HPC sites that we have looked at allow unlimited run times, and very long running jobs have caused problems for us in the past, especially un-monitored jobs.

If you don't specify a queue, the job will go into the default Compute queue, as usual. Currently this has a CPU limit of 30 days. If your job is shorter, use the Med queue (7 days) or the Short queue (1 day). For short debugging runs on one node, use the Debug queue, which has a limit of 1 hour. The gauss queue has a CPU limit of 30 days. FatComp and gauss_big have limits of 7 days. The GPU queues still have limits of 3 days.

The CPU limits are subject to change. Use the command queue_wcl to list the current queues and time limits.

$ queue_wcl
PARTITION AVAIL TIMELIMIT
debug up 1:00:00
Compute* up 30-00:00:00
Short up 1-00:00:00
Med up 7-00:00:00
Long up 30-00:00:00
FatComp up 7-00:00:00
gauss up 30-00:00:00
gauss_big up 7-00:00:00
GPU up 3-00:00:00

For more information, see the Getting Started: Job Queues web page (www.uky.edu/ukit/hpc/getting-started-job-queues). From the message of the day:

061413 Compute/gauss queue wall-clock times are now capped at 30 days; others are shorter.
061413 FatComp queue WCL is now capped at 7 days; GPU queue WCL is now capped at 3 days.

We're sorry for any inconvenience this may cause.

2012-12-11   New GPU Nodes Available

UKIT is pleased to announce that the GPU enabled nodes on the new DLX supercomputing cluster are finally available for general use. The old GPU enabled nodes also remain available. These were purchased separately from the old cluster. GPU enabled code often runs many times faster than on a CPU, but the code must be designed to run on a GPU.

The 28 new GPU nodes are identical to the basic compute nodes (16 cores and 64 GB each), plus each node has two Nvidia M2075 GPUs attached.

The 4 old GPU nodes are based on the old compute nodes (12 cores with 36 GB), plus each node has four Nvidia M2070 GPUs attached.

See Hardware for more details.

To run your GPU enabled code, add the sbatch partition option to your script (#sbatch --partition=GPU) or to the sbatch command (sbatch -pGPU). GPU1 is a queue for the old nodes, GPU2 is a queue for the new ones, and the GPU queue will allocate whichever is available. Please don't run non-GPU enabled code on these nodes!

2012-12-11   Media Attention

Here are some articles resulting from our Cyber Infrastructure Symposium:

Lexington Herald Leader: UK's new supercomputer puts school in top 10 for computing

Business Lexington: UK Launches New Supercomputer, Network

Kentucky Kernel: UK upgrading technology

UKNow: 'UK at the Half' Highlights Advantages of UK's Latest Supercomputing Capabilities

WKYT: New World-Class Super Computer Leased at UK

HPCWire: UK Takes Academic Supercomputing to Next Level

WUKY: UK's Supercomputer Now Three Times Faster

Symposium Poster

Symposium Poster

2012-12-03   UKIT Cyber Infrastructure Symposium

Press Conference

University of Kentucky (UK) is leading the way with cutting edge Cyber Infrastructure (CI) resources for our researchers. UK Information Technology (UKIT) is pleased to announce the deployment of the most powerful supercomputer in University of Kentucky history. In addition to the investment in High Performance Computing (HPC), UK has been awarded a million dollar Cyber Infrastructure grant from the National Science Foundation (NSF) to advance research through software-defined networking.

The NSF Cyber Infrastructure grant will be used to develop technologies and techniques to fundamentally change interaction with communication networks. Software-defined networks will allow researchers and their applications to directly control the flow of data between technical resources and collaborators.

UKIT recently deployed a new High Performance Computing cluster, in partnership with Dell, Inc. This cluster is more than three times as fast as the one it replaces, installed just two years ago, and has a theoretical maximum of just over 140 teraflops (one teraFlop is a million million (1,000,000,000,000) mathematical calculations per second). The cluster contains almost 5000 CPUs and 48 high performance GPUs. See www.uky.edu/ukit/hpc for more information about High Performance Computing at UK.

As Research Computing at UK celebrates its 25th anniversary, this event will review UK's advancements in computing, reflect on its history, discover how CI enhances current research, and glimpse into the future of research computing. Speakers will include UK President Eli Capiluto, Senior Vice Provost Vince Kellen, Computer Science Professor James Griffioen, and several researchers who utilize UK’s cyber infrastructure in their work.

2012-11-15   Introduction to UK's Supercomputer

On Wednesday, November 14, from 1:30 to 2:30, Vijay Nadadur will teach Introduction to UKY-HPC (DLX), intended to introduce faculty, graduate students, and anyone else interested, how to get started using UK's DLX cluster. If you have ever wondered what you could do with a world class supercomputer, you are invited to attend. The class will be held in the Windstream Room of the Hardymon Building and no registration is necessary. Feel free to pass this announcement along to anyone who might be interested.

2012-11-09   The New Cluster is Available

We are cautiously letting users back onto the new DLX cluster. Just ssh to dlx.uky.edu as usual. Not all of the software from the old cluster has been reinstalled yet, but the sysadmin team will do that as they have time. Email any questions, issues, or problems to help-hpc@uky.edu as usual, but PLEASE look at the New Cluster FAQ first (www.uky.edu/ukit/hpc/faq-new). Note that it may take quite some time before we can get an answer for you.

2012-10-26   Cutover

We had hoped to allow researchers back onto the upgraded DLX cluster this week, but copying the home and share filesystems took much longer than our vendor estimated (almost a week, rather than a day or so), and we've also had a series of small issues with the storage and software configurations. Almost all of the issues have been solved, but we still need to run the benchmark acceptance tests and the HPL test. At this point we hope to allow you back on some time next week. We're sorry for any inconvenience this might cause.

2012-09-20   New Migration Schedule

The current (always tentative) migration schedule looks like this:

  • Week 4, Sep 24-28: Finish configuring login, admin, and one compute rack [in progress]
  • Week 5, Oct 01-05: Phase I complete. UK sysadms installing SW and testing
      Dell bringing up compute racks one at a time.
  • Week 6, Oct 08-12: [10/8] Power down most of old cluster, move XDH, power up new cluster
      DDN starts copy of home / share
      UK sysadms continue installing SW and testing
      Dell begins validation testing
  • Week 7, Oct 15-19: Sometime this week we let our users on.

On Monday, 10/1, our sysadms will have access to the Phase I hardware. They will start installing software, including the applications needed for the verification runs.

The shutdown of the old cluster is scheduled for Monday, 10/8. The new cluster will be powered up, and Dell will finish its testing during this week. Our sysadms will continue installing software and testing. At some point this week UK and Dell will finish the verification tests and the HPL (Top500) test.

As soon as possible during the week of 10/15, we will start letting users onto the system.

2012-09-21   Warning about /scratch

The scratch file system on the DLX will not be migrated to the new cluster! If you think you might ever need anything that's currently in scratch, copy it somewhere else ASAP. Scratch is not backed up. Once we turn off the Panasas disk system and ship it back to Dell, it will not be possible to recover any files that were in scratch.

The DLX home file system will be migrated to the new cluster, and home is backed up regularly. You don't need to do anything special for the files in your home directory.

2012-09-20   New Cluster Schedule

The current (tentative) cluster installation schedule is:

  • Week 1, Sep 4-7: Position racks, complete cabling [done]
  • Week 2, Sep 10-14: Configure the head node and one compute node rack [mostly done]
  • Week 3, Sep 17-21: Finish configuring login, admin, and one compute rack [in progress]
  • Week 4, Sep 24-28: Complete compute node rack configurations and rack testing
  • Week 5, Oct 1-5: Power down old cluster, move XDH, power up new cluster, validation testing

Of course, we still need to reinstall all of our compilers, applications, and utilities.

2012-09-04   New Cluster Delivered

The bulk of the hardware for the DLX 2012 cluster was delivered on Tuesday, September 4, 2012. Then the real fun began! For more details, see the New Cluster Delivery page.

2012-06-20   HPC Contract Awarded

The University of Kentucky has awarded a contract to Dell Inc for a new supercomputer cluster to replace our Lipscomb High Performance Computing Cluster, also known as the DLX. The new cluster will be installed in McVey Hall when it arrives in July and must be fully operational before the end of August. Due to the limited power in McVey Hall, the Lipscomb cluster must be powered down before the full new cluster can be powered up. This will mean an intense effort by Dell technicians, our HPC team, Data Center Operations, and the Center for Computational Sciences to minimize the disruption for our researchers. As soon as possible, a conversion schedule will be posted. For more details, see the New Cluster Announcement page.

2012-03-29   HPC RFP Released

The RFP for our next HPC system was released on March 29, 2012. You can look it up on the Purchasing bid page or download it directly.

2012-02-16   GPU Nodes available

UKIT is pleased to announce that we have added four GPU enabled nodes to the DLX supercomputing cluster. The nodes are identical to our basic compute nodes (12 cores with 36 GB), except that each node has four Nvidia M2070 GPUs attached. GPU enabled code often runs many times faster than on a CPU.

To run your GPU enabled code, use the sbatch command to queue your job as usual, but put this option into your script:

#SBATCH --partition=GPU

Or, you can add the partition flag to specify the GPU partition:

sbatch -n12 -pGPU aaa.sh

The PMEMD module in Amber is now GPU enabled, and we'll send out information about running Amber on the GPU nodes soon. CCS tested Amber samples jobs that ran MUCH faster using GPUs.

If you are interested in GPU enabling your own code, then see the extensive Nvidia GPU developer info at
http://developer.nvidia.com/gpu-computing-sdk.

Note that the "SDK" is a misnomer; this is mostly sample code. The Toolkit is the development environment, which you establish by loading the CUDA module (module load cuda).

Please don't run non-GPU code on the GPU nodes! For technical help, please email your questions to help-hpc@uky.edu

2010-09-17   Lipscomb Cluster Announcement

The Lipscomb High Performance Computing cluster (dlx.uky.edu) was announced on September 20, 2010, and named after UK alumnus and Nobel Laureate Dr. William N. Lipscomb, Jr.

Installation Photographs.

A video clip of the installation, provided by UK PR and hosted on YouTube

Plaque presented to Dr. Lipscomb

Lipscomb Plaque

2010-07-13   New Supercomputer Cluster Announcement

The University has recently awarded a contract to Dell for a new High Performance Computing cluster. See the Tentative Installation Schedule.

859-218-HELP (859-218-4357) 218help@uky.edu