What: job title, keywords, or company
Where: city, state, or zip code

Sign up for E-mail Updates
Job Description:


Department Description
The Office of the Vice President for Research and for National Laboratories (OVPRNL) oversees the conduct of sponsored research, technology transfer, research program development, multi-institutional research institutes, and national laboratory board and contract management functions. OVPRNL helps to develop and coordinate research-related communications and educational programs at The University of Chicago. OVPRNL oversees the management of two Department of Energy contracts for Argonne National Laboratory and Fermi National Accelerator Laboratory. When combined with the Lab R&D budgets, the office oversees approximately $1.4 billion in sponsored research. OVPRNL works closely with individual scholars, departments, and divisions to encourage, seed, and coalesce research across the University, Argonne, and Fermilab campuses. This includes joint-faculty and -staff appointments that cross divisional and institutional boundaries.

General Job Summary:

The University of Chicago is seeking a highly qualified HPC System Administrator to build and manage HPC systems and facility operations of its newly created Research Computing Center. The HPC System Administrator will be involved in the identification of the University HPC datacenter facility and the collocation, procurement and management of HPC hardware.

The responsibilities include but are not limited to:

-        Installing, configuring, and maintaining large computer clusters/servers and software.
-        Day-to-day operations of the systems including systems administration, monitoring and storage performance up to and including network components.
-        Management of the system's network switch, parallel file system and HPC software stack and tools.
-        Configuration of the scheduling and queuing system.
-        Diagnosing and resolving system operational problems quickly and effectively.
-        Coordinating with vendors to resolve hardware and software problems.
-        Building and deploying open source software and software from vendors/partners.
-        Providing reliable and efficient backups/restores for all managed systems.
-        Documenting system administration procedures for routine and complex tasks.
-        Maintaining and monitoring the security of the HPC systems and servers.

-Bachelor's or higher in computer science or closely related field or equivalent experience required.
-A minimum three years of full time Linux system administration experience in a large distributed computing environment required.
-Experience with installing, configuring, and maintaining job management support tools (Moab, TORQUE, SGE) required.
-Experience installing MPI libraries and OpenMP required.
-Experience with operating system deployment tools (e.g.: xcat, ROCKS) required.
-Experience configuring, administering, and supporting storage subsystems (e.g.: NetApp, DataDirect, IBM, LSI, etc.) required.
-Experience installing and maintaining diskless clustered environments, and provisioning systems using automated installation methods required.
-Direct experience working with Infiniband (must at least be able to demonstrate a working knowledge of Infiniband concepts, OFED layers, sub-net managers) required.
-Experience with one or more distributed file systems (Lustre, Gluster, GPFS, GFS, IBRIX, PVFS, etc.) required.

-Experience configuring, installing, tuning and maintaining scientific application software preferred.
-A minimum two years of experience in providing support for Linux HPC cluster used for scientific research preferred.
-Experience supporting HPC compilers and libraries preferred.
-Scientific programming experience preferred.
-Experience configuring, installing, maintaining and/or using performance monitoring and optimization tools preferred.
-Experience with installing and maintaining Condor high throughput computing software preferred.
-Competencies Ability to develop and maintain programs and scripts that aid in the operation and automation of administrative tasks using various shell and scripting languages (bash, Perl, Python) required.
-Ability to plan, organize, prioritize tasks, and complete assigned projects with minimal supervision required.
Strong interpersonal and communication skills required.

To apply for this position, please visit our Web site at https://jobopportunities.uchicago.edu and search postings for job requisition number: 091676

The University of Chicago is an Affirmative Action / Equal Opportunity Employer.

Job Category:

Computer and Information Technology

Career Level:

Mid Career (2+ years of experience)

Job Type:

Full Time/Permanent