2015-03-15

## _What You Will Do_

The Infrastructure team within HPC-3 is responsible for providing system and run-time support for HPC's production storage infrastructure in support of compute clusters. Team member duties include System administration of HPC infrastructure; diagnosing, solving and implementing solutions for various system operational problems; tuning file systems to increase performance and reliability of services; automating common processes when possible; interacting with vendors; and communicating and collaborating with other groups, teams, projects and sites. The selected candidate will participate in a regularly scheduled rotation of on-call support of productions systems, including some systems under 7x24 hour support. In addition, some non-standard working hours may occasionally be required.

**_This position will be filled at one of the Scientist 1 - 4 levels, as dictated by current programmatic needs and the skills of the selected candidate. Job responsibilities (outlined below) will be assigned in accordance with the level at which the selected candidate is hired._**

**Level Title One: Scientist 1 ($72,100 - $117,900)**
The successful candidate will be required to:

* Work under the supervision and guidance of senior HPC filesystem administrators to provide technical assistance in problem solving and day-to-day operation of various supercomputing systems

* Steadily increase responsibilities as knowledge of our environment and HPC filesystems increases

**Level Title Two: Scientist 2 ($79,600 - $133,100)

The successful candidate will be required to:

* Work independently and interactively with other filesystem administrators

* Participate in process improvement and deep multi-system problem isolation and resolution in coordination with administrators of other HPC subsystems

* Propose and implement solutions when presented with projects in our HPC environment

* Communicate the strategies and successes of HPC Division to Laboratory organizations

**Level Title Three: Scientist 3 ($86,400 - $148,200)

The successful candidate will be required to:

* Work as a technical leader to implement solutions to current problems and future deficiencies in our HPC environment in conjunction with junior and senior filesystem administrators and technical members of other HPC teams

* Proactively examine our HPC environment and propose projects to make it better

* Communicate the strategies and successes of HPC Division to national peers and participate in national strategic partnerships

**Level Title Four: Scientist 4 ($105,500- $177,800)**In addition to the duties outlined above, the Scientist 4 will be required to:

* Work closely with fellow HPC filesystem administrators as a technical leader and mentor to define and implement solutions both on tactical and strategic levels

* Help set future direction of HPC-3 Group and HPC Division by directing work and participating in filesystem administration efforts

* Partner with national peers and lead national strategic partnerships

**_What You Need_

**Minimum Job Requirements:**

**Scientist 1:

* Strong interpersonal and communication skills

* Basic Linux system management experience

* Knowledge of building, configuring, and administering production Linux or Unix computer systems

* Experience with system management programming in Bash, Perl, Python, or similar scripting languages; experience programming in C/C++ preferred

* Basic knowledge of networking concepts and technology such as Ethernet, Infiniband, SAN, fibre channel, and LNET

* Experience diagnosing system-level problems

**Scientist 2:

* Demonstrated experience with system management programming in Bash, Perl, Python, or similar scripting languages; experience programming in C/C++ preferred.

* Knowledge of administration of parallel/distributed filesystems and archives

* Knowledge of data transfer tools such as Globus, BBCP, HSI, and bittorrent

* Working knowledge of networking concepts and technology such as Ethernet, Infiniband, SAN, Fiberchannel, and LNET

* Demonstrated experience troubleshooting and debugging computing system problems

* Ability to present practice and experience reports to peers locally or at conferences

**Scientist 3:

* Advanced Linux system management experience with emphasis on configuration management and system automation

* Intricate knowledge of Unix/Linux operating systems

* Demonstrated experience with production system management programming in more than one scripting language

* Demonstrated ability to write useful programs in at least one high level language and more than one scripting language in a production environment

* Demonstrated experience configuring data transfer services such as Globus, BBCP, HSI, and bittorrent

* Demonstrated experience configuring and implementing multiple filesystems or enterprise level archives in a production environment

* Experience evaluating competing filesystem and archive technologies

* Demonstrated experience troubleshooting and debugging computing systems at large scale

**Scientist 4:

* Demonstrated ability to act as a technical leader in a core area of HPC filesystem administration, such as parallel filesystems, archive solutions, or data transfer

* Ability to mentor students and junior technical staff as demonstrated by a track record of mentoring activity

* Strong understanding of High Performance Computing system design

* Demonstrated experience designing HPC systems, focusing on scale and evolving capability

* Demonstrated ability to lead and manage technical projects

* Experience evaluating integrated HPC technologies with focus on tiered data storage

* Experience presenting talks or keynotes to peers locally or at conferences

**Desired Skills:**

* Experience or expertise with HPC filesystem and archive technologies such as HPSS, TSM, object storage, and erasure coding

* Demonstrated publication record

* Experience with additional file systems such as NFS, ZFS, CEPH, and OrangeFS

* Record of technical leadership

* Knowledge of statistics, data analytics, or similar fields

* Experience managing and administering production HPC clusters

* Demonstrated experience leading multi-person projects

* Experience working in a secure environment and security hardening systems

* Ability to present technical papers to peers locally or at conferences

* Ability to mentor students and junior technical staff as demonstrated by a track record of mentoring activity

**Education:**

* Scientist 1 and 2: Typical educational requirement is a bachelor’s, master’s, or doctorate’s degree in science from an accredited college or university.

* Scientist 3 and 4: Typical educational requirement is an advanced degree in science from an accredited college or university.

**Additional Details:**

**Clearance: Q**(Position will be cleared to this level). Applicants selected will be subject to a Federal background investigation and must meet eligibility requirements* for access to classified matter.

*Eligibility requirements: To obtain a clearance, an individual must be at least 18 years of age; U.S. citizenship is required except in very limited circumstances. See DOE Order 472.2 for additional information.

**Pre-Employment Drug Test:** The Laboratory requires successful applicants to complete a pre-employment drug test and maintains a substance abuse policy that includes random drug testing.

**Regular position:**Term status Laboratory employees applying for regular-status positions are converted to a regular status only with approval of the cognizant Principal Associate Director.

**Equal Opportunity:**Los Alamos National Laboratory is an equal opportunity employer and supports a diverse and inclusive workforce. All employment practices are based on qualification and merit, without regards to race, color, national origin, ancestry, religion, age, sex, gender identity, sexual orientation or preference, marital status or spousal affiliation, physical or mental disability, medical conditions, pregnancy, status as a protected veteran, genetic information, or citizenship within the limits imposed by federal laws and regulations. The Laboratory is also committed to making our workplace accessible to individuals with disabilities and will provide reasonable accommodations, upon request, for individuals to participate in the application and hiring process. To request such an accommodation, please send an email to applyhelp@lanl.gov or call 1-505-665-5627.

## _Where You Will Work_

**Located in northern New Mexico, Los Alamos National Laboratory (LANL) is a multidisciplinary research institution engaged in strategic science on behalf of national security. LANL enhances national security by ensuring the safety and reliability of the U.S. nuclear stockpile, developing technologies to reduce threats from weapons of mass destruction, and solving problems related to energy, environment, infrastructure, health, and global security concerns.

The High-Performance Computing Division (HPC) provides production high performance computing systems services to the Laboratory. Our work spans the early phases of acquisition, development, and production readiness of HPC systems and infrastructure continuing to the maintenance and operation of these systems and the facilities, in which they are housed, as well as the network, parallel file system, storage, and visualization infrastructure associated with these platforms. The Division’s goal is to create an effective HPC environment in which scientists can be as productive as possible. Additionally we support selected research activities that we deem important to our mission.

*Location:* Los Alamos, NM, US

*Contact Name:* Vigil, Kenneth A

*Organization Name:* HPC-3/High Performance Computer Systems

*Email:* kennethv@lanl.gov

*Job Title:* HPC File Systems Administrator (Scientist 1-4)

Show more