CLOUD INFRASTRUCTURE ENGINEER PRINCETON, NJ Direct Hire (Full Time Employee). Hybrid (2 days per week).
Project Description:
System engineering for Scientific computing to support advanced compute environments in AWS cloud.
Systems engineering for Hybrid cloud environments integrating data center infrastructure, compute clusters with Cloud and Lab workflows.
Participate in all stages of Cloud infrastructure provisioning, primarily providing the development, test, and production support.
AWS is the primary cloud platform.
Implement cloud services, process, and cost optimizations.
Implementation security best practices and initiatives at all levels of the cloud, data center & Lab systems.
Adhere to principles/pillars on incident management and service level objectives Service Now.
Work closely with Scientific Computing Engineering infrastructure engineers to apply/improve the automation scripts and system designs to improve systems efficiency in production environment.
Ensure maximum uptime and stability of cloud and on-premises environments, especially in development, test, and production environments.
Apply the latest OS and security patches ensuring the compatibility of underlying running application.
Handle service desk & JIRA tickets and mitigate any production issues.
Ensure accurate knowledge base documentation in a timely manner.
Participate in a weekly on-call rotation (~every 3-4 weeks) as needed.
Provide mission critical production support in case of an outage during off business hours if necessary.
Required Skills:
Bachelor's degree in technology related, engineering or computer science (a plus).
AWS Cloud Practitioner / Solutions Architect - Associate certification is a must.
Strong knowledge of Cloud Platforms in AWS (3+ years) is required.
Relevant work experience (3+ years) in IT infrastructure and data center management.
Experience with hybrid infrastructure systems.
Strong knowledge of Cloud services in AWS (3+ years) is required.
Experience as a Linux and Windows server administrator.
The ability to work with little supervision, must be self-driven and motivated.
Manage & optimize unified logging system and APM (Application Performance Management) monitoring tools, constantly reduce the MTTR (Mean Time to Recovery).
Monitoring and proactive incident management.
Some knowledge of web application programming languages (such as JavaScript, NodeJS, Java, etc.).
Ability to proactively triage on troubleshooting urgent production issues under high time pressure with precision.
Experience in working collaboratively with various R&ED teams throughout the organization to resolve mission critical problems.
Excellent written and oral communication skills necessary to produce and process technical documents.
Excellent problem-solving and analytical skills and the ability to translate business requirements into information systems solutions.
Experience with IT security.
Someone who is a team player.
Google, Azure cloud experience is a plus.
Experience with containerized microservices delivered with Docker, Kubernetes (Kops, AWS EKS), or OpenShift 4.X a plus.
Red Hat Certified Engineer certification is a plus.
Background in Pharmaceutical or Biotechnology is a plus.
Strong scripting skills using Shell and Python or Go (a plus).
This is a full-time - direct hire position that starts ASAP.
Please E-MAIL your resume (attachment to email) with rate and availability to Malika: malika@alphaconsulting.Com
ALPHA'S REQUIREMENT #23-00096
MUST BE ELIGIBLE TO WORK IN THE U.S. AS AN HOURLY W2 EMPLOYEE