A bit about us:
We are an innovative, AI-driven startup in the technology industry on the lookout for a Principal Site Reliability Engineer (SRE) with a focus on AI Security. Our company is centered around a SaaS model, and we are at the forefront of protecting AI/ML algo's for both public and private companies across the world . As an SRE, you will play a critical role in maintaining the health and security of our SaaS solutions. This role offers the opportunity to work with a dynamic team and contribute to a groundbreaking product.
Job Details
Responsibilities:
- Lead the design, implementation, and maintenance of our AI-powered cybersecurity platform.
- Develop and implement best practices for system reliability, scalability, operability, and security.
- Use your Python programming skills to automate processes, reduce system complexity, and improve system performance.
- Utilize Kubernetes, Terraform, and AWS to manage service deployments, scaling, and management of our cloud resources.
- Collaborate with the engineering team to optimize code paths and database queries.
- Troubleshoot and resolve system outages or degradation and implement solutions to prevent their recurrence.
- Participate in the on-call rotation and respond to service incidents.
- Create and maintain detailed documentation of the system architecture and troubleshooting guides.
- Work closely with the product team to understand end-user requirements and translate them into robust solutions.
- Continually update our security practices in response to new and emerging threats.
Qualifications:
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- Minimum of 8 years of experience as an SRE, DevOps engineer, or similar role in a startup environment.
- Strong experience with Python, Kubernetes, Terraform, and AWS.
- Proven experience in cybersecurity, ideally with a focus on AI security.
- Deep understanding of SaaS architectures and delivery models.
- Familiarity with AI and machine learning concepts.
- Strong problem-solving skills and the ability to work under pressure.
- Excellent communication skills and the ability to work effectively with a remote team.
- Must be self-driven, proactive, and eager to learn and keep up with the latest technologies.
- Experience with continuous integration and continuous delivery (CI/CD) pipelines.
- Certifications in AWS, Kubernetes, or cybersecurity would be a plus.
If you are passionate about AI and cybersecurity, and you love the fast-paced and dynamic nature of a startup, then this is the perfect opportunity for you. We can't wait to see the unique contributions you'll bring to our team!