Jan 23, 2025
New York Life Insurance Company
Location Designation: Hybrid - 3 days per week What You'll Do: * Monitoring and Incident Management: o Develop and maintain monitoring, alerting, and logging systems to proactively detect and resolve incidents. o Perform root cause analysis and implement solutions to prevent recurrence. o Manage incident response, including on-call rotations, triaging, and escalation. * Infrastructure Automation and Management: o Create and manage Infrastructure as Code (IaC) using tools like Terraform. o Automate deployments, scaling, backups, and disaster recovery processes. o Develop and maintain CI/CD pipelines to ensure smooth deployment and rollback processes. * Performance and Reliability Optimization: o Analyze performance metrics and optimize infrastructure and application performance. o Define and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs). o Conduct capacity planning and scaling to manage anticipated loads. * Security...
Professional Diversity Network
Lebanon, NJ, USA
Full-Time