We are looking for someone who can lead SRE in financial services to support challengers!
仕事概要
■Mission
Our mission is to provide customers with the most reliable financial infrastructure to process card transactions and offer a range of financial services that support businesses. As a Senior Site Reliability Engineer at UPSIDER, you will play a pivotal role in ensuring our systems are robust, resilient, and scalable. You will lead efforts to improve our service uptime and performance, enabling our customers to grow their businesses with confidence.
■Responsibilities:**
As a Senior Site Reliability Engineer, your primary responsibilities will include but are not limited to.
・Service Level Indicators (SLIs) & Metrics
Dissect and define key Service Level Indicators (SLIs) into specific, measurable metrics that reflect the health and performance of our financial services infrastructure.
・Managing Monitoring and Alerting Systems
Design, build, and maintain advanced monitoring and alerting systems to promptly detect and address issues before they impact our customers. This includes leveraging a variety of tools and technologies to gain insights into system performance and reliability.
・Service Level Objectives (SLOs)
Work closely with engineering teams to establish and support Service Level Objectives (SLOs) that align with our mission to provide reliable services. Provide guidance and support to engineers in achieving these SLOs through effective monitoring, alerting, and incident response strategies.
・Incident Management and Response
Lead and participate in the incident response process, including post-mortem analysis and implementing preventive measures to minimize recurrence.
・Continuous Improvement
Continuously evaluate and improve our infrastructure and processes to enhance reliability, scalability, and efficiency. Foster a culture of innovation and excellence within the team.
・Cross-Functional Collaboration
Foster a collaborative environment by working closely with Infrastructure/Platform Teams, Application Developers, and the Information Security Team to ensure system architectures and deployments are optimized for security, reliability, and scalability. Coordinate with these teams to implement best practices for infrastructure management, application development, and security. Drive the integration of SRE principles into the broader engineering culture, facilitating knowledge sharing and joint problem-solving efforts to enhance overall system performance and security posture.
・Team Leadership and Mentorship
Oversee a team of junior SREs and other technical staff, providing mentorship, guidance, and support to ensure professional growth and achievement of team objectives.
【For Your Information】
▼Engineer Deck
https://speakerdeck.com/upsider_official/upsider-engineering-deck
▼UPSIDER Tech Stack
https://whatweuse.dev/company/upsider
▼UPSIDER Entry Book(JP only)
https://www.notion.so/5c3e32a157fc4e368999848986b6c02a?v=16c8ebaebf214efaaa2f69f08834fb70
必須スキル
・A minimum of 3 years of experience in a Site Reliability Engineering role or similar, with at least 2 years in a leadership position.
・Deep understanding of SLIs, SLOs, and SLAs and their importance in maintaining high service reliability and performance.
・Proficient in designing, building, and maintaining monitoring and alerting systems using tools like Prometheus, Grafana, ELK stack, etc.
・Experience with cloud services (e.g., AWS, Google Cloud Platform, Azure) and container orchestration technologies (e.g., Kubernetes, Docker).
・Strong knowledge of infrastructure as code (IaC) practices and tools (e.g., Terraform, Ansible).
・Excellent problem-solving skills, with the ability to lead root cause analysis and implement strategic solutions.
・Strong leadership and communication skills, capable of mentoring junior team members and collaborating with cross-functional teams.
歓迎スキル
・Experience working in a start-up or fast-growing company.
・Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
求める人物像
・You have a strong interest in and knowledge of Cloud Native technologies
・You are true to data and can make radical and innovative decisions based on research and practical testing
・You have a strong focus on our users’ success, and their trust in our service as a whole
・You have experience in making quality solutions in an iterative approach with a team
・You are passionate about technical learning, and sharing your learnings with others
・You are comfortable and excited about tackling ambiguous challenges and shaping up our services
・You are comfortable writing documentation in English, and speaking in English both informally and during formal presentations
応募概要
給与 | 当社規定による。賞与あり(半期に一回、グレードや実績によって支給) |
---|---|
勤務地 | リモートワーク or 東京オフィス(KDX飯田橋ビル) ※2023年3月より、日比谷WeWorkから飯田橋の自社オフィスに移転しました。 |
雇用形態 | 正社員 |
勤務体系 | フルフレックス、フルリモート(出社日時の指定なし) 土日祝日休み、有給休暇あり、年末年始・夏季休暇、出産育児休暇制度あり |
試用期間 | あり(6ヶ月) |
福利厚生 | ・交通費支給 ・入社時PC貸与 ・社会保険完備 ・スタートアップ休暇を入社時3日付与 |
企業情報
企業名 | 株式会社 UPSIDER |
---|---|
設立年月 | 2018年5月 |
本社所在地 | 東京都港区六本木7-15-5 |
資本金 | 8,794百万円(資本剰余金を含む) |
従業員数 | 100名 |