GEICO
Senior Staff Engineer – Data Lakehouse Platform
Job Description
At GEICO, we offer a rewarding career where your ambitions are met with endless possibilities. Every day we honor our iconic brand by offering quality coverage to millions of customers and being there when they need us most. We thrive through relentless innovation to exceed our customers’ expectations while making a real impact for our company through our shared purpose.
When you join our company, we want you to feel valued, supported and proud to work here. That’s why we offer The GEICO Pledge: Great Company, Great Culture, Great Rewards and Great Careers. Position Summary GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and core data infrastructure. You will help drive our insurance business transformation as we transition from a traditional IT model to a tech organization with engineering excellence as its mission, while co-creating the culture of psychological safety and continuous improvement.
Position Description Our Staff/Senior Staff Engineer is a key member of the engineering staff working across the organization to innovate and bring the best open-source data infrastructure and practices into GEICO as we embark on a greenfield project to implement a core Data Lakehouse for all Geico’s core data use-cases across each of the company’s business verticals. Position Responsibilities As a Senior Staff Engineer, you will: Scope, design, and build scalable, resilient Data Lakehouse components Lead architecture sessions and reviews with peers and leadership Spearhead new software evaluations and innovate with new tooling Design and lead the development & implementation of Compute Efficiency projects like Smart Spark Auto-Tuning Feature.
Drive performance regression testing, benchmarking, and continuous performance profiling. Accountable for the quality, usability, and performance of the solutions Determine and support resource requirements, evaluate operational processes, measure outcomes to ensure desired results, and demonstrate adaptability and sponsoring continuous learning Collaborate with customers, team members, and other engineering teams to solve our toughest problems Be a role model and mentor, helping to coach and strengthen the technical expertise and know-how of our engineering community Consistently share best practices and improve processes within and across teams Share your passion for staying on top of the latest open-source projects, experimenting with, and learning recent technologies, participating in internal and external OSS technology communities, and mentoring other members of the engineering community Qualifications Deep knowledge of Spark internals, including Catalyst, Tungsten, AQE, CBO, scheduling, shuffle management, and memory tuning. Proven experience in tuning and optimizing Spark jobs on Hyper-Scale Spark Compute Platforms.
Mastery of Spark configuration parameters, resource tuning, partitioning strategies, and job execution behaviors. Experience building automated optimization systems – from config auto-tuners to feedback loops and adaptive pipelines. Strong software engineering skills in Scala, Java, and python are required.
Ability to build tooling to surface meaningful performance insights at scale. Deep understanding of auto-scaling and cost-efficiency strategies in cloud-based Spark environments.
Exemplary ability to design and develop, perform experiments, and influence engineering direction and product roadmap Advanced experience developing new and enhancing existing open-source based Data Lakehouse platform components Experience cultivating relationships with and contributing to open-source software projects. Experience with open-source table formats (Apache Iceberg, Delta, Hudi or equivalent) Advanced experience with open-source compute engines (Apache Spark, Apache Flink, Trino/Presto, or equivalent) Experience with cloud computing (AWS, Microsoft Azure, Google Cloud, Hybrid Cloud, or equivalent) Expertise in developing distributed systems that are scalable, resilient, and highly available Experience in container technology like Docker and Kubernetes platform development Experience with continuous delivery and infrastructure as code In-depth knowledge of DevOps concepts and cloud architecture Experience in Azure Network (Subscription, Security zoning, etc. ) or equivalent Preferred Qualifications Active or past Apache Spark Committer (or significant code contributions to OSS Apache Spark).
Experience with ML-based optimization techniques (e. g. , reinforcement learning, Bayesian tuning, predictive models).
Contributions to other big data/open-source projects (e. g. , Delta Lake, Iceberg, Flink, Presto, Trino).
Background in designing performance regression frameworks and benchmarking suites. Deep understanding of Spark accelerators (Spark RAPIDS, Apache Gluten, Apache Comet, Apache Auron, etc. ) committer status in one or more project is a plus.
EWJD3