Comcast

Sr. Software Engineer – Incident Management – Chicago, IL OR Denver, CO – Onsite

21 November 2025
Apply Now
Deadline date:
£129516 - £194274 / year

Job Description

FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we’re making it easier for buyers and sellers to transact across all screens, data types, and sales channels. As a global company, we have offices in nine countries and can insert advertisements around the world.

Job SummaryWe’re looking for a Sr. Software Engineer with Incident Management experience to be the central point of accountability for Incident Management in Software Engineering. This role is special because it combines deep technical expertise with strong collaboration and communication skills, ensuring we not only resolve incidents quickly but also turn them into long-term improvements. You’ll split your time between technical ownership – leading root cause analysis, retrospectives, and system hardening – and cross-functional collaboration – working with Engineering teams on improvement plans and with the COO/client-facing teams on impact analysis and clear communications.

This role is key to building a resilient, reliable, and learning-focused culture where every incident strengthens our systems, our processes, and our customer trust. As our customer base grows globally, you’ll also help us ensure consistent, high-quality service across time zones and regions.

This role is about creating consistency, building trust, and making sure escalations become opportunities to improve – not just problems to patch. Job DescriptionTechnical Ownership (50%)Own the Escalations lifecycle within Engineering, from the beginning through resolution. Lead root cause analysis (RCA) sessions that dig deeper than symptoms and deliver long-lasting fixes.

Facilitate retrospectives and follow-ups, turning lessons learned into clear improvement plans. Define and track metrics (incident frequency, resolution times, client impact), and make them visible through dashboards and reports. Partner with teams to strengthen systems through tooling, automation, and platform hardening.

Keep a cross-platform perspective (TV, Data, Beeswax, Strata) to spot patterns and systemic issues. Collaboration & Communication (50%)Lead Incident Management reviews and improvement sessions with leadership, highlighting what happened, why, and how we’ll prevent it next time.

Support a culture of learning and transparency by running training, knowledge-sharing, and quality workshops. Act as the single voice for Engineering in incident management, making sure communication is consistent and clear at all levels. Collaborate with Engineering (Tier 2/3) to resolve incidents quickly and share learnings across teams.

Partner with Operations (Tier 1) to fine-tune escalation paths and help reduce unnecessary hand-offs. Work closely with the COO team to analyze client impact and provide crisp, timely updates during incidents. Requirements6+ years of technical experience in software engineering, site reliability, or production operations.

Proven track record of managing the full software development lifecycle (SDLC), from requirements gathering to production release. Hands-on understanding of full-stack components:Hands on understanding of full stack components: Frontend/UI frameworks and client experience

APIs & service layers Database layer (SQL/NoSQL, data modeling, performance tuning) Backend servers and distributed systems


EWJD3