Transforming Enterprise Remediation: UX for AI-Driven DevOps Automation

Modern enterprises are overwhelmed by the complexity of managing applications, infrastructures, and security vulnerabilities. At AT&T, 70% of vulnerability remediation took more than 90 days manually, leading to increased security risks, operational inefficiencies, and wasted cloud resources. As the UX Lead, I was tasked with designing a user-centered AI-driven platform to automate vulnerability management, software modernization, and cloud cost optimization, ultimately making remediation faster, safer, and more scalable.

 

Challenge

Enterprise IT teams struggled with:

  • Slow, manual vulnerability remediation processes

  • Inefficient cloud resource management (wasting up to $179B globally)

  • High costs in maintaining outdated infrastructures ($600B in legacy system maintenance in the U.S. alone)

  • Lack of visibility and user trust in automation processes

The opportunity: Create an intuitive, transparent platform that empowers users to trust, control, and scale AI-driven remediation, cutting costs and risk without overwhelming complexity.

 

Research & Discovery

Stakeholder & User Interviews

  • Conducted interviews with engineers, DevOps leads, and executive sponsors to uncover pain points

  • Synthesized needs: speed, trust, control, and contextual transparency

Heuristic Review

  • Assessed early internal MVP tools for gaps in usability, cognitive load, and transparency

 

Early internal MVP

 
 

Comparison Research

 
 

Feedback Highlights

  • Users feared "blind" automation: "What does this button do?"

  • Desired gradual engagement: "I started slowly... then moved more aggressively."

Artifacts Produced

  • Journey maps, lean UX canvases, and competitive analyses of DevOps automation platforms.

 
 

Agile UX in DevOps Process

To deliver this solution, I applied a comprehensive Agile UX methodology.

Principles:

  • Understand problems before attempting to solve them

  • Support risk-taking and experimentation

  • Build only what delivers real value to users and the business

  • Iterate based on data and user feedback

Plan

  • Gathered requirements and defined goals with engineers, product owners, and leadership

  • Conducted interviews with SMEs and internal users to understand remediation workflows

  • Created a comprehensive research backlog and documented user journeys, pain points, and opportunities

  • Defined UX metrics (Users, Adoption, Engagement, Retention, Signals, SUS) to track success

 

Design

  • Mapped out customer journeys and logical flows

  • Developed wireframes, high-fidelity mockups, and clickable prototypes

  • Introduced key concepts such as notifications, user profiles, and reoriented navigation

  • Provided Loom-recorded walkthroughs for broader team alignment

 

Loom Recording

 

Develop

  • Collaborated closely with developers, clarifying requirements and rapidly responding to UX questions

  • Adapted designs based on technical constraints while preserving user-centered principles

Test

  • Conducted iterative usability testing and feedback sessions

  • Refined navigation patterns, terminology, and prioritization based on internal user insights

Deploy

  • Supported launch readiness by reviewing visual and functional consistency

  • Created onboarding materials and supported user guide creation for smooth adoption

Review

  • Analyzed user engagement metrics and gathered feedback

  • Documented sprint insights and surfaced future backlog items

  • Guided post-launch iterations and new feature planning

Launch

  • Ensured a polished final product for broader release

  • Monitored feedback and metrics post-launch to drive continuous improvement

 
 

Solution & Implementation

 

Key Product Components

  • Security Vulnerability Remediation: AI identifies and fixes security risks

  • DevOps Pipeline Refactoring: Automates modern CI/CD improvements

  • Infrastructure Automation: Powers VM resizing, cloud resource right-sizing, and cost minimization

  • Cloud Cost Optimization: Identifies inefficiencies in cloud usage

  • Software Upgrade Automation: Updates and patches legacy systems

  • Application Code Modernization: Refactors code for performance and security

UX Highlights

  • Centralized control panel with real-time remediation feedback

  • Transparent reporting of successful remediations, changes, projected impacts, and rollback options

  • Progressive onboarding: easing users into trusting AI interventions

 

CHALLENGE

Building Trust in AI-Driven Remediation

Problem: Users were hesitant to trigger automated remediations, fearing unknown impacts on production systems.

Solution:

  • Implemented progressive disclosure to show what each remediation action would do

  • Introduced sandbox testing before live deployment

  • Designed clear feedback loops, including rollback options, scheduled remediation cancelations and success confirmations

Impact: Increased user confidence in AI actions, leading to faster adoption and more aggressive use of automated remediation features.

 

CHALLENGE

Simplifying Complex Remediation Workflows

Problem: Manual remediation processes were long, fragmented, and difficult to prioritize, overwhelming users with too many decisions.

Solution:

  • Consolidated critical remediation data into a single control panel

  • Prioritized tasks based on severity and business impact scoring

  • Surfaced "Next Best Actions" to guide user workflows dynamically

Impact: Reduced cognitive load, improved task success rates, and dramatically sped up remediation cycles.

 

Outcomes & Reflections

Impact in Under 1 Year:

  • 28,000+ successful remediations

  • 291,890 hours of efficiency gained

  • $30 million saved in operational costs

User Feedback

  • "Fixed issues that would have normally taken 30 days in a single day."

  • "Astrix saved us enormous amounts of time and money."

  • "308 vulnerabilities remediated in one day!"

Internal Customer Quote "Astrix has been an amazing tool to remediate security vulnerabilities… We were able to remediate over 50 vulnerabilities a day, fixing issues that would have normally taken 30 days. The time saved was a game changer, with vulnerabilities being corrected in real time with the click of a button. The value add that Astrix provided not only reduced the time to remediate vulnerabilities but also resulted in significant time and cost savings for our company. Thanks for all the effort that went into developing Astrix!" - Harvey Lynch, AT&T, August 2024

Key Lessons Learned

  • Transparency builds trust: show users the "why" and "how" of automation

  • Flexibility matters: empower users to start slow and build confidence

  • Human-centered design can make advanced AI accessible, safe, and scalable