Transforming Enterprise Remediation: UX for AI-Driven DevOps Automation
Modern enterprises are overwhelmed by the complexity of managing applications, infrastructures, and security vulnerabilities. At AT&T, 70% of vulnerability remediation took more than 90 days manually, leading to increased security risks, operational inefficiencies, and wasted cloud resources. As the UX Lead, I was tasked with designing a user-centered AI-driven platform to automate vulnerability management, software modernization, and cloud cost optimization, ultimately making remediation faster, safer, and more scalable.
Challenge
Enterprise IT teams struggled with:
Slow, manual vulnerability remediation processes
Inefficient cloud resource management (wasting up to $179B globally)
High costs in maintaining outdated infrastructures ($600B in legacy system maintenance in the U.S. alone)
Lack of visibility and user trust in automation processes
The opportunity: Create an intuitive, transparent platform that empowers users to trust, control, and scale AI-driven remediation, cutting costs and risk without overwhelming complexity.
Research & Discovery
Stakeholder & User Interviews
Conducted interviews with engineers, DevOps leads, and executive sponsors to uncover pain points
Synthesized needs: speed, trust, control, and contextual transparency
Heuristic Review
Assessed early internal MVP tools for gaps in usability, cognitive load, and transparency
Early internal MVP
Comparison Research
Feedback Highlights
Users feared "blind" automation: "What does this button do?"
Desired gradual engagement: "I started slowly... then moved more aggressively."
Artifacts Produced
Journey maps, lean UX canvases, and competitive analyses of DevOps automation platforms.
Agile UX in DevOps Process
To deliver this solution, I applied a comprehensive Agile UX methodology.
Principles:
Understand problems before attempting to solve them
Support risk-taking and experimentation
Build only what delivers real value to users and the business
Iterate based on data and user feedback
Plan
Gathered requirements and defined goals with engineers, product owners, and leadership
Conducted interviews with SMEs and internal users to understand remediation workflows
Created a comprehensive research backlog and documented user journeys, pain points, and opportunities
Defined UX metrics (Users, Adoption, Engagement, Retention, Signals, SUS) to track success
Design
Mapped out customer journeys and logical flows
Developed wireframes, high-fidelity mockups, and clickable prototypes
Introduced key concepts such as notifications, user profiles, and reoriented navigation
Provided Loom-recorded walkthroughs for broader team alignment
Loom Recording
Develop
Collaborated closely with developers, clarifying requirements and rapidly responding to UX questions
Adapted designs based on technical constraints while preserving user-centered principles
Test
Conducted iterative usability testing and feedback sessions
Refined navigation patterns, terminology, and prioritization based on internal user insights
Deploy
Supported launch readiness by reviewing visual and functional consistency
Created onboarding materials and supported user guide creation for smooth adoption
Review
Analyzed user engagement metrics and gathered feedback
Documented sprint insights and surfaced future backlog items
Guided post-launch iterations and new feature planning
Launch
Ensured a polished final product for broader release
Monitored feedback and metrics post-launch to drive continuous improvement
Solution & Implementation
Key Product Components
Security Vulnerability Remediation: AI identifies and fixes security risks
DevOps Pipeline Refactoring: Automates modern CI/CD improvements
Infrastructure Automation: Powers VM resizing, cloud resource right-sizing, and cost minimization
Cloud Cost Optimization: Identifies inefficiencies in cloud usage
Software Upgrade Automation: Updates and patches legacy systems
Application Code Modernization: Refactors code for performance and security
UX Highlights
Centralized control panel with real-time remediation feedback
Transparent reporting of successful remediations, changes, projected impacts, and rollback options
Progressive onboarding: easing users into trusting AI interventions
CHALLENGE
Building Trust in AI-Driven Remediation
Problem: Users were hesitant to trigger automated remediations, fearing unknown impacts on production systems.
Solution:
Implemented progressive disclosure to show what each remediation action would do
Introduced sandbox testing before live deployment
Designed clear feedback loops, including rollback options, scheduled remediation cancelations and success confirmations
Impact: Increased user confidence in AI actions, leading to faster adoption and more aggressive use of automated remediation features.
CHALLENGE
Simplifying Complex Remediation Workflows
Problem: Manual remediation processes were long, fragmented, and difficult to prioritize, overwhelming users with too many decisions.
Solution:
Consolidated critical remediation data into a single control panel
Prioritized tasks based on severity and business impact scoring
Surfaced "Next Best Actions" to guide user workflows dynamically
Impact: Reduced cognitive load, improved task success rates, and dramatically sped up remediation cycles.
Outcomes & Reflections
Impact in Under 1 Year:
28,000+ successful remediations
291,890 hours of efficiency gained
$30 million saved in operational costs
User Feedback
"Fixed issues that would have normally taken 30 days in a single day."
"Astrix saved us enormous amounts of time and money."
"308 vulnerabilities remediated in one day!"
Internal Customer Quote "Astrix has been an amazing tool to remediate security vulnerabilities… We were able to remediate over 50 vulnerabilities a day, fixing issues that would have normally taken 30 days. The time saved was a game changer, with vulnerabilities being corrected in real time with the click of a button. The value add that Astrix provided not only reduced the time to remediate vulnerabilities but also resulted in significant time and cost savings for our company. Thanks for all the effort that went into developing Astrix!" - Harvey Lynch, AT&T, August 2024
Key Lessons Learned
Transparency builds trust: show users the "why" and "how" of automation
Flexibility matters: empower users to start slow and build confidence
Human-centered design can make advanced AI accessible, safe, and scalable