Comet AI: Advanced Mobile Web UI/UE Testing Solution

As AI systems move beyond text, a new generation of tools is emerging to tackle complex challenges in software development. Comet, an AI-driven automated experience review platform, has introduced a solution designed to address the persistent difficulties in mobile web application UI/UE testing. This development marks a significant shift in the testing paradigm, moving from mere code verification to simulating human perception.

Key Points

Mobile web application (H5/PWA) UI/UE testing presents distinct challenges compared to traditional PC web page testing and even native app testing. The platform identifies five core pain points that frequently trouble testers and designers:

Fragmentation Hell: The diverse mobile ecosystem, encompassing varied screen sizes, forms (notches, dynamic islands), browser kernel differences (Webkit/Blink rendering nuances, custom manufacturer browsers, WeChat/Alipay's X5 kernel), and system-level settings (large fonts, dark mode), often leads to inconsistent rendering and functionality. Issues include obscured buttons, overlapping navigation, and layout distortions.
Touch & Gesture Issues: Unlike precise mouse clicks, finger touches are prone to "fat finger" errors and obscuring elements. Small click targets (below 44x44pt for iOS or 48x48dp for Android) can lead to accidental selections. Gesture conflicts, such as an H5's internal swipe conflicting with a browser's "swipe back" gesture, create clunky user experiences. The behavior of soft keyboards (pushing content vs. overlaying) also varies between Android and iOS, often obscuring critical elements like submit buttons.
Difficulty Simulating Real-world Context: Pages that perform well in controlled environments can fail under real-world conditions. Poor network connectivity can lead to "loading anxiety," with users abandoning pages that take longer than three seconds to load. Interruption testing, which assesses an application's ability to maintain state after a call, alarm, or app switch, is often overlooked, leading to data loss.
High Maintenance Cost of Automation ROI: UI automation for mobile web faces challenges with unstable element locators due to dynamic DOM structures or obfuscated CSS. Visual regression testing is also problematic, as minor pixel differences across devices can generate numerous "false positive" alerts, burdening testers with script maintenance.
Unquantifiable "Subjective Feelings": Beyond bug detection, UI/UE testing evaluates "ease of use," a metric difficult for machines to quantify. Issues like perceived smoothness (frame drops, sticky scrolling) or timely feedback (lack of active states or loading indicators) often require manual interaction to identify.

Under the Hood

Comet aims to address these pain points through several core capabilities:

Visual Cognition: Utilizing computer vision, Comet can "see" like a human, identifying issues such as obscured buttons or low contrast, rather than solely relying on DOM code analysis.
Intent-Driven & Self-healing: Users can instruct Comet with high-level goals, such as "buy a product." The system then navigates the application like a human. If page code changes, Comet can automatically adjust its script as long as visual elements remain, reducing script maintenance.
Concurrent Simulation: The platform can control over 100 cloud instances simultaneously, simulating various device models and network conditions. This capability significantly reduces the time required for compatibility testing.
Rule-based Empathetic Audit: Comet integrates established usability heuristics, such as Nielsen's Ten Usability Heuristics and WCAG accessibility standards, to transform subjective user experience issues into quantifiable data.

Live Demonstration

In practice, Comet's workflow was demonstrated by reviewing a mobile web application named "Moments & Feelings." The system was provided with a prompt defining its role as a "senior experience review expert," outlining its profile, core capabilities, review style, and goals. The prompt also specified constraints, workflow steps, and an evaluation framework covering visual performance, interactive experience, content strategy, and technical perception.

Comet's diagnostic report for "Moments & Feelings" included an overall rating, highlights, core pain points categorized by severity (P0, P1, P2), and optimization suggestions. For example, a P0-level issue identified was an "icon size too small," which Comet detected by simulating the "fat finger" effect—a problem traditional coordinate-based automation scripts would typically miss.

Value Loop

Comet systematically addresses the identified pain points:

Fragmented Compatibility: Through concurrent testing, Comet accurately identifies layout breakdowns across different screen sizes.
Interaction Touch Issues: The platform's ability to simulate real user interaction, such as the "fat finger" effect, allows it to detect issues like undersized icons.
Real-world Scenario Simulation: Intent-driven testing enables Comet to simulate diverse user contexts, including varying light conditions and interruptions, assessing state restoration.
High Maintenance Cost: Comet's self-healing and visual-based approach allows test processes to be reused even if underlying code changes, achieving near "zero maintenance" for scripts.
Quantifying Subjective Feelings: By applying principles like Fitts's Law and Nielsen's Principles, Comet translates subjective "feel" issues into reasoned expert diagnoses, supporting product decisions.

What Comes Next

Comet represents a generational leap in user experience testing, moving beyond traditional automated testing tools like Selenium or Appium. While traditional methods focus on verifying code logic using DOM selectors and coordinates, Comet emphasizes simulating behavior and perception through visual recognition and semantic understanding.

From a structural standpoint, Comet offers enhanced resilience, self-healing capabilities, and intent understanding, leading to warm, design-based diagnostic reports with improvement suggestions, rather than cold pass/fail logs. This shift ensures not only that functionality works but also that the application is easy to use, user-friendly, and human-centric.

Key differentiators include:

Human-like "Seeing": Using computer vision to identify visual anomalies.
Intent-driven Interaction: Allowing high-level instructions rather than precise coordinates.
Self-healing Scripts: Automatic correction of scripts when page structures change.
Exploratory Testing: AI-driven "Monkey Testing" to discover edge cases.
Persona Testing: Evaluating user experience from specific user profiles (e.g., elderly users, impatient business people).
Empathetic Audit: Analyzing the emotional tone of content.
Concurrent Simulation: Parallel execution across numerous cloud browser instances.
Visual Regression Automation: Intelligent comparison of screenshots, ignoring meaningless pixel differences.