UX Design Case Study · Kingston University CI7801
Transforming passive, text-only reading into a dynamic visual experience — using Generative AI and AR/VR, with the reader always in control.
01 — Overview
Reading is declining — not because stories matter less, but because text-only formats aren't meeting readers where they are cognitively and visually.
Existing reading platforms preserve the very problem they claim to solve. Readers who struggle to visualise scenes, sustain focus, or emotionally connect with narrative — particularly visual learners, students, and attention-challenged individuals — have no meaningful support beyond the same wall of text, now on a screen.
"Imagine As You Read" is a mobile-first reading companion that uses Generative AI to transform selected text into contextual visual scenes — user-triggered, non-intrusive, and designed to support imagination without ever replacing the act of reading itself.
"The most important design decision was the one I didn't make — choosing not to make visual generation automatic, resisting the temptation to showcase AI capability. Research consistently and emphatically said: give users control."
02 — Research & Discovery
A mixed-methods approach — combining qualitative depth with quantitative breadth — to ensure design decisions were grounded in real user behaviour, not assumptions.
8 participants screened and interviewed using an 11-question open-ended script. Moved from surface reading habits to deep emotional territory — the precise moment a reader loses focus or hits cognitive friction.
66 responses across students and working professionals. Captured reading frequency, device usage, attitudes toward visual content, and openness to AI and AR features. Validated patterns at scale.
Reviewed cognitive load theory, visual learning research, digital reading behaviour, generative AI capabilities, and existing AR/VR educational applications to ground hypotheses in evidence.
Mapped Wonderscope, Google Learn Your Way, Microsoft Copilot, and Google Gemini against 8 dimensions to identify the precise gap "Imagine As You Read" was designed to fill.
Observations clustered into 5 themes: Reading Behaviour, Visual Cognition, Digital Engagement, Generative AI, and AR/VR.
Upload Affinity Diagram ImageMental model mapping that directly shaped the app's Information Architecture and mode structure.
Upload Card Sort ImageNo competitor integrated real-time text interpretation, generative visualisation, and reading-focused UX in a single product. This was the gap "Imagine As You Read" was designed to fill.
03 — User Personas
Evidence-based archetypes built from interviews, surveys, and literature — not invented profiles, but real patterns observed in the research data.
04 — Design Approach
Every structural decision traces directly to a specific research finding. The design rationale is as important as the design itself.
The IA was structured around a single governing principle: users should always feel in control of how deep they go. This meant separating the experience into clearly defined modes rather than layering everything onto one interface. Survey data showed 50% of users explicitly wanted "text with optional visuals." Burying visual features inside reading mode would have violated their primary expectation.
MoSCoW prioritisation was applied ruthlessly to prevent feature bloat — which would have directly contradicted the core research insight that users wanted simplicity and control. The "Won't Have" list was as important as the "Must Have" list. Each item on it represented a real assumption that testing could not yet validate.
AR/VR features surface only after user engages with core reading → prevents cognitive overload for new users
Full IA map showing Library, Reading Mode, Visualise Mode, Immersive Mode, and Settings hierarchies.
Upload IA DiagramOnboarding flow and Home Screen feature flow — showing every branch and return path to reading state.
Upload User Flow DiagramsRisk vs. value matrix plotting user and business assumptions to identify what to test first.
Upload Assumption MapFeature prioritisation — core value centre ring vs. future enhancements vs. out-of-scope items.
Upload Bull's Eye Diagram05 — Prototyping
Each prototype level was designed to answer specific, bounded questions — not to demonstrate completeness, but to validate one layer of assumption at a time.
Initial concept sketches exploring the core layout patterns — text-first reading experience, mode-switching architecture, and the side-panel visual placement that became the defining structural decision of the project.
Layout exploration for the central library hub and navigation architecture
Upload WireframeText-first layout with side-panel visual zone placement and minimal chrome
Upload WireframeRead → Visualise → Immersive mode switching flow and progressive escalation
Upload WireframePaper prototypes created psychological safety — participants gave honest feedback because it was clearly unfinished. The goal was to validate whether user-triggered visual generation made intuitive sense, not whether the button colour was right.
Home, Onboarding, Text-to-Visual interaction, Mode switching, Settings, Immersive entry
Upload Lo-Fi PhotosVisual Mode panel, Regenerate/Dismiss flow, Reading Mode return, Accessibility settings
Upload Lo-Fi PhotosKey discovery from Lo-Fi testing: users immediately looked for a way to dismiss a generated visual. The "dismiss/regenerate" control — which became a core feature — was discovered here, not designed in advance.
Grayscale digital layouts with realistic content placeholders. Validated navigation flows, mode-switching behaviour, and visual control comprehension. Participants found mode transitions more intuitive than expected — the separation of modes as distinct "spaces" mapped well to their mental models.
Text layout, scroll behaviour, bookmarking, and the AI visual trigger placement
Upload Mid-Fi ScreenSide-panel visual zone, intensity toggle, regenerate/dismiss controls
Upload Mid-Fi ScreenAR/VR entry flow, mode switching, pause/disable controls
Upload Mid-Fi ScreenMid-Fi testing revealed that participants wanted finer-grained visual intensity control than the initial three-state toggle provided — directly shaping the slider design in the Hi-Fi prototype.
A complete, realistic simulation covering 5 critical user flows. Full colour, typography, micro-interactions, WCAG 2.1 AA accessibility compliance, and ethical AI transparency indicators built in at every visual generation point.
Registration, welcome screen, reading mode entry, library view with search and history
Upload Hi-Fi ScreensRead Mode → AI visual trigger → side-panel generation → dismiss/save → return to reading
Upload Hi-Fi ScreensVisual slider control, style selection, AR/VR progressive entry, contextual prompt
Upload Hi-Fi ScreensFont scaling, contrast themes, AI transparency indicators, user preference controls
Upload Hi-Fi ScreensPhysical text digitisation via camera, feeding directly into the visual generation pipeline
Upload Hi-Fi Screens06 — Usability Testing
Usability testing was conducted across all three fidelity levels, with a structured quantitative scoring approach applied to the Hi-Fi prototype.
Participant interaction photos and screen recordings from Lo-Fi, Mid-Fi, and Hi-Fi testing rounds
Upload Testing PhotosRisk vs. value matrix used to determine which hypotheses to validate first during testing phases
Upload Hypothesis CanvasParticipants expressed unprompted relief when visual generation was confirmed as optional. Several noted they'd feared the app would "take over" their reading experience.
The "dismiss visual" control was used enthusiastically across all participants — not because visuals were bad, but because users needed to know they held the override.
One participant noted AI-generated imagery was occasionally "too literal." This signals a future need for visual style controls — realistic vs. abstract vs. symbolic interpretation.
07 — Design Iterations
Every design revision traces directly from a specific friction point observed in testing. No change was made from opinion — only from observed user behaviour.
Immersive Mode was accessible via the mode switcher — equally weighted alongside Read and Visualise modes, with no contextual prompt. P4's task failure (score 0.925, exceeding the 0.75 threshold) revealed users didn't encounter it naturally. They had to actively hunt for it.
Immersive Mode entry was repositioned as a progressive enhancement — surfaced after a user had generated at least one AI visual, with a gentle contextual prompt: "Want to step inside this scene?" This aligned the interaction with the natural user journey: text → visuals → immersion in sequence.
The initial three-state toggle (Off / Low / Medium) felt binary in practice. Mid-Fi testing revealed participants wanted to communicate how much visual support they needed — not just whether it was on or off. Most users either fully enabled or fully disabled it.
Replaced with a continuous slider with labelled reference points (Off → Subtle → Moderate → Rich), supplemented by a preview thumbnail showing what the current intensity setting would look like before committing. This converted an abstract control into a concrete, previsualisable decision.
The Lo-Fi prototype had a "generate visual" trigger but no visible "dismiss" or "regenerate" control. These were in the backlog but not yet surfaced in the UI. Participants immediately and instinctively looked for a way to remove a visual they didn't like or that felt inaccurate — a behaviour that hadn't been explicitly designed for.
A persistent two-action control bar — "Regenerate" and "Dismiss" — always visible beneath every generated visual. The "Save" option appeared only after a user had kept the visual for 5 seconds, preventing accidental saves while supporting intentional library-building. Trust is built through control.
Early canvas showing initial assumptions, solutions, and business outcomes before hypothesis testing
Upload Canvas ImageRefined canvas with validated hypotheses, updated solutions, and sharpened business outcomes
Upload Canvas Image08 — Outcomes & Reflections
Three of four usability tasks passed. One failed — and produced the most valuable insight of the entire project.
Optional, user-triggered AI visuals improve perceived engagement without significantly raising cognitive load. Supported by three successful task completions and consistent qualitative feedback across all testing rounds.
User control over visual intensity and dismissal is essential for trust. Confirmed by how immediately and unprompted participants reached for dismissal controls in every single testing session.
A text-first, mode-based architecture supports diverse reading contexts. Validated by how naturally participants switched between Read and Visualise modes without confusion or instruction.
4 participants is sufficient for hypothesis validation but insufficient for statistical confidence. Wider demographic testing — children, non-native readers — would strengthen findings substantially.
Real-world AR/VR performance, latency, and spatial accuracy introduce usability dimensions that prototyping cannot simulate. The Immersive Mode findings are directional, not conclusive.
Certain early-stage assumptions were not formally retested after mid-fi iterations — a gap in the validation cycle that future work should close with modular feature testing.
"Imagine As You Read" is not a reading app that uses AI. It is a reading experience that uses AI only when the reader asks it to — and that distinction is everything.Ganesh Suresh Javhare — CI7801 UX Major Project, Kingston University London, 2026
Up Next