
year
2025
type
Master's Graduation Project
timeline
Jan - Apr (4 Months)
team size
Individual + 2 Supervisors
keywords
Generative AI, Generative UI, Design Futures, Design Fiction, Multi-Agent Systems.
The Context
What if your homescreen built itself around you instead of the other way around?
That question started as a nagging frustration and turned into a Master's thesis. Genie is the culminating project of my MSc in Interactive Arts and Technology at Simon Fraser University, and it's my attempt at a real answer.
The core idea: instead of hunting through a grid of apps to complete a task, you simply say what you need and the interface generates itself around your intent, in real time, powered by a network of AI agents working coordination.
Objectives
This project's objective was to move beyond theory and demonstrate a tangible alternative to the static, app-based interface. Employing a Design Futures methodology along with Design Fiction, my goal was to bridge the gap between current technological limitations and a preferred future for user interaction. This dual-focused approach led to two primary deliverables:
A functional proof-of-concept: A working, voice-driven prototype that validates the core interaction model by interpreting user intent and dynamically generating a UI using current technology.
A diegetic prototype video: A key Design Fiction artifact that communicates the long-term vision of a seamless, instantaneous, and truly adaptive interface, unhindered by present-day constraints.
Is this the best we can do?

Home-screen interfaces have remained fundamentally unchanged since Xerox invented the Alto and Apple shipped the System 1. The underlying OS has evolved enormously but how we actually interact with it? Not so much.
We still navigate a static grid of icons. We still manually hunt for the right app. We still adapt our needs to the system's structure, rather than the other way around.
During my master's program this realization kept nagging at me. And the more I looked, the more I saw a crisis of imagination.
identifying the problem

The problem runs deeper than inconvenience. The app-centric model assumes users know exactly which tools to use, how to use them, and in what order. That assumption excludes people who aren't tech-savvy, or who have cognitive or physical disabilities. It also fragments every task across multiple tools, many of which do the same thing slightly differently, many of which are locked behind subscriptions you have to compare and download before you can decide.
Designing for the future?

Design Futures merges Design Thinking's problem-solving with Futures Thinking's foresight. The (modified) diagram above by Elliot P. Montgomery shows where these methods lie in this spectrum between the Unconstrained/Imaginary and Constrained/Applied design practices.
To rethink interfaces, I needed a method built for imagining futures. A method that doesn't try to fix what’s broken, but one that could help me imagine what should be and what could be. I employed a dual methodological approach, using Design Futures and Design Fiction.
Design futures

A modified version of the “Future Cone” adapted from Hancock and Bezold (1994).
Design Futures is a hybrid speculative methodology that sits at the intersection of Design Thinking's problem-solving rigor and Futures Thinking's long-view perspective. As defined by Jod Kaftan at Oracle, it works by identifying drivers, the deep forces shaping how technology evolves and signals, which are early indicators of what might be coming.
The two primary drivers I identified for this project were:
Generative AI advancement: the shift from content creation to functional interface generation
Multi-agent systems: AI frameworks that enable complex, collaborative task execution without human orchestration
The signals confirming these directions were already visible: Vercel's v0 demonstrated that functional UIs could be generated directly from text. AutoGen from Microsoft and multi-agent frameworks from Google showed that AI systems could collaborate to solve problems.
I focused on two futures:
Probable Future, where I operate within today's technical constraints to demonstrate the core value.
Preferred Future, where interfaces truly serve human intentions rather than force users to navigate rigid app structures.
Design fiction
Designing only within today's technical boundaries would limit what this project could say about the technology's true potential.
Following Dunne and Raby's approach in Speculative Everything, I created speculative artifacts as tools for critical inquiry — using design to interrogate possible futures rather than simply solve present problems. For this project, that meant building a diegetic prototype: what David A. Kirby describes in Lab Coats in Hollywood as a prototype existing within a fictional narrative context.
The resulting vision video situates the technology within a plausible future ecosystem, similar to how Apple's 1987 "Knowledge Navigator" helped audiences imagine computing's possibilities long before the hardware existed to support it. The goal wasn't just to show what Genie could do, but to ask how it might reshape our relationship with digital devices entirely.
What could this mean for user interfaces?
I can't be the first one to try, right?
Before I got to start working on my solution, I analyzed some tools and devices that have attempted to change how we interact with interfaces. I chose three interesting different products that target different needs to understand the strengths they offer and the limitations they faced.

Vercel’s v0 can transform plain-text descriptions into working next.js components, making it a powerful tool for developers. Yet its focus remains front-end prototyping, not complete end-to-end experiences, and it still assumes a developer in the loop.

Natural by Brain Technologies Inc. takes a more radical stance, abandoning the app grid entirely in favor of dynamically generated interfaces from natural language. But its closed architecture and limited extensibility make it feel less like a platform and more like a specialized search engine.

Rabbit R1 positions itself as an AI-powered personal assistant with a “Generative UI” that adapts to the user. It’s impressive, but its scope is bound to a fixed set of predefined tasks, leaving no room for dynamic, extensible orchestration.
Taken together, these tools hint at something larger, a future where the interface itself is ephemeral, summoned only when needed. But none of them bridge the gap between intent and execution.
Creating Genie

Squashing early assumptions

Grounded in a Research Through Design approach, I started by trying to break my own idea. My supervisor uses an exercise called "5 Intelligent Failures" — the goal is to rigorously stress-test core assumptions before writing a line of production code. I deconstructed the project into its five key components: the agentic framework, voice input, prompt engineering, frontend, and backend. Then I designed specific failure scenarios for each and prototyped solutions to test them.
Early Development

The initial design phase established the system's architecture by mapping out three key components:
A Supervisor Agent to interpret user intent and delegate tasks.
Specialized Worker Agents (e.g., weather, stocks) to retrieve data from public APIs.
A UI Generation Process to transform the retrieved data into visual components.
Preprototyping
tldraw sketches that explored layout and interaction states (how a generated UI collapses, expands, or hands control back to the user)
n8n workflow prototypes that encoded voice → action sequences.
The tldraw frames validated the multi-agent architecture logic. The n8n prototypes validated layout affordances, voice-to-intent parsing, worker sequencing, and the structured JSON the frontend would eventually consume. Both artifacts reduced risk before committing to the full stack.


Development Phase
Genie is built with:
n8n, implementing LangChain-style workflows for agent coordination
OpenAI, for natural language understanding and task parsing
Supabase, as the backend to store data
React, to generate real UI components in response to user intent

n8n
Multi-Agent AI workflow. n8n implements LangChain JS.

Supabase
A PostgreSQL based database to store widget metadata

OpenAI
Large Language Models used: o3-mini & 4o.

React
React.js framework to generate widgets.
system Architecture
Genie operates on a multi-agent architecture where a central Supervisor coordinates a network of specialized Worker agents. The system is built on three core components:
This entire orchestration is managed visually in n8n, a tool whose node-based editor makes the system's logic explicit and easy to modify.

Frontend Architecture
The frontend is built in React, using the WebSpeech API for real-time voice transcription as you speak, so the system feels responsive before you've even finished your sentence. Once speech input completes, the final transcription is dispatched to n8n via webhook for processing.


Backend and Multi-Agent Architecture
The multi-agent architecture in Genie is built entirely within n8n. The Supervisor workflow receives the parsed user voice input and decides which Workers to activate. For example, a travel-related request might trigger Calendar, Flight Search, and Weather Workers in sequence.

Genie employs specialized worker agents that interact with specific third-party APIs. Due to time constraints during development, only the following agents were fully implemented:
Calendar Agent: Manages Google Calendar interactions, allowing users to create and retrieve calendar events.
Stock Agent: Utilizes the Alpha Vantage API to fetch real-time stock information, providing timely financial insights.
Weather Agent: Fetches current weather information via OpenWeatherMap API.
Once the Supervisor and Worker agents complete their tasks, the Coder Agent (powered by o3-mini) takes the structured output and generates the corresponding React widget code. It follows a standardised structure throughout, using Lucide icons and TailwindCSS for consistent, visually coherent results. The finished component is automatically committed to GitHub, triggering a frontend update.
what worked
The proof-of-concept works. Say "what's the weather in Vancouver?" and Genie generates a weather widget on the fly. Ask about a stock, get a live card with real data. Request your next calendar event, and it surfaces.
The interaction isn't only retrieval but also generation. And seeing it work for the first time, however rough around the edges, confirmed the core thesis: the paradigm shift is technically possible today.
A working demo of Genie is demonstrated in the video below.
What didn't
Latency is the honest limitation. Generating a widget takes a few seconds which is fast for what's happening under the hood, but feels slow compared to tapping an app icon. The interaction model needs to account for this: feedback states, loading affordances, and progressive rendering would all help close the gap.
Although LLMs excel at generating functional code, aligning their outputs with established design standards or style guides required extensive manual intervention and iterative refinement. This is where I'd invest significant energy in a next iteration.
The rapid evolution and relative immaturity of frameworks like n8n and LangChain introduced frequent capability shifts, incomplete documentation, and unstable integrations, complicating development.
The goal was never to build a finished product. It was to demonstrate that the paradigm is ready to shift and to show what that shift might feel like.
the vision for the future
Sources:
Corbett, E., & Weber, A. (2016, September). What can I say? addressing user experience challenges of a mobile voice user interface for accessibility. In Proceedings of the 18th international conference on human-computer interaction with mobile devices and services (pp. 72-82).
Dunne, A., & Raby, F. (2013). Speculative everything: Design, fiction, and Social Dreaming. The MIT Press.
Hancock, T., & Bezold, C. (1994, March). Possible futures, preferable futures. In The Healthcare Forum Journal (Vol. 37, No. 2, pp. 23-29).
Li, X., Wang, S., Zeng, S., Wu, Y., & Yang, Y. (2024). A survey on LLM-based multi-agent systems: workflow, infrastructure, and challenges. Vicinagearth, 1(1), 9.
Myers, C., Furqan, A., Nebolsky, J., Caro, K., & Zhu, J. (2018, April). Patterns for how users overcome obstacles in voice user interfaces. In Proceedings of the 2018 CHI conference on human factors in computing systems (pp. 1-7).
Yuan, M., Chen, J., & Quigley, A. (2024a). MAxPrototyper: A Multi-Agent Generation System for Interactive User Interface Prototyping. arXiv preprint arXiv:2405.07131.








