year

2025

type

Master's Graduation Project

timeline

Jan - Apr (4 Months)

team size

Individual + 2 Supervisors

keywords

Generative AI, Generative UI, Design Futures, Design Fiction, Multi-Agent Systems.

The Context

What if your homescreen built itself around you instead of the other way around?

That question started as a nagging frustration and turned into a Master's thesis. Genie is the culminating project of my MSc in Interactive Arts and Technology at Simon Fraser University, and it's my attempt at a real answer.

The core idea: instead of hunting through a grid of apps to complete a task, you simply say what you need and the interface generates itself around your intent, in real time, powered by a network of AI agents working coordination.

Genie is a reimagining of the home screen where multi-agent AI systems dynamically generate adaptive interfaces in direct response to user intent.

Genie is a reimagining of the home screen where multi-agent AI systems dynamically generate adaptive interfaces in direct response to user intent.

Objectives

This project's objective was to move beyond theory and demonstrate a tangible alternative to the static, app-based interface. Employing a Design Futures methodology along with Design Fiction, my goal was to bridge the gap between current technological limitations and a preferred future for user interaction. This dual-focused approach led to two primary deliverables:

  • A functional proof-of-concept: A working, voice-driven prototype that validates the core interaction model by interpreting user intent and dynamically generating a UI using current technology.

  • A diegetic prototype video: A key Design Fiction artifact that communicates the long-term vision of a seamless, instantaneous, and truly adaptive interface, unhindered by present-day constraints.

Is this the best we can do?

For decades, home-screen interfaces have remained fundamentally unchanged: a passive, app-centric model that forces users to adapt to the system.

For decades, home-screen interfaces have remained fundamentally unchanged: a passive, app-centric model that forces users to adapt to the system.

Home-screen interfaces have remained fundamentally unchanged since Xerox invented the Alto and Apple shipped the System 1. The underlying OS has evolved enormously but how we actually interact with it? Not so much.

We still navigate a static grid of icons. We still manually hunt for the right app. We still adapt our needs to the system's structure, rather than the other way around.

During my master's program this realization kept nagging at me. And the more I looked, the more I saw a crisis of imagination.

identifying the problem

This approach works for many good reasons, which is why we've used it for 40 years. But it assumes users know exactly which apps to use, how to use them, and in what order.

This approach works for many good reasons, which is why we've used it for 40 years. But it assumes users know exactly which apps to use, how to use them, and in what order.

The current paradigm led me to believe that there is a "crisis of imagination", a stagnation in design where we've accepted a passive, fragmented user experience. The process of performing a simple task like checking flight times or creating a calendar event requires multiple manual steps and navigation across different apps, creating unnecessary cognitive load.

The current paradigm led me to believe that there is a "crisis of imagination", a stagnation in design where we've accepted a passive, fragmented user experience. The process of performing a simple task like checking flight times or creating a calendar event requires multiple manual steps and navigation across different apps, creating unnecessary cognitive load.

The problem runs deeper than inconvenience. The app-centric model assumes users know exactly which tools to use, how to use them, and in what order. That assumption excludes people who aren't tech-savvy, or who have cognitive or physical disabilities. It also fragments every task across multiple tools, many of which do the same thing slightly differently, many of which are locked behind subscriptions you have to compare and download before you can decide.

Designing for the future?

So then how do we go beyond incremental improvements and rethink the way we interact with our devices?

So then how do we go beyond incremental improvements and rethink the way we interact with our devices?

Design Futures merges Design Thinking's problem-solving with Futures Thinking's foresight. The (modified) diagram above by Elliot P. Montgomery shows where these methods lie in this spectrum between the Unconstrained/Imaginary and Constrained/Applied design practices.

To rethink interfaces, I needed a method built for imagining futures. A method that doesn't try to fix what’s broken, but one that could help me imagine what should be and what could be. I employed a dual methodological approach, using Design Futures and Design Fiction.

Design futures

A modified version of the “Future Cone” adapted from Hancock and Bezold (1994).

Design Futures is a hybrid speculative methodology that sits at the intersection of Design Thinking's problem-solving rigor and Futures Thinking's long-view perspective. As defined by Jod Kaftan at Oracle, it works by identifying drivers, the deep forces shaping how technology evolves and signals, which are early indicators of what might be coming.

The two primary drivers I identified for this project were:

  • Generative AI advancement: the shift from content creation to functional interface generation

  • Multi-agent systems: AI frameworks that enable complex, collaborative task execution without human orchestration

The signals confirming these directions were already visible: Vercel's v0 demonstrated that functional UIs could be generated directly from text. AutoGen from Microsoft and multi-agent frameworks from Google showed that AI systems could collaborate to solve problems.

I focused on two futures:

  • Probable Future, where I operate within today's technical constraints to demonstrate the core value.

  • Preferred Future, where interfaces truly serve human intentions rather than force users to navigate rigid app structures.

Design fiction

Designing only within today's technical boundaries would limit what this project could say about the technology's true potential.

Following Dunne and Raby's approach in Speculative Everything, I created speculative artifacts as tools for critical inquiry — using design to interrogate possible futures rather than simply solve present problems. For this project, that meant building a diegetic prototype: what David A. Kirby describes in Lab Coats in Hollywood as a prototype existing within a fictional narrative context.

The resulting vision video situates the technology within a plausible future ecosystem, similar to how Apple's 1987 "Knowledge Navigator" helped audiences imagine computing's possibilities long before the hardware existed to support it. The goal wasn't just to show what Genie could do, but to ask how it might reshape our relationship with digital devices entirely.

What could this mean for user interfaces?

My research identified a convergence of key technological drivers. Academic literature shows that multi-agent systems are shifting development from coding to orchestration (Li et al., 2024), with AI agents already collaborating to generate coherent UI prototypes (Yuan et al., 2024). At the same time, LLMs are finally overcoming the documented failures of early voice interfaces (Corbett & Weber, 2016; Myers et al., 2018) by enabling flexible, conversational interaction. This led to my core insight:


LLMs provide the reasoning, agents provide the structure, and voice lowers the barrier to state what you need.

My research identified a convergence of key technological drivers. Academic literature shows that multi-agent systems are shifting development from coding to orchestration (Li et al., 2024), with AI agents already collaborating to generate coherent UI prototypes (Yuan et al., 2024). At the same time, LLMs are finally overcoming the documented failures of early voice interfaces (Corbett & Weber, 2016; Myers et al., 2018) by enabling flexible, conversational interaction. This led to my core insight:


LLMs provide the reasoning, agents provide the structure, and voice lowers the barrier to state what you need.

My research identified a convergence of key technological drivers. Academic literature shows that multi-agent systems are shifting development from coding to orchestration (Li et al., 2024), with AI agents already collaborating to generate coherent UI prototypes (Yuan et al., 2024). At the same time, LLMs are finally overcoming the documented failures of early voice interfaces (Corbett & Weber, 2016; Myers et al., 2018) by enabling flexible, conversational interaction. This led to my core insight:


LLMs provide the reasoning, agents provide the structure, and voice lowers the barrier to state what you need.

I can't be the first one to try, right?

Before I got to start working on my solution, I analyzed some tools and devices that have attempted to change how we interact with interfaces. I chose three interesting different products that target different needs to understand the strengths they offer and the limitations they faced.

Vercel’s v0 can transform plain-text descriptions into working next.js components, making it a powerful tool for developers. Yet its focus remains front-end prototyping, not complete end-to-end experiences, and it still assumes a developer in the loop.

Natural by Brain Technologies Inc. takes a more radical stance, abandoning the app grid entirely in favor of dynamically generated interfaces from natural language. But its closed architecture and limited extensibility make it feel less like a platform and more like a specialized search engine.

Rabbit R1 positions itself as an AI-powered personal assistant with a “Generative UI” that adapts to the user. It’s impressive, but its scope is bound to a fixed set of predefined tasks, leaving no room for dynamic, extensible orchestration.

Taken together, these tools hint at something larger, a future where the interface itself is ephemeral, summoned only when needed. But none of them bridge the gap between intent and execution.

Genie explores how Generative AI and multi-agent systems can translate user input into interfaces.

Genie explores how Generative AI and multi-agent systems can translate user input into interfaces.

Creating Genie

Genie is a proof-of-concept that reimagines the home screen as a dynamic, responsive environment. Instead of hunting for the right app, users simply state their needs in natural language, and Genie instantly generates a tailored, context-specific interface. This project was a deep dive into how generative AI and multi-agent systems could help achieve this goal.

Genie is a proof-of-concept that reimagines the home screen as a dynamic, responsive environment. Instead of hunting for the right app, users simply state their needs in natural language, and Genie instantly generates a tailored, context-specific interface. This project was a deep dive into how generative AI and multi-agent systems could help achieve this goal.

Squashing early assumptions

The objective was to challenge preconceived notions, refine the project direction, and build a stronger foundation before moving into development.

The objective was to challenge preconceived notions, refine the project direction, and build a stronger foundation before moving into development.

Grounded in a Research Through Design approach, I started by trying to break my own idea. My supervisor uses an exercise called "5 Intelligent Failures" — the goal is to rigorously stress-test core assumptions before writing a line of production code. I deconstructed the project into its five key components: the agentic framework, voice input, prompt engineering, frontend, and backend. Then I designed specific failure scenarios for each and prototyped solutions to test them.

Early Development

Early ideation focused on establishing a multi-agent architecture to understand the sequence and agent requirements.

Early ideation focused on establishing a multi-agent architecture to understand the sequence and agent requirements.

The initial design phase established the system's architecture by mapping out three key components:


  • A Supervisor Agent to interpret user intent and delegate tasks.

  • Specialized Worker Agents (e.g., weather, stocks) to retrieve data from public APIs.

  • A UI Generation Process to transform the retrieved data into visual components.

Preprototyping
  1. tldraw sketches that explored layout and interaction states (how a generated UI collapses, expands, or hands control back to the user)

  2. n8n workflow prototypes that encoded voice → action sequences.

The tldraw frames validated the multi-agent architecture logic. The n8n prototypes validated layout affordances, voice-to-intent parsing, worker sequencing, and the structured JSON the frontend would eventually consume. Both artifacts reduced risk before committing to the full stack.

Development Phase

Genie is built with:

  • n8n, implementing LangChain-style workflows for agent coordination

  • OpenAI, for natural language understanding and task parsing

  • Supabase, as the backend to store data

  • React, to generate real UI components in response to user intent

n8n

Multi-Agent AI workflow. n8n implements LangChain JS.

Supabase

A PostgreSQL based database to store widget metadata

OpenAI

Large Language Models used: o3-mini & 4o.

React

React.js framework to generate widgets.

system Architecture

Genie operates on a multi-agent architecture where a central Supervisor coordinates a network of specialized Worker agents. The system is built on three core components:

  • The Supervisor Agent, powered by GPT-4o, parses user intent and manages conversational memory.

  • Specialized Worker Agents (e.g., Weather, Stocks) execute specific tasks by making API calls and returning structured JSON.

  • A unique Coder Agent translates this JSON data into functional React components.

  • The Supervisor Agent, powered by GPT-4o, parses user intent and manages conversational memory.

  • Specialized Worker Agents (e.g., Weather, Stocks) execute specific tasks by making API calls and returning structured JSON.

  • A unique Coder Agent translates this JSON data into functional React components.

This entire orchestration is managed visually in n8n, a tool whose node-based editor makes the system's logic explicit and easy to modify.

Frontend Architecture

The frontend is built in React, using the WebSpeech API for real-time voice transcription as you speak, so the system feels responsive before you've even finished your sentence. Once speech input completes, the final transcription is dispatched to n8n via webhook for processing.

The interface itself is intentionally minimal: a clean dashboard with a central clock, closer to a home screen than an app. GridStack.js handles layout, resize, and drag-and-drop. When a widget is generated, the renderer interprets the incoming JSON spec and maps each component definition to its corresponding React widget, with Supabase to store positions and configurations between sessions.

The frontend interface of Genie adopts a minimalistic approach, featuring an empty dashboard screen with a central clock to mimic familiar desktop home screens. The interface is rendered in React, using GridStack.js to manage layout, resizing, and drag-and-drop interactions. The renderer interprets the JSON specification coming from the orchestration layer and maps each component definition to a corresponding widget. The Supabase backend is used to store widget positions and configurations between sessions.

Backend and Multi-Agent Architecture

The multi-agent architecture in Genie is built entirely within n8n. The Supervisor workflow receives the parsed user voice input and decides which Workers to activate. For example, a travel-related request might trigger Calendar, Flight Search, and Weather Workers in sequence.

Genie employs specialized worker agents that interact with specific third-party APIs. Due to time constraints during development, only the following agents were fully implemented:

  1. Calendar Agent: Manages Google Calendar interactions, allowing users to create and retrieve calendar events.

  2. Stock Agent: Utilizes the Alpha Vantage API to fetch real-time stock information, providing timely financial insights.

  3. Weather Agent: Fetches current weather information via OpenWeatherMap API.

Once the Supervisor and Worker agents complete their tasks, the Coder Agent (powered by o3-mini) takes the structured output and generates the corresponding React widget code. It follows a standardised structure throughout, using Lucide icons and TailwindCSS for consistent, visually coherent results. The finished component is automatically committed to GitHub, triggering a frontend update.

what worked

The proof-of-concept works. Say "what's the weather in Vancouver?" and Genie generates a weather widget on the fly. Ask about a stock, get a live card with real data. Request your next calendar event, and it surfaces.

The interaction isn't only retrieval but also generation. And seeing it work for the first time, however rough around the edges, confirmed the core thesis: the paradigm shift is technically possible today.

A working demo of Genie is demonstrated in the video below.

What didn't

Latency is the honest limitation. Generating a widget takes a few seconds which is fast for what's happening under the hood, but feels slow compared to tapping an app icon. The interaction model needs to account for this: feedback states, loading affordances, and progressive rendering would all help close the gap.

Although LLMs excel at generating functional code, aligning their outputs with established design standards or style guides required extensive manual intervention and iterative refinement. This is where I'd invest significant energy in a next iteration.

The rapid evolution and relative immaturity of frameworks like n8n and LangChain introduced frequent capability shifts, incomplete documentation, and unstable integrations, complicating development.

The goal was never to build a finished product. It was to demonstrate that the paradigm is ready to shift and to show what that shift might feel like.

Genie is one attempt at that. It's a proof of concept, a design fiction, and a provocation. The home screen has looked the same for forty years. It doesn't have to look the same for the next forty

Genie is one attempt at that. It's a proof of concept, a design fiction, and a provocation. The home screen has looked the same for forty years. It doesn't have to look the same for the next forty

the vision for the future

Given these practical limitations, it became necessary to shift perspective toward speculation, narrative, and design fiction as a way of imagining what Genie might become once freed from current constraints. Following Dunne and Raby’s (2013) approach to speculative design, the design fiction video serves as a provocation, encouraging reflection on future possibilities in user experience and broader societal implications of adaptive generative UIs.

Given these practical limitations, it became necessary to shift perspective toward speculation, narrative, and design fiction as a way of imagining what Genie might become once freed from current constraints. Following Dunne and Raby’s (2013) approach to speculative design, the design fiction video serves as a provocation, encouraging reflection on future possibilities in user experience and broader societal implications of adaptive generative UIs.

Sources:

  1. Corbett, E., & Weber, A. (2016, September). What can I say? addressing user experience challenges of a mobile voice user interface for accessibility. In Proceedings of the 18th international conference on human-computer interaction with mobile devices and services (pp. 72-82).

  2. Dunne, A., & Raby, F. (2013). Speculative everything: Design, fiction, and Social Dreaming. The MIT Press.

  3. Hancock, T., & Bezold, C. (1994, March). Possible futures, preferable futures. In The Healthcare Forum Journal (Vol. 37, No. 2, pp. 23-29).

  4. Li, X., Wang, S., Zeng, S., Wu, Y., & Yang, Y. (2024). A survey on LLM-based multi-agent systems: workflow, infrastructure, and challenges. Vicinagearth, 1(1), 9.

  5. Myers, C., Furqan, A., Nebolsky, J., Caro, K., & Zhu, J. (2018, April). Patterns for how users overcome obstacles in voice user interfaces. In Proceedings of the 2018 CHI conference on human factors in computing systems (pp. 1-7).

  6. Yuan, M., Chen, J., & Quigley, A. (2024a). MAxPrototyper: A Multi-Agent Generation System for Interactive User Interface Prototyping. arXiv preprint arXiv:2405.07131.

navin thomsy

I'm a multidisciplinary product designer with a unique perspective shaped by my background as an Indian third-culture kid who grew up in Muscat, Oman. This upbringing in a multicultural environment instilled in me a deep-seated empathy and a curiosity for understanding diverse human needs, which I now channel into my design practice. My work is inherently forward-looking, driven by a passion for exploring and building with frontier technologies like generative AI and extended reality.

© 2026 Navin Thomsy

navin thomsy

I'm a multidisciplinary product designer with a unique perspective shaped by my background as an Indian third-culture kid who grew up in Muscat, Oman. This upbringing in a multicultural environment instilled in me a deep-seated empathy and a curiosity for understanding diverse human needs, which I now channel into my design practice. My work is inherently forward-looking, driven by a passion for exploring and building with frontier technologies like generative AI and extended reality.

© 2026 Navin Thomsy

navin thomsy

I'm a multidisciplinary product designer with a unique perspective shaped by my background as an Indian third-culture kid who grew up in Muscat, Oman. This upbringing in a multicultural environment instilled in me a deep-seated empathy and a curiosity for understanding diverse human needs, which I now channel into my design practice. My work is inherently forward-looking, driven by a passion for exploring and building with frontier technologies like generative AI and extended reality.

© 2026 Navin Thomsy