Genie

Reimagining homescreen interfaces using generative AI

Genie

Reimagining homescreen interfaces using generative AI

Type

Master's Graduation Project

Year

2025

Timeline

Jan - Apr (4 Months)

Team Size

Individual + 2 Supervisors

Keywords

Generative AI, Generative UI, Design Futures, Design Fiction, Multi-Agent Systems.

The Context

Genie is the culminating project for my Master of Science in Interactive Arts and Technology at Simon Fraser University. For decades, we've been stuck with the same static, app-centric home screens on our phones and computers. This rigid grid of icons forces us to manually hunt and peck through different applications to complete even simple tasks, creating a fragmented and inefficient experience. My research challenged this outdated paradigm by exploring a fundamental question: What if our interfaces could build themselves around our needs in real-time? Genie is a reimagining of the home screen where multi-agent AI systems dynamically generate adaptive interfaces in direct response to user intent.

Objectives

This project's objective was to move beyond theory and demonstrate a tangible alternative to the static, app-based interface. Employing a Design Futures methodology along with Design Fiction, my goal was to bridge the gap between current technological limitations and a preferred future for user interaction. This dual-focused approach led to two primary deliverables:

A Functional Proof-of-Concept: A working, voice-driven prototype that validates the core interaction model by interpreting user intent and dynamically generating a UI using current technology.
A Diegetic Prototype Video: A key Design Fiction artifact that communicates the long-term vision of a seamless, instantaneous, and truly adaptive interface, unhindered by present-day constraints.

Is this the best we can do?

For decades, home-screen interfaces have remained fundamentally unchanged: a passive, app-centric model that forces users to adapt to the system.

For over forty years, the way we interact with our digital devices has remained largely the same. We navigate through a static grid of applications, forcing our needs to conform to a rigid, app-centric model.

In fact, our home screens have looked essentially the same since Xerox invented the graphical interface Alto and the System 1 by Apple. Yes, we have come a long way when it comes to what our OS's could do. However, what hasn’t changed all that much is how we interact with it.

This realization hit me during my master's program at Simon Fraser University, leading to Genie, my exploration of what interfaces could become. This project explores an alternative by reimagining the home-screen as a dynamic, responsive environment.

Identifying the problem

This approach works for many good reasons, which is why we've used it for 40 years. But it assumes users know exactly which apps to use, how to use them, and in what order.

The current paradigm led me to believe that there is a "crisis of imagination", a stagnation in design where we've accepted a passive, fragmented user experience. The process of performing a simple task like checking flight times or creating a calendar event requires multiple manual steps and navigation across different apps, creating unnecessary cognitive load.

It’s also inaccessible for people who aren’t tech-savvy, or who have physical or cognitive disabilities as they find these systems rigid and unforgiving.

And even when the user knows what they’re doing, there are so many options. There are multiple apps that all do the same thing slightly differently. You end up spending time downloading and comparing tools to find the right one, only to find out they are locked behind a subscription.

It’s also inaccessible for people who aren’t tech-savvy, or who have physical or cognitive disabilities as they find these systems rigid and unforgiving.

Designing for the future?

So then how do we go beyond incremental improvements and rethink the way we interact with our devices?

So acknowledging this crisis of imagination led me to ask a broader question:

How do we go beyond incremental improvements and rethink the way we interact with our devices?

To tackle that, I needed a methodology that wasn’t just about fixing what’s broken, but one that could help me imagine what should be and what could be. I employed a dual methodological approach, using Design Futures and Design Fiction.

Design Futures merges Design Thinking's problem-solving with Futures Thinking's foresight. The (modified) diagram above by Elliot P. Montgomery shows where these methods lie in this spectrum between the Unconstrained & Imaginary and Constrained & Applied design practices.

I looked at what is making the headlines these days, AI, LLMs, agentic frameworks and tools and designed for probable futures i.e. scenarios that are likely enabled by near-term technological advancements or in other words, what could we realistically build now?

So acknowledging this crisis of imagination led me to ask a broader question:

How do we go beyond incremental improvements and rethink the way we interact with our devices?

So acknowledging this crisis of imagination led me to ask a broader question:

How do we go beyond incremental improvements and rethink the way we interact with our devices?

However, designing only within today's technical boundaries would limit our understanding of this technology's true potential.

Following Dunne and Raby's approach in Speculative Everything: Design, Fiction and Social Dreaming, I created speculative artifacts as tools for critical inquiry. Their work demonstrates how design can interrogate possible futures rather than simply solve present problems. For this project, I developed a diegetic prototype, what David A. Kirby in Lab Coats in Hollywood describes as a prototype existing within a fictional narrative context.

This speculative vision video explores how intent-driven interfaces might evolve in a more technologically mature ecosystem, similar to how Apple's 1987 "Knowledge Navigator" concept helped audiences imagine computing's future possibilities. By situating the technology within a plausible future scenario, the video becomes a platform for examining not just what this technology could do, but how it might reshape our relationship with digital devices entirely.

However, designing only within today's technical boundaries would limit our understanding of this technology's true potential.

Design × Futures

Design Futures is a fairly new industry practice; it's a hybrid speculative design methodology combining elements of Design Thinking and Futures Thinking. Design Futures, as defined by Jod Kaftan (Head of Product Design & Research at Oracle), begins with identifying drivers and signals that shape our technological landscape. Drivers are "tides" or underlying forces in the present with deep roots in historic patterns of change. This can be cultural, economic, technological, or environmental. Signals are "waves" or early indicators: small innovations (a new product service, initiative, policy, data, social convention or technology) happening today that hint at larger transformations ahead.

For this project, I identified two primary drivers reshaping interface design:

Generative AI advancement: Moving from content creation to functional interface generation
Multi-agent systems: Enabling AI to perform complex, collaborative tasks autonomously

My signals came from cutting-edge AI that, while still experimental, pointed to a clear future. Products like Vercel's v0 proved functional UIs could be generated from text, and collaborative AI frameworks like AutoGen showed how complex problems could be solved. Seeing major corporations like Microsoft and Google pushing these technologies confirmed that intent-driven interfaces were not just possible, but inevitable.

Phil Balagtas, Experience Design Director at McKinsey and the founder of the Design Futures Initiative emphasizes that the future should not be seen as a straight line but as a cone of possibilities (many "futures"), encompassing probable, plausible, and possible outcomes. I chose to focus on two:

The Probable Future: What's most likely to happen given current technological constraints and adoption patterns. My functional proof-of-concept operates within today's technical limitations while demonstrating the core value proposition of intent-driven interfaces.
The Preferred Future: The ideal outcome we should strive toward, guided by user values and needs rather than just technical possibility. This vision pushes beyond current limitations to imagine interfaces that truly serve human intentions rather than forcing users to navigate rigid app structures.

For this project, I identified two primary drivers reshaping interface design:

Generative AI advancement: Moving from content creation to functional interface generation
Multi-agent systems: Enabling AI to perform complex, collaborative tasks autonomously

The Probable Future: What's most likely to happen given current technological constraints and adoption patterns. My functional proof-of-concept operates within today's technical limitations while demonstrating the core value proposition of intent-driven interfaces.
The Preferred Future: The ideal outcome we should strive toward, guided by user values and needs rather than just technical possibility. This vision pushes beyond current limitations to imagine interfaces that truly serve human intentions rather than forcing users to navigate rigid app structures.

For this project, I identified two primary drivers reshaping interface design:

Generative AI advancement: Moving from content creation to functional interface generation
Multi-agent systems: Enabling AI to perform complex, collaborative tasks autonomously

The Probable Future: What's most likely to happen given current technological constraints and adoption patterns. My functional proof-of-concept operates within today's technical limitations while demonstrating the core value proposition of intent-driven interfaces.
The Preferred Future: The ideal outcome we should strive toward, guided by user values and needs rather than just technical possibility. This vision pushes beyond current limitations to imagine interfaces that truly serve human intentions rather than forcing users to navigate rigid app structures.

What could this mean for user interfaces?

My research identified a convergence of key technological drivers. Academic literature shows that multi-agent systems are shifting development from coding to orchestration (Li et al., 2024), with AI agents already collaborating to generate coherent UI prototypes (Yuan et al., 2024a). At the same time, LLMs are finally overcoming the documented failures of early voice interfaces (Corbett & Weber, 2016; Myers et al., 2018) by enabling flexible, conversational interaction.

This led to my core insight:

LLMs provide the reasoning, agents provide the structure, and voice lowers the barrier to state what you need.

This led to my core insight:

LLMs provide the reasoning, agents provide the structure, and voice lowers the barrier to state what you need.

This led to my core insight:

LLMs provide the reasoning, agents provide the structure, and voice lowers the barrier to state what you need.

I can't be the first one to try, right?

Before I got to start working on my solution, I analyzed some tools and devices that have attempted to change how we interact with interfaces. I chose three interesting different products that target different needs to understand the strengths they offer and the limitations they faced.

Vercel’s v0 can transform plain-text descriptions into working React components, making it a powerful tool for developers. Yet its focus remains front-end prototyping, not complete end-to-end experiences, and it still assumes a developer in the loop.

Natural by Brain Technologies Inc. takes a more radical stance, abandoning the app grid entirely in favor of dynamically generated interfaces from natural language. But its closed architecture and limited extensibility make it feel less like a platform and more like a specialized search engine.

Rabbit R1 positions itself as an AI-powered personal assistant with a “Generative UI” that adapts to the user. It’s impressive, but its scope is bound to a fixed set of predefined tasks, leaving no room for dynamic, extensible orchestration.

Taken together, these tools hint at something larger, a future where the interface itself is ephemeral, summoned only when needed. But none fully bridges the gap between intent and execution.

Genie explores how Generative AI and multi-agent systems can translate user input into interfaces.

Creating Genie

Genie is a proof-of-concept that reimagines the home screen as a dynamic, responsive environment. Instead of hunting for the right app, users simply state their needs in natural language, and Genie instantly generates a tailored, context-specific interface. This project was a deep dive into how generative AI and multi-agent systems could help achieve this goal.

Squashing early assumptions

The objective was to challenge preconceived notions, refine the project direction, and build a stronger foundation before moving into development.

Grounded in a Research Through Design approach, this project utilized the "5 Intelligent Failures", an exercise developed by my supervisor to rigorously test core assumptions. My process involved:

Deconstructing the idea into five key components: the agentic framework, voice input, prompt engineering, frontend, and backend.
Challenging the assumptions of each component by designing specific failure scenarios.
Prototyping solutions using sketches, mockups, and simple code to test these scenarios.
Validating each test with peers and supervisors to gather feedback and iterate quickly, which confirmed and strengthened the final design.

Grounded in a Research Through Design approach, this project utilized the "5 Intelligent Failures", an exercise developed by my supervisor to rigorously test core assumptions. My process involved:

Deconstructing the idea into five key components: the agentic framework, voice input, prompt engineering, frontend, and backend.
Challenging the assumptions of each component by designing specific failure scenarios.
Prototyping solutions using sketches, mockups, and simple code to test these scenarios.
Validating each test with peers and supervisors to gather feedback and iterate quickly, which confirmed and strengthened the final design.

Grounded in a Research Through Design approach, this project utilized the "5 Intelligent Failures", an exercise developed by my supervisor to rigorously test core assumptions. My process involved:

Deconstructing the idea into five key components: the agentic framework, voice input, prompt engineering, frontend, and backend.
Challenging the assumptions of each component by designing specific failure scenarios.
Prototyping solutions using sketches, mockups, and simple code to test these scenarios.
Validating each test with peers and supervisors to gather feedback and iterate quickly, which confirmed and strengthened the final design.

Early Development

Early ideation focused on establishing a multi-agent architecture to understand the sequence and agent requirements.

The initial design phase established the system's architecture by mapping out three key components:

A Supervisor Agent to interpret user intent and delegate tasks.
Specialized Worker Agents (e.g., weather, stocks) to retrieve data from public APIs.
A UI Generation Process to transform the retrieved data into visual components.

Preprototyping

Before writing production code I built two parallel, low-cost artifacts

tldraw sketches that explored layout and interaction states (how a generated UI collapses, expands, or hands control back to the user)
n8n workflow prototypes that encoded voice → action sequences.

The tldraw frames validated the basic logic involving the multi-agent architecture; the n8n prototypes validated layout affordances, voice input to intent parsing, worker sequencing, and the structured JSON the frontend would consume. These artifacts reduced risk before investing in the full stack.

Development Phase

Genie is built with:

n8n, implementing LangChain-style workflows for agent coordination
OpenAI, for natural language understanding and task parsing
Supabase, as the backend to store data
React, to generate real UI components in response to user intent

n8n

Multi-Agent AI workflow. n8n implements LangChain JS.

Supabase

A PostgreSQL based database to store widget metadata

OpenAI

Large Language Models used: o3-mini & 4o.

React

React.js framework to generate widgets.

System Architecture

Genie operates on a multi-agent architecture where a central Supervisor coordinates a network of specialized Worker agents. The system is built on three core components:

The Supervisor Agent, powered by GPT-4o, parses user intent and manages conversational memory.
Specialized Worker Agents (e.g., Weather, Stocks) execute specific tasks by making API calls and returning structured JSON.
A unique Coder Agent translates this JSON data into functional React components.

This entire orchestration is managed visually in n8n, a tool whose node-based editor makes the system's logic explicit and easy to modify.

The real-time workflow for generating UI unfolds in a clear sequence:

The Supervisor analyzes a user's request.
It triggers the relevant Worker agent(s) to perform the task.
The Worker returns data to the system as structured JSON.
The Coder agent converts this JSON into a new React component.
The component is automatically pushed to a GitHub repository.
Finally, the frontend fetches the component from GitHub and renders it for the user.

Genie operates on a multi-agent architecture where a central Supervisor coordinates a network of specialized Worker agents. The system is built on three core components:

The Supervisor Agent, powered by GPT-4o, parses user intent and manages conversational memory.
Specialized Worker Agents (e.g., Weather, Stocks) execute specific tasks by making API calls and returning structured JSON.
A unique Coder Agent translates this JSON data into functional React components.

This entire orchestration is managed visually in n8n, a tool whose node-based editor makes the system's logic explicit and easy to modify.

The real-time workflow for generating UI unfolds in a clear sequence:

The Supervisor analyzes a user's request.
It triggers the relevant Worker agent(s) to perform the task.
The Worker returns data to the system as structured JSON.
The Coder agent converts this JSON into a new React component.
The component is automatically pushed to a GitHub repository.
Finally, the frontend fetches the component from GitHub and renders it for the user.

Genie operates on a multi-agent architecture where a central Supervisor coordinates a network of specialized Worker agents. The system is built on three core components:

The Supervisor Agent, powered by GPT-4o, parses user intent and manages conversational memory.
Specialized Worker Agents (e.g., Weather, Stocks) execute specific tasks by making API calls and returning structured JSON.
A unique Coder Agent translates this JSON data into functional React components.

This entire orchestration is managed visually in n8n, a tool whose node-based editor makes the system's logic explicit and easy to modify.

The real-time workflow for generating UI unfolds in a clear sequence:

The Supervisor analyzes a user's request.
It triggers the relevant Worker agent(s) to perform the task.
The Worker returns data to the system as structured JSON.
The Coder agent converts this JSON into a new React component.
The component is automatically pushed to a GitHub repository.
Finally, the frontend fetches the component from GitHub and renders it for the user.

Frontend Architecture

Genie's user experience centers around intuitive voice interactions. The system employs the WebSpeech API for accurate speech transcription. The speech recognition component is configured to display interim transcriptions, providing users real-time feedback as they speak. Upon completion of speech input or manual termination, a final transcription is sent to an external webhook service (n8n) for further processing.

The frontend interface of Genie adopts a minimalistic approach, featuring an empty dashboard screen with a central clock to mimic familiar desktop home screens. The interface is rendered in React, using GridStack.js to manage layout, resizing, and drag-and-drop interactions. The renderer interprets the JSON specification coming from the orchestration layer and maps each component definition to a corresponding widget. The Supabase backend is used to store widget positions and configurations between sessions.

Backend and Multi-Agent Architecture

The multi-agent architecture in Genie is built entirely within n8n. The Supervisor workflow receives the parsed user voice input and decides which Workers to activate. For example, a travel-related request might trigger Calendar, Flight Search, and Weather Workers in sequence.

Genie employs specialized worker agents that interact with specific third-party APIs. Due to time constraints during development, only the following agents were fully implemented:

Calendar Agent: Manages Google Calendar interactions, allowing users to create and retrieve calendar events.

Stock Agent: Utilizes the Alpha Vantage API to fetch real-time stock information, providing timely financial insights.

Weather Agent: Fetches current weather information via OpenWeatherMap API.

Following task completion by the Supervisor Agent and worker agents, the Coder Agent which is powered by OpenAI’s o3-mini model, receives structured task outputs and generates the appropriate React widget code. The generated React code adheres to a standardized structure for dynamic content and interactive elements, such as icons from Lucide, and visually appealing styles using TailwindCSS. After code generation, the Coder Agent automatically commits these new widget files directly into the GitHub repository, triggering frontend updates.

Results and Limitations

While the Design Futures methodology enabled the creation of a functional proof-of-concept and clarified practical avenues for generative AI-driven interfaces, several critical limitations emerged during the development process:

Performance constraints: The current technological infrastructure for running large language models (LLMs) and multi-agent systems (MAS) introduced latency issues. Genie’s dependence on cloud-based processing results in significant delays of 15 to 40 seconds per task, detracting from the envisioned fluid user experience.
Consistency in UI design: Although LLMs excel at generating functional code, aligning their outputs with established design standards or style guides required extensive manual intervention and iterative refinement.
Framework maturity and stability: The rapid evolution and relative immaturity of frameworks like n8n and LangChain introduced frequent capability shifts, incomplete documentation, and unstable integrations, complicating development.

A working demo of Genie is demonstrated in the video below.

Design Fiction

Given these practical limitations, it became necessary to shift perspective toward speculation, narrative, and design fiction as a way of imagining what Genie might become once freed from current constraints. Following Dunne and Raby’s (2013) approach to speculative design, the design fiction video serves as a provocation, encouraging reflection on future possibilities in user experience and broader societal implications of adaptive generative UIs. It challenges conventional interaction paradigms and prompts viewers to critically assess what desirable computing futures should look like, ensuring these visions align with user needs.

Ultimately, this approach aims not merely to illustrate technological capabilities but to provoke deeper dialogues about the role of technology in everyday life and the principles that should guide future computing systems.

I'm Navin Thomsy, a multidisciplinary product designer with a unique perspective shaped by my background as an Indian third-culture kid who grew up in Muscat, Oman. This upbringing in a multicultural environment instilled in me a deep-seated empathy and a curiosity for understanding diverse human needs, which I now channel into my design practice. My work is inherently forward-looking, driven by a passion for exploring and building with frontier technologies like generative AI and extended reality.

Socials

Instagram

Contact

Resume

Navin

Thomsy

Socials

Instagram

Contact

Resume

Navin

Thomsy

Socials

Instagram

Contact

Resume

Navin

Thomsy