Aria Xiying Bao, Yubo Zhao
We collaboratively developed the concept and interaction flow. I architected and implemented the full software stack, including AI integration, hardware control, and real-time synchronization.
Narratron is an interactive projector that augments hand shadow puppetry with AI-generated storytelling. Designed for all ages, this product transforms traditional physical shadow plays into a truly immersive and intelligent phygital storytelling experience.
The history of hand shadow play is nearly untraceable which was prevalently practiced long before the existence of Greek shadow show Karagiozis or Chinese shadow puppetry Pi Ying Xi. It is a prelinguistic and transcultural form of storytelling that entertains, educates, and instills moral values in the younger generation; it is also a stimuli of creative production, by mimicking the things we see, and by telling the stories we relate. Narratron, in that sense, has deeply embedded AI into this intelligent collective effort of hands, eyes, and brains as a true “fairytale copilot”. Its nature of multimodality that combines visual, auditory, tactile, and textual I/O, supported by the collaboration of LLM, image classifier, speech synthesizer and diffusion models, demonstrates how seamlessly we are able to make bodily interactions with AI. Through bridging the digital and the physical, we are now bridging the ancient and the future.
Narratron initializes with a clear instruction screen that guides users through the interaction paradigm. The system prompts users to create hand shadows using the integrated projection surface.
Users experiment freely with shadow formations while the webcam continuously monitors the projection area. When users achieve a desired shadow shape, they press the capture button to freeze and process the gesture.
Make your hand shadow to start the story
Press the button to capture shadow
Spin the knob to reveal the story
The system employs computer vision algorithms to classify captured shadows into predefined categories. These classifications trigger GPT-3.5 to generate original narratives with structured plot development and thematic coherence.
Concurrently, our LoRA-adapted Stable Diffusion model generates environmental backgrounds that contextualize the narrative without rendering the protagonist. This design preserves the user's shadow as the active character while AI provides atmospheric staging.
The resulting experience synchronizes AI-generated narration with projected visuals. Users navigate chapters via rotary encoder—a deliberate interface choice that provides tactile control over narrative pacing. This physical interaction mechanism ensures users maintain agency throughout the storytelling experience, advancing through generated content at their preferred pace.
Narratron employs a minimalist industrial design with deliberate component placement and material selection. The enclosure features a clean white finish that doubles as the projection surface, with strategic yellow accents highlighting interactive elements.
The interface consists of two primary controls: a top-mounted button for shadow capture and a front-facing rotary encoder for story navigation. We positioned these controls based on ergonomic studies and user testing to ensure intuitive operation across age groups and abilities.
The oversized button and knob serve both functional and aesthetic purposes. Their scale enhances discoverability and encourages interaction while accommodating users with varying motor abilities. The rotary encoder's industrial design references vintage projection equipment, creating a familiar interaction paradigm that requires no instruction.
Narratron's industrial design prioritizes user-centered functionality and ergonomic considerations. The placement of the button and knob is carefully optimized for intuitive operation, ensuring a seamless and enjoyable user experience. The product's clean and uncluttered design eliminates unnecessary distractions, allowing users to focus on the immersive storytelling journey.
Narratron’s tech stack combines a variety of powerful technologies and frameworks. Teachable Machine is used for hand gesture recognition, GPT-3.5 for story generation, Stable Diffusion for image generation, and Replicate for custom model training. This comprehensive tech stack enables Narratron to deliver an interactive storytelling experience that blends user creativity with AI-generated narratives and visuals.
Teachable Machine: To enable hand gesture recognition, Narratron utilizes Teachable Machine, an online platform that allows for the training of custom machine learning models. The hand gesture recognition model is trained using Teachable Machine's interface, which captures and analyzes various hand shadow shapes to recognize and interpret user movements accurately.
GPT-3.5-turbo: Narratron employs the powerful GPT-3.5-turbo model developed by OpenAI for generating dynamic and engaging stories. The model is trained on a vast corpus of text, including literary works, articles, and other narrative sources. To specifically train the story generation model for Narratron, The Ugly Duckling story might have been used as a training dataset, providing a foundation for generating narratives based on user inputs and animal keywords.
Stable Diffusion: For image generation, Narratron utilizes the Stable Diffusion algorithm. This algorithm leverages deep generative models to produce visually appealing and contextually relevant images. The image generation model has been trained using a custom dataset created specifically for Narratron. The training data likely includes various animal images and visual elements related to storytelling.
Replicate: In order to train the image generation model, Narratron employs the Replicate framework. This framework allows for the development and training of custom deep learning models. Using Replicate, the image generation model is trained on the custom dataset, incorporating techniques such as conditional image generation to ensure that the generated images correspond to the user's hand shadow animal keywords.
1. Persistent Narrative Memory
Narratron retains session history, opens with a succinct “Previously …” recap, and extends established plotlines and character arcs. The result is a continuously evolving, household‑specific story universe that deepens emotional engagement through meaningful callbacks and character growth.
2. Voice‑Branching Choice Moments
At pivotal story beats, Narratron offers two or more narrative paths and listens for the player’s spoken choice (e.g., “the glowing door”). On‑device voice recognition then steers the plot down the selected branch, reinforcing agency while keeping the interaction hands‑free and immersive.
3. Modular Shadow Props
A set of magnetically attached silhouette frames—dragon wings, castle spires, star portals, and more—snap over the light source in seconds, instantly expanding the shadow vocabulary. These swappable physical add‑ons enrich imaginative play while preserving Narratron’s screen‑free, hands‑on ethos.