Skip to content

Build a Multimodal Agent

That Sees, Hears & Talks Back

Multimodal AI — systems that can understand and respond through text, images, and voice — is redefining how we interact with technology. In this lab, we’ll show how to build a multimodal agent that seamlessly blends language, vision, and voice inputs — and also speaks back — creating a fully natural human-AI conversation flow.

✅ What You’ll Learn
  • What true multimodal AI means — and why you can’t ignore it

  • How to architect agents that blend language, vision, and voice inputs/outputs

  • The must-have tools: vision models, speech-to-text, text-to-speech, and orchestration frameworks

  • Real-world use cases across support, healthcare, education, and retail

  • Why voice response is the next big leap in natural, human-like AI
🚀 Live Demo
  • 👁️ See — Analyze a screenshot of an error

  • 🎙️ Hear — Understand a voice message describing the issue

  • 💬 Chat — Ask clarifying questions in real time

  • 🗣️ Talk Back — Deliver the solution using AI-generated speech
  •  
👥 Who Should Attend
  • Developers & ML engineers adding voice/vision to AI

  • Product & CX leads rethinking support with agents

  • Business leaders automating workflows with AI

Ready to slash development time and ship your next AI idea lightning-fast?

→ Save my seat now

Omar Shanti

CTO @ HatchWorks AI​

Omar Shanti, the new Chief Technology Officer at HatchWorks AI, is an award-winning innovator with extensive experience in AI, data architecture, and pioneering technology. With a strong background from Kin + Carta, Omar has led groundbreaking projects in Generative AI, MLOps, and Blockchain. At HatchWorks AI, he will drive the Generative-Driven Development™ methodology, focusing on AI-native solutions that enhance software development, boost operational efficiency, and unlock new revenue opportunities for clients.

Matt Paige

VP Strategy @ HatchWorks AI​

Matt Paige is VP of Strategy & Marketing at HatchWorks AI, where he leads go-to-market strategy, education, and community around AI-native software. As host of the Talking AI podcast and founder of HatchWorks AI Labs, he’s known for turning complex AI concepts into practical playbooks. With a background in business strategy and a passion for hands-on training, Matt helps teams—from startups to enterprises—build real capability in the age of intelligent software.

David Berrio

Sr. AI/ML Engineer @ HatchWorks AI

David Berrio is a seasoned professional in AI and Data Science with extensive expertise in developing and deploying machine learning algorithms, leveraging a strong command of MLOps and Microservices. His career features significant experience in Computer Vision, Machine Learning, Azure Cloud, and state-of-the-art Deep Learning Architectures. A lifelong learner, David is always at the forefront of technology. He believes in the power of knowledge to drive innovation and deliver impactful solutions, guiding his professional endeavors.

Accelerate Your AI Projects with HatchWorks AI

Avoid pitfalls and keep full control over your strategy.