Daniel Madalitso Phiri
Vision for Websites: Training Your Frontend to See
#1about 1 minute
Defining vision as the ability to deduce and understand
The concept of vision for websites is redefined from simply seeing to the ability to deduce, understand, and act on information.
#2about 4 minutes
Demo of a multimodal e-commerce search application
A live demonstration showcases an e-commerce store where users can search for products using both text queries and by uploading images.
#3about 2 minutes
What is multimodality in artificial intelligence?
Multimodality enables search queries to use multiple media types like text, images, and audio to capture more context and improve user interaction.
#4about 2 minutes
Why multimodal AI creates richer user experiences
Multimodal interfaces provide more natural and context-aware interactions, moving beyond simple keyword searches to a more intuitive experience.
#5about 4 minutes
Differentiating generative AI from embedding models
Embedding models encapsulate information into numerical representations (vectors), unlike generative models which create new data.
#6about 4 minutes
How vector search works by measuring distance
Vector search operates by converting a query into an embedding and finding the closest, most semantically similar items in a multidimensional space.
#7about 2 minutes
Creating a unified space for multimodal search
Different data types like text, images, and audio are processed by specific encoders and plotted into a single, unified vector space for cross-modal queries.
#8about 9 minutes
Implementing text-based image search with Weaviate
A code walkthrough demonstrates how to build a text-to-image search feature using a Next.js frontend and a Weaviate backend with a `nearText` query.
#9about 4 minutes
Implementing visual search with an image query
The code for an image-to-image search is explained, showing how a base64 image is sent to the backend to perform a `nearImage` vector search.
#10about 2 minutes
Expanding vision to other creative applications
Beyond e-commerce, multimodal vision can be applied to creative use cases like movie recommenders, educational tools, and map navigation.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
04:57 MIN
Increasing the value of talk recordings post-event
Cat Herding with Lions and Tigers - Christian Heilmann
03:15 MIN
The future of recruiting beyond talent acquisition
What 2025 Taught Us: A Year-End Special with Hung Lee
02:44 MIN
Rapid-fire thoughts on the future of work
What 2025 Taught Us: A Year-End Special with Hung Lee
03:28 MIN
Why corporate AI adoption lags behind the hype
What 2025 Taught Us: A Year-End Special with Hung Lee
04:27 MIN
Moving beyond headcount to solve business problems
What 2025 Taught Us: A Year-End Special with Hung Lee
03:34 MIN
The business case for sustainable high performance
Sustainable High Performance: Build It or Pay the Price
06:44 MIN
Using Chrome's built-in AI for on-device features
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
11:10 MIN
The only frontend stack that truly matters
WeAreDevelopers LIVE – Frontend Inspirations, Web Standards and more
Featured Partners
Related Videos
WeAreDevelopers LIVE – AI vs the Web & AI in Browsers
Chris Heilmann, Daniel Cranney & Raymond Camden
From ML to LLM: On-device AI in the Browser
Nico Martin
WeAreDevelopers LIVE – Frontend Inspirations, Web Standards and more
Chris Heilmann, Daniel Cranney & Jan Deppisch
What comes after ChatGPT? Vector Databases - the Simple and powerful future of ML?
Erik Bamberg
Multimodal Generative AI Demystified
Ekaterina Sirazitdinova
WeAreDevelopers LIVE - Vector Similarity Search Patterns for Efficiency and more
Chris Heilmann, Daniel Cranney, Raphael De Lio & Developer Advocate at Redis
Develop AI-powered Applications with OpenAI Embeddings and Azure Search
Rainer Stropek
Let's get visual - Visual testing in your project
Ramona Schwering
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Visonum GmbH
Remote
Junior
Intermediate
React
Redux
TypeScript




Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
RxJS
Angular
TypeScript



n8n GmbH
Berlin, Germany
Remote
Senior
API
React
Vue.js
Node.js
+1
