The MultiModal Real Estate AI Agent is a specialized assistant that ingests multimodal inputs—textual listings, photographs, floorplans, and location maps—to generate comprehensive property analyses. It leverages computer vision to extract features from images and LLM capabilities to interpret descriptions and neighborhood data. The agent estimates property value, identifies investment potential, and offers personalized suggestions based on user preferences. Through an interactive chat interface, users can ask follow-up questions, request comparisons between listings, and receive visual annotations on floorplans. This end-to-end solution streamlines the real estate search and decision process by combining data-driven insights with intuitive conversational guidance.