- Step1: Clone the repository and install Python dependencies.
- Step2: Set your OPENAI_API_KEY and configure Whisper settings.
- Step3: Run the agent script in CLI mode.
- Step4: Upload or specify the target document (PDF, DOCX, TXT, image).
- Step5: Speak your query into the microphone.
- Step6: Agent transcribes your voice and processes the document.
- Step7: Receive AI-generated answers or summaries in the terminal.
- Step8: Adjust prompts or re-upload different files as needed.