An intelligent AI receptionist system that handles voice-based customer interactions, maintains conversation context, extracts metadata, and schedules appointments with email confirmations.
This AI receptionist serves Gloss & Glow Hair Salon, a fictional hair salon offering:
- Services: Haircuts, Hair Coloring, Styling, and Spa Treatments
- Stylists: Riya (Haircuts & Styling), Maya (Coloring & Highlights), Sarah (Spa Treatments), Alex (Creative Cuts & Color)
- Hours: Monday-Saturday, 10 AM - 7 PM
- ๐ค Voice-to-Voice Interaction: Real-time speech-to-text and text-to-speech
- ๐ง Context-Aware Conversations: Maintains memory across the conversation
- ๐ Metadata Extraction: Automatically extracts customer name, service preferences, date, time, stylist, and email
- ๐ Appointment Scheduling: Books appointments and generates meeting links
- โ๏ธ Email Confirmations: Sends appointment confirmation emails with details
- ๐ฌ Multi-Modal Interface: Supports both voice and text input
Backend (FastAPI)
- Framework: FastAPI with WebSocket support
- STT Model: OpenAI Whisper-1 (Speech-to-Text)
- LLM: GPT-4o-mini (Conversational AI & Metadata Extraction)
- TTS Model: OpenAI TTS-1 with Nova voice (Text-to-Speech)
- Email: aiosmtplib for async email delivery
- Architecture: Modular route structure with service injection
Frontend (Streamlit)
- Framework: Streamlit 1.51.0+
- Audio Recording: audio-recorder-streamlit
- WebSocket Client: websockets 12.0
- Real-time Communication: Async WebSocket connections
User Voice Input
โ
[STT] Whisper-1 converts speech โ text
โ
[Memory Service] Uses LLM (GPT-4o-mini) to extract metadata intelligently
โ
โ
[LLM] GPT-4o-mini generates contextual response
โ
[TTS] OpenAI TTS-1 converts response โ audio
โ
User receives voice + text response
โ
[If booking detected] โ Schedule appointment โ Send email
speedchain-assignment/
โ
โโโ backend/
โ โโโ main.py # FastAPI app entry point with service injection
โ โโโ requirements.txt # Python dependencies
โ โโโ .env.example # Environment variables template
โ โโโ routes/
โ โ โโโ __init__.py # Route package init
โ โ โโโ appointments.py # Appointment scheduling endpoints
โ โ โโโ conversation.py # Conversation history endpoints
โ โ โโโ websocket.py # WebSocket handler (voice/text communication)
โ โโโ services/
โ โโโ voice_service.py # STT & TTS using OpenAI
โ โโโ llm_service.py # LLM conversation & intelligent metadata extraction
โ โโโ memory_service.py # Conversation memory & context management
โ โโโ appointment_service.py # Scheduling & email notifications
โ
โโโ frontend/
โ โโโ app.py # Streamlit UI application
โ โโโ requirements.txt # Frontend dependencies
โ
โโโ data/
โ โโโ conversations.json # Stored conversation history
โ โโโ appointments.json # Appointment records
โ
โโโ .gitignore
โโโ README.md # This file
- Python 3.9+
- OpenAI API Key
- Gmail account (for email notifications)
git clone https://github.com/shryesth/speedchain-assignment.git
cd speedchain-assignmentcd backend
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure environment variables
cp .env.example .env
# Edit .env and add:
# OPENAI_API_KEY=your_openai_api_key
# GMAIL_USER=your_email@gmail.com
# GMAIL_PASSWORD=your_app_passwordcd ../frontend
# Create virtual environment
python -m venv .venv
# Activate virtual environment
.venv\Scripts\activate # Windows
# source .venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txtTerminal 1 - Backend:
cd backend
python main.py
# Backend runs on http://localhost:8000Terminal 2 - Frontend:
cd frontend
streamlit run app.py
# Frontend runs on http://localhost:8501- Open Frontend: Navigate to
http://localhost:8501 - Voice Interaction:
- Click the microphone button to record your voice
- Speak your query (e.g., "Hi, I'd like to book a haircut")
- The AI will respond with both voice and text
- Text Interaction:
- Type your message in the text input field
- Click "Send" to get a text response
- Quick Booking:
- Use the right-side form to directly book an appointment
- View History:
- All conversations are displayed with playable audio for both user and assistant
User: "Hello, I'd like to book an appointment"
AI: "Hi! I'd be happy to help you book an appointment. What service are you interested in?"
User: "I want a haircut with Riya at 3 PM tomorrow"
AI: "Great choice! Riya is excellent with haircuts. Can I have your name and email to confirm the booking?"
User: "My name is John and my email is john@example.com"
AI: "Perfect, John! I've scheduled your haircut with Riya for tomorrow at 3 PM. You'll receive a confirmation email with the meeting link shortly."
- Why: High accuracy, multi-language support, robust to accents
- Performance: Fast transcription with good quality
- Why: Cost-effective, fast responses, good conversational abilities
- Context: Maintains conversation history for coherent interactions
- Dual Role: Both conversation generation AND intelligent metadata extraction
- Extraction: Uses structured JSON output to extract booking details from natural language
- Why: Natural-sounding voice, low latency
- Voice Choice: Nova - friendly and professional tone suitable for receptionist
- Stores complete message history per user session
- Maintains context across multiple interactions
- Persists to
data/conversations.json
Uses LLM-based intelligent extraction (GPT-4o-mini) with regex fallback:
- Customer Name: Extracted from conversation context using NLP
- Service Type: Haircut, Coloring, Styling, Spa Treatment (handles multiple services)
- Stylist Preference: Riya, Maya, Sarah, Alex
- Date: Today, Tomorrow, or specific weekdays
- Time: Time slots from 10 AM to 7 PM
- Email: Validates and auto-completes domains (e.g., "gmail" โ "gmail.com")
Key Features:
- Handles speech-to-text variations: "at the rate" โ "@", "dot" โ "."
- Accumulates information across conversation turns (uses last 10 messages for context)
- Smart email domain completion for incomplete addresses
- Robust to typos and speech recognition errors
- Structured JSON output with field validation
- Generates unique appointment IDs
- Creates Google Meet links (demo format)
- Sends confirmation emails with appointment details
- Stores appointments in
data/appointments.json
To enable email confirmations:
- Use a Gmail account
- Generate an App Password:
- Go to Google Account โ Security โ 2-Step Verification โ App Passwords
- Generate password for "Mail"
- Add to
.env:GMAIL_USER=your_email@gmail.com GMAIL_PASSWORD=your_app_password
ws://localhost:8000/ws/{client_id}- Real-time voice/text communication
GET /- Health checkPOST /schedule-appointment- Direct appointment bookingGET /appointments- List all appointmentsGET /conversation-history/{user_id}- Get user conversation history
Modular Route Structure:
- Routes separated into dedicated files (
appointments.py,conversation.py,websocket.py) - Service injection pattern for dependency management
- Clean separation of concerns
Service Layer:
VoiceService: Handles STT/TTS operationsLLMService: Manages conversations AND metadata extractionMemoryService: Conversation context and persistenceAppointmentService: Booking logic and email notifications
Edit backend/services/memory_service.py to add service keywords:
services = ["haircut", "coloring", "your_new_service"]Add to the stylists list in memory_service.py and update the frontend display.
Modify backend/services/voice_service.py:
voice="nova" # Options: alloy, echo, fable, onyx, nova, shimmerWebSocket Connection Issues:
- Ensure backend is running on port 8000
- Check firewall settings
- Verify OPENAI_API_KEY is set
Audio Not Playing:
- Check browser audio permissions
- Ensure audio format compatibility (wav for user, mp3 for AI)
Email Not Sending:
- Verify Gmail credentials in
.env - Check App Password is correctly generated
- Ensure 2FA is enabled on Gmail account
Name: Shryesth Pandey
Repository: speedchain-assignment