A real-time face recognition desktop application built with Python and PyQt6, designed for the School of Information Technology (SIT). Features photobooth functionality with frame overlays, voice greetings, and server integration for photo storage.
- Real-time Face Detection & Recognition - Uses InsightFace for accurate face detection and recognition
- Voice Greetings - Automatic offline TTS greetings when recognizing known faces
- Photobooth Mode - Capture photos with customizable PNG frame overlays
- Server Upload - Upload captured photos to a FastAPI server with QR code generation
- Live Camera Preview - High-quality 720p camera feed with face bounding boxes
- Countdown Overlay - Visual countdown timer displayed on the camera preview before capture
- QR Code Generation - Instant QR codes for uploaded photos for easy sharing
- Recognition Controls - Adjustable confidence threshold and toggle recognition on/off
- Server Status Panel - Monitor server connectivity with ping functionality
- Frame Selection - Choose from 4 customizable photobooth frames
- Precomputed Embeddings - Fast recognition using pre-generated face embeddings
- Multi-threaded Processing - Separate threads for camera, recognition, and TTS
- Clean Architecture - Strict layered architecture with one-directional dependencies
- Cross-platform - Works on Windows, macOS, and Linux
The application follows a strict 5-layer architecture with one-directional dependencies:
┌─────────────────────────────────────────────────────────────┐
│ UI Layer │
│ PyQt6 widgets, main window, panels, camera preview │
└─────────────────────────┬───────────────────────────────────┘
│ uses
┌─────────────────────────▼───────────────────────────────────┐
│ App Layer │
│ Controllers, thread management, orchestration │
└─────────────────────────┬───────────────────────────────────┘
│ uses
┌─────────────────────────▼───────────────────────────────────┐
│ Domain Layer │
│ Face matching logic, business rules, domain models │
└─────────────────────────┬───────────────────────────────────┘
│ uses
┌─────────────────────────▼───────────────────────────────────┐
│ Infrastructure Layer │
│ Camera, InsightFace, TTS, HTTP client, frame compositor │
└─────────────────────────┬───────────────────────────────────┘
│ uses
┌─────────────────────────▼───────────────────────────────────┐
│ Data Layer │
│ Configuration, dataset loader, embeddings store │
└─────────────────────────────────────────────────────────────┘See PROJECT_STRUCTURE.md for detailed file organization.
- Python 3.10 or higher
- Webcam/Camera
- Windows/macOS/Linux
git clone https://github.com/PhilixTheExplorer/sit-face-recog.git
cd sit-face-recogpython -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activatepip install -r requirements.txtNote: InsightFace is the primary face recognition library. If you have issues installing it, you can use
face_recognitionas a fallback by uncommenting those lines inrequirements.txt.
Create a faces_dataset/ directory with subdirectories for each person:
faces_dataset/
├── Person1/
│ ├── photo1.jpg
│ ├── photo2.jpg
│ └── photo3.png
├── Person2/
│ ├── image1.jpg
│ └── image2.jpg
└── ... (add more people)Tips for best results:
- Use 2-5 clear face images per person
- Images should have good lighting
- Face should be clearly visible and frontal
- Supported formats:
.jpg,.jpeg,.png,.bmp
Place 4 PNG frame templates with transparency in pb_frames/:
pb_frames/
├── frame_1.png
├── frame_2.png
├── frame_3.png
└── frame_4.pngFrame requirements:
- PNG format with transparency (alpha channel)
- Recommended resolution: 1280x720 or 1920x1080
- The transparent area is where the captured photo will appear
python precompute_embeddings.pyThis generates embeddings.npz containing face embeddings for all people in the dataset. Re-run this command whenever you add new faces to the dataset.
python main.pyIn a separate terminal, start the FastAPI server for photo uploads:
python server.pyOr with uvicorn for development:
uvicorn server:app --reload --host 0.0.0.0 --port 8000┌──────────────────────────────────────────────────────────────────────┐
│ SIT Face Recognition System │
├─────────────────────────────────────────┬────────────────────────────┤
│ │ ┌────────────────────────┐│
│ │ │ Photobooth ││
│ │ │ ┌────────┬────────┐ ││
│ │ │ │ Frame 1│ Frame 2│ ││
│ Live Camera Preview │ │ ├────────┼────────┤ ││
│ (720p, 16:9 aspect ratio) │ │ │ Frame 3│ Frame 4│ ││
│ │ │ └────────┴────────┘ ││
│ [Face boxes with names] │ │ [📸 Capture Photo] ││
│ [FPS counter] │ └────────────────────────┘│
│ [Countdown overlay on capture] │ ┌────────────────────────┐│
│ │ │ Last Upload ││
│ │ │ [QR Code] [Thumbnail] ││
│ │ └────────────────────────┘│
│ │ ┌────────────────────────┐│
│ │ │ Recognition Controls ││
│ │ │ Threshold: [====] 50% ││
│ │ │ Recognition: [ON/OFF] ││
│ │ │ Detected: John Doe ││
│ │ └────────────────────────┘│
│ │ ┌────────────────────────┐│
│ │ │ Server Status ││
│ │ │ Status: Connected ✓ ││
│ │ │ [Ping Server] ││
│ │ └────────────────────────┘│
└─────────────────────────────────────────┴────────────────────────────┘
│ Ready │
└──────────────────────────────────────────────────────────────────────┘- Select a Frame - Click on one of the 4 frame thumbnails
- Click Capture - A 3-second countdown appears as an overlay on the camera
- Smile! - Photo is captured when countdown reaches zero
- Auto Upload - Photo is automatically composed with the frame and uploaded
- Share - Scan the QR code to view/download the photo
Note: Face recognition is automatically disabled during the countdown for a cleaner capture experience.
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Server info and available endpoints |
/health |
GET | Health check with timestamp |
/photobooth/upload |
POST | Upload photobooth image |
/photobooth/list |
GET | List all stored images |
/photobooth/list/{person_name} |
GET | List images for a specific person |
/photobooth/stats |
GET | Get storage statistics |
/photobooth/image/{person_name}/{filename} |
GET | Download specific image |
/photobooth/gallery |
GET | HTML gallery of all photos |
/photobooth/gallery/{person_name} |
GET | HTML gallery for specific person |
Edit src/data/config.py to customize settings:
@dataclass
class CameraConfig:
device_index: int = 0 # Camera device (0, 1, 2...)
frame_width: int = 1280 # Capture width
frame_height: int = 720 # Capture height
fps: int = 30 # Target FPS@dataclass
class RecognitionConfig:
confidence_threshold: float = 0.5 # Min confidence (0-1)
min_face_size: int = 50 # Min face size in pixels
detection_scale: float = 0.5 # Scale for faster detection@dataclass
class GreetingConfig:
enabled: bool = True
cooldown_seconds: float = 15.0 # Re-greet cooldown
greeting_template: str = "Hello {name}"@dataclass
class ServerConfig:
base_url: str = "http://localhost:8000"
upload_endpoint: str = "/photobooth/upload"
timeout_seconds: float = 10.0- Check camera connection and permissions
- Try different
device_indexvalues (0, 1, 2...) in config - On Linux, ensure your user is in the
videogroup
- Run
python precompute_embeddings.pyto regenerate embeddings - Ensure dataset images have clear, frontal faces
- Check that InsightFace or face_recognition is properly installed
- Try lowering the confidence threshold
- Ensure pyttsx3 is installed:
pip install pyttsx3 - On Linux, install espeak:
sudo apt-get install espeak - On macOS, the built-in TTS should work automatically
- Ensure server is running:
python server.py - Check the
base_urlin config matches your server - Verify firewall allows connections on port 8000
- For ngrok, update the URL in config after each restart
- Try:
pip install insightface onnxruntime - On Windows, you may need Visual C++ Build Tools
- Alternative: Use face_recognition library (see requirements.txt)
See PROJECT_STRUCTURE.md for the complete project structure and file descriptions.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- InsightFace - Face detection and recognition
- PyQt6 - Desktop UI framework
- FastAPI - Server framework
- pyttsx3 - Offline text-to-speech