SIT Face Recognition System

A real-time face recognition desktop application built with Python and PyQt6, designed for the School of Information Technology (SIT). Features photobooth functionality with frame overlays, voice greetings, and server integration for photo storage.

✨ Features

Core Features

Real-time Face Detection & Recognition - Uses InsightFace for accurate face detection and recognition
Voice Greetings - Automatic offline TTS greetings when recognizing known faces
Photobooth Mode - Capture photos with customizable PNG frame overlays
Server Upload - Upload captured photos to a FastAPI server with QR code generation
Live Camera Preview - High-quality 720p camera feed with face bounding boxes

UI Features

Countdown Overlay - Visual countdown timer displayed on the camera preview before capture
QR Code Generation - Instant QR codes for uploaded photos for easy sharing
Recognition Controls - Adjustable confidence threshold and toggle recognition on/off
Server Status Panel - Monitor server connectivity with ping functionality
Frame Selection - Choose from 4 customizable photobooth frames

Technical Features

Precomputed Embeddings - Fast recognition using pre-generated face embeddings
Multi-threaded Processing - Separate threads for camera, recognition, and TTS
Clean Architecture - Strict layered architecture with one-directional dependencies
Cross-platform - Works on Windows, macOS, and Linux

🏗️ Architecture

The application follows a strict 5-layer architecture with one-directional dependencies:

┌─────────────────────────────────────────────────────────────┐
│                        UI Layer                             │
│   PyQt6 widgets, main window, panels, camera preview       │
└─────────────────────────┬───────────────────────────────────┘
                          │ uses
┌─────────────────────────▼───────────────────────────────────┐
│                       App Layer                             │
│   Controllers, thread management, orchestration            │
└─────────────────────────┬───────────────────────────────────┘
                          │ uses
┌─────────────────────────▼───────────────────────────────────┐
│                     Domain Layer                            │
│   Face matching logic, business rules, domain models       │
└─────────────────────────┬───────────────────────────────────┘
                          │ uses
┌─────────────────────────▼───────────────────────────────────┐
│                  Infrastructure Layer                       │
│   Camera, InsightFace, TTS, HTTP client, frame compositor  │
└─────────────────────────┬───────────────────────────────────┘
                          │ uses
┌─────────────────────────▼───────────────────────────────────┐
│                      Data Layer                             │
│   Configuration, dataset loader, embeddings store          │
└─────────────────────────────────────────────────────────────┘

See PROJECT_STRUCTURE.md for detailed file organization.

📋 Requirements

Python 3.10 or higher
Webcam/Camera
Windows/macOS/Linux

🚀 Installation

1. Clone the Repository

git clone https://github.com/PhilixTheExplorer/sit-face-recog.git
cd sit-face-recog

2. Create Virtual Environment (Recommended)

python -m venv venv

# Windows
venv\Scripts\activate

# Linux/Mac
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

Note: InsightFace is the primary face recognition library. If you have issues installing it, you can use face_recognition as a fallback by uncommenting those lines in requirements.txt.

4. Prepare Face Dataset

Create a faces_dataset/ directory with subdirectories for each person:

faces_dataset/
├── Person1/
│   ├── photo1.jpg
│   ├── photo2.jpg
│   └── photo3.png
├── Person2/
│   ├── image1.jpg
│   └── image2.jpg
└── ... (add more people)

Tips for best results:

Use 2-5 clear face images per person
Images should have good lighting
Face should be clearly visible and frontal
Supported formats: .jpg, .jpeg, .png, .bmp

5. Add Photobooth Frames

Place 4 PNG frame templates with transparency in pb_frames/:

pb_frames/
├── frame_1.png
├── frame_2.png
├── frame_3.png
└── frame_4.png

Frame requirements:

PNG format with transparency (alpha channel)
Recommended resolution: 1280x720 or 1920x1080
The transparent area is where the captured photo will appear

6. Precompute Face Embeddings

python precompute_embeddings.py

This generates embeddings.npz containing face embeddings for all people in the dataset. Re-run this command whenever you add new faces to the dataset.

💻 Usage

Start the Desktop Application

python main.py

Start the Server (Optional)

In a separate terminal, start the FastAPI server for photo uploads:

python server.py

Or with uvicorn for development:

uvicorn server:app --reload --host 0.0.0.0 --port 8000

🖥️ User Interface

┌──────────────────────────────────────────────────────────────────────┐
│                    SIT Face Recognition System                        │
├─────────────────────────────────────────┬────────────────────────────┤
│                                         │  ┌────────────────────────┐│
│                                         │  │     Photobooth        ││
│                                         │  │  ┌────────┬────────┐  ││
│                                         │  │  │ Frame 1│ Frame 2│  ││
│       Live Camera Preview               │  │  ├────────┼────────┤  ││
│       (720p, 16:9 aspect ratio)         │  │  │ Frame 3│ Frame 4│  ││
│                                         │  │  └────────┴────────┘  ││
│       [Face boxes with names]           │  │  [📸 Capture Photo]   ││
│       [FPS counter]                     │  └────────────────────────┘│
│       [Countdown overlay on capture]    │  ┌────────────────────────┐│
│                                         │  │     Last Upload        ││
│                                         │  │  [QR Code] [Thumbnail] ││
│                                         │  └────────────────────────┘│
│                                         │  ┌────────────────────────┐│
│                                         │  │  Recognition Controls  ││
│                                         │  │  Threshold: [====] 50% ││
│                                         │  │  Recognition: [ON/OFF] ││
│                                         │  │  Detected: John Doe    ││
│                                         │  └────────────────────────┘│
│                                         │  ┌────────────────────────┐│
│                                         │  │    Server Status       ││
│                                         │  │  Status: Connected ✓   ││
│                                         │  │  [Ping Server]         ││
│                                         │  └────────────────────────┘│
└─────────────────────────────────────────┴────────────────────────────┘
│ Ready                                                                 │
└──────────────────────────────────────────────────────────────────────┘

Photobooth Workflow

Select a Frame - Click on one of the 4 frame thumbnails
Click Capture - A 3-second countdown appears as an overlay on the camera
Smile! - Photo is captured when countdown reaches zero
Auto Upload - Photo is automatically composed with the frame and uploaded
Share - Scan the QR code to view/download the photo

Note: Face recognition is automatically disabled during the countdown for a cleaner capture experience.

🔌 API Endpoints (Server)

Endpoint	Method	Description
`/`	GET	Server info and available endpoints
`/health`	GET	Health check with timestamp
`/photobooth/upload`	POST	Upload photobooth image
`/photobooth/list`	GET	List all stored images
`/photobooth/list/{person_name}`	GET	List images for a specific person
`/photobooth/stats`	GET	Get storage statistics
`/photobooth/image/{person_name}/{filename}`	GET	Download specific image
`/photobooth/gallery`	GET	HTML gallery of all photos
`/photobooth/gallery/{person_name}`	GET	HTML gallery for specific person

⚙️ Configuration

Edit src/data/config.py to customize settings:

Camera Settings

@dataclass
class CameraConfig:
    device_index: int = 0        # Camera device (0, 1, 2...)
    frame_width: int = 1280      # Capture width
    frame_height: int = 720      # Capture height
    fps: int = 30                # Target FPS

Recognition Settings

@dataclass
class RecognitionConfig:
    confidence_threshold: float = 0.5   # Min confidence (0-1)
    min_face_size: int = 50             # Min face size in pixels
    detection_scale: float = 0.5        # Scale for faster detection

Greeting Settings

@dataclass
class GreetingConfig:
    enabled: bool = True
    cooldown_seconds: float = 15.0      # Re-greet cooldown
    greeting_template: str = "Hello {name}"

Server Settings

@dataclass
class ServerConfig:
    base_url: str = "http://localhost:8000"
    upload_endpoint: str = "/photobooth/upload"
    timeout_seconds: float = 10.0

🔧 Troubleshooting

Camera not detected

Check camera connection and permissions
Try different device_index values (0, 1, 2...) in config
On Linux, ensure your user is in the video group

Face recognition not working

Run python precompute_embeddings.py to regenerate embeddings
Ensure dataset images have clear, frontal faces
Check that InsightFace or face_recognition is properly installed
Try lowering the confidence threshold

TTS not speaking

Ensure pyttsx3 is installed: pip install pyttsx3
On Linux, install espeak: sudo apt-get install espeak
On macOS, the built-in TTS should work automatically

Server connection failed

Ensure server is running: python server.py
Check the base_url in config matches your server
Verify firewall allows connections on port 8000
For ngrok, update the URL in config after each restart

InsightFace installation issues

Try: pip install insightface onnxruntime
On Windows, you may need Visual C++ Build Tools
Alternative: Use face_recognition library (see requirements.txt)

📁 File Structure

See PROJECT_STRUCTURE.md for the complete project structure and file descriptions.

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Commit changes: git commit -m 'Add amazing feature'
Push to branch: git push origin feature/amazing-feature
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

InsightFace - Face detection and recognition
PyQt6 - Desktop UI framework
FastAPI - Server framework
pyttsx3 - Offline text-to-speech

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
pb_frames		pb_frames
src		src
.gitignore		.gitignore
LICENSE		LICENSE
PROJECT_STRUCTURE.md		PROJECT_STRUCTURE.md
README.md		README.md
main.py		main.py
precompute_embeddings.py		precompute_embeddings.py
requirements.txt		requirements.txt
server.py		server.py

Folders and files

Latest commit

History

Repository files navigation

SIT Face Recognition System

✨ Features

Core Features

UI Features

Technical Features

🏗️ Architecture

📋 Requirements

🚀 Installation

1. Clone the Repository

2. Create Virtual Environment (Recommended)

3. Install Dependencies

4. Prepare Face Dataset

5. Add Photobooth Frames

6. Precompute Face Embeddings

💻 Usage

Start the Desktop Application

Start the Server (Optional)

🖥️ User Interface

Photobooth Workflow

🔌 API Endpoints (Server)

⚙️ Configuration

Camera Settings

Recognition Settings

Greeting Settings

Server Settings

🔧 Troubleshooting

Camera not detected

Face recognition not working

TTS not speaking

Server connection failed

InsightFace installation issues

📁 File Structure

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages