CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is the e6data Python Connector - a DB-API 2.0 compliant database connector for the e6data distributed SQL Engine. The connector uses gRPC for communication with e6data clusters and provides SQLAlchemy dialect support.

Key Features

DB-API 2.0 compliant interface
gRPC-based communication with SSL/TLS support
SQLAlchemy dialect integration
Blue-green deployment strategy support with automatic failover
Thread-safe and process-safe operation
Automatic retry and re-authentication logic

Common Development Commands

Building and Installing

# Install development dependencies
pip install -r requirements.txt

# Install the package in development mode
pip install -e .

# Build distribution packages
python setup.py sdist bdist_wheel

# Upload to PyPI (requires credentials)
twine upload dist/*

Running Tests

# Run tests using unittest (requires environment variables)
# Set these environment variables first:
# - ENGINE_IP: IP address of the e6data engine
# - DB_NAME: Database name
# - EMAIL: Your e6data email
# - PASSWORD: Access token from e6data console
# - CATALOG: Catalog name
# - PORT: Port number (default: 80)

# Run all tests
python -m unittest tests.py tests_grpc.py

# Run specific test file
python -m unittest tests.py
python -m unittest tests_grpc.py

Protocol Buffer Compilation

# Install protobuf compiler
pip install grpcio-tools

# Regenerate gRPC code from proto files (if proto files change)
python -m grpc_tools.protoc -I. --python_out=e6data_python_connector/server --grpc_python_out=e6data_python_connector/server e6x_engine.proto
python -m grpc_tools.protoc -I. --python_out=e6data_python_connector/cluster_server --grpc_python_out=e6data_python_connector/cluster_server cluster.proto

Testing Blue-Green Strategy

# Start mock server (in one terminal)
python mock_grpc_server.py

# Run test client (in another terminal)
python test_mock_server.py

# Or use the convenience script
./run_mock_test.sh

Architecture Overview

Core Components

Connection Management (e6data_grpc.py)
- Main Connection class implementing DB-API 2.0 interface
- Handles gRPC channel creation (secure/insecure)
- Authentication using email/password (access token)
- Connection pooling and retry logic
Cursor Implementation (e6data_grpc.py)
- GRPCCursor class for query execution
- Supports parameterized queries using pyformat style
- Fetch operations: fetchone(), fetchmany(), fetchall(), fetchall_buffer()
- Query analysis with explain_analyse()
gRPC Services
- Query Engine Service (server/): Main query execution interface
- Cluster Service (cluster_server/): Cluster management operations
- Both use Protocol Buffers for message serialization
SQLAlchemy Integration (dialect.py)
- Custom dialect registered as e6data+e6data_python_connector
- Enables use with SQLAlchemy ORM and query builder
Type System
- typeId.py: Type mapping between e6data and Python types
- date_time_utils.py: Date/time handling utilities
- datainputstream.py: Binary data deserialization

Key Design Patterns

Error Handling: Automatic retry with re-authentication for gRPC errors
Resource Management: Proper cleanup with clear(), close() methods
Memory Efficiency: fetchall_buffer() returns generator for large datasets
Security: SSL/TLS support for secure connections
Blue-Green Deployment:
- Automatic strategy detection and switching
- Graceful transitions without query interruption
- Thread-safe and process-safe strategy caching
- 456 error handling for strategy mismatches

Configuration Options

The connector supports extensive gRPC configuration through grpc_options:

Message size limits
Keepalive settings
Timeout configurations
HTTP/2 ping settings

See TECH_DOC.md for detailed gRPC options documentation.

Important Notes

Always use environment variables for credentials in tests
The connector requires network access to e6data clusters
Port 80 must be open for inbound connections
Tests require a running e6data cluster with valid credentials
When modifying proto files, regenerate the Python code
Follow DB-API 2.0 specification for any API changes
Blue-green strategy is handled automatically - no code changes required
All API responses now include optional new_strategy field
Strategy transitions happen after query completion (on clear/cancel)

Blue-Green Deployment Strategy

The connector automatically handles blue-green deployments:

Initial Detection: On first connection, tries both strategies
Header Injection: Adds "strategy" header to all gRPC requests
Graceful Transition: Current queries complete with old strategy
Automatic Failover: Handles 456 errors with strategy retry
Caching: 5-minute cache timeout for performance

See BLUE_GREEN_STRATEGY.md for detailed documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Key Features

Common Development Commands

Building and Installing

Running Tests

Protocol Buffer Compilation

Testing Blue-Green Strategy

Architecture Overview

Core Components

Key Design Patterns

Configuration Options

Important Notes

Blue-Green Deployment Strategy

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Key Features

Common Development Commands

Building and Installing

Running Tests

Protocol Buffer Compilation

Testing Blue-Green Strategy

Architecture Overview

Core Components

Key Design Patterns

Configuration Options

Important Notes

Blue-Green Deployment Strategy