Skip to content

nebula-cu/An-Expert-In-Residence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Barebones OS Parameter Optimizer

A modular framework for optimizing Linux kernel scheduler and system parameters using automated tuning with fixed or LLM-based tuners.

Overview

This framework optimizes Linux kernel scheduler parameters (e.g., min_granularity_ns, latency_ns, wakeup_granularity_ns) to improve database workload performance. It supports multiple tuning strategies:

  • Fixed Tuner: Uses fixed parameter values (baseline)
  • LLM Tuner: Uses Large Language Models (Gemini) for parameter suggestions
  • Human Tuner: Interactive tuning with human input
  • Q-Learning Tuner: Reinforcement learning with Q-tables
  • Bayesian Optimizer: SMAC3-based Bayesian optimization
  • DQN Tuner: Deep Q-Network for parameter tuning

Repository Structure

os-param-tuning/
├── src/barebones_optimizer/      # Main optimizer code
│   ├── main.py                   # Entry point
│   ├── optimizer.py              # Optimization loop
│   ├── config.py                 # Configuration management
│   ├── tuners.py                 # Tuner implementations
│   ├── parameter_manager.py      # OS parameter management
│   └── benchmarks/               # Benchmark implementations
│       ├── benchbase.py         # BenchBase/TPCC benchmark
│       └── benchmark_registry.py # Benchmark registry
├── config/
│   ├── barebones_optimizer/      # Optimizer configuration files
│   └── benchbase/
│       └── postgres/
│           └── sample_tpcc_config.xml  # BenchBase TPCC config
├── scripts/
│   ├── run_all_benchmarks.sh     # Run all benchmarks script
│   ├── plot_parameter_comparison.py  # Parameter comparison plots
│   └── plot_violin_p99_comparison.py # Violin plot comparisons
└── results/                      # Output directory (created at runtime)

Prerequisites

  1. System Requirements:

    • OS with Linux Kernel 5.15 (e.g. Ubuntu 22.04)
    • Python 3.8+
    • sudo access (required for setting kernel parameters)
    • PostgreSQL database server
  2. Python Dependencies:

    pip install -r requirements.txt
  3. Install the Package (Optional but recommended):

    # Install in development mode (no PYTHONPATH needed)
    pip install -e .
    
    # Or install system-wide (requires sudo)
    sudo pip install -e .

    If you don't install the package, you'll need to set PYTHONPATH when running commands (see below).

  4. BenchBase Setup:

    • BenchBase JAR file must be built and available at deps/benchbase/target/benchbase-postgres/benchbase.jar
    • PostgreSQL database must be running and accessible
  5. Google API Key (for LLM tuner):

    • Set GEMINI_API_KEY environment variable, or
    • Set llm_api_key in config files

Quick Start

1. Setup Environment

Simply run the setup script:

sudo ./scripts/setup.sh

and install the package:

pip install -e .

2. Run a Single Configuration

If you installed the package (recommended):

# Run with a specific config file
sudo python3 -m barebones_optimizer.main \
    -c config/barebones_optimizer/tpcc_default.json

# Run with LLM tuner
sudo python3 -m barebones_optimizer.main \
    -c config/barebones_optimizer/tpcc_llm_p99_min_granularity_only_flash_lite.json

If you didn't install the package, set PYTHONPATH:

# Run with a specific config file
sudo PYTHONPATH=$(pwd)/src:$PYTHONPATH python3 -m barebones_optimizer.main \
    -c config/barebones_optimizer/tpcc_default.json

# Run with LLM tuner
sudo PYTHONPATH=$(pwd)/src:$PYTHONPATH python3 -m barebones_optimizer.main \
    -c config/barebones_optimizer/tpcc_llm_p99_min_granularity_only_flash_lite.json

3. Run All Benchmarks

# Run all configurations (excluding human tuner)
sudo ./scripts/run_all_benchmarks.sh

This script will:

  • Run all benchmark configurations in order
  • Log all output to results/run_all_benchmarks_YYYYMMDD_HHMMSS.log

Configuration Files

All configuration files are in config/barebones_optimizer/. Each file specifies:

  • Benchmark: Currently only tpcc (BenchBase TPCC)
  • Tuner Type: fixed, llm, human, qlearning, bayesian, dqn
  • Parameter Ranges: Tunable parameters and their valid ranges
  • Fixed Parameters: Parameters that are set once at start and never changed
  • Optimization Goal: minimize or maximize a specific metric
  • Iterations: Number of optimization iterations
  • Window Duration: Duration of each benchmark window in seconds

Example Configuration

{
  "benchmark": "tpcc",
  "pin_to_cores": "0-3",
  "benchbase_jar_path": "deps/benchbase/target/benchbase-postgres/benchbase.jar",
  "benchbase_config_file": "config/benchbase/postgres/sample_tpcc_config.xml",
  "tuner_type": "llm",
  "parameter_ranges": {
    "min_granularity_ns": [100000, 50000000]
  },
  "fixed_parameters": {
    "latency_ns": 1000,
    "wakeup_granularity_ns": 500000
  },
  "optimization_metric": "p_99_latency",
  "optimization_goal": "minimize",
  "max_iterations": 200,
  "window_duration": 8,
  "results_dir": "results",
  "llm_model_name": "gemini-2.5-flash-lite"
}

Adjusting BenchBase Parameters

The BenchBase TPCC configuration is in config/benchbase/postgres/sample_tpcc_config.xml. You can modify:

Database Connection

<type>POSTGRES</type>
<url>jdbc:postgresql://localhost:5432/benchbase?sslmode=disable</url>
<username>admin</username>
<password>password</password>

Workload Configuration

<!-- Scale factor (number of warehouses) -->
<scalefactor>4</scalefactor>

<!-- Number of client terminals -->
<terminals>16</terminals>

<!-- Workload specification -->
<works>
    <work>
        <time>60000</time>        <!-- Duration in seconds -->
        <rate>1100</rate>         <!-- Target transaction rate -->
        <weights>45,43,4,4,4</weights>  <!-- Transaction type weights -->
    </work>
</works>

Dynamic Workload Changes

You can also specify workload changes in the optimizer config file:

{
  "changes": {
    "75": [["rate", "300"]],      // At iteration 75, change rate to 300
    "150": [["rate", "1100"]]     // At iteration 150, change rate to 1100
  }
}

Supported changes:

  • rate: Transaction rate (transactions per second)
  • terminals: Number of client terminals

Recreating Results

1. Run All Configurations

sudo ./scripts/run_all_benchmarks.sh

This will generate history files in results/ with names like:

optimization_history_tpcc_YYYYMMDD_HHMMSS.json

2. Generate Comparison Plots

# Parameter comparison plot (parameter values over iterations)
python3 scripts/plot_parameter_comparison.py \
    -d results \
    --output parameter_comparison.png

# Parameter comparison with iteration filtering
python3 scripts/plot_parameter_comparison.py \
    -d results \
    --start 10 --end 100 \
    --output parameter_comparison_filtered.png

# Parameter comparison with custom axis limits
python3 scripts/plot_parameter_comparison.py \
    -d results \
    --ylim 0.1 100 --xlim 0 200 \
    --output parameter_comparison_custom.png

# Violin plot comparison (P99 latency distributions)
python3 scripts/plot_violin_p99_comparison.py \
    -d results \
    --output violin_p99_comparison.png

# Violin plot with iteration filtering
python3 scripts/plot_violin_p99_comparison.py \
    -d results \
    --start 10 --end 100 --trim \
    --output violin_p99_comparison_filtered.png

Parameter Behavior

Fixed Parameters

Fixed parameters are set once at the start of optimization and never changed during tuning. They override default OS values.

Example:

{
  "fixed_parameters": {
    "latency_ns": 1000,
    "wakeup_granularity_ns": 500000
  }
}

Tunable Parameters

Only parameters in parameter_ranges are tuned. All other parameters remain at their fixed or default values.

Auto-Synchronization

When min_granularity_ns is set, wakeup_granularity_ns is automatically synchronized to the same value. This happens regardless of whether wakeup_granularity_ns is fixed or tunable.

Parameter Restoration

On exit (normal, interrupt, or error), all parameters are automatically restored to OS defaults:

  • latency_ns: 24000000 (24ms)
  • min_granularity_ns: 3000000 (3ms)
  • wakeup_granularity_ns: 500000 (0.5ms)

Output Files

History Files

Each run generates a history file in results/:

  • Format: optimization_history_{benchmark}_{timestamp}.json
  • Contains: Complete optimization history, configuration, and results

Log Files

  • Console output: Real-time logging to stdout
  • Log file: barebones_optimizer.log (if configured)
  • Batch run log: results/run_all_benchmarks_YYYYMMDD_HHMMSS.log

Citation

If you use this framework in your research, please cite:

@inproceedings{expertinresidence2025,
  title={An expert in residence: LLM agents for always-on operating system tuning},
  author={Liargkovas, Georgios and Jabrayilov, Vahab and Franke, Hubertus and Kaffes, Kostis},
  booktitle={Machine Learning for Systems 2025},
  year={2025}
}

License

MIT License. Please refer to LICENSE for details.

Contact

Georgios Liargkovas gl2902@columbia.edu