DeepDriveSim¶
Deep learning-driven Adaptive Simulations
DeepDriveSim is a toolkit developed by Brookhaven National Laboratory (BNL) / RADICAL Laboratory at Rutgers University, in collaboration with Argonne National Laboratory. It implements an AI-steered ensemble simulation workflow that uses deep learning models to guide and optimize simulations in real-time.
Features¶
- Adaptive Simulation Management: Dynamically manages molecular simulations based on ML predictions
- Active Learning Loop: Implements simulation → training → prediction → cancellation → re-submission cycle
- Multiple Execution Backends: Supports local execution, RHAPSODY (HPC), and Dragon distributed computing
- Resource-Aware Scheduling: Automatically balances resources between simulations and training
- GPU Support: Automatic GPU detection and utilization
- Extensible Architecture: Easy to customize for different simulation types and ML models
Architecture¶
┌─────────────────────────────────────────────────────────────┐
│ DDMD Manager │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Simulation │ │ Training │ │ Prediction │ │
│ │ Queue │──│ Module │──│ Module │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ ROSE / RADICAL-AsyncFlow │ │
│ │ (Execution Backend Abstraction) │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
1. Basic Usage with DummyWorkflow (for testing)¶
import asyncio
from radical.asyncflow import ConcurrentExecutionBackend, WorkflowEngine
from concurrent.futures import ThreadPoolExecutor
from ddmd import DummyWorkflow
async def main():
# Create execution backend
engine = await ConcurrentExecutionBackend(ThreadPoolExecutor())
asyncflow = await WorkflowEngine.create(engine)
# Initialize workflow
workflow = DummyWorkflow(
asyncflow=asyncflow,
max_sim_batch=4,
training_cores=1,
num_files=10
)
# Run the adaptive learning loop
await workflow.start()
await workflow.close()
asyncio.run(main())
2. Creating a Custom Workflow¶
Extend DDSimManager to create your own workflow:
from ddmd import DDSimManager
class MyWorkflow(DDSimManager):
def __init__(self, asyncflow, **kwargs):
super().__init__(asyncflow)
# Your initialization code
self._register_learner_tasks()
def _register_learner_tasks(self):
@self.learner.simulation_task(as_executable=False)
async def simulation(*args, **kwargs):
# Your simulation logic
pass
self.simulation = simulation
def stop_simulation(self, prediction):
# Return True to cancel simulation based on prediction
return prediction < 0.5
async def init_sim_queue(self):
# Populate self.sim_task_queue with simulation inputs
pass
async def check_train_data(self):
# Return True when ready to start training
return True
async def train_model(self):
# Your training logic
pass
async def clean_sim_data(self, sim_ind):
# Cleanup files for canceled simulations
pass
Configuration Options¶
| Parameter | Description | Default |
|---|---|---|
max_sim_batch |
Maximum concurrent simulations | 4 |
training_cores |
CPU cores reserved for training | 1 |
training_threshold |
Accuracy threshold for training | 0.5 |
prediction_threshold |
Score threshold for cancellation | 0.5 |
force_start_training |
Skip waiting for data threshold | False |
clean_unregistered_sims |
Delete files from canceled sims | True |