Hugging Face Model Support

Stagehand now supports using open-source Hugging Face models for local inference! This allows you to run web automation tasks without relying on external API services.

Supported Models

The following Hugging Face models are pre-configured and ready to use:

Llama 2 7B Chat (huggingface/meta-llama/Llama-2-7b-chat-hf)
Llama 2 13B Chat (huggingface/meta-llama/Llama-2-13b-chat-hf)
Mistral 7B Instruct (huggingface/mistralai/Mistral-7B-Instruct-v0.1)
Zephyr 7B Beta (huggingface/HuggingFaceH4/zephyr-7b-beta)
CodeGen 2.5B Mono (huggingface/Salesforce/codegen-2B-mono)
StarCoder2 7B (huggingface/bigcode/starcoder2-7b)

Requirements

Hardware Requirements

GPU: CUDA-compatible GPU with at least 8GB VRAM (recommended)
RAM: At least 16GB system RAM
Storage: 20GB+ free space for model downloads

Software Requirements

Python 3.9+
CUDA toolkit (for GPU acceleration)
PyTorch with CUDA support

Installation

Install the required dependencies:

pip install transformers torch accelerate bitsandbytes

For GPU support, make sure you have the appropriate CUDA version installed.

Basic Usage

import asyncio
from stagehand import Stagehand, StagehandConfig
from stagehand.schemas import AvailableModel

async def main():
    # Configure Stagehand to use a Hugging Face model
    config = StagehandConfig(
        env="LOCAL",
        model_name=AvailableModel.HUGGINGFACE_ZEPHYR_7B,
        verbose=2,
        use_api=False,
    )
    
    stagehand = Stagehand(config=config)
    
    try:
        await stagehand.init()
        await stagehand.navigate("https://example.com")
        
        # Extract data using the Hugging Face model
        result = await stagehand.extract(
            instruction="Extract the main heading from this page"
        )
        
        print(f"Extracted: {result.data}")
        
    finally:
        await stagehand.close()

asyncio.run(main())

Advanced Configuration

Memory Optimization

For systems with limited GPU memory, you can use quantization:

config = StagehandConfig(
    env="LOCAL",
    model_name=AvailableModel.HUGGINGFACE_LLAMA_2_7B,
    use_api=False,
    model_client_options={
        "device": "cuda",
        "quantization_config": {
            "load_in_4bit": True,
            "bnb_4bit_compute_dtype": "float16",
            "bnb_4bit_use_double_quant": True,
        }
    }
)

Custom Models

You can also use any Hugging Face model by specifying the full model name:

config = StagehandConfig(
    env="LOCAL",
    model_name="huggingface/your-username/your-model",
    use_api=False,
)

Performance Tips

Use GPU: Always use CUDA if available for significantly faster inference
Quantization: Use 4-bit or 8-bit quantization to reduce memory usage
Model Size: Start with smaller models (7B parameters) for testing
Batch Processing: Process multiple tasks in sequence rather than parallel
Memory Management: Close other GPU applications when running large models

Troubleshooting

Out of Memory Errors

Use quantization (load_in_4bit=True)
Try a smaller model
Close other GPU applications
Use CPU mode (slower but uses less memory)

Slow Performance

Ensure CUDA is properly installed
Use GPU instead of CPU
Try a smaller model
Check if other processes are using GPU

Model Download Issues

Check internet connection
Ensure sufficient disk space
Try downloading manually from Hugging Face Hub

Examples

See examples/example_huggingface.py for comprehensive examples including:

Basic usage with different models
Memory-efficient configurations
Form filling and data extraction
Error handling and troubleshooting

Limitations

First Run: Models are downloaded on first use (5-15GB)
Memory: Large models require significant GPU memory
Speed: Local inference is slower than API calls
Model Quality: Some models may not perform as well as commercial APIs

Contributing

To add support for new Hugging Face models:

Add the model to AvailableModel enum in schemas.py
Test the model with various web automation tasks
Update this documentation
Add tests for the new model

Support

For issues related to Hugging Face model support:

Check the Hugging Face documentation
Review the example file for usage patterns
Open an issue on the GitHub repository

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hugging Face Model Support

Supported Models

Requirements

Hardware Requirements

Software Requirements

Installation

Basic Usage

Advanced Configuration

Memory Optimization

Custom Models

Performance Tips

Troubleshooting

Out of Memory Errors

Slow Performance

Model Download Issues

Examples

Limitations

Contributing

Support

FilesExpand file tree

HUGGINGFACE_SUPPORT.md

Latest commit

History

HUGGINGFACE_SUPPORT.md

File metadata and controls

Hugging Face Model Support

Supported Models

Requirements

Hardware Requirements

Software Requirements

Installation

Basic Usage

Advanced Configuration

Memory Optimization

Custom Models

Performance Tips

Troubleshooting

Out of Memory Errors

Slow Performance

Model Download Issues

Examples

Limitations

Contributing

Support