Depth Estimation
Transformers
Safetensors
PyTorch
English
depth_anything
computer-vision
absolute depth
Instructions to use Boxiang/depth_chm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Boxiang/depth_chm with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("depth-estimation", model="Boxiang/depth_chm")# Load model directly from transformers import AutoImageProcessor, AutoModelForDepthEstimation processor = AutoImageProcessor.from_pretrained("Boxiang/depth_chm") model = AutoModelForDepthEstimation.from_pretrained("Boxiang/depth_chm") - Notebooks
- Google Colab
- Kaggle
| language: en | |
| license: apache-2.0 | |
| tags: | |
| - depth-estimation | |
| - computer-vision | |
| - pytorch | |
| - absolute depth | |
| pipeline_tag: depth-estimation | |
| library_name: transformers | |
| # Depth-CHM Model | |
| A fine-tuned Depth Anything V2 model for depth estimation, trained on forest canopy height data. | |
| ## Model Description | |
| This model is based on [Depth-Anything-V2-Metric-Indoor-Base](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Indoor-Base-hf) and fine-tuned for estimating depth/canopy height from aerial imagery. | |
| ### Training Details | |
| - **Base Model**: depth-anything/Depth-Anything-V2-Metric-Indoor-Base-hf | |
| - **Max Depth**: 40.0 meters | |
| - **Loss Function**: SiLog + 0.1 * L1 Loss | |
| - **Hyperparameter Tuning**: Optuna (50 trials) | |
| ## Installation | |
| ```bash | |
| pip install transformers torch pillow numpy | |
| ``` | |
| ## Usage | |
| ### Method 1: Using Pipeline (Recommended) | |
| The simplest way to use the model: | |
| ```python | |
| from transformers import pipeline | |
| from PIL import Image | |
| import numpy as np | |
| # Load pipeline | |
| pipe = pipeline(task="depth-estimation", model="Boxiang/depth_chm") | |
| # Load image | |
| image = Image.open("your_image.png").convert("RGB") | |
| # Run inference | |
| result = pipe(image) | |
| depth_image = result["depth"] # PIL Image (normalized 0-255) | |
| # Convert to numpy array and scale to actual depth (0-40m) | |
| max_depth = 40.0 | |
| depth = np.array(depth_image).astype(np.float32) / 255.0 * max_depth | |
| print(f"Depth shape: {depth.shape}") | |
| print(f"Depth range: [{depth.min():.2f}, {depth.max():.2f}] meters") | |
| ``` | |
| ### Method 2: Using AutoImageProcessor + Model | |
| For more control over the inference process: | |
| ```python | |
| import torch | |
| import torch.nn.functional as F | |
| from transformers import AutoImageProcessor, DepthAnythingForDepthEstimation | |
| from PIL import Image | |
| import numpy as np | |
| # Configuration | |
| model_id = "Boxiang/depth_chm" | |
| max_depth = 40.0 | |
| # Load model and processor | |
| processor = AutoImageProcessor.from_pretrained(model_id) | |
| model = DepthAnythingForDepthEstimation.from_pretrained(model_id) | |
| # Use GPU if available | |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | |
| model = model.to(device) | |
| model.eval() | |
| # Load and process image | |
| image = Image.open("your_image.png").convert("RGB") | |
| original_size = image.size # (width, height) | |
| # Prepare input | |
| inputs = processor(images=image, return_tensors="pt") | |
| pixel_values = inputs["pixel_values"].to(device) | |
| # Run inference | |
| with torch.no_grad(): | |
| outputs = model(pixel_values) | |
| predicted_depth = outputs.predicted_depth | |
| # Scale by max_depth | |
| pred_scaled = predicted_depth * max_depth | |
| # Resize to original image size | |
| depth = F.interpolate( | |
| pred_scaled.unsqueeze(0), | |
| size=(original_size[1], original_size[0]), # (height, width) | |
| mode="bilinear", | |
| align_corners=True | |
| ).squeeze().cpu().numpy() | |
| print(f"Depth shape: {depth.shape}") | |
| print(f"Depth range: [{depth.min():.2f}, {depth.max():.2f}] meters") | |
| ``` | |
| ### Method 3: Local Model Path | |
| If you have the model saved locally: | |
| ```python | |
| from transformers import AutoImageProcessor, DepthAnythingForDepthEstimation | |
| # Load from local path | |
| model_path = "./depth_chm_trained" | |
| processor = AutoImageProcessor.from_pretrained(model_path, local_files_only=True) | |
| model = DepthAnythingForDepthEstimation.from_pretrained(model_path, local_files_only=True) | |
| ``` | |
| ## Output Format | |
| - **Pipeline output**: Returns a PIL Image with normalized depth values (0-255). Multiply by `max_depth / 255.0` to get actual depth in meters. | |
| - **Model output**: Returns `predicted_depth` tensor with values in range [0, 1]. Multiply by `max_depth` (40.0) to get actual depth in meters. | |
| ## Depth vs Height Conversion | |
| The model outputs **depth** (distance from camera). To convert to **height** (like CHM - Canopy Height Model): | |
| ```python | |
| height = max_depth - depth | |
| ``` | |
| ## Model Files | |
| - `model.safetensors` - Model weights | |
| - `config.json` - Model configuration | |
| - `preprocessor_config.json` - Image processor configuration | |
| - `training_info.json` - Training hyperparameters | |
| ## Citation | |
| If you use this model, please cite: | |
| ```bibtex | |
| @misc{depth_chm_2024, | |
| title={Depth-CHM: Fine-tuned Depth Anything V2 for Canopy Height Estimation}, | |
| author={Boxiang}, | |
| year={2024}, | |
| url={https://huggingface.co/Boxiang/depth_chm} | |
| } | |
| ``` | |
| ## License | |
| This model inherits the license from the base Depth Anything V2 model. | |