hzxie/DOM
Updated โข 27k โข 9
A DynamicVLA policy trained on the DOM dataset (hzxie/DOM) for dynamic-object manipulation.
โ ๏ธ Mid-training checkpoint (~epoch 16, train loss โ 0.0007โ0.002). Self-contained and eval-ready (includes normalization buffers), but optimizer/scheduler state is not included (cannot resume optimizer momentum from this file).
model.safetensors + config.json (root) โ latest checkpoint (~epoch 16, a mid-epoch step
snapshot, refreshed as training proceeds).epoch0005/, epoch0010/ โ clean epoch-milestone checkpoints (saved at the end of those
epochs; load with subfolder="epoch0005" etc.). Note the folder name uses the internal
epoch_idx, which equals the log's "Epoch N+1" (e.g. epoch0010 = the completed "Epoch 11").SmolLM2-360M VLM backbone (16 layers) + FastViT vision encoderfreeze_* = False) โ all
430M parameters trainable (the stock config freezes the backbone and trains only ~99M).opst_cam + wrist_cam.from policies.dynamicvla.modeling_dynamicvla import DynamicVLAPolicy
policy = DynamicVLAPolicy.from_pretrained("mickeykang/dynamic-vla-DOM") # latest (~epoch 16)
# policy = DynamicVLAPolicy.from_pretrained("mickeykang/dynamic-vla-DOM", subfolder="epoch0010")
policy.eval().cuda()
from_pretrained restores the normalization buffers from model.safetensors, so no dataset is
needed to load/infer. For the DOM benchmark, serve with scripts/inference.py -p <dir> against the
Isaac Lab simulations/evaluate.py eval server.
utils/datasets.py resilience patch
(substitute a valid sample on any decode error) is needed to train on the full set, but not
to load/eval this checkpoint.