🏷️ Labeling Strategy

Since manual labeling is expensive, weak supervision rules were used:

Content pages:

  • Deep URL paths
  • Article-like slugs
  • Presence of IDs or long titles
  • News/story patterns

Section pages:

  • Short paths
  • Category URLs
  • Homepage or listing pages
  • Trailing slash URLs

⚙️ Usage

Install dependencies

pip install transformers torch
Downloads last month
614
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support