ocxlabs
/

FloydARC

+---
+license: apache-2.0
+---
+# FloydARC (ARC-AGI Reasoning)
+## Model Summary
+**FloydARC** is a neural algorithmic reasoning model adapted from FloydNet for the **ARC-AGI** benchmark.
+This checkpoint is trained primarily on ARC-style synthetic and curated data, and is designed to solve ARC tasks via **iterative refinement and test-time adaptation**, rather than large-scale web pretraining.
+Among models trained mainly on ARC-like data, FloydARC achieves **state-of-the-art performance** on both ARC-AGI-1 and ARC-AGI-2, significantly narrowing the gap to very large proprietary models.
+---
+## Performance
+FloydARC demonstrates strong generalization on ARC benchmarks under standard evaluation protocols.
+**ARC-AGI benchmark results:**
+| Model        | #Params | ARC-AGI-1 | ARC-AGI-2 |
+| ------------ | ------: | --------: | --------: |
+| VARC         |     73M |      60.4 |      11.1 |
+| Loop-ViT     |   11.2M |      61.2 |      10.3 |
+| HRM          |     27M |      40.3 |       5.0 |
+| **FloydARC** |  153.7M |  **70.5** |  **15.3** |
+---
+## Model Details
+* **Model ID**: `ocxlabs/FloydARC`
+* **Task**: Abstraction and Reasoning Corpus (ARC-AGI)
+* **Architecture**: FloydNet-based global relational reasoning with looped refinement
+* **Input / Output**: ARC grid-based visual reasoning (query canvas → predicted answer canvas)
+* **License**: Apache 2.0
+---
+## Usage: Inference & Evaluation
+This checkpoint is intended for **research and evaluation use** on ARC-AGI. Full reproduction of reported results requires multi-GPU inference with test-time training.
+### 1. Download checkpoint
+Download the pretrained checkpoint from Hugging Face:
+```
+https://huggingface.co/ocxlabs/FloydARC
+```
+Place the downloaded folder anywhere on disk and pass its path via `--ckpt_path`.
+---
+### 2. Prepare ARC evaluation data
+Place the original ARC JSON files under `rawdata/`, then preprocess:
+```bash
+python -m scripts.process_data \
+  --input_dir ./rawdata/ARC-AGI-1_evaluation/ \
+  --output_dir ./preprocessed/arc1 \
+  --split test
+```
+Repeat with `ARC-AGI-2_evaluation` for ARC-AGI-2.
+---
+### 3. Run inference with Test-Time Training (recommended)
+```bash
+python -m scripts.TTT \
+  --ckpt_path /path/to/floydarc_ckpt \
+  --subset arc1 \
+  --output_dir ./output/TTT_results
+```
+Notes:
+* Default configuration uses **8 GPUs on a single node**
+* LoRA-based TTT is enabled by default and recommended
+* For ARC-AGI-2, set `--subset arc2`
+---
+### 4. Ensembling & visualization
+For reproducible evaluation and qualitative inspection:
+```bash
+python -m scripts.analyze \
+  --result-folder ./output/TTT_results \
+  --subset arc1 \
+  --out-html output/arc1_results.html
+```
+Multiple result folders can be passed to enable max-voting ensembling.