The AMD BC-250 is a repurposed cryptocurrency mining board with a Cyan Skillfish APU (Zen 2 CPU, GFX1013 GPU, 24 CUs) and 16 GB of unified GDDR6 memory. Because AMD’s ROCm libraries do not support GFX1013, Vulkan via the Mesa RADV driver was the only GPU compute path verified as functional in this work. This study documents the Linux kernel tuning (ttm. pagesₗimit) that enables access to the full 16 GB for inference, and reports benchmark results for 31 large language models (3B–35B parameters) covering generation speed, filled- context scaling with real-token payloads, output quality across five task types, and cold-start latency. A 35-billion-parameter Mixture-of-Experts model with 3B active parameters reaches 37. 5 tok/s. Of the 31 models tested, 24 reach a verified 64K filled-context ceiling. Silent context truncation by the Ollama runtime is documented and shown to affect benchmark validity. As a supplementary evaluation, the same Vulkan path is also used for image generation, vision inference, and video generation, though these modalities were tested with less rigour.
Building similarity graph...
Analyzing shared references across papers
Loading...
Artur Andrzejczak
Building similarity graph...
Analyzing shared references across papers
Loading...
Artur Andrzejczak (Wed,) studied this question.
www.synapsesocial.com/papers/69d895a86c1944d70ce06c00 — DOI: https://doi.org/10.5281/zenodo.19476017