在过去的1个月里,越来越多的人开始在Mac上运行本地AI大模型。例如,通过Ollama来运行各种模型,再借助OpenCat或Ollama桌面客户端进行调用。但很多人都有一个非常痛苦的体验:速度慢、推理卡顿、token每秒只有个位数。
<img src="https://www.foxnan.com/wp-content/uploads/2026/06/61d77eb571ff3bb.webp" alt="2026 03 14 22 21 35.00 00 12 17.Still001 scaled" width="2560" height="1440" class="alignnone size-full wp-image-23371" decoding="async"</p
<p data-start="290" data-end="334"<span class="BZ_Pyq_fadeIn"尤其是</span<span class="BZ_Pyq_fadeIn"在 </span<span class="BZ_Pyq_fadeIn"Mac </span<span class="BZ_Pyq_fadeIn"Mini</span <span class="BZ_Pyq_fadeIn"或 </span<span class="BZ_Pyq_fadeIn"16GB </span<span class="BZ_Pyq_fadeIn"内存</span<span class="BZ_Pyq_fadeIn"设备</span <span class="BZ_Pyq_fadeIn"上,</span<span class="BZ_Pyq_fadeIn"这个</span<span class="BZ_Pyq_fadeIn"问题</span<span class="BZ_Pyq_fadeIn"更</span<span class="BZ_Pyq_fadeIn"明显。</span<span class="BZ_Pyq_fadeIn"今天</span<span class="BZ_Pyq_fadeIn"给</span<span class="BZ_Pyq_fadeIn"大家</span<span class="BZ_Pyq_fadeIn"介绍</span<span class="BZ_Pyq_fadeIn"一个 </span<span class="BZ_Pyq_fadeIn"Mac </span<span class="BZ_Pyq_fadeIn"本地</span<span class="BZ_Pyq_fadeIn"跑</span<span class="BZ_Pyq_fadeIn"模型</span<span class="BZ_Pyq_fadeIn"的</span<span class="BZ_Pyq_fadeIn"加速</span<span class="BZ_Pyq_fadeIn"神器 —— </span<span class="hover:entity-accent entity-underline inline cursor-pointer align-baseline"<span class="whitespace-normal"OMLX</span</span<span class="BZ_Pyq_fadeIn"。</span</p
<p data-start="410" data-end="467"<span class="BZ_Pyq_fadeIn"它</span<span class="BZ_Pyq_fadeIn"可以</span<span class="BZ_Pyq_fadeIn"让</span<span class="BZ_Pyq_fadeIn"本地</span<span class="BZ_Pyq_fadeIn"模型 </span<strong data-start="419" data-end="436"<span class="BZ_Pyq_fadeIn"推理</span<span class="BZ_Pyq_fadeIn"速度</span<span class="BZ_Pyq_fadeIn"提升 </span<span class="BZ_Pyq_fadeIn"10 </span<span class="BZ_Pyq_fadeIn"倍</span<span class="BZ_Pyq_fadeIn"以上</span</strong<span class="BZ_Pyq_fadeIn",</span<span class="BZ_Pyq_fadeIn"即使是 </span<strong data-start="441" data-end="456"<span class="BZ_Pyq_fadeIn"丐</span<span class="BZ_Pyq_fadeIn"版 </span<span class="BZ_Pyq_fadeIn"Mac </span<span class="BZ_Pyq_fadeIn"Mini</span</strong <span class="BZ_Pyq_fadeIn"也能</span<span class="BZ_Pyq_fadeIn"轻松</span<span class="BZ_Pyq_fadeIn"运行</span<span class="BZ_Pyq_fadeIn"大</span<span class="BZ_Pyq_fadeIn"模型。</span</p
<p data-start="469" data-end="492"<span class="BZ_Pyq_fadeIn"下面</span<span class="BZ_Pyq_fadeIn"我</span<span class="BZ_Pyq_fadeIn"带</span<span class="BZ_Pyq_fadeIn"大家 </span<span class="BZ_Pyq_fadeIn"完整</span<span class="BZ_Pyq_fadeIn"实</span<span class="BZ_Pyq_fadeIn"测 + </span<span class="BZ_Pyq_fadeIn"部署</span<span class="BZ_Pyq_fadeIn"教程</span</p
<div class="video-container"</div
一、为什么Mac本地模型这么慢?
许多人在Mac上运行本地模型时,通常会采用这样的架构:
安卿辰博客






