<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>OpenAI API on Producthunt daily</title>
        <link>https://producthunt.programnotes.cn/en/tags/openai-api/</link>
        <description>Recent content in OpenAI API on Producthunt daily</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Tue, 14 Oct 2025 15:29:21 +0800</lastBuildDate><atom:link href="https://producthunt.programnotes.cn/en/tags/openai-api/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>llama.cpp</title>
        <link>https://producthunt.programnotes.cn/en/p/llama.cpp/</link>
        <pubDate>Tue, 14 Oct 2025 15:29:21 +0800</pubDate>
        
        <guid>https://producthunt.programnotes.cn/en/p/llama.cpp/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1641738876363-a0728bf25a8d?ixid=M3w0NjAwMjJ8MHwxfHJhbmRvbXx8fHx8fHx8fDE3NjA0MjY4ODR8&amp;ixlib=rb-4.1.0" alt="Featured image of post llama.cpp" /&gt;&lt;h1 id=&#34;ggml-orgllamacpp&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ggml-org/llama.cpp&lt;/a&gt;
&lt;/h1&gt;&lt;h1 id=&#34;llamacpp&#34;&gt;llama.cpp
&lt;/h1&gt;&lt;p&gt;&lt;img src=&#34;https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;llama&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://opensource.org/licenses/MIT&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/license-MIT-blue.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;License: MIT&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/releases&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/v/release/ggml-org/llama.cpp&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Release&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/actions/workflows/server.yml&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://github.com/ggml-org/llama.cpp/actions/workflows/server.yml/badge.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Server&#34;
	
	
&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/205&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Manifesto&lt;/a&gt; / &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/ggml&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ggml&lt;/a&gt; / &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/blob/master/docs/ops.md&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ops&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;LLM inference in C/C++&lt;/p&gt;
&lt;h2 id=&#34;recent-api-changes&#34;&gt;Recent API changes
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/issues/9289&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Changelog for &lt;code&gt;libllama&lt;/code&gt; API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/issues/9291&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Changelog for &lt;code&gt;llama-server&lt;/code&gt; REST API&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;hot-topics&#34;&gt;Hot topics
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/15396&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;guide : running gpt-oss with llama.cpp&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/15313&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;[FEEDBACK] Better packaging for llama.cpp to support downstream consumers 🤗&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Support for the &lt;code&gt;gpt-oss&lt;/code&gt; model with native MXFP4 format has been added | &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/15091&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;PR&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;https://blogs.nvidia.com/blog/rtx-ai-garage-openai-oss&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Collaboration with NVIDIA&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/15095&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Comment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Hot PRs: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pulls?q=is%3Apr&amp;#43;label%3Ahot&amp;#43;&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;All&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pulls?q=is%3Apr&amp;#43;label%3Ahot&amp;#43;is%3Aopen&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Open&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Multimodal support arrived in &lt;code&gt;llama-server&lt;/code&gt;: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/12898&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;#12898&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./docs/multimodal.md&#34; &gt;documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;VS Code extension for FIM completions: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.vscode&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/ggml-org/llama.vscode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Vim/Neovim plugin for FIM completions: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.vim&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/ggml-org/llama.vim&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Introducing GGUF-my-LoRA &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/10123&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/ggml-org/llama.cpp/discussions/10123&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Hugging Face Inference Endpoints now support GGUF out of the box! &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/9669&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/ggml-org/llama.cpp/discussions/9669&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Hugging Face GGUF editor: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/9268&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;discussion&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/spaces/CISCai/gguf-editor&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tool&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id=&#34;quick-start&#34;&gt;Quick start
&lt;/h2&gt;&lt;p&gt;Getting started with llama.cpp is straightforward. Here are several ways to install it on your machine:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Install &lt;code&gt;llama.cpp&lt;/code&gt; using &lt;a class=&#34;link&#34; href=&#34;docs/install.md&#34; &gt;brew, nix or winget&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Run with Docker - see our &lt;a class=&#34;link&#34; href=&#34;docs/docker.md&#34; &gt;Docker documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Download pre-built binaries from the &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/releases&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;releases page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Build from source by cloning this repository - check out &lt;a class=&#34;link&#34; href=&#34;docs/build.md&#34; &gt;our build guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once installed, you&amp;rsquo;ll need a model to work with. Head to the &lt;a class=&#34;link&#34; href=&#34;#obtaining-and-quantizing-models&#34; &gt;Obtaining and quantizing models&lt;/a&gt; section to learn more.&lt;/p&gt;
&lt;p&gt;Example command:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-sh&#34; data-lang=&#34;sh&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Use a local model file&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -m my_model.gguf
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Or download and run a model directly from Hugging Face&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Launch OpenAI-compatible API server&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -hf ggml-org/gemma-3-1b-it-GGUF
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id=&#34;description&#34;&gt;Description
&lt;/h2&gt;&lt;p&gt;The main goal of &lt;code&gt;llama.cpp&lt;/code&gt; is to enable LLM inference with minimal setup and state-of-the-art performance on a wide
range of hardware - locally and in the cloud.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Plain C/C++ implementation without any dependencies&lt;/li&gt;
&lt;li&gt;Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks&lt;/li&gt;
&lt;li&gt;AVX, AVX2, AVX512 and AMX support for x86 architectures&lt;/li&gt;
&lt;li&gt;1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use&lt;/li&gt;
&lt;li&gt;Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads GPUs via MUSA)&lt;/li&gt;
&lt;li&gt;Vulkan and SYCL backend support&lt;/li&gt;
&lt;li&gt;CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;llama.cpp&lt;/code&gt; project is the main playground for developing new features for the &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/ggml&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ggml&lt;/a&gt; library.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;Models&lt;/summary&gt;
&lt;p&gt;Typically finetunes of the base models below are supported as well.&lt;/p&gt;
&lt;p&gt;Instructions for adding support for new models: &lt;a class=&#34;link&#34; href=&#34;docs/development/HOWTO-add-model.md&#34; &gt;HOWTO-add-model.md&lt;/a&gt;&lt;/p&gt;
&lt;h4 id=&#34;text-only&#34;&gt;Text-only
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; LLaMA 🦙&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; LLaMA 2 🦙🦙&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; LLaMA 3 🦙🦙🦙&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/mistralai/Mistral-7B-v0.1&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mistral 7B&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=mistral-ai/Mixtral&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mixtral MoE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/databricks/dbrx-instruct&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;DBRX&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=tiiuae/falcon&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Falcon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ymcui/Chinese-LLaMA-Alpaca&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Chinese LLaMA / Alpaca&lt;/a&gt; and &lt;a class=&#34;link&#34; href=&#34;https://github.com/ymcui/Chinese-LLaMA-Alpaca-2&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Chinese LLaMA-2 / Alpaca-2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/bofenghuang/vigogne&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Vigogne (French)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/5423&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;BERT&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://bair.berkeley.edu/blog/2023/04/03/koala/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Koala&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=baichuan-inc/Baichuan&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Baichuan 1 &amp;amp; 2&lt;/a&gt; + &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/hiyouga/baichuan-7b-sft&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;derivations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=BAAI/Aquila&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Aquila 1 &amp;amp; 2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/3187&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Starcoder models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/smallcloudai/Refact-1_6B-fim&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Refact&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/3417&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;MPT&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/3553&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Bloom&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=01-ai/Yi&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Yi models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/stabilityai&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;StableLM models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=deepseek-ai/deepseek&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Deepseek models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=Qwen/Qwen&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Qwen models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/3557&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;PLaMo-13B&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=microsoft/phi&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Phi models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/11003&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;PhiMoE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/gpt2&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GPT-2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/5118&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Orion 14B&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=internlm2&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;InternLM2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/WisdomShell/codeshell&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;CodeShell&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://ai.google.dev/gemma&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Gemma&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/state-spaces/mamba&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mamba&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/keyfan/grok-1-hf&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Grok-1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=xverse&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Xverse&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=CohereForAI/c4ai-command-r&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Command-R models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=sea-lion&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;SEA-LION&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/GritLM/GritLM-7B&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GritLM-7B&lt;/a&gt; + &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/GritLM/GritLM-8x7B&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GritLM-8x7B&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://allenai.org/olmo&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OLMo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://allenai.org/olmo&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OLMo 2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/allenai/OLMoE-1B-7B-0924&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OLMoE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/ibm-granite/granite-code-models-6624c5cec322e4c148c8b330&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Granite models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/EleutherAI/gpt-neox&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GPT-NeoX&lt;/a&gt; + &lt;a class=&#34;link&#34; href=&#34;https://github.com/EleutherAI/pythia&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Pythia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/Snowflake/arctic-66290090abe542894a5ac520&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Snowflake-Arctic MoE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=Smaug&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Smaug&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/LumiOpen/Poro-34B&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Poro 34B&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/1bitLLM&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Bitnet b1.58 models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=flan-t5&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Flan T5&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/apple/openelm-instruct-models-6619ad295d7ae9f868b759ca&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Open Elm models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/THUDM/chatglm3-6b&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ChatGLM3-6b&lt;/a&gt; + &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/THUDM/glm-4-9b&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ChatGLM4-9b&lt;/a&gt; + &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/THUDM/glm-edge-1.5b-chat&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GLMEdge-1.5b&lt;/a&gt; + &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/THUDM/glm-edge-4b-chat&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GLMEdge-4b&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/THUDM/glm-4-0414-67f3cbcb34dd9d252707cb2e&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GLM-4-0414&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;SmolLM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;EXAONE-3.0-7.8B-Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;FalconMamba Models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/inceptionai/jais-13b-chat&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Jais&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/speakleash/bielik-11b-v23-66ee813238d9b526a072408a&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Bielik-11B-v2.3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/BlinkDL/RWKV-LM&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;RWKV-6&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;QRWKV-6&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/ai-sage/GigaChat-20B-A3B-instruct&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GigaChat-20B-A3B&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/trillionlabs/Trillion-7B-preview&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Trillion-7B-preview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/inclusionAI/ling-67c51c85b34a7ea0aba94c32&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Ling models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/LiquidAI/lfm2-686d721927015b2ad73eaa38&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LFM2 models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/tencent/hunyuan-dense-model-6890632cda26b19119c9c5e7&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Hunyuan models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;multimodal&#34;&gt;Multimodal
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/liuhaotian/llava-15-653aac15d994e992e2677a7e&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LLaVA 1.5 models&lt;/a&gt;, &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/liuhaotian/llava-16-65b9e40155f60fd046a5ccf2&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LLaVA 1.6 models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=SkunkworksAI/Bakllava&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;BakLLaVA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/NousResearch/Obsidian-3B-V0.5&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Obsidian&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=Lin-Chen/ShareGPT4V&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ShareGPT4V&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=mobileVLM&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;MobileVLM 1.7B/3B models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=Yi-VL&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Yi-VL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=MiniCPM&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mini CPM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/vikhyatk/moondream2&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Moondream&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/BAAI-DCAI/Bunny&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Bunny&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?search=glm-edge&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GLM-EDGE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/Qwen/qwen2-vl-66cee7455501d7126940800d&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Qwen2-VL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/LiquidAI/lfm2-vl-68963bbc84a610f7638d5ffa&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LFM2-VL&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Bindings&lt;/summary&gt;
&lt;ul&gt;
&lt;li&gt;Python: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ddh0/easy-llama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ddh0/easy-llama&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Python: &lt;a class=&#34;link&#34; href=&#34;https://github.com/abetlen/llama-cpp-python&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;abetlen/llama-cpp-python&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Go: &lt;a class=&#34;link&#34; href=&#34;https://github.com/go-skynet/go-llama.cpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;go-skynet/go-llama.cpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Node.js: &lt;a class=&#34;link&#34; href=&#34;https://github.com/withcatai/node-llama-cpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;withcatai/node-llama-cpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;JS/TS (llama.cpp server client): &lt;a class=&#34;link&#34; href=&#34;https://modelfusion.dev/integration/model-provider/llamacpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;lgrammel/modelfusion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;JS/TS (Programmable Prompt Engine CLI): &lt;a class=&#34;link&#34; href=&#34;https://github.com/offline-ai/cli&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;offline-ai/cli&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;JavaScript/Wasm (works in browser): &lt;a class=&#34;link&#34; href=&#34;https://github.com/tangledgroup/llama-cpp-wasm&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tangledgroup/llama-cpp-wasm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Typescript/Wasm (nicer API, available on npm): &lt;a class=&#34;link&#34; href=&#34;https://github.com/ngxson/wllama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ngxson/wllama&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Ruby: &lt;a class=&#34;link&#34; href=&#34;https://github.com/yoshoku/llama_cpp.rb&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;yoshoku/llama_cpp.rb&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Rust (more features): &lt;a class=&#34;link&#34; href=&#34;https://github.com/edgenai/llama_cpp-rs&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;edgenai/llama_cpp-rs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Rust (nicer API): &lt;a class=&#34;link&#34; href=&#34;https://github.com/mdrokz/rust-llama.cpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;mdrokz/rust-llama.cpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Rust (more direct bindings): &lt;a class=&#34;link&#34; href=&#34;https://github.com/utilityai/llama-cpp-rs&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;utilityai/llama-cpp-rs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Rust (automated build from crates.io): &lt;a class=&#34;link&#34; href=&#34;https://github.com/ShelbyJenkins/llm_client&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ShelbyJenkins/llm_client&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;C#/.NET: &lt;a class=&#34;link&#34; href=&#34;https://github.com/SciSharp/LLamaSharp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;SciSharp/LLamaSharp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;C#/VB.NET (more features - community license): &lt;a class=&#34;link&#34; href=&#34;https://docs.lm-kit.com/lm-kit-net/index.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LM-Kit.NET&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Scala 3: &lt;a class=&#34;link&#34; href=&#34;https://github.com/donderom/llm4s&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;donderom/llm4s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Clojure: &lt;a class=&#34;link&#34; href=&#34;https://github.com/phronmophobic/llama.clj&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;phronmophobic/llama.clj&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;React Native: &lt;a class=&#34;link&#34; href=&#34;https://github.com/mybigday/llama.rn&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;mybigday/llama.rn&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Java: &lt;a class=&#34;link&#34; href=&#34;https://github.com/kherud/java-llama.cpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;kherud/java-llama.cpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Java: &lt;a class=&#34;link&#34; href=&#34;https://github.com/QuasarByte/llama-cpp-jna&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;QuasarByte/llama-cpp-jna&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Zig: &lt;a class=&#34;link&#34; href=&#34;https://github.com/Deins/llama.cpp.zig&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;deins/llama.cpp.zig&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flutter/Dart: &lt;a class=&#34;link&#34; href=&#34;https://github.com/netdur/llama_cpp_dart&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;netdur/llama_cpp_dart&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flutter: &lt;a class=&#34;link&#34; href=&#34;https://github.com/xuegao-tzx/Fllama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;xuegao-tzx/Fllama&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;PHP (API bindings and features built on top of llama.cpp): &lt;a class=&#34;link&#34; href=&#34;https://github.com/distantmagic/resonance&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;distantmagic/resonance&lt;/a&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/6326&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;(more info)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Guile Scheme: &lt;a class=&#34;link&#34; href=&#34;https://savannah.nongnu.org/projects/guile-llama-cpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;guile_llama_cpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Swift &lt;a class=&#34;link&#34; href=&#34;https://github.com/srgtuszy/llama-cpp-swift&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;srgtuszy/llama-cpp-swift&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Swift &lt;a class=&#34;link&#34; href=&#34;https://github.com/ShenghaiWang/SwiftLlama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ShenghaiWang/SwiftLlama&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Delphi &lt;a class=&#34;link&#34; href=&#34;https://github.com/Embarcadero/llama-cpp-delphi&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Embarcadero/llama-cpp-delphi&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;UIs&lt;/summary&gt;
&lt;p&gt;&lt;em&gt;(to have a project listed here, it should clearly state that it depends on &lt;code&gt;llama.cpp&lt;/code&gt;)&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/yaroslavyaroslav/OpenAI-sublime-text&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;AI Sublime Text plugin&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/cztomsik/ava&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;cztomsik/ava&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/alexpinel/Dot&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Dot&lt;/a&gt; (GPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ylsdamxssjxxdd/eva&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;eva&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/iohub/coLLaMA&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;iohub/collama&lt;/a&gt; (Apache-2.0)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/janhq/jan&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;janhq/jan&lt;/a&gt; (AGPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/johnbean393/Sidekick&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;johnbean393/Sidekick&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/zhouwg/kantv?tab=readme-ov-file&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;KanTV&lt;/a&gt; (Apache-2.0)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/firatkiral/kodibot&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;KodiBot&lt;/a&gt; (GPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.vim&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;llama.vim&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/abgulati/LARS&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LARS&lt;/a&gt; (AGPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/vietanhdev/llama-assistant&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Llama Assistant&lt;/a&gt; (GPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/guinmoon/LLMFarm?tab=readme-ov-file&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LLMFarm&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/undreamai/LLMUnity&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LLMUnity&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://lmstudio.ai/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LMStudio&lt;/a&gt; (proprietary)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/mudler/LocalAI&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LocalAI&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/LostRuins/koboldcpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LostRuins/koboldcpp&lt;/a&gt; (AGPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://mindmac.app&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;MindMac&lt;/a&gt; (proprietary)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/MindWorkAI/AI-Studio&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;MindWorkAI/AI-Studio&lt;/a&gt; (FSL-1.1-MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/Mobile-Artificial-Intelligence/maid&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mobile-Artificial-Intelligence/maid&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/Mozilla-Ocho/llamafile&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mozilla-Ocho/llamafile&lt;/a&gt; (Apache-2.0)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/nat/openplayground&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;nat/openplayground&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/nomic-ai/gpt4all&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;nomic-ai/gpt4all&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ollama/ollama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ollama/ollama&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/oobabooga/text-generation-webui&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;oobabooga/text-generation-webui&lt;/a&gt; (AGPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/a-ghorbani/pocketpal-ai&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;PocketPal AI&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/psugihara/FreeChat&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;psugihara/FreeChat&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ptsochantaris/emeltal&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ptsochantaris/emeltal&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/pythops/tenere&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;pythops/tenere&lt;/a&gt; (AGPL)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/containers/ramalama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ramalama&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/semperai/amica&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;semperai/amica&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/withcatai/catai&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;withcatai/catai&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/blackhole89/autopen&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Autopen&lt;/a&gt; (GPL)&lt;/li&gt;
&lt;/ul&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Tools&lt;/summary&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/akx/ggify&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;akx/ggify&lt;/a&gt; – download PyTorch models from HuggingFace Hub and convert them to GGML&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/akx/ollama-dl&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;akx/ollama-dl&lt;/a&gt; – download models from the Ollama library to be used directly with llama.cpp&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/crashr/gppm&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;crashr/gppm&lt;/a&gt; – launch llama.cpp instances utilizing NVIDIA Tesla P40 or P100 GPUs with reduced idle power consumption&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/gpustack/gguf-parser-go/tree/main/cmd/gguf-parser&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;gpustack/gguf-parser&lt;/a&gt; - review/check the GGUF file and estimate the memory usage&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://marketplace.unity.com/packages/tools/generative-ai/styled-lines-llama-cpp-model-292902&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Styled Lines&lt;/a&gt; (proprietary licensed, async wrapper of inference part for game development in Unity3d with pre-built Mobile and Web platform wrappers and a model example)&lt;/li&gt;
&lt;/ul&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Infrastructure&lt;/summary&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/intentee/paddler&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Paddler&lt;/a&gt; - Open-source LLMOps platform for hosting and scaling AI in your own infrastructure&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/gpustack/gpustack&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GPUStack&lt;/a&gt; - Manage GPU clusters for running LLMs&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/onicai/llama_cpp_canister&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;llama_cpp_canister&lt;/a&gt; - llama.cpp as a smart contract on the Internet Computer, using WebAssembly&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/mostlygeek/llama-swap&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;llama-swap&lt;/a&gt; - transparent proxy that adds automatic model switching with llama-server&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/kalavai-net/kalavai-client&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Kalavai&lt;/a&gt; - Crowdsource end to end LLM deployment at any scale&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/InftyAI/llmaz&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;llmaz&lt;/a&gt; - ☸️ Easy, advanced inference platform for large language models on Kubernetes.&lt;/li&gt;
&lt;/ul&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Games&lt;/summary&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/MorganRO8/Lucys_Labyrinth&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Lucy&amp;rsquo;s Labyrinth&lt;/a&gt; - A simple maze game where agents controlled by an AI model will try to trick you.&lt;/li&gt;
&lt;/ul&gt;
&lt;/details&gt;
&lt;h2 id=&#34;supported-backends&#34;&gt;Supported backends
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Backend&lt;/th&gt;
          &lt;th&gt;Target devices&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#metal-build&#34; &gt;Metal&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Apple Silicon&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#blas-build&#34; &gt;BLAS&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;All&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/backend/BLIS.md&#34; &gt;BLIS&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;All&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/backend/SYCL.md&#34; &gt;SYCL&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Intel and Nvidia GPU&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#musa&#34; &gt;MUSA&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Moore Threads GPU&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#cuda&#34; &gt;CUDA&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Nvidia GPU&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#hip&#34; &gt;HIP&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;AMD GPU&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#vulkan&#34; &gt;Vulkan&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;GPU&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#cann&#34; &gt;CANN&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Ascend NPU&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/backend/OPENCL.md&#34; &gt;OpenCL&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Adreno GPU&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/backend/zDNN.md&#34; &gt;IBM zDNN&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;IBM Z &amp;amp; LinuxONE&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md#webgpu&#34; &gt;WebGPU [In Progress]&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;All&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;RPC&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;All&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;obtaining-and-quantizing-models&#34;&gt;Obtaining and quantizing models
&lt;/h2&gt;&lt;p&gt;The &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Hugging Face&lt;/a&gt; platform hosts a &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?library=gguf&amp;amp;sort=trending&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;number of LLMs&lt;/a&gt; compatible with &lt;code&gt;llama.cpp&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?library=gguf&amp;amp;sort=trending&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Trending&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/models?sort=trending&amp;amp;search=llama&amp;#43;gguf&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LLaMA&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can either manually download the GGUF file or directly use any &lt;code&gt;llama.cpp&lt;/code&gt;-compatible models from &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Hugging Face&lt;/a&gt; or other model hosting sites, such as &lt;a class=&#34;link&#34; href=&#34;https://modelscope.cn/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ModelScope&lt;/a&gt;, by using this CLI argument: &lt;code&gt;-hf &amp;lt;user&amp;gt;/&amp;lt;model&amp;gt;[:quant]&lt;/code&gt;. For example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-sh&#34; data-lang=&#34;sh&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;By default, the CLI would download from Hugging Face, you can switch to other options with the environment variable &lt;code&gt;MODEL_ENDPOINT&lt;/code&gt;. For example, you may opt to downloading model checkpoints from ModelScope or other model sharing communities by setting the environment variable, e.g. &lt;code&gt;MODEL_ENDPOINT=https://www.modelscope.cn/&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;After downloading a model, use the CLI tools to run it locally - see below.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llama.cpp&lt;/code&gt; requires the model to be stored in the &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/ggml/blob/master/docs/gguf.md&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GGUF&lt;/a&gt; file format. Models in other data formats can be converted to GGUF using the &lt;code&gt;convert_*.py&lt;/code&gt; Python scripts in this repo.&lt;/p&gt;
&lt;p&gt;The Hugging Face platform provides a variety of online tools for converting, quantizing and hosting models with &lt;code&gt;llama.cpp&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use the &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/spaces/ggml-org/gguf-my-repo&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GGUF-my-repo space&lt;/a&gt; to convert to GGUF format and quantize model weights to smaller sizes&lt;/li&gt;
&lt;li&gt;Use the &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/spaces/ggml-org/gguf-my-lora&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GGUF-my-LoRA space&lt;/a&gt; to convert LoRA adapters to GGUF format (more info: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/10123&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/ggml-org/llama.cpp/discussions/10123&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Use the &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/spaces/CISCai/gguf-editor&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GGUF-editor space&lt;/a&gt; to edit GGUF meta data in the browser (more info: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/9268&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/ggml-org/llama.cpp/discussions/9268&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Use the &lt;a class=&#34;link&#34; href=&#34;https://ui.endpoints.huggingface.co/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Inference Endpoints&lt;/a&gt; to directly host &lt;code&gt;llama.cpp&lt;/code&gt; in the cloud (more info: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/9669&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/ggml-org/llama.cpp/discussions/9669&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To learn more about model quantization, &lt;a class=&#34;link&#34; href=&#34;tools/quantize/README.md&#34; &gt;read this documentation&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;llama-cli&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;tools/main&#34; &gt;&lt;code&gt;llama-cli&lt;/code&gt;&lt;/a&gt;
&lt;/h2&gt;&lt;h4 id=&#34;a-cli-tool-for-accessing-and-experimenting-with-most-of-llamacpps-functionality&#34;&gt;A CLI tool for accessing and experimenting with most of &lt;code&gt;llama.cpp&lt;/code&gt;&amp;rsquo;s functionality.
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;
&lt;details open&gt;
  &lt;summary&gt;Run in conversation mode&lt;/summary&gt;
&lt;p&gt;Models with a built-in chat template will automatically activate conversation mode. If this doesn&amp;rsquo;t occur, you can manually enable it by adding &lt;code&gt;-cnv&lt;/code&gt; and specifying a suitable chat template with &lt;code&gt;--chat-template NAME&lt;/code&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -m model.gguf
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# &amp;gt; hi, who are you?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Hi there! I&amp;#39;m your helpful assistant! I&amp;#39;m an AI-powered chatbot designed to assist and provide information to users like you. I&amp;#39;m here to help answer your questions, provide guidance, and offer support on a wide range of topics. I&amp;#39;m a friendly and knowledgeable AI, and I&amp;#39;m always happy to help with anything you need. What&amp;#39;s on your mind, and how can I assist you today?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# &amp;gt; what is 1+1?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Easy peasy! The answer to 1+1 is... 2!&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Run in conversation mode with custom chat template&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# use the &amp;#34;chatml&amp;#34; template (use -h to see the list of supported templates)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -m model.gguf -cnv --chat-template chatml
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# use a custom template&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -m model.gguf -cnv --in-prefix &lt;span class=&#34;s1&#34;&gt;&amp;#39;User: &amp;#39;&lt;/span&gt; --reverse-prompt &lt;span class=&#34;s1&#34;&gt;&amp;#39;User:&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Run simple text completion&lt;/summary&gt;
&lt;p&gt;To disable conversation mode explicitly, use &lt;code&gt;-no-cnv&lt;/code&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -m model.gguf -p &lt;span class=&#34;s2&#34;&gt;&amp;#34;I believe the meaning of life is&amp;#34;&lt;/span&gt; -n &lt;span class=&#34;m&#34;&gt;128&lt;/span&gt; -no-cnv
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# I believe the meaning of life is to find your own truth and to live in accordance with it. For me, this means being true to myself and following my passions, even if they don&amp;#39;t align with societal expectations. I think that&amp;#39;s what I love about yoga – it&amp;#39;s not just a physical practice, but a spiritual one too. It&amp;#39;s about connecting with yourself, listening to your inner voice, and honoring your own unique journey.&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Constrain the output with a custom grammar&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-cli -m model.gguf -n &lt;span class=&#34;m&#34;&gt;256&lt;/span&gt; --grammar-file grammars/json.gbnf -p &lt;span class=&#34;s1&#34;&gt;&amp;#39;Request: schedule a call at 8pm; Command:&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# {&amp;#34;appointmentTime&amp;#34;: &amp;#34;8pm&amp;#34;, &amp;#34;appointmentDetails&amp;#34;: &amp;#34;schedule a a call&amp;#34;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The &lt;a class=&#34;link&#34; href=&#34;grammars/&#34; &gt;grammars/&lt;/a&gt; folder contains a handful of sample grammars. To write your own, check out the &lt;a class=&#34;link&#34; href=&#34;grammars/README.md&#34; &gt;GBNF Guide&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For authoring more complex JSON grammars, check out &lt;a class=&#34;link&#34; href=&#34;https://grammar.intrinsiclabs.ai/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://grammar.intrinsiclabs.ai/&lt;/a&gt;&lt;/p&gt;
  &lt;/details&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;llama-server&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;tools/server&#34; &gt;&lt;code&gt;llama-server&lt;/code&gt;&lt;/a&gt;
&lt;/h2&gt;&lt;h4 id=&#34;a-lightweight-openai-api-compatible-http-server-for-serving-llms&#34;&gt;A lightweight, &lt;a class=&#34;link&#34; href=&#34;https://github.com/openai/openai-openapi&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OpenAI API&lt;/a&gt; compatible, HTTP server for serving LLMs.
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;
&lt;details open&gt;
  &lt;summary&gt;Start a local HTTP server with default configuration on port 8080&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -m model.gguf --port &lt;span class=&#34;m&#34;&gt;8080&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Basic web UI can be accessed via browser: http://localhost:8080&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Chat completion endpoint: http://localhost:8080/v1/chat/completions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Support multiple-users and parallel decoding&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# up to 4 concurrent requests, each with 4096 max context&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -m model.gguf -c &lt;span class=&#34;m&#34;&gt;16384&lt;/span&gt; -np &lt;span class=&#34;m&#34;&gt;4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Enable speculative decoding&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# the draft.gguf model should be a small variant of the target model.gguf&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -m model.gguf -md draft.gguf
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Serve an embedding model&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# use the /embedding endpoint&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -m model.gguf --embedding --pooling cls -ub &lt;span class=&#34;m&#34;&gt;8192&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Serve a reranking model&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# use the /reranking endpoint&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -m model.gguf --reranking
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Constrain all outputs with a grammar&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# custom grammar&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -m model.gguf --grammar-file grammar.gbnf
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# JSON&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-server -m model.gguf --grammar-file grammars/json.gbnf
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;llama-perplexity&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;tools/perplexity&#34; &gt;&lt;code&gt;llama-perplexity&lt;/code&gt;&lt;/a&gt;
&lt;/h2&gt;&lt;h4 id=&#34;a-tool-for-measuring-the-perplexity--and-other-quality-metrics-of-a-model-over-a-given-text&#34;&gt;A tool for measuring the &lt;a class=&#34;link&#34; href=&#34;tools/perplexity/README.md&#34; &gt;perplexity&lt;/a&gt; &lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; (and other quality metrics) of a model over a given text.
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;
&lt;details open&gt;
  &lt;summary&gt;Measure the perplexity over a text file&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-perplexity -m model.gguf -f file.txt
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# [1]15.2701,[2]5.4007,[3]5.3073,[4]6.2965,[5]5.8940,[6]5.6096,[7]5.7942,[8]4.9297, ...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Final estimate: PPL = 5.4007 +/- 0.67339&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Measure KL divergence&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# TODO&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;llama-bench&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;tools/llama-bench&#34; &gt;&lt;code&gt;llama-bench&lt;/code&gt;&lt;/a&gt;
&lt;/h2&gt;&lt;h4 id=&#34;benchmark-the-performance-of-the-inference-for-various-parameters&#34;&gt;Benchmark the performance of the inference for various parameters.
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;
&lt;details open&gt;
  &lt;summary&gt;Run default benchmark&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;9
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-bench -m model.gguf
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Output:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# | model               |       size |     params | backend    | threads |          test |                  t/s |&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# | ------------------- | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# | qwen2 1.5B Q4_0     | 885.97 MiB |     1.54 B | Metal,BLAS |      16 |         pp512 |      5765.41 ± 20.55 |&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# | qwen2 1.5B Q4_0     | 885.97 MiB |     1.54 B | Metal,BLAS |      16 |         tg128 |        197.71 ± 0.81 |&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# build: 3e0ba0e60 (4229)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;llama-run&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;tools/run&#34; &gt;&lt;code&gt;llama-run&lt;/code&gt;&lt;/a&gt;
&lt;/h2&gt;&lt;h4 id=&#34;a-comprehensive-example-for-running-llamacpp-models-useful-for-inferencing-used-with-ramalama-&#34;&gt;A comprehensive example for running &lt;code&gt;llama.cpp&lt;/code&gt; models. Useful for inferencing. Used with RamaLama &lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;.
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Run a model with a specific prompt (by default it&#39;s pulled from Ollama registry)&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-run granite-code
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;llama-simple&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;examples/simple&#34; &gt;&lt;code&gt;llama-simple&lt;/code&gt;&lt;/a&gt;
&lt;/h2&gt;&lt;h4 id=&#34;a-minimal-example-for-implementing-apps-with-llamacpp-useful-for-developers&#34;&gt;A minimal example for implementing apps with &lt;code&gt;llama.cpp&lt;/code&gt;. Useful for developers.
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;
&lt;details&gt;
  &lt;summary&gt;Basic text completion&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;llama-simple -m model.gguf
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Hello my name is Kaitlyn and I am a 16 year old girl. I am a junior in high school and I am currently taking a class called &amp;#34;The Art of&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;  &lt;/details&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;contributing&#34;&gt;Contributing
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Contributors can open PRs&lt;/li&gt;
&lt;li&gt;Collaborators will be invited based on contributions&lt;/li&gt;
&lt;li&gt;Maintainers can push to branches in the &lt;code&gt;llama.cpp&lt;/code&gt; repo and merge PRs into the &lt;code&gt;master&lt;/code&gt; branch&lt;/li&gt;
&lt;li&gt;Any help with managing issues, PRs and projects is very appreciated!&lt;/li&gt;
&lt;li&gt;See &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/issues?q=is%3Aissue&amp;#43;is%3Aopen&amp;#43;label%3A%22good&amp;#43;first&amp;#43;issue%22&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;good first issues&lt;/a&gt; for tasks suitable for first contributions&lt;/li&gt;
&lt;li&gt;Read the &lt;a class=&#34;link&#34; href=&#34;CONTRIBUTING.md&#34; &gt;CONTRIBUTING.md&lt;/a&gt; for more information&lt;/li&gt;
&lt;li&gt;Make sure to read this: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/discussions/205&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Inference at the edge&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A bit of backstory for those who are interested: &lt;a class=&#34;link&#34; href=&#34;https://changelog.com/podcast/532&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Changelog podcast&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;other-documentation&#34;&gt;Other documentation
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;tools/main/README.md&#34; &gt;main (cli)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;tools/server/README.md&#34; &gt;server&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;grammars/README.md&#34; &gt;GBNF grammars&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;development-documentation&#34;&gt;Development documentation
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;docs/build.md&#34; &gt;How to build&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;docs/docker.md&#34; &gt;Running on Docker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;docs/android.md&#34; &gt;Build on Android&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;docs/development/token_generation_performance_tips.md&#34; &gt;Performance troubleshooting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/wiki/GGML-Tips-&amp;amp;-Tricks&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GGML tips &amp;amp; tricks&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;seminal-papers-and-background-on-the-models&#34;&gt;Seminal papers and background on the models
&lt;/h4&gt;&lt;p&gt;If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;LLaMA:
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://ai.facebook.com/blog/large-language-model-llama-meta-ai/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Introducing LLaMA: A foundational, 65-billion-parameter large language model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://arxiv.org/abs/2302.13971&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LLaMA: Open and Efficient Foundation Language Models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;GPT-3
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://arxiv.org/abs/2005.14165&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Language Models are Few-Shot Learners&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;GPT-3.5 / InstructGPT / ChatGPT:
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://openai.com/research/instruction-following&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Aligning language models to follow instructions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://arxiv.org/abs/2203.02155&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Training language models to follow instructions with human feedback&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;xcframework&#34;&gt;XCFramework
&lt;/h2&gt;&lt;p&gt;The XCFramework is a precompiled version of the library for iOS, visionOS, tvOS,
and macOS. It can be used in Swift projects without the need to compile the
library from source. For example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-swift&#34; data-lang=&#34;swift&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;// swift-tools-version: 5.10&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;// The swift-tools-version declares the minimum version of Swift required to build this package.&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;kd&#34;&gt;import&lt;/span&gt; &lt;span class=&#34;nc&#34;&gt;PackageDescription&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;kd&#34;&gt;let&lt;/span&gt; &lt;span class=&#34;nv&#34;&gt;package&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;Package&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;MyLlamaPackage&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;targets&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;executableTarget&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;            &lt;span class=&#34;n&#34;&gt;name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;MyLlamaPackage&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;            &lt;span class=&#34;n&#34;&gt;dependencies&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                &lt;span class=&#34;s&#34;&gt;&amp;#34;LlamaFramework&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;            &lt;span class=&#34;p&#34;&gt;]),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;binaryTarget&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;            &lt;span class=&#34;n&#34;&gt;name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;LlamaFramework&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;            &lt;span class=&#34;n&#34;&gt;url&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;https://github.com/ggml-org/llama.cpp/releases/download/b5046/llama-b5046-xcframework.zip&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;            &lt;span class=&#34;n&#34;&gt;checksum&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;c19be78b5f00d8d29a25da41042cb7afa094cbf6280a225abe614b03b20029ab&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;p&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The above example is using an intermediate build &lt;code&gt;b5046&lt;/code&gt; of the library. This can be modified
to use a different version by changing the URL and checksum.&lt;/p&gt;
&lt;h2 id=&#34;completions&#34;&gt;Completions
&lt;/h2&gt;&lt;p&gt;Command-line completion is available for some environments.&lt;/p&gt;
&lt;h4 id=&#34;bash-completion&#34;&gt;Bash Completion
&lt;/h4&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;$ build/bin/llama-cli --completion-bash &amp;gt; ~/.llama-completion.bash
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;$ &lt;span class=&#34;nb&#34;&gt;source&lt;/span&gt; ~/.llama-completion.bash
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Optionally this can be added to your &lt;code&gt;.bashrc&lt;/code&gt; or &lt;code&gt;.bash_profile&lt;/code&gt; to load it
automatically. For example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;gp&#34;&gt;$&lt;/span&gt; &lt;span class=&#34;nb&#34;&gt;echo&lt;/span&gt; &lt;span class=&#34;s2&#34;&gt;&amp;#34;source ~/.llama-completion.bash&amp;#34;&lt;/span&gt; &amp;gt;&amp;gt; ~/.bashrc
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id=&#34;dependencies&#34;&gt;Dependencies
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/yhirose/cpp-httplib&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;yhirose/cpp-httplib&lt;/a&gt; - Single-header HTTP server, used by &lt;code&gt;llama-server&lt;/code&gt; - MIT license&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/nothings/stb&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;stb-image&lt;/a&gt; - Single-header image format decoder, used by multimodal subsystem - Public domain&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/nlohmann/json&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;nlohmann/json&lt;/a&gt; - Single-header JSON library, used by various tools/examples - MIT License&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/google/minja&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;minja&lt;/a&gt; - Minimal Jinja parser in C++, used by various tools/examples - MIT License&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;./tools/run/linenoise.cpp/linenoise.cpp&#34; &gt;linenoise.cpp&lt;/a&gt; - C++ library that provides readline-like line editing capabilities, used by &lt;code&gt;llama-run&lt;/code&gt; - BSD 2-Clause License&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://curl.se/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;curl&lt;/a&gt; - Client-side URL transfer library, used by various tools/examples - &lt;a class=&#34;link&#34; href=&#34;https://curl.se/docs/copyright.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;CURL License&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/mackron/miniaudio&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;miniaudio.h&lt;/a&gt; - Single-header audio format decoder, used by multimodal subsystem - Public domain&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/docs/transformers/perplexity&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://huggingface.co/docs/transformers/perplexity&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/containers/ramalama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;RamaLama&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
        </item>
        <item>
        <title>generative-ai-for-beginners</title>
        <link>https://producthunt.programnotes.cn/en/p/generative-ai-for-beginners/</link>
        <pubDate>Tue, 09 Sep 2025 15:28:49 +0800</pubDate>
        
        <guid>https://producthunt.programnotes.cn/en/p/generative-ai-for-beginners/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1657870329074-e5c29e668d2d?ixid=M3w0NjAwMjJ8MHwxfHJhbmRvbXx8fHx8fHx8fDE3NTc0MDI4NTN8&amp;ixlib=rb-4.1.0" alt="Featured image of post generative-ai-for-beginners" /&gt;&lt;h1 id=&#34;microsoftgenerative-ai-for-beginners&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/generative-ai-for-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;microsoft/generative-ai-for-beginners&lt;/a&gt;
&lt;/h1&gt;&lt;h3 id=&#34;21-lessons-teaching-everything-you-need-to-know-to-start-building-generative-ai-applications&#34;&gt;21 Lessons teaching everything you need to know to start building Generative AI applications
&lt;/h3&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/Generative-AI-For-Beginners/blob/master/LICENSE?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/license/microsoft/Generative-AI-For-Beginners.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub license&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://GitHub.com/microsoft/Generative-AI-For-Beginners/graphs/contributors/?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/contributors/microsoft/Generative-AI-For-Beginners.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub contributors&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://GitHub.com/microsoft/Generative-AI-For-Beginners/issues/?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/issues/microsoft/Generative-AI-For-Beginners.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub issues&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://GitHub.com/microsoft/Generative-AI-For-Beginners/pulls/?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/issues-pr/microsoft/Generative-AI-For-Beginners.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub pull-requests&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;http://makeapullrequest.com?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;PRs Welcome&#34;
	
	
&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://GitHub.com/microsoft/Generative-AI-For-Beginners/watchers/?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/watchers/microsoft/Generative-AI-For-Beginners.svg?style=social&amp;amp;label=Watch&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub watchers&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://GitHub.com/microsoft/Generative-AI-For-Beginners/network/?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/forks/microsoft/Generative-AI-For-Beginners.svg?style=social&amp;amp;label=Fork&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub forks&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://GitHub.com/microsoft/Generative-AI-For-Beginners/stargazers/?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/github/stars/microsoft/Generative-AI-For-Beginners.svg?style=social&amp;amp;label=Star&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub stars&#34;
	
	
&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-discord?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://dcbadge.limes.pink/api/server/ByRwuEEgH4&#34;
	
	
	
	loading=&#34;lazy&#34;
	
	
&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;-multi-language-support&#34;&gt;🌐 Multi-Language Support
&lt;/h3&gt;&lt;h4 id=&#34;supported-via-github-action-automated--always-up-to-date&#34;&gt;Supported via GitHub Action (Automated &amp;amp; Always Up-to-Date)
&lt;/h4&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;./translations/fr/README.md&#34; &gt;French&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/es/README.md&#34; &gt;Spanish&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/de/README.md&#34; &gt;German&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ru/README.md&#34; &gt;Russian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ar/README.md&#34; &gt;Arabic&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/fa/README.md&#34; &gt;Persian (Farsi)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ur/README.md&#34; &gt;Urdu&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/zh/README.md&#34; &gt;Chinese (Simplified)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/mo/README.md&#34; &gt;Chinese (Traditional, Macau)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/hk/README.md&#34; &gt;Chinese (Traditional, Hong Kong)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/tw/README.md&#34; &gt;Chinese (Traditional, Taiwan)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ja/README.md&#34; &gt;Japanese&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ko/README.md&#34; &gt;Korean&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/hi/README.md&#34; &gt;Hindi&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/bn/README.md&#34; &gt;Bengali&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/mr/README.md&#34; &gt;Marathi&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ne/README.md&#34; &gt;Nepali&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/pa/README.md&#34; &gt;Punjabi (Gurmukhi)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/pt/README.md&#34; &gt;Portuguese (Portugal)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/br/README.md&#34; &gt;Portuguese (Brazil)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/it/README.md&#34; &gt;Italian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/lt/README.md&#34; &gt;Lithuanian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/pl/README.md&#34; &gt;Polish&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/tr/README.md&#34; &gt;Turkish&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/el/README.md&#34; &gt;Greek&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/th/README.md&#34; &gt;Thai&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/sv/README.md&#34; &gt;Swedish&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/da/README.md&#34; &gt;Danish&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/no/README.md&#34; &gt;Norwegian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/fi/README.md&#34; &gt;Finnish&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/nl/README.md&#34; &gt;Dutch&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/he/README.md&#34; &gt;Hebrew&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/vi/README.md&#34; &gt;Vietnamese&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/id/README.md&#34; &gt;Indonesian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ms/README.md&#34; &gt;Malay&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/tl/README.md&#34; &gt;Tagalog (Filipino)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/sw/README.md&#34; &gt;Swahili&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/hu/README.md&#34; &gt;Hungarian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/cs/README.md&#34; &gt;Czech&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/sk/README.md&#34; &gt;Slovak&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/ro/README.md&#34; &gt;Romanian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/bg/README.md&#34; &gt;Bulgarian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/sr/README.md&#34; &gt;Serbian (Cyrillic)&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/hr/README.md&#34; &gt;Croatian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/sl/README.md&#34; &gt;Slovenian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/uk/README.md&#34; &gt;Ukrainian&lt;/a&gt; | &lt;a class=&#34;link&#34; href=&#34;./translations/my/README.md&#34; &gt;Burmese (Myanmar)&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;generative-ai-for-beginners-version-3---a-course&#34;&gt;Generative AI for Beginners (Version 3) - A Course
&lt;/h1&gt;&lt;p&gt;Learn the fundamentals of building Generative AI applications with our 21-lesson comprehensive course by Microsoft Cloud Advocates.&lt;/p&gt;
&lt;h2 id=&#34;-getting-started&#34;&gt;🌱 Getting Started
&lt;/h2&gt;&lt;p&gt;This course has 21 lessons. Each lesson covers its own topic so start wherever you like!&lt;/p&gt;
&lt;p&gt;Lessons are labeled either &amp;ldquo;Learn&amp;rdquo; lessons explaining a Generative AI concept or &amp;ldquo;Build&amp;rdquo; lessons that explain a concept and code examples in both &lt;strong&gt;Python&lt;/strong&gt; and &lt;strong&gt;TypeScript&lt;/strong&gt; when possible.&lt;/p&gt;
&lt;p&gt;For .NET Developers checkout &lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/Generative-AI-for-beginners-dotnet?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Generative AI for Beginners (.NET Edition)&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;Each lesson also includes a &amp;ldquo;Keep Learning&amp;rdquo; section with additional learning tools.&lt;/p&gt;
&lt;h2 id=&#34;what-you-need&#34;&gt;What You Need
&lt;/h2&gt;&lt;h3 id=&#34;to-run-the-code-of-this-course-you-can-use-either&#34;&gt;To run the code of this course, you can use either:
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-beginners/azure-open-ai?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Azure OpenAI Service&lt;/a&gt; - &lt;strong&gt;Lessons:&lt;/strong&gt; &amp;ldquo;aoai-assignment&amp;rdquo;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-beginners/gh-models?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GitHub Marketplace Model Catalog&lt;/a&gt; - &lt;strong&gt;Lessons:&lt;/strong&gt; &amp;ldquo;githubmodels&amp;rdquo;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-beginners/open-ai?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OpenAI API&lt;/a&gt; - &lt;strong&gt;Lessons:&lt;/strong&gt; &amp;ldquo;oai-assignment&amp;rdquo;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Basic knowledge of Python or TypeScript is helpful - *For absolute beginners check out these &lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-beginners/python?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Python&lt;/a&gt; and &lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-beginners/typescript?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;TypeScript&lt;/a&gt; courses&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A GitHub account to &lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-beginners/github?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;fork this entire repo&lt;/a&gt; to your own GitHub account&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We have created a &lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;./00-course-setup/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Course Setup&lt;/a&gt;&lt;/strong&gt; lesson to help you with setting up your development environment.&lt;/p&gt;
&lt;p&gt;Don&amp;rsquo;t forget to &lt;a class=&#34;link&#34; href=&#34;https://docs.github.com/en/get-started/exploring-projects-on-github/saving-repositories-with-stars?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;star (🌟) this repo&lt;/a&gt; to find it easier later.&lt;/p&gt;
&lt;h2 id=&#34;-ready-to-deploy&#34;&gt;🧠 Ready to Deploy?
&lt;/h2&gt;&lt;p&gt;If you are looking for more advanced code samples, check out our &lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-beg-code?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;collection of Generative AI Code Samples&lt;/a&gt; in both &lt;strong&gt;Python&lt;/strong&gt; and &lt;strong&gt;TypeScript&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id=&#34;-meet-other-learners-get-support&#34;&gt;🗣️ Meet Other Learners, Get Support
&lt;/h2&gt;&lt;p&gt;Join our &lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-discord?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;official Azure AI Foundry Discord server&lt;/a&gt; to meet and network with other learners taking this course and get support.&lt;/p&gt;
&lt;p&gt;Ask questions or share product feedback in our &lt;a class=&#34;link&#34; href=&#34;https://aka.ms/azureaifoundry/forum&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Azure AI Foundry Developer Forum&lt;/a&gt; on Github.&lt;/p&gt;
&lt;h2 id=&#34;-building-a-startup&#34;&gt;🚀 Building a Startup?
&lt;/h2&gt;&lt;p&gt;Visit &lt;a class=&#34;link&#34; href=&#34;https://www.microsoft.com/startups&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Microsoft for Startups&lt;/a&gt; to find out how to get started building with Azure credits today.&lt;/p&gt;
&lt;h2 id=&#34;-want-to-help&#34;&gt;🙏 Want to help?
&lt;/h2&gt;&lt;p&gt;Do you have suggestions or found spelling or code errors? &lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/generative-ai-for-beginners/issues?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Raise an issue&lt;/a&gt; or &lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/generative-ai-for-beginners/pulls?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Create a pull request&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;-each-lesson-includes&#34;&gt;📂 Each lesson includes:
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;A short video introduction to the topic&lt;/li&gt;
&lt;li&gt;A written lesson located in the README&lt;/li&gt;
&lt;li&gt;Python and TypeScript code samples supporting Azure OpenAI and OpenAI API&lt;/li&gt;
&lt;li&gt;Links to extra resources to continue your learning&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;-lessons&#34;&gt;🗃️ Lessons
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;#&lt;/th&gt;
          &lt;th&gt;&lt;strong&gt;Lesson Link&lt;/strong&gt;&lt;/th&gt;
          &lt;th&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/th&gt;
          &lt;th&gt;&lt;strong&gt;Video&lt;/strong&gt;&lt;/th&gt;
          &lt;th&gt;&lt;strong&gt;Extra Learning&lt;/strong&gt;&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;00&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./00-course-setup/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Course Setup&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; How to Setup Your Development Environment&lt;/td&gt;
          &lt;td&gt;Video Coming Soon&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;01&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./01-introduction-to-genai/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Introduction to Generative AI and LLMs&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; Understanding what Generative AI is and how Large Language Models (LLMs) work.&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson-1-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;02&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./02-exploring-and-comparing-different-llms/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Exploring and comparing different LLMs&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; How to select the right model for your use case&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson2-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;03&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./03-using-generative-ai-responsibly/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Using Generative AI Responsibly&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; How to build Generative AI Applications responsibly&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson3-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;04&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./04-prompt-engineering-fundamentals/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Understanding Prompt Engineering Fundamentals&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; Hands-on Prompt Engineering Best Practices&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson4-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;05&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./05-advanced-prompts/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Creating Advanced Prompts&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; How to apply prompt engineering techniques that improve the outcome of your prompts.&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson5-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;06&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./06-text-generation-apps/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building Text Generation Applications&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; A text generation app using Azure OpenAI / OpenAI API&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson6-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;07&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./07-building-chat-applications/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building Chat Applications&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; Techniques for efficiently building and integrating chat applications.&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lessons7-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;08&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./08-building-search-applications/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building Search Apps Vector Databases&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; A search application that uses Embeddings to search for data.&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson8-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;09&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./09-building-image-applications/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building Image Generation Applications&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; An image generation application&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson9-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;10&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./10-building-low-code-ai-applications/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building Low Code AI Applications&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; A Generative AI application using Low Code tools&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson10-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;11&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./11-integrating-with-function-calling/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Integrating External Applications with Function Calling&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; What is function calling and its use cases for applications&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson11-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;12&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./12-designing-ux-for-ai-applications/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Designing UX for AI Applications&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; How to apply UX design principles when developing Generative AI Applications&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson12-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;13&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./13-securing-ai-applications/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Securing Your Generative AI Applications&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; The threats and risks to AI systems and methods to secure these systems.&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson13-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;14&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./14-the-generative-ai-application-lifecycle/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;The Generative AI Application Lifecycle&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; The tools and metrics to manage the LLM Lifecycle and LLMOps&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson14-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;15&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./15-rag-and-vector-databases/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Retrieval Augmented Generation (RAG) and Vector Databases&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; An application using a RAG Framework to retrieve embeddings from a Vector Databases&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson15-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;16&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./16-open-source-models/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Open Source Models and Hugging Face&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; An application using open source models available on Hugging Face&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson16-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;17&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./17-ai-agents/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;AI Agents&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Build:&lt;/strong&gt; An application using an AI Agent Framework&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson17-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;18&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./18-fine-tuning/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Fine-Tuning LLMs&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; The what, why and how of fine-tuning LLMs&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/gen-ai-lesson18-gh?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Video&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;19&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./19-slm/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building with SLMs&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; The benefits of building with Small Language Models&lt;/td&gt;
          &lt;td&gt;Video Coming Soon&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;20&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./20-mistral/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building with Mistral Models&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; The features and differences of the Mistral Family Models&lt;/td&gt;
          &lt;td&gt;Video Coming Soon&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;21&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;./21-meta/README.md?WT.mc_id=academic-105485-koreyst&#34; &gt;Building with Meta Models&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Learn:&lt;/strong&gt; The features and differences of the Meta Family Models&lt;/td&gt;
          &lt;td&gt;Video Coming Soon&lt;/td&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learn More&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;-special-thanks&#34;&gt;🌟 Special thanks
&lt;/h3&gt;&lt;p&gt;Special thanks to &lt;a class=&#34;link&#34; href=&#34;https://www.linkedin.com/in/john0isaac/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;strong&gt;John Aziz&lt;/strong&gt;&lt;/a&gt; for creating all of the GitHub Actions and workflows&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://www.linkedin.com/in/bernhard-merkle-738b73/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;strong&gt;Bernhard Merkle&lt;/strong&gt;&lt;/a&gt; for making key contributions to each lesson to improve the learner and code experience.&lt;/p&gt;
&lt;h2 id=&#34;-other-courses&#34;&gt;🎒 Other Courses
&lt;/h2&gt;&lt;p&gt;Our team produces other courses! Check out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/mcp-for-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;strong&gt;NEW&lt;/strong&gt; Model Context Protocol for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/ai-agents-for-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;AI Agents for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/Generative-AI-for-beginners-dotnet&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Generative AI for Beginners using .NET&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genai-js-course&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Generative AI for Beginners using JavaScript&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/genaijava&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Generative AI for Beginners using Java&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/ml-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ML for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/datascience-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Data Science for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/ai-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;AI for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/Security-101&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Cybersecurity for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/webdev-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Web Dev for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/iot-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;IoT for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/xr-development-for-beginners&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;XR Development for Beginners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://aka.ms/GitHubCopilotAI&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mastering GitHub Copilot for AI Paired Programming&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/mastering-github-copilot-for-dotnet-csharp-developers&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Mastering GitHub Copilot for C#/.NET Developers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/microsoft/CopilotAdventures&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Choose Your Own Copilot Adventure&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        </item>
        <item>
        <title>KrillinAI</title>
        <link>https://producthunt.programnotes.cn/en/p/krillinai/</link>
        <pubDate>Wed, 16 Apr 2025 15:29:17 +0800</pubDate>
        
        <guid>https://producthunt.programnotes.cn/en/p/krillinai/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1727175401108-6e8bf73ca114?ixid=M3w0NjAwMjJ8MHwxfHJhbmRvbXx8fHx8fHx8fDE3NDQ3ODg0NTh8&amp;ixlib=rb-4.0.3" alt="Featured image of post KrillinAI" /&gt;&lt;h1 id=&#34;krillinaikrillinai&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/krillinai/KrillinAI&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;krillinai/KrillinAI&lt;/a&gt;
&lt;/h1&gt;&lt;div align=&#34;center&#34;&gt;
  &lt;img src=&#34;./docs/images/logo.png&#34; alt=&#34;KrillinAI&#34; height=&#34;90&#34;&gt;
&lt;h1 id=&#34;ai-audiovideo-translation-and-dubbing-tool&#34;&gt;AI Audio&amp;amp;Video Translation and Dubbing Tool
&lt;/h1&gt;&lt;p&gt;&lt;a href=&#34;https://trendshift.io/repositories/13360&#34; target=&#34;_blank&#34;&gt;&lt;img src=&#34;https://trendshift.io/api/badge/repositories/13360&#34; alt=&#34;krillinai%2FKrillinAI | Trendshift&#34; style=&#34;width: 250px; height: 55px;&#34; width=&#34;250&#34; height=&#34;55&#34;/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;./README.md&#34; &gt;English&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_zh.md&#34; &gt;简体中文&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_jp.md&#34; &gt;日本語&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_kr.md&#34; &gt;한국어&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_fr.md&#34; &gt;Français&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_de.md&#34; &gt;Deutsch&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_es.md&#34; &gt;Español&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_pt.md&#34; &gt;Português&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_rus.md&#34; &gt;Русский&lt;/a&gt;｜&lt;a class=&#34;link&#34; href=&#34;./docs/README_ar.md&#34; &gt;اللغة العربية&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://x.com/KrillinAI&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/Twitter-KrillinAI-orange?logo=twitter&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Twitter&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://space.bilibili.com/242124650&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/dynamic/json?label=Bilibili&amp;amp;query=%24.data.follower&amp;amp;suffix=%20followers&amp;amp;url=https%3A%2F%2Fapi.bilibili.com%2Fx%2Frelation%2Fstat%3Fvmid%3D242124650&amp;amp;logo=bilibili&amp;amp;color=00A1D6&amp;amp;labelColor=FE7398&amp;amp;logoColor=FFFFFF&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Bilibili&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://jq.qq.com/?_wv=1027&amp;amp;k=754069680&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/QQ%20%e7%be%a4-754069680-green?logo=tencent-qq&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;QQ 群&#34;
	
	
&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;h3 id=&#34;-new-release-for-win--mac-desktop-version--welcome-to-test-and-provide-feedback&#34;&gt;📢 New Release for Win &amp;amp; Mac Desktop Version – Welcome to Test and Provide Feedback
&lt;/h3&gt;&lt;h2 id=&#34;overview&#34;&gt;Overview
&lt;/h2&gt;&lt;p&gt;Krillin AI is an all-in-one solution for effortless video localization and enhancement. This minimalist yet powerful tool handles everything from translation, dubbing to voice cloning，formatting—seamlessly converting videos between landscape and portrait modes for optimal display across all content platforms(YouTube, TikTok, Bilibili, Douyin, WeChat Channel, RedNote, Kuaishou). With its end-to-end workflow, Krillin AI transforms raw footage into polished, platform-ready content in just a few clicks.&lt;/p&gt;
&lt;h2 id=&#34;key-features&#34;&gt;Key Features:
&lt;/h2&gt;&lt;p&gt;🎯 &lt;strong&gt;One-Click Start&lt;/strong&gt; - Launch your workflow instantly,New desktop version available—easier to use!&lt;/p&gt;
&lt;p&gt;📥 &lt;strong&gt;Video download&lt;/strong&gt; - yt-dlp and local file uploading supported&lt;/p&gt;
&lt;p&gt;📜 &lt;strong&gt;Precise Subtitles&lt;/strong&gt; - Whisper-powered high-accuracy recognition&lt;/p&gt;
&lt;p&gt;🧠 &lt;strong&gt;Smart Segmentation&lt;/strong&gt; - LLM-based subtitle chunking &amp;amp; alignment&lt;/p&gt;
&lt;p&gt;🌍 &lt;strong&gt;Professional Translation&lt;/strong&gt; - Paragraph-level translation for consistency&lt;/p&gt;
&lt;p&gt;🔄 &lt;strong&gt;Term Replacement&lt;/strong&gt; - One-click domain-specific vocabulary swap&lt;/p&gt;
&lt;p&gt;🎙️ &lt;strong&gt;Dubbing and Voice Cloning&lt;/strong&gt; - CosyVoice selected or cloning voices&lt;/p&gt;
&lt;p&gt;🎬 &lt;strong&gt;Video Composition&lt;/strong&gt; - Auto-formatting for horizontal/vertical layouts&lt;/p&gt;
&lt;h2 id=&#34;showcase&#34;&gt;Showcase
&lt;/h2&gt;&lt;p&gt;The following picture demonstrates the effect after the subtitle file, which was generated through a one-click operation after importing a 46-minute local video, was inserted into the track. There was no manual adjustment involved at all. There are no missing or overlapping subtitles, the sentence segmentation is natural, and the translation quality is also quite high.&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;
&lt;td width=&#34;33%&#34;&gt;
&lt;h3 id=&#34;subtitle-translation&#34;&gt;Subtitle Translation
&lt;/h3&gt;&lt;hr&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/user-attachments/assets/bba1ac0a-fe6b-4947-b58d-ba99306d0339&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/user-attachments/assets/bba1ac0a-fe6b-4947-b58d-ba99306d0339&lt;/a&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td width=&#34;33%&#34;&gt;
&lt;h3 id=&#34;dubbing&#34;&gt;Dubbing
&lt;/h3&gt;&lt;hr&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/user-attachments/assets/0b32fad3-c3ad-4b6a-abf0-0865f0dd2385&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/user-attachments/assets/0b32fad3-c3ad-4b6a-abf0-0865f0dd2385&lt;/a&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td width=&#34;33%&#34;&gt;
&lt;h3 id=&#34;portrait&#34;&gt;Portrait
&lt;/h3&gt;&lt;hr&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/user-attachments/assets/c2c7b528-0ef8-4ba9-b8ac-f9f92f6d4e71&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/user-attachments/assets/c2c7b528-0ef8-4ba9-b8ac-f9f92f6d4e71&lt;/a&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;
&lt;h2 id=&#34;-speech-recognition-support&#34;&gt;🔍 Speech Recognition Support
&lt;/h2&gt;&lt;p&gt;&lt;em&gt;&lt;strong&gt;All local models in the table below support automatic installation of executable files + model files. Just make your selection, and KrillinAI will handle everything else for you.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Service&lt;/th&gt;
          &lt;th&gt;Supported Platforms&lt;/th&gt;
          &lt;th&gt;Model Options&lt;/th&gt;
          &lt;th&gt;Local/Cloud&lt;/th&gt;
          &lt;th&gt;Notes&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;strong&gt;OpenAI Whisper&lt;/strong&gt;&lt;/td&gt;
          &lt;td&gt;Cross-platform&lt;/td&gt;
          &lt;td&gt;-&lt;/td&gt;
          &lt;td&gt;Cloud&lt;/td&gt;
          &lt;td&gt;Fast with excellent results&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;strong&gt;FasterWhisper&lt;/strong&gt;&lt;/td&gt;
          &lt;td&gt;Windows/Linux&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;tiny&lt;/code&gt;/&lt;code&gt;medium&lt;/code&gt;/&lt;code&gt;large-v2&lt;/code&gt; (recommend medium+)&lt;/td&gt;
          &lt;td&gt;Local&lt;/td&gt;
          &lt;td&gt;Faster speed, no cloud service overhead&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;strong&gt;WhisperKit&lt;/strong&gt;&lt;/td&gt;
          &lt;td&gt;macOS (Apple Silicon only)&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;large-v2&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Local&lt;/td&gt;
          &lt;td&gt;Native optimization for Apple chips&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;strong&gt;Alibaba Cloud ASR&lt;/strong&gt;&lt;/td&gt;
          &lt;td&gt;Cross-platform&lt;/td&gt;
          &lt;td&gt;-&lt;/td&gt;
          &lt;td&gt;Cloud&lt;/td&gt;
          &lt;td&gt;Bypasses China mainland network issues&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;-large-language-model-support&#34;&gt;🚀 Large Language Model Support
&lt;/h2&gt;&lt;p&gt;✅ Compatible with all &lt;strong&gt;OpenAI API-compatible&lt;/strong&gt; cloud/local LLM services including but not limited to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI&lt;/li&gt;
&lt;li&gt;DeepSeek&lt;/li&gt;
&lt;li&gt;Qwen (Tongyi Qianwen)&lt;/li&gt;
&lt;li&gt;Self-hosted open-source models&lt;/li&gt;
&lt;li&gt;Other OpenAI-format compatible API services&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;-language-support&#34;&gt;🌍 Language Support
&lt;/h2&gt;&lt;p&gt;Input languages: Chinese, English, Japanese, German, Turkish supported (more languages being added)&lt;br&gt;
Translation languages: 56 languages supported, including English, Chinese, Russian, Spanish, French, etc.&lt;/p&gt;
&lt;h2 id=&#34;interface-preview&#34;&gt;Interface Preview
&lt;/h2&gt;&lt;h2 id=&#34;-quick-start&#34;&gt;🚀 Quick Start
&lt;/h2&gt;&lt;h3 id=&#34;basic-steps&#34;&gt;Basic Steps
&lt;/h3&gt;&lt;p&gt;First, download the Release executable file that matches your device&amp;rsquo;s system. Follow the instructions below to choose between the desktop or non-desktop version, then place the software in an empty folder. Running the program will generate some directories, so keeping it in an empty folder makes management easier.&lt;/p&gt;
&lt;p&gt;[For the desktop version (release files with &amp;ldquo;desktop&amp;rdquo; in the name), refer here]&lt;br&gt;
&lt;em&gt;The desktop version is newly released to address the difficulty beginners face in editing configuration files correctly. It still has some bugs and is being continuously updated.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Double-click the file to start using it.&lt;/p&gt;
&lt;p&gt;[For the non-desktop version (release files without &amp;ldquo;desktop&amp;rdquo; in the name), refer here]&lt;br&gt;
&lt;em&gt;The non-desktop version is the original release, with more complex configuration but stable functionality. It is also suitable for server deployment, as it provides a web-based UI.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Create a &lt;code&gt;config&lt;/code&gt; folder in the directory, then create a &lt;code&gt;config.toml&lt;/code&gt; file inside it. Copy the contents of the &lt;code&gt;config-example.toml&lt;/code&gt; file from the source code&amp;rsquo;s &lt;code&gt;config&lt;/code&gt; directory into your &lt;code&gt;config.toml&lt;/code&gt; and fill in your configuration details. (If you want to use OpenAI models but don’t know how to get a key, you can join the group for free trial access.)&lt;/p&gt;
&lt;p&gt;Double-click the executable or run it in the terminal to start the service.&lt;/p&gt;
&lt;p&gt;Open your browser and enter http://127.0.0.1:8888 to begin using it. (Replace 8888 with the port number you specified in the config file.)&lt;/p&gt;
&lt;h3 id=&#34;to-macos-users&#34;&gt;To: macOS Users
&lt;/h3&gt;&lt;p&gt;[For the desktop version, i.e., release files with &amp;ldquo;desktop&amp;rdquo; in the name, refer here]&lt;br&gt;
The current packaging method for the desktop version cannot support direct double-click execution or DMG installation due to signing issues. Manual trust configuration is required as follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Open the directory containing the executable file (assuming the filename is KrillinAI_1.0.0_desktop_macOS_arm64) in Terminal&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Execute the following commands sequentially:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-fallback&#34; data-lang=&#34;fallback&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;sudo xattr -cr ./KrillinAI_1.0.0_desktop_macOS_arm64  
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;sudo chmod +x ./KrillinAI_1.0.0_desktop_macOS_arm64  
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;./KrillinAI_1.0.0_desktop_macOS_arm64  
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;[For the non-desktop version, i.e., release files without &amp;ldquo;desktop&amp;rdquo; in the name, refer here]&lt;br&gt;
This software is not signed, so after completing the file configuration in the &amp;ldquo;Basic Steps,&amp;rdquo; you will need to manually trust the application on macOS. Follow these steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Open the terminal and navigate to the directory where the executable file (assuming the file name is &lt;code&gt;KrillinAI_1.0.0_macOS_arm64&lt;/code&gt;) is located.&lt;/li&gt;
&lt;li&gt;Execute the following commands in sequence:&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-fallback&#34; data-lang=&#34;fallback&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;sudo xattr -rd com.apple.quarantine ./KrillinAI_1.0.0_macOS_arm64
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;sudo chmod +x ./KrillinAI_1.0.0_macOS_arm64
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;./KrillinAI_1.0.0_macOS_arm64
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;This will start the service.&lt;/p&gt;
&lt;h3 id=&#34;docker-deployment&#34;&gt;Docker Deployment
&lt;/h3&gt;&lt;p&gt;This project supports Docker deployment. Please refer to the &lt;a class=&#34;link&#34; href=&#34;./docs/docker.md&#34; &gt;Docker Deployment Instructions&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;cookie-configuration-instructions&#34;&gt;Cookie Configuration Instructions
&lt;/h3&gt;&lt;p&gt;If you encounter video download failures, please refer to the &lt;a class=&#34;link&#34; href=&#34;./docs/get_cookies.md&#34; &gt;Cookie Configuration Instructions&lt;/a&gt; to configure your cookie information.&lt;/p&gt;
&lt;h3 id=&#34;configuration-help&#34;&gt;Configuration Help
&lt;/h3&gt;&lt;p&gt;The quickest and most convenient configuration method:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Select &lt;code&gt;openai&lt;/code&gt; for both &lt;code&gt;transcription_provider&lt;/code&gt; and &lt;code&gt;llm_provider&lt;/code&gt;. In this way, you only need to fill in &lt;code&gt;openai.apikey&lt;/code&gt; in the following three major configuration item categories, namely &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;local_model&lt;/code&gt;, and &lt;code&gt;aliyun&lt;/code&gt;, and then you can conduct subtitle translation. (Fill in &lt;code&gt;app.proxy&lt;/code&gt;, &lt;code&gt;model&lt;/code&gt; and &lt;code&gt;openai.base_url&lt;/code&gt; as per your own situation.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The configuration method for using the local speech recognition model (macOS is not supported for the time being) (a choice that takes into account cost, speed, and quality):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fill in &lt;code&gt;fasterwhisper&lt;/code&gt; for &lt;code&gt;transcription_provider&lt;/code&gt; and &lt;code&gt;openai&lt;/code&gt; for &lt;code&gt;llm_provider&lt;/code&gt;. In this way, you only need to fill in &lt;code&gt;openai.apikey&lt;/code&gt; and &lt;code&gt;local_model.faster_whisper&lt;/code&gt; in the following three major configuration item categories, namely &lt;code&gt;openai&lt;/code&gt; and &lt;code&gt;local_model&lt;/code&gt;, and then you can conduct subtitle translation. The local model will be downloaded automatically. (The same applies to &lt;code&gt;app.proxy&lt;/code&gt; and &lt;code&gt;openai.base_url&lt;/code&gt; as mentioned above.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following usage situations require the configuration of Alibaba Cloud:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If &lt;code&gt;llm_provider&lt;/code&gt; is filled with &lt;code&gt;aliyun&lt;/code&gt;, it indicates that the large model service of Alibaba Cloud will be used. Consequently, the configuration of the &lt;code&gt;aliyun.bailian&lt;/code&gt; item needs to be set up.&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;transcription_provider&lt;/code&gt; is filled with &lt;code&gt;aliyun&lt;/code&gt;, or if the &amp;ldquo;voice dubbing&amp;rdquo; function is enabled when starting a task, the voice service of Alibaba Cloud will be utilized. Therefore, the configuration of the &lt;code&gt;aliyun.speech&lt;/code&gt; item needs to be filled in.&lt;/li&gt;
&lt;li&gt;If the &amp;ldquo;voice dubbing&amp;rdquo; function is enabled and local audio files are uploaded for voice timbre cloning at the same time, the OSS cloud storage service of Alibaba Cloud will also be used. Hence, the configuration of the &lt;code&gt;aliyun.oss&lt;/code&gt; item needs to be filled in.
Configuration Guide: &lt;a class=&#34;link&#34; href=&#34;./docs/aliyun.md&#34; &gt;Alibaba Cloud Configuration Instructions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;frequently-asked-questions&#34;&gt;Frequently Asked Questions
&lt;/h2&gt;&lt;p&gt;Please refer to &lt;a class=&#34;link&#34; href=&#34;./docs/faq.md&#34; &gt;Frequently Asked Questions&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;contribution-guidelines&#34;&gt;Contribution Guidelines
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Do not submit unnecessary files like &lt;code&gt;.vscode&lt;/code&gt;, &lt;code&gt;.idea&lt;/code&gt;, etc. Please make good use of &lt;code&gt;.gitignore&lt;/code&gt; to filter them.&lt;/li&gt;
&lt;li&gt;Do not submit &lt;code&gt;config.toml&lt;/code&gt;; instead, submit &lt;code&gt;config-example.toml&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;star-history&#34;&gt;Star History
&lt;/h2&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://star-history.com/#krillinai/KrillinAI&amp;amp;Date&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://api.star-history.com/svg?repos=krillinai/KrillinAI&amp;amp;type=Date&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Star History Chart&#34;
	
	
&gt;&lt;/a&gt;&lt;/p&gt;
</description>
        </item>
        
    </channel>
</rss>
