<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Non-Thinking Mode on Producthunt daily</title>
        <link>https://producthunt.programnotes.cn/en/tags/non-thinking-mode/</link>
        <description>Recent content in Non-Thinking Mode on Producthunt daily</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Fri, 25 Jul 2025 15:33:27 +0800</lastBuildDate><atom:link href="https://producthunt.programnotes.cn/en/tags/non-thinking-mode/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>Qwen3</title>
        <link>https://producthunt.programnotes.cn/en/p/qwen3/</link>
        <pubDate>Fri, 25 Jul 2025 15:33:27 +0800</pubDate>
        
        <guid>https://producthunt.programnotes.cn/en/p/qwen3/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1537987571070-9df5420854b2?ixid=M3w0NjAwMjJ8MHwxfHJhbmRvbXx8fHx8fHx8fDE3NTM0Mjg3NDh8&amp;ixlib=rb-4.1.0" alt="Featured image of post Qwen3" /&gt;&lt;h1 id=&#34;qwenlmqwen3&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/QwenLM/Qwen3&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;QwenLM/Qwen3&lt;/a&gt;
&lt;/h1&gt;&lt;h1 id=&#34;qwen3&#34;&gt;Qwen3
&lt;/h1&gt;&lt;p align=&#34;center&#34;&gt;
    &lt;img src=&#34;https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/logo_qwen3.png&#34; width=&#34;400&#34;/&gt;
&lt;p&gt;
&lt;p align=&#34;center&#34;&gt;
          💜 &lt;a href=&#34;https://chat.qwen.ai/&#34;&gt;&lt;b&gt;Qwen Chat&lt;/b&gt;&lt;/a&gt;&amp;nbsp&amp;nbsp | &amp;nbsp&amp;nbsp🤗 &lt;a href=&#34;https://huggingface.co/Qwen&#34;&gt;Hugging Face&lt;/a&gt;&amp;nbsp&amp;nbsp | &amp;nbsp&amp;nbsp🤖 &lt;a href=&#34;https://modelscope.cn/organization/qwen&#34;&gt;ModelScope&lt;/a&gt;&amp;nbsp&amp;nbsp | &amp;nbsp&amp;nbsp 📑 &lt;a href=&#34;https://arxiv.org/abs/2505.09388&#34;&gt;Paper&lt;/a&gt; &amp;nbsp&amp;nbsp | &amp;nbsp&amp;nbsp 📑 &lt;a href=&#34;https://qwenlm.github.io/blog/qwen3/&#34;&gt;Blog&lt;/a&gt; &amp;nbsp&amp;nbsp ｜ &amp;nbsp&amp;nbsp📖 &lt;a href=&#34;https://qwen.readthedocs.io/&#34;&gt;Documentation&lt;/a&gt;
&lt;br&gt;
🖥️ &lt;a href=&#34;https://huggingface.co/spaces/Qwen/Qwen3-Demo&#34;&gt;Demo&lt;/a&gt;&amp;nbsp&amp;nbsp | &amp;nbsp&amp;nbsp💬 &lt;a href=&#34;https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png&#34;&gt;WeChat (微信)&lt;/a&gt;&amp;nbsp&amp;nbsp | &amp;nbsp&amp;nbsp🫨 &lt;a href=&#34;https://discord.gg/CV4E9rpNSD&#34;&gt;Discord&lt;/a&gt;&amp;nbsp&amp;nbsp
&lt;/p&gt;
&lt;p&gt;Visit our Hugging Face or ModelScope organization (click links above), search checkpoints with names starting with &lt;code&gt;Qwen3-&lt;/code&gt; or visit the &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Qwen3 collection&lt;/a&gt;, and you will find all you need! Enjoy!&lt;/p&gt;
&lt;p&gt;To learn more about Qwen3, feel free to read our documentation &lt;/p&gt;
\[[EN](https://qwen.readthedocs.io/en/latest/)|[ZH](https://qwen.readthedocs.io/zh-cn/latest/)\]&lt;p&gt;. Our documentation consists of the following sections:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Quickstart: the basic usages and demonstrations;&lt;/li&gt;
&lt;li&gt;Inference: the guidance for the inference with Transformers, including batch inference, streaming, etc.;&lt;/li&gt;
&lt;li&gt;Run Locally: the instructions for running LLM locally on CPU and GPU, with frameworks like llama.cpp and Ollama;&lt;/li&gt;
&lt;li&gt;Deployment: the demonstration of how to deploy Qwen for large-scale inference with frameworks like SGLang, vLLM, TGI, etc.;&lt;/li&gt;
&lt;li&gt;Quantization: the practice of quantizing LLMs with GPTQ, AWQ, as well as the guidance for how to make high-quality quantized GGUF files;&lt;/li&gt;
&lt;li&gt;Training: the instructions for post-training, including SFT and RLHF (TODO) with frameworks like Axolotl, LLaMA-Factory, etc.&lt;/li&gt;
&lt;li&gt;Framework: the usage of Qwen with frameworks for application, e.g., RAG, Agent, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;introduction&#34;&gt;Introduction
&lt;/h2&gt;&lt;p&gt;We are excited to introduce the updated version of the &lt;strong&gt;Qwen3-235B-A22B non-thinking mode&lt;/strong&gt;, named &lt;strong&gt;Qwen3-235B-A22B-Instruct-2507&lt;/strong&gt;, featuring the following key enhancements:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Significant improvements&lt;/strong&gt; in general capabilities, including &lt;strong&gt;instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Substantial gains&lt;/strong&gt; in long-tail knowledge coverage across &lt;strong&gt;multiple languages&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Markedly better alignment&lt;/strong&gt; with user preferences in &lt;strong&gt;subjective and open-ended tasks&lt;/strong&gt;, enabling more helpful responses and higher-quality text generation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Enhanced capabilities&lt;/strong&gt; in &lt;strong&gt;256K-token long-context understanding&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;https://qianwen-res.oss-accelerate.aliyuncs.com/Qwen3-235B-A22B-Instruct-2507.jpeg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Qwen3-235B-A22B-Instruct-2507&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;The updated versions of &lt;strong&gt;more Qwen3 model sizes&lt;/strong&gt; and for &lt;strong&gt;thinking mode&lt;/strong&gt; are also expected to be released very soon. Stay tuned🚀&lt;/p&gt;
&lt;details&gt;
    &lt;summary&gt;&lt;b&gt;Previous News for Qwen3 Release&lt;/b&gt;&lt;/summary&gt;
    &lt;p&gt;
    We are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models. 
    These models represent our most advanced and intelligent systems to date, improving from our experience in building QwQ and Qwen2.5.
    We are making the weights of Qwen3 available to the public, including both dense and Mixture-of-Expert (MoE) models. 
    &lt;br&gt;&lt;br&gt;
    The highlights from Qwen3 include:
        &lt;ul&gt;
            &lt;li&gt;&lt;b&gt;Dense and Mixture-of-Experts (MoE) models of various sizes&lt;/b&gt;, available in 0.6B, 1.7B, 4B, 8B, 14B, 32B and 30B-A3B, 235B-A22B.&lt;/li&gt;
            &lt;li&gt;&lt;b&gt;Seamless switching between thinking mode&lt;/b&gt; (for complex logical reasoning, math, and coding) and &lt;b&gt;non-thinking mode&lt;/b&gt; (for efficient, general-purpose chat), ensuring optimal performance across various scenarios.&lt;/li&gt;
            &lt;li&gt;&lt;b&gt;Significantly enhancement in reasoning capabilities&lt;/b&gt;, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.&lt;/li&gt;
            &lt;li&gt;&lt;b&gt;Superior human preference alignment&lt;/b&gt;, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.&lt;/li&gt;
            &lt;li&gt;&lt;b&gt;Expertise in agent capabilities&lt;/b&gt;, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.&lt;/li&gt;
            &lt;li&gt;&lt;b&gt;Support of 100+ languages and dialects&lt;/b&gt; with strong capabilities for &lt;b&gt;multilingual instruction following&lt;/b&gt; and &lt;b&gt;translation&lt;/b&gt;.&lt;/li&gt;
        &lt;/ul&gt;
    &lt;/p&gt;
&lt;/details&gt;
&lt;h2 id=&#34;news&#34;&gt;News
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;2025.07.21: We released the updated version of Qwen3-235B-A22B non-thinking mode, named Qwen3-235B-A22B-Instruct-2507, featuring significant enhancements over the previous version and supporting 256K-token long-context understanding. Check our &lt;a class=&#34;link&#34; href=&#34;https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;modelcard&lt;/a&gt; for more details!&lt;/li&gt;
&lt;li&gt;2025.04.29: We released the Qwen3 series. Check our &lt;a class=&#34;link&#34; href=&#34;https://qwenlm.github.io/blog/qwen3&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;blog&lt;/a&gt; for more details!&lt;/li&gt;
&lt;li&gt;2024.09.19: We released the Qwen2.5 series. This time there are 3 extra model sizes: 3B, 14B, and 32B for more possibilities. Check our &lt;a class=&#34;link&#34; href=&#34;https://qwenlm.github.io/blog/qwen2.5&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;blog&lt;/a&gt; for more!&lt;/li&gt;
&lt;li&gt;2024.06.06: We released the Qwen2 series. Check our &lt;a class=&#34;link&#34; href=&#34;https://qwenlm.github.io/blog/qwen2/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;blog&lt;/a&gt;!&lt;/li&gt;
&lt;li&gt;2024.03.28: We released the first MoE model of Qwen: Qwen1.5-MoE-A2.7B! Temporarily, only HF transformers and vLLM support the model. We will soon add the support of llama.cpp, mlx-lm, etc. Check our &lt;a class=&#34;link&#34; href=&#34;https://qwenlm.github.io/blog/qwen-moe/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;blog&lt;/a&gt; for more information!&lt;/li&gt;
&lt;li&gt;2024.02.05: We released the Qwen1.5 series.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;performance&#34;&gt;Performance
&lt;/h2&gt;&lt;p&gt;Detailed evaluation results are reported in this &lt;a class=&#34;link&#34; href=&#34;https://qwenlm.github.io/blog/qwen3/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;📑 blog&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For requirements on GPU memory and the respective throughput, see results &lt;a class=&#34;link&#34; href=&#34;https://qwen.readthedocs.io/en/latest/getting_started/speed_benchmark.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;run-qwen3&#34;&gt;Run Qwen3
&lt;/h2&gt;&lt;h3 id=&#34;-transformers&#34;&gt;🤗 Transformers
&lt;/h3&gt;&lt;p&gt;Transformers is a library of pretrained natural language processing for inference and training.
The latest version of &lt;code&gt;transformers&lt;/code&gt; is recommended and &lt;code&gt;transformers&amp;gt;=4.51.0&lt;/code&gt; is required.&lt;/p&gt;
&lt;p&gt;The following contains a code snippet illustrating how to use Qwen3-235B-A22B-Instruct-2507 to generate content based on given inputs.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;21
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;22
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;23
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;24
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;25
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;26
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;27
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;28
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;29
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;30
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;31
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;32
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;33
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;34
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;kn&#34;&gt;from&lt;/span&gt; &lt;span class=&#34;nn&#34;&gt;transformers&lt;/span&gt; &lt;span class=&#34;kn&#34;&gt;import&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AutoTokenizer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;model_name&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s2&#34;&gt;&amp;#34;Qwen/Qwen3-235B-A22B-Instruct-2507&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# load the tokenizer and the model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;tokenizer&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AutoTokenizer&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;from_pretrained&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;model_name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;model&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;from_pretrained&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;model_name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;torch_dtype&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;auto&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;device_map&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;auto&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# prepare the model input&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;prompt&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s2&#34;&gt;&amp;#34;Give me a short introduction to large language model.&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;messages&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;s2&#34;&gt;&amp;#34;user&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s2&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;prompt&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;text&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;tokenizer&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;apply_chat_template&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;messages&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;tokenize&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;kc&#34;&gt;False&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;add_generation_prompt&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;kc&#34;&gt;True&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;model_inputs&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;tokenizer&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;([&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;text&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;],&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;return_tensors&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;pt&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;to&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;model&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;device&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# conduct text completion&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;generated_ids&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;model&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;generate&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;o&#34;&gt;**&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;model_inputs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;max_new_tokens&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;16384&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;output_ids&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;generated_ids&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;[&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;0&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;][&lt;/span&gt;&lt;span class=&#34;nb&#34;&gt;len&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;model_inputs&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;input_ids&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;[&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;0&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;]):]&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;tolist&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;content&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;tokenizer&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;decode&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;output_ids&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;skip_special_tokens&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;kc&#34;&gt;True&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nb&#34;&gt;print&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;content:&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;content&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;[!Note]
The updated version of Qwen3-235B-A22B, namely &lt;strong&gt;Qwen3-235B-A22B-Instruct-2507&lt;/strong&gt; supports &lt;strong&gt;only non-thinking mode&lt;/strong&gt; and &lt;strong&gt;does not generate &lt;code&gt;&amp;lt;think&amp;gt;&amp;lt;/think&amp;gt;&lt;/code&gt; blocks&lt;/strong&gt; in its output. Meanwhile, &lt;strong&gt;specifying &lt;code&gt;enable_thinking=False&lt;/code&gt; is no longer required&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;details&gt;
    &lt;summary&gt;&lt;b&gt;Switching Thinking/Non-thinking Modes for Previous Qwen3 Hybrid Models&lt;/b&gt;&lt;/summary&gt;
    &lt;p&gt;
    By default, Qwen3 models will think before response.
    This could be controlled by
        &lt;ul&gt;
            &lt;li&gt;&lt;code&gt;enable_thinking=False&lt;/code&gt;: Passing &lt;code&gt;enable_thinking=False&lt;/code&gt; to `tokenizer.apply_chat_template` will strictly prevent the model from generating thinking content.&lt;/li&gt;
            &lt;li&gt;&lt;code&gt;/think&lt;/code&gt; and &lt;code&gt;/no_think&lt;/code&gt; instructions: Use those words in the system or user message to signify whether Qwen3 should think. In multi-turn conversations, the latest instruction is followed.&lt;/li&gt;
        &lt;/ul&gt;
    &lt;/p&gt;
&lt;/details&gt;
&lt;h3 id=&#34;modelscope&#34;&gt;ModelScope
&lt;/h3&gt;&lt;p&gt;We strongly advise users especially those in mainland China to use ModelScope.
ModelScope adopts a Python API similar to Transformers.
The CLI tool &lt;code&gt;modelscope download&lt;/code&gt; can help you solve issues concerning downloading checkpoints.&lt;/p&gt;
&lt;h3 id=&#34;llamacpp&#34;&gt;llama.cpp
&lt;/h3&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;code&gt;llama.cpp&lt;/code&gt;&lt;/a&gt; enables LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware.
&lt;code&gt;llama.cpp&amp;gt;=b5092&lt;/code&gt; is required for the support of Qwen3 architecture.
&lt;code&gt;llama.cpp&amp;gt;=b5401&lt;/code&gt; is recommended for the full support of the official Qwen3 chat template.&lt;/p&gt;
&lt;p&gt;To use the CLI, run the following in a terminal:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;./llama-cli -hf Qwen/Qwen3-8B-GGUF:Q8_0 --jinja --color -ngl &lt;span class=&#34;m&#34;&gt;99&lt;/span&gt; -fa -sm row --temp 0.6 --top-k &lt;span class=&#34;m&#34;&gt;20&lt;/span&gt; --top-p 0.95 --min-p &lt;span class=&#34;m&#34;&gt;0&lt;/span&gt; -c &lt;span class=&#34;m&#34;&gt;40960&lt;/span&gt; -n &lt;span class=&#34;m&#34;&gt;32768&lt;/span&gt; --no-context-shift
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# CTRL+C to exit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;To use the API server, run the following in a terminal:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;./llama-server -hf Qwen/Qwen3-8B-GGUF:Q8_0 --jinja --reasoning-format deepseek -ngl &lt;span class=&#34;m&#34;&gt;99&lt;/span&gt; -fa -sm row --temp 0.6 --top-k &lt;span class=&#34;m&#34;&gt;20&lt;/span&gt; --top-p 0.95 --min-p &lt;span class=&#34;m&#34;&gt;0&lt;/span&gt; -c &lt;span class=&#34;m&#34;&gt;40960&lt;/span&gt; -n &lt;span class=&#34;m&#34;&gt;32768&lt;/span&gt; --no-context-shift --port &lt;span class=&#34;m&#34;&gt;8080&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;A simple web front end will be at &lt;code&gt;http://localhost:8080&lt;/code&gt; and an OpenAI-compatible API will be at &lt;code&gt;http://localhost:8080/v1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;For additional guides, please refer to &lt;a class=&#34;link&#34; href=&#34;https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;our documentation&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[!TIP]
llama.cpp adopts &amp;ldquo;rotating context management&amp;rdquo; and infinite generation is made possible by evicting earlier tokens.
It could configured by parameters and the commands above effectively disable it.
For more details, please refer to &lt;a class=&#34;link&#34; href=&#34;https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html#llama-cli&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;our documentation&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;ollama&#34;&gt;Ollama
&lt;/h3&gt;&lt;p&gt;After &lt;a class=&#34;link&#34; href=&#34;https://ollama.com/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;installing Ollama&lt;/a&gt;, you can initiate the Ollama service with the following command (Ollama v0.6.6 or higher is required):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama serve
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# You need to keep this service running whenever you are using ollama&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;To pull a model checkpoint and run the model, use the &lt;code&gt;ollama run&lt;/code&gt; command. You can specify a model size by adding a suffix to &lt;code&gt;qwen3&lt;/code&gt;, such as &lt;code&gt;:8b&lt;/code&gt; or &lt;code&gt;:30b-a3b&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama run qwen3:8b
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# Setting parameters, type &amp;#34;/set parameter num_ctx 40960&amp;#34; and &amp;#34;/set parameter num_predict 32768&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;# To exit, type &amp;#34;/bye&amp;#34; and press ENTER&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;You can also access the Ollama service via its OpenAI-compatible API.
Please note that you need to (1) keep &lt;code&gt;ollama serve&lt;/code&gt; running while using the API, and (2) execute &lt;code&gt;ollama run qwen3:8b&lt;/code&gt; before utilizing this API to ensure that the model checkpoint is prepared.
The API is at &lt;code&gt;http://localhost:11434/v1/&lt;/code&gt; by default.&lt;/p&gt;
&lt;p&gt;For additional details, please visit &lt;a class=&#34;link&#34; href=&#34;https://ollama.com/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;ollama.ai&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[!TIP]
Ollama adopts the same &amp;ldquo;rotating context management&amp;rdquo; with llama.cpp.
However, its default settings (&lt;code&gt;num_ctx&lt;/code&gt; 2048 and &lt;code&gt;num_predict&lt;/code&gt; -1), suggesting infinite generation with a 2048-token context,
could lead to trouble for Qwen3 models.
We recommend setting &lt;code&gt;num_ctx&lt;/code&gt; and &lt;code&gt;num_predict&lt;/code&gt; properly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;lmstudio&#34;&gt;LMStudio
&lt;/h3&gt;&lt;p&gt;Qwen3 has already been supported by &lt;a class=&#34;link&#34; href=&#34;https://lmstudio.ai/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;lmstudio.ai&lt;/a&gt;. You can directly use LMStudio with our GGUF files.&lt;/p&gt;
&lt;h3 id=&#34;executorch&#34;&gt;ExecuTorch
&lt;/h3&gt;&lt;p&gt;To export and run on ExecuTorch (iOS, Android, Mac, Linux, and more), please follow this &lt;a class=&#34;link&#34; href=&#34;https://github.com/pytorch/executorch/blob/main/examples/models/qwen3/README.md&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;example&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;mnn&#34;&gt;MNN
&lt;/h3&gt;&lt;p&gt;To export and run on MNN, which supports Qwen3 on mobile devices, please visit &lt;a class=&#34;link&#34; href=&#34;https://github.com/alibaba/MNN&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Alibaba MNN&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;mlx-lm&#34;&gt;MLX LM
&lt;/h3&gt;&lt;p&gt;If you are running on Apple Silicon, &lt;a class=&#34;link&#34; href=&#34;https://github.com/ml-explore/mlx-lm&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;code&gt;mlx-lm&lt;/code&gt;&lt;/a&gt; also supports Qwen3 (&lt;code&gt;mlx-lm&amp;gt;=0.24.0&lt;/code&gt;).
Look for models ending with MLX on Hugging Face Hub.&lt;/p&gt;
&lt;h3 id=&#34;openvino&#34;&gt;OpenVINO
&lt;/h3&gt;&lt;p&gt;If you are running on Intel CPU or GPU, &lt;a class=&#34;link&#34; href=&#34;https://github.com/openvinotoolkit&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OpenVINO toolkit&lt;/a&gt; supports Qwen3.
You can follow this &lt;a class=&#34;link&#34; href=&#34;https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/llm-chatbot.ipynb&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;chatbot example&lt;/a&gt;.&lt;/p&gt;
&lt;!-- ### Text generation web UI

You can directly use [`text-generation-webui`](https://github.com/oobabooga/text-generation-webui) for creating a web UI demo. If you use GGUF, remember to install the latest wheel of `llama.cpp` with the support of Qwen2.5. --&gt;
&lt;!-- ### llamafile

Clone [`llamafile`](https://github.com/Mozilla-Ocho/llamafile), run source install, and then create your own llamafile with the GGUF file following the guide [here](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#creating-llamafiles). You are able to run one line of command, say `./qwen.llamafile`, to create a demo. --&gt;
&lt;h2 id=&#34;deploy-qwen3&#34;&gt;Deploy Qwen3
&lt;/h2&gt;&lt;p&gt;Qwen3 is supported by multiple inference frameworks.
Here we demonstrate the usage of &lt;code&gt;SGLang&lt;/code&gt;, &lt;code&gt;vLLM&lt;/code&gt; and &lt;code&gt;TensorRT-LLM&lt;/code&gt;.
You can also find Qwen3 models from various inference providers, e.g., &lt;a class=&#34;link&#34; href=&#34;https://www.alibabacloud.com/en/product/modelstudio&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Alibaba Cloud Model Studio&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;sglang&#34;&gt;SGLang
&lt;/h3&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/sgl-project/sglang&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;SGLang&lt;/a&gt; is a fast serving framework for large language models and vision language models.
SGLang could be used to launch a server with OpenAI-compatible API service.
&lt;code&gt;sglang&amp;gt;=0.4.6.post1&lt;/code&gt; is required.
It is as easy as&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;python -m sglang.launch_server --model-path Qwen/Qwen3-8B --port &lt;span class=&#34;m&#34;&gt;30000&lt;/span&gt; --reasoning-parser qwen3
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;An OpenAI-compatible API will be available at &lt;code&gt;http://localhost:30000/v1&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&#34;vllm&#34;&gt;vLLM
&lt;/h3&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/vllm-project/vllm&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;vLLM&lt;/a&gt; is a high-throughput and memory-efficient inference and serving engine for LLMs.
&lt;code&gt;vllm&amp;gt;=0.8.5&lt;/code&gt; is recommended.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;vllm serve Qwen/Qwen3-8B --port &lt;span class=&#34;m&#34;&gt;8000&lt;/span&gt; --enable-reasoning --reasoning-parser deepseek_r1
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;An OpenAI-compatible API will be available at &lt;code&gt;http://localhost:8000/v1&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&#34;tensorrt-llm&#34;&gt;TensorRT-LLM
&lt;/h3&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/NVIDIA/TensorRT-LLM&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;TensorRT-LLM&lt;/a&gt; is an open-source LLM inference engine from NVIDIA, which provides optimizations including custom attention kernels, quantization and more on NVIDIA GPUs. Qwen3 is supported in its re-architected &lt;a class=&#34;link&#34; href=&#34;https://nvidia.github.io/TensorRT-LLM/torch.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;PyTorch backend&lt;/a&gt;. &lt;code&gt;tensorrt_llm&amp;gt;=0.20.0rc3&lt;/code&gt; is recommended. Please refer to the &lt;a class=&#34;link&#34; href=&#34;https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/models/core/qwen/README.md#qwen3&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;README&lt;/a&gt; page for more details.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;trtllm-serve Qwen/Qwen3-8B --host localhost --port &lt;span class=&#34;m&#34;&gt;8000&lt;/span&gt; --backend pytorch
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;An OpenAI-compatible API will be available at &lt;code&gt;http://localhost:8000/v1&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&#34;mindie&#34;&gt;MindIE
&lt;/h3&gt;&lt;p&gt;For deployment on Ascend NPUs, please visit &lt;a class=&#34;link&#34; href=&#34;https://modelers.cn/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Modelers&lt;/a&gt; and search for Qwen3.&lt;/p&gt;
&lt;!-- 
### OpenLLM

[OpenLLM](https://github.com/bentoml/OpenLLM) allows you to easily run Qwen2.5 as OpenAI-compatible APIs. You can start a model server using `openllm serve`. For example:

```bash
openllm serve qwen2.5:7b
```

The server is active at `http://localhost:3000/`, providing OpenAI-compatible APIs. You can create an OpenAI client to call its chat API. For more information, refer to [our documentation](https://qwen.readthedocs.io/en/latest/deployment/openllm.html). --&gt;
&lt;h2 id=&#34;build-with-qwen3&#34;&gt;Build with Qwen3
&lt;/h2&gt;&lt;h3 id=&#34;tool-use&#34;&gt;Tool Use
&lt;/h3&gt;&lt;p&gt;For tool use capabilities, we recommend taking a look at &lt;a class=&#34;link&#34; href=&#34;https://github.com/QwenLM/Qwen-Agent&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Qwen-Agent&lt;/a&gt;, which provides a wrapper around these APIs to support tool use or function calling with MCP support.
Tool use with Qwen3 can also be conducted with SGLang, vLLM, Transformers, llama.cpp, Ollama, etc.
Follow guides in our documentation to see how to enable the support.&lt;/p&gt;
&lt;h3 id=&#34;finetuning&#34;&gt;Finetuning
&lt;/h3&gt;&lt;p&gt;We advise you to use training frameworks, including &lt;a class=&#34;link&#34; href=&#34;https://github.com/OpenAccess-AI-Collective/axolotl&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Axolotl&lt;/a&gt;, &lt;a class=&#34;link&#34; href=&#34;https://github.com/unslothai/unsloth&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;UnSloth&lt;/a&gt;, &lt;a class=&#34;link&#34; href=&#34;https://github.com/modelscope/swift&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Swift&lt;/a&gt;, &lt;a class=&#34;link&#34; href=&#34;https://github.com/hiyouga/LLaMA-Factory&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Llama-Factory&lt;/a&gt;, etc., to finetune your models with SFT, DPO, GRPO, etc.&lt;/p&gt;
&lt;h2 id=&#34;license-agreement&#34;&gt;License Agreement
&lt;/h2&gt;&lt;p&gt;All our open-weight models are licensed under Apache 2.0.
You can find the license files in the respective Hugging Face repositories.&lt;/p&gt;
&lt;h2 id=&#34;citation&#34;&gt;Citation
&lt;/h2&gt;&lt;p&gt;If you find our work helpful, feel free to give us a cite.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bibtex&#34; data-lang=&#34;bibtex&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nc&#34;&gt;@article&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;&lt;span class=&#34;nl&#34;&gt;qwen3&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;title&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;{Qwen3 Technical Report}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;author&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;{An Yang and Anfeng Li and Baosong Yang and Beichen Zhang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Gao and Chengen Huang and Chenxu Lv and Chujie Zheng and Dayiheng Liu and Fan Zhou and Fei Huang and Feng Hu and Hao Ge and Haoran Wei and Huan Lin and Jialong Tang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Yang and Jiaxi Yang and Jing Zhou and Jingren Zhou and Junyang Lin and Kai Dang and Keqin Bao and Kexin Yang and Le Yu and Lianghao Deng and Mei Li and Mingfeng Xue and Mingze Li and Pei Zhang and Peng Wang and Qin Zhu and Rui Men and Ruize Gao and Shixuan Liu and Shuang Luo and Tianhao Li and Tianyi Tang and Wenbiao Yin and Xingzhang Ren and Xinyu Wang and Xinyu Zhang and Xuancheng Ren and Yang Fan and Yang Su and Yichang Zhang and Yinger Zhang and Yu Wan and Yuqiong Liu and Zekun Wang and Zeyu Cui and Zhenru Zhang and Zhipeng Zhou and Zihan Qiu}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;journal&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{arXiv preprint arXiv:2505.09388}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;year&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nc&#34;&gt;@article&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;&lt;span class=&#34;nl&#34;&gt;qwen2.5&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;title&lt;/span&gt;   &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{Qwen2.5 Technical Report}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;author&lt;/span&gt;  &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{An Yang and Baosong Yang and Beichen Zhang and Binyuan Hui and Bo Zheng and Bowen Yu and Chengyuan Li and Dayiheng Liu and Fei Huang and Haoran Wei and Huan Lin and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Yang and Jiaxi Yang and Jingren Zhou and Junyang Lin and Kai Dang and Keming Lu and Keqin Bao and Kexin Yang and Le Yu and Mei Li and Mingfeng Xue and Pei Zhang and Qin Zhu and Rui Men and Runji Lin and Tianhao Li and Tingyu Xia and Xingzhang Ren and Xuancheng Ren and Yang Fan and Yang Su and Yichang Zhang and Yu Wan and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zihan Qiu}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;journal&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{arXiv preprint arXiv:2412.15115}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;year&lt;/span&gt;    &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{2024}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nc&#34;&gt;@article&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;&lt;span class=&#34;nl&#34;&gt;qwen2&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;title&lt;/span&gt;   &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{Qwen2 Technical Report}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;author&lt;/span&gt;  &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;journal&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{arXiv preprint arXiv:2407.10671}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;na&#34;&gt;year&lt;/span&gt;    &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;{2024}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id=&#34;contact-us&#34;&gt;Contact Us
&lt;/h2&gt;&lt;p&gt;If you are interested to leave a message to either our research team or product team, join our &lt;a class=&#34;link&#34; href=&#34;https://discord.gg/z3GAxXZ9Ce&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Discord&lt;/a&gt; or &lt;a class=&#34;link&#34; href=&#34;assets/wechat.png&#34; &gt;WeChat groups&lt;/a&gt;!&lt;/p&gt;
</description>
        </item>
        
    </channel>
</rss>
