<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Quantization on Producthunt daily</title>
        <link>https://producthunt.programnotes.cn/en/tags/quantization/</link>
        <description>Recent content in Quantization on Producthunt daily</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Fri, 08 Aug 2025 15:39:19 +0800</lastBuildDate><atom:link href="https://producthunt.programnotes.cn/en/tags/quantization/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>vllm</title>
        <link>https://producthunt.programnotes.cn/en/p/vllm/</link>
        <pubDate>Fri, 08 Aug 2025 15:39:19 +0800</pubDate>
        
        <guid>https://producthunt.programnotes.cn/en/p/vllm/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1531914082256-1b9047242426?ixid=M3w0NjAwMjJ8MHwxfHJhbmRvbXx8fHx8fHx8fDE3NTQ2Mzg3MjN8&amp;ixlib=rb-4.1.0" alt="Featured image of post vllm" /&gt;&lt;h1 id=&#34;vllm-projectvllm&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/vllm-project/vllm&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;vllm-project/vllm&lt;/a&gt;
&lt;/h1&gt;&lt;!-- markdownlint-disable MD001 MD041 --&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;picture&gt;
    &lt;source media=&#34;(prefers-color-scheme: dark)&#34; srcset=&#34;https://raw.githubusercontent.com/vllm-project/vllm/main/docs/assets/logos/vllm-logo-text-dark.png&#34;&gt;
    &lt;img alt=&#34;vLLM&#34; src=&#34;https://raw.githubusercontent.com/vllm-project/vllm/main/docs/assets/logos/vllm-logo-text-light.png&#34; width=55%&gt;
  &lt;/picture&gt;
&lt;/p&gt;
&lt;h3 align=&#34;center&#34;&gt;
Easy, fast, and cheap LLM serving for everyone
&lt;/h3&gt;
&lt;p align=&#34;center&#34;&gt;
| &lt;a href=&#34;https://docs.vllm.ai&#34;&gt;&lt;b&gt;Documentation&lt;/b&gt;&lt;/a&gt; | &lt;a href=&#34;https://blog.vllm.ai/&#34;&gt;&lt;b&gt;Blog&lt;/b&gt;&lt;/a&gt; | &lt;a href=&#34;https://arxiv.org/abs/2309.06180&#34;&gt;&lt;b&gt;Paper&lt;/b&gt;&lt;/a&gt; | &lt;a href=&#34;https://x.com/vllm_project&#34;&gt;&lt;b&gt;Twitter/X&lt;/b&gt;&lt;/a&gt; | &lt;a href=&#34;https://discuss.vllm.ai&#34;&gt;&lt;b&gt;User Forum&lt;/b&gt;&lt;/a&gt; | &lt;a href=&#34;https://slack.vllm.ai&#34;&gt;&lt;b&gt;Developer Slack&lt;/b&gt;&lt;/a&gt; |
&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;Latest News&lt;/em&gt; 🔥&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[2025/05] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/c1rqyf1f&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;NYC vLLM Meetup&lt;/a&gt;! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1_q_aW_ioMJWUImf1s1YM-ZhjXz8cUeL0IJvaquOYBeA/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2025/05] vLLM is now a hosted project under PyTorch Foundation! Please find the announcement &lt;a class=&#34;link&#34; href=&#34;https://pytorch.org/blog/pytorch-foundation-welcomes-vllm/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2025/04] We hosted &lt;a class=&#34;link&#34; href=&#34;https://www.sginnovate.com/event/limited-availability-morning-evening-slots-remaining-inaugural-vllm-asia-developer-day&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Asia Developer Day&lt;/a&gt;! Please find the meetup slides from the vLLM team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/19cp6Qu8u48ihB91A064XfaXruNYiBOUKrBxAmDOllOo/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2025/01] We are excited to announce the alpha release of vLLM V1: A major architectural upgrade with 1.7x speedup! Clean code, optimized execution loop, zero-overhead prefix caching, enhanced multimodal support, and more. Please check out our blog post &lt;a class=&#34;link&#34; href=&#34;https://blog.vllm.ai/2025/01/27/v1-alpha-release.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;details&gt;
&lt;summary&gt;Previous News&lt;/summary&gt;
&lt;ul&gt;
&lt;li&gt;[2025/03] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/vllm-ollama&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;vLLM x Ollama Inference Night&lt;/a&gt;! Please find the meetup slides from the vLLM team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/16T2PDD1YwRnZ4Tu8Q5r6n53c5Lr5c73UV9Vd2_eBo4U/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2025/03] We hosted &lt;a class=&#34;link&#34; href=&#34;https://mp.weixin.qq.com/s/n77GibL2corAtQHtVEAzfg&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the first vLLM China Meetup&lt;/a&gt;! Please find the meetup slides from vLLM team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1REHvfQMKGnvz6p3Fd23HhSO4c8j5WPGZV0bKYLwnHyQ/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2025/03] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/7mu4k4xx&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the East Coast vLLM Meetup&lt;/a&gt;! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1NHiv8EUFF1NLd3fEYODm56nDmL26lEeXCaDgyDlTsRs/edit#slide=id.g31441846c39_0_0&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2025/02] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/h7g3kuj9&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the ninth vLLM meetup&lt;/a&gt; with Meta! Please find the meetup slides from vLLM team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1jzC_PZVXrVNSFVCW-V4cFXb6pn7zZ2CyP_Flwo05aqg/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt; and AMD &lt;a class=&#34;link&#34; href=&#34;https://drive.google.com/file/d/1Zk5qEJIkTmlQ2eQcXQZlljAx3m9s7nwn/view?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;. The slides from Meta will not be posted.&lt;/li&gt;
&lt;li&gt;[2025/01] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/zep56hui&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the eighth vLLM meetup&lt;/a&gt; with Google Cloud! Please find the meetup slides from vLLM team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1epVkt4Zu8Jz_S5OhEHPc798emsYh2BwYfRuDDVEF7u4/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;, and Google Cloud team &lt;a class=&#34;link&#34; href=&#34;https://drive.google.com/file/d/1h24pHewANyRL11xy5dXUbvRC9F9Kkjix/view?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2024/12] vLLM joins &lt;a class=&#34;link&#34; href=&#34;https://pytorch.org/blog/vllm-joins-pytorch&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;pytorch ecosystem&lt;/a&gt;! Easy, Fast, and Cheap LLM Serving for Everyone!&lt;/li&gt;
&lt;li&gt;[2024/11] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/h0qvrajz&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the seventh vLLM meetup&lt;/a&gt; with Snowflake! Please find the meetup slides from vLLM team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1e3CxQBV3JsfGp30SwyvS3eM_tW-ghOhJ9PAJGK6KR54/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;, and Snowflake team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1qF3RkDAbOULwz9WK5TOltt2fE9t6uIc_hVNLFAaQX6A/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2024/10] We have just created a developer slack (&lt;a class=&#34;link&#34; href=&#34;https://slack.vllm.ai&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;slack.vllm.ai&lt;/a&gt;) focusing on coordinating contributions and discussing features. Please feel free to join us there!&lt;/li&gt;
&lt;li&gt;[2024/10] Ray Summit 2024 held a special track for vLLM! Please find the opening talk slides from the vLLM team &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1B_KQxpHBTRa_mDF-tR6i8rWdOU5QoTZNcEg2MKZxEHM/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;. Learn more from the &lt;a class=&#34;link&#34; href=&#34;https://www.youtube.com/playlist?list=PLzTswPQNepXl6AQwifuwUImLPFRVpksjR&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;talks&lt;/a&gt; from other vLLM contributors and users!&lt;/li&gt;
&lt;li&gt;[2024/09] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/87q3nvnh&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the sixth vLLM meetup&lt;/a&gt; with NVIDIA! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1wrLGwytQfaOTd5wCGSPNhoaW3nq0E-9wqyP7ny93xRs/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2024/07] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/lp0gyjqr&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the fifth vLLM meetup&lt;/a&gt; with AWS! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1RgUD8aCfcHocghoP3zmXzck9vX3RCI9yfUAB2Bbcl4Y/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2024/07] In partnership with Meta, vLLM officially supports Llama 3.1 with FP8 quantization and pipeline parallelism! Please check out our blog post &lt;a class=&#34;link&#34; href=&#34;https://blog.vllm.ai/2024/07/23/llama31.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2024/06] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/agivllm&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the fourth vLLM meetup&lt;/a&gt; with Cloudflare and BentoML! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1iJ8o7V2bQEi0BFEljLTwc5G1S10_Rhv3beed5oB0NJ4/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2024/04] We hosted &lt;a class=&#34;link&#34; href=&#34;https://robloxandvllmmeetup2024.splashthat.com/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the third vLLM meetup&lt;/a&gt; with Roblox! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1A--47JAK4BJ39t954HyTkvtfwn0fkqtsL8NGFuslReM/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2024/01] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/ygxbpzhl&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the second vLLM meetup&lt;/a&gt; with IBM! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/12mI2sKABnUw5RBWXDYY-HtHth4iMSNcEoQ10jDQbxgA/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2023/10] We hosted &lt;a class=&#34;link&#34; href=&#34;https://lu.ma/first-vllm-meetup&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the first vLLM meetup&lt;/a&gt; with a16z! Please find the meetup slides &lt;a class=&#34;link&#34; href=&#34;https://docs.google.com/presentation/d/1QL-XPFXiFpDBh86DbEegFXBXFXjix4v032GhShbKf3s/edit?usp=sharing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;[2023/08] We would like to express our sincere gratitude to &lt;a class=&#34;link&#34; href=&#34;https://a16z.com/2023/08/30/supporting-the-open-source-ai-community/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Andreessen Horowitz&lt;/a&gt; (a16z) for providing a generous grant to support the open-source development and research of vLLM.&lt;/li&gt;
&lt;li&gt;[2023/06] We officially released vLLM! FastChat-vLLM integration has powered &lt;a class=&#34;link&#34; href=&#34;https://chat.lmsys.org&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;LMSYS Vicuna and Chatbot Arena&lt;/a&gt; since mid-April. Check out our &lt;a class=&#34;link&#34; href=&#34;https://vllm.ai&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;blog post&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/details&gt;
&lt;hr&gt;
&lt;h2 id=&#34;about&#34;&gt;About
&lt;/h2&gt;&lt;p&gt;vLLM is a fast and easy-to-use library for LLM inference and serving.&lt;/p&gt;
&lt;p&gt;Originally developed in the &lt;a class=&#34;link&#34; href=&#34;https://sky.cs.berkeley.edu&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Sky Computing Lab&lt;/a&gt; at UC Berkeley, vLLM has evolved into a community-driven project with contributions from both academia and industry.&lt;/p&gt;
&lt;p&gt;vLLM is fast with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;State-of-the-art serving throughput&lt;/li&gt;
&lt;li&gt;Efficient management of attention key and value memory with &lt;a class=&#34;link&#34; href=&#34;https://blog.vllm.ai/2023/06/20/vllm.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;strong&gt;PagedAttention&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Continuous batching of incoming requests&lt;/li&gt;
&lt;li&gt;Fast model execution with CUDA/HIP graph&lt;/li&gt;
&lt;li&gt;Quantizations: &lt;a class=&#34;link&#34; href=&#34;https://arxiv.org/abs/2210.17323&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GPTQ&lt;/a&gt;, &lt;a class=&#34;link&#34; href=&#34;https://arxiv.org/abs/2306.00978&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;AWQ&lt;/a&gt;, &lt;a class=&#34;link&#34; href=&#34;https://arxiv.org/abs/2309.05516&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;AutoRound&lt;/a&gt;, INT4, INT8, and FP8&lt;/li&gt;
&lt;li&gt;Optimized CUDA kernels, including integration with FlashAttention and FlashInfer&lt;/li&gt;
&lt;li&gt;Speculative decoding&lt;/li&gt;
&lt;li&gt;Chunked prefill&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;vLLM is flexible and easy to use with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Seamless integration with popular Hugging Face models&lt;/li&gt;
&lt;li&gt;High-throughput serving with various decoding algorithms, including &lt;em&gt;parallel sampling&lt;/em&gt;, &lt;em&gt;beam search&lt;/em&gt;, and more&lt;/li&gt;
&lt;li&gt;Tensor, pipeline, data and expert parallelism support for distributed inference&lt;/li&gt;
&lt;li&gt;Streaming outputs&lt;/li&gt;
&lt;li&gt;OpenAI-compatible API server&lt;/li&gt;
&lt;li&gt;Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs, TPU, and AWS Neuron&lt;/li&gt;
&lt;li&gt;Prefix caching support&lt;/li&gt;
&lt;li&gt;Multi-LoRA support&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;vLLM seamlessly supports most popular open-source models on HuggingFace, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Transformer-like LLMs (e.g., Llama)&lt;/li&gt;
&lt;li&gt;Mixture-of-Expert LLMs (e.g., Mixtral, Deepseek-V2 and V3)&lt;/li&gt;
&lt;li&gt;Embedding Models (e.g., E5-Mistral)&lt;/li&gt;
&lt;li&gt;Multi-modal LLMs (e.g., LLaVA)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Find the full list of supported models &lt;a class=&#34;link&#34; href=&#34;https://docs.vllm.ai/en/latest/models/supported_models.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;getting-started&#34;&gt;Getting Started
&lt;/h2&gt;&lt;p&gt;Install vLLM with &lt;code&gt;pip&lt;/code&gt; or &lt;a class=&#34;link&#34; href=&#34;https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html#build-wheel-from-source&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;from source&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;pip install vllm
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Visit our &lt;a class=&#34;link&#34; href=&#34;https://docs.vllm.ai/en/latest/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;documentation&lt;/a&gt; to learn more.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://docs.vllm.ai/en/latest/getting_started/installation.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Installation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://docs.vllm.ai/en/latest/getting_started/quickstart.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Quickstart&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://docs.vllm.ai/en/latest/models/supported_models.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;List of Supported Models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;contributing&#34;&gt;Contributing
&lt;/h2&gt;&lt;p&gt;We welcome and value any contributions and collaborations.
Please check out &lt;a class=&#34;link&#34; href=&#34;https://docs.vllm.ai/en/latest/contributing/index.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Contributing to vLLM&lt;/a&gt; for how to get involved.&lt;/p&gt;
&lt;h2 id=&#34;sponsors&#34;&gt;Sponsors
&lt;/h2&gt;&lt;p&gt;vLLM is a community project. Our compute resources for development and testing are supported by the following organizations. Thank you for your support!&lt;/p&gt;
&lt;!-- Note: Please sort them in alphabetical order. --&gt;
&lt;!-- Note: Please keep these consistent with docs/community/sponsors.md --&gt;
&lt;p&gt;Cash Donations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a16z&lt;/li&gt;
&lt;li&gt;Dropbox&lt;/li&gt;
&lt;li&gt;Sequoia Capital&lt;/li&gt;
&lt;li&gt;Skywork AI&lt;/li&gt;
&lt;li&gt;ZhenFund&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Compute Resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;AMD&lt;/li&gt;
&lt;li&gt;Anyscale&lt;/li&gt;
&lt;li&gt;AWS&lt;/li&gt;
&lt;li&gt;Crusoe Cloud&lt;/li&gt;
&lt;li&gt;Databricks&lt;/li&gt;
&lt;li&gt;DeepInfra&lt;/li&gt;
&lt;li&gt;Google Cloud&lt;/li&gt;
&lt;li&gt;Intel&lt;/li&gt;
&lt;li&gt;Lambda Lab&lt;/li&gt;
&lt;li&gt;Nebius&lt;/li&gt;
&lt;li&gt;Novita AI&lt;/li&gt;
&lt;li&gt;NVIDIA&lt;/li&gt;
&lt;li&gt;Replicate&lt;/li&gt;
&lt;li&gt;Roblox&lt;/li&gt;
&lt;li&gt;RunPod&lt;/li&gt;
&lt;li&gt;Trainy&lt;/li&gt;
&lt;li&gt;UC Berkeley&lt;/li&gt;
&lt;li&gt;UC San Diego&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Slack Sponsor: Anyscale&lt;/p&gt;
&lt;p&gt;We also have an official fundraising venue through &lt;a class=&#34;link&#34; href=&#34;https://opencollective.com/vllm&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OpenCollective&lt;/a&gt;. We plan to use the fund to support the development, maintenance, and adoption of vLLM.&lt;/p&gt;
&lt;h2 id=&#34;citation&#34;&gt;Citation
&lt;/h2&gt;&lt;p&gt;If you use vLLM for your research, please cite our &lt;a class=&#34;link&#34; href=&#34;https://arxiv.org/abs/2309.06180&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;paper&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bibtex&#34; data-lang=&#34;bibtex&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nc&#34;&gt;@inproceedings&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;&lt;span class=&#34;nl&#34;&gt;kwon2023efficient&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;na&#34;&gt;title&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;{Efficient Memory Management for Large Language Model Serving with PagedAttention}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;na&#34;&gt;author&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;{Woosuk Kwon and Zhuohan Li and Siyuan Zhuang and Ying Sheng and Lianmin Zheng and Cody Hao Yu and Joseph E. Gonzalez and Hao Zhang and Ion Stoica}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;na&#34;&gt;booktitle&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;{Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles}&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;na&#34;&gt;year&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;{2023}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id=&#34;contact-us&#34;&gt;Contact Us
&lt;/h2&gt;&lt;!-- --8&lt;-- [start:contact-us] --&gt;
&lt;ul&gt;
&lt;li&gt;For technical questions and feature requests, please use GitHub &lt;a class=&#34;link&#34; href=&#34;https://github.com/vllm-project/vllm/issues&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Issues&lt;/a&gt; or &lt;a class=&#34;link&#34; href=&#34;https://github.com/vllm-project/vllm/discussions&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Discussions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;For discussing with fellow users, please use the &lt;a class=&#34;link&#34; href=&#34;https://discuss.vllm.ai&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;vLLM Forum&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;For coordinating contributions and development, please use &lt;a class=&#34;link&#34; href=&#34;https://slack.vllm.ai&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Slack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;For security disclosures, please use GitHub&amp;rsquo;s &lt;a class=&#34;link&#34; href=&#34;https://github.com/vllm-project/vllm/security/advisories&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Security Advisories&lt;/a&gt; feature&lt;/li&gt;
&lt;li&gt;For collaborations and partnerships, please contact us at &lt;a class=&#34;link&#34; href=&#34;mailto:vllm-questions@lists.berkeley.edu&#34; &gt;vllm-questions@lists.berkeley.edu&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;!-- --8&lt;-- [end:contact-us] --&gt;
&lt;h2 id=&#34;media-kit&#34;&gt;Media Kit
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;If you wish to use vLLM&amp;rsquo;s logo, please refer to &lt;a class=&#34;link&#34; href=&#34;https://github.com/vllm-project/media-kit&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;our media kit repo&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        </item>
        
    </channel>
</rss>
