-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathbian-yi-llama-cpp-cuda.html
More file actions
128 lines (127 loc) · 4.63 KB
/
Copy pathbian-yi-llama-cpp-cuda.html
File metadata and controls
128 lines (127 loc) · 4.63 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>编译llama-cpp-cuda — Code Tip</title>
<meta name="description" content="Title: 编译llama-cpp-cuda; Date: 2025-07-06; Author: Marsyas">
<meta name="author" content="Marsyas">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!-- Le HTML5 shim, for IE6-8 support of HTML elements -->
<!--[if lt IE 9]>
<script src="/theme/html5.js"></script>
<![endif]-->
<link href="/theme/css/ipython.css" rel="stylesheet">
<link href="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/css/bootstrap.min.css" rel="stylesheet">
<link href="//maxcdn.bootstrapcdn.com/font-awesome/4.1.0/css/font-awesome.min.css" rel="stylesheet">
<link href="//maxcdn.bootstrapcdn.com/bootswatch/3.2.0/simplex/bootstrap.min.css" rel="stylesheet">
<link href="/theme/css/local.css" rel="stylesheet">
<link href="/theme/css/pygments.css" rel="stylesheet">
</head>
<body>
<div class="container">
<div class="page-header">
<h1><a href="/">Code Tip</a>
<br> </div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<div class="article" itemscope itemtype="http://schema.org/BlogPosting">
<div class="text-center article-header">
<h1 itemprop="name headline" class="article-title">编译llama-cpp-cuda</h1>
<span itemprop="author" itemscope itemtype="http://schema.org/Person">
<h4 itemprop="name">Marsyas</h4>
</span>
<time datetime="2025-07-06T20:00:00+08:00" itemprop="datePublished">Sun 06 July 2025</time>
</div>
<div>
Category:
<span itemprop="articleSection">
<a href="/category/ji-zhu.html" rel="category">技术</a>
</span>
</div>
<div>
Tags:
<span itemprop="keywords">
<a href="/tag/llama-cpp-python.html" rel="tag">llama-cpp-python</a>
</span>
<span itemprop="keywords">
<a href="/tag/python.html" rel="tag">python</a>
</span>
<span itemprop="keywords">
<a href="/tag/llamacpp.html" rel="tag">llama.cpp</a>
</span>
</div>
<div itemprop="articleBody" class="article-body"><p>llama-cpp-python 0.3.4版本(官方在win上最新的预编译cuda版本)不支持qwen3和gemma3架构,只能自己编译最新的llama-cpp-python cuda版本。
编译期间碰到“No CUDA toolset found.”报错,copilot询问无果,在 <a href="https://github.com/abetlen/llama-cpp-python/issues/247#issuecomment-1848769561">这里</a> 找到了解决方法。
编译步骤总结如下:<br>
1. 安装Build Tools for Visual Studio 2022<br>
2. 把 <code>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12. 8\extras\visual_studio_integration\MSBuildExtensions</code> 中的文件全部复制到 <code>C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\MSBuild\Microsoft\VC\v170\BuildCustomizations</code>中<br>
3. 安装llama-cpp-python</p>
<div class="highlight"><pre><span></span><code> $env:CMAKE_ARGS="-DGGML_CUDA=on"
pip install llama-cpp-python --no-cache-dir
</code></pre></div></div>
<hr>
<h2>Comments</h2>
</div>
</div>
</div> <!-- <hr> -->
</div> <!-- /container -->
<footer class="aw-footer bg-danger">
<div class="container"> <!-- footer -->
<div class="row">
<div class="col-md-10 col-md-offset-1">
<div class="row">
<div class="col-md-3">
<h4>Navigation</h4>
<ul class="list-unstyled my-list-style">
<li><a href="">Code Tip</a></li>
</ul>
</div>
<div class="col-md-3">
<h4>Author</h4>
<ul class="list-unstyled my-list-style">
<li><a href="mailto:llxxyy217@gmail.com">Email</a></li>
</ul>
</div>
<div class="col-md-3">
<h4>Categories</h4>
<ul class="list-unstyled my-list-style">
<li><a href="/category/技术.html">技术 (8)</a></li>
</ul>
</div>
<div class="col-md-3">
<h4>Links</h4>
<ul class="list-unstyled my-list-style">
<li><a href="https://github.com/fispurring/">Github</a></li>
</ul>
</div>
</div>
</div>
</div>
</div>
</footer>
<div class="container">
<div class="row">
<div class="col-md-12 text-center center-block aw-bottom">
<p>© Marsyas 2016</p>
<p>Powered by Pelican</p>
</div>
</div>
</div>
<!-- JavaScript -->
<script src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/js/bootstrap.min.js"></script>
<script type="text/javascript">
jQuery(document).ready(function($) {
$("div.collapseheader").click(function () {
$header = $(this).children("span").first();
$codearea = $(this).children(".input_area");
$codearea.slideToggle(500, function () {
$header.text(function () {
return $codearea.is(":visible") ? "Collapse Code" : "Expand Code";
});
});
});
});
</script>
</body>
</html>