教程：https://www.youtube.com/watch?v=doR0wcZ_nOg

安装依赖：

pip install ollama llama-index llama-index-embeddings-ollama transformers accelerate sentence_transformers  -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com

方法一：直接使用模型

使用本地模型：/root/workspace/gemma-2-2b-it，直接从 huggingface 下载的文件的目录代码：

def test_c():
    from sentence_transformers import SentenceTransformer

    model_path = "/root/workspace/gemma-2-2b-it"
    model = SentenceTransformer(model_path, trust_remote_code=True)
    model.max_seq_length = 1024

    queries = [ 
            "Hello World",
            "What is your name",
            "I'm fine, and you?",
            "It's OK"
            ]
    for q in queries:
        embeddings = model.encode(q)
        print(embeddings)
        print(f"type(embeddings): {type(embeddings)}, shape: {embeddings.shape}")

方法二：使用 ollama 服务

注⚠️：需要先搭建一个 ollama 服务。下面的两个方法都会去访问本地的 ollama 服务，默认端口为 11434

可以通过环境变量 OLLAMA_HOST 来指定服务地址。

通过 ollama 源码文件 /usr/local/lib/python3.10/dist-packages/ollama/_client.py，可以看到客户端使用的 http client 库为 httpx.Client。

代码：

def test_a():
    import ollama
    model_path = "/root/workspace/gemma-2-2b-it"

    x = ollama.embeddings(model=model_path, prompt="Hello World !")
    print(x)


def test_b():
    from llama_index.embeddings.ollama import OllamaEmbedding

    model_path = "/root/workspace/gemma-2-2b-it"
    ollama_embedding = OllamaEmbedding(model_name=model_path, base_url="http://localhost:11434", ollama_addintional_kwargs={"mirstay": 0}) 
    print(ollama_embedding)
    query_embedding = ollama_embedding.get_query_embedding('Hello World !')
    print(query_embedding)

方法三：使用 HuggingFaceEmbedding

来源：https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface/

参考： - https://blog.csdn.net/qq_43814415/article/details/138403394 langchain+qwen1.5-7b-chat搭建本地RAG系统 - https://blog.csdn.net/qq_23953717/article/details/136553084 llama-index调用qwen大模型实现RAG

安装库：

pip install llama-index-embeddings-huggingface  -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com
pip install llama-index-embeddings-instructor  -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com
pip install llama-index  -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com

3.1 HuggingFaceEmbedding

from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# loads BAAI/bge-small-en
# embed_model = HuggingFaceEmbedding()

# loads BAAI/bge-small-en-v1.5
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])

3.2 InstructorEmbedding

from llama_index.embeddings.instructor import InstructorEmbedding

embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")

embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])

3.3 OptimumEmbedding