来源:https://github.com/pgvector/pgvector-python?tab=readme-ov-file#sqlalchemy
1. 安装包
代码仓库:https://github.com/pgvector/pgvector-python
pip install pgvector
2. 创建表
- Enable the extension
session.execute(text('CREATE EXTENSION IF NOT EXISTS vector'))
- Add a vector column
from pgvector.sqlalchemy import Vector
#
class Item(Base):
embedding = mapped_column(Vector(3))
3. 数据操作
- Insert a vector
item = Item(embedding=[1, 2, 3])
session.add(item)
session.commit()
- Get the nearest neighbors to a vector
session.scalars(select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5))
l2_distance
- 欧几里得距离max_inner_product
- 内积cosine_distance
余弦- Get the distance
session.scalars(select(Item.embedding.l2_distance([3, 1, 2])))
- Get items within a certain distance
session.scalars(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5))
- Average vectors
from sqlalchemy.sql import func
#
session.scalars(select(func.avg(Item.embedding))).first()
avg
sum
4. 创建索引
- Add an approximate index
index = Index(
'my_index',
Item.embedding,
postgresql_using='hnsw',
postgresql_with={'m': 16, 'ef_construction': 64},
postgresql_ops={'embedding': 'vector_l2_ops'}
)
# or
index = Index(
'my_index',
Item.embedding,
postgresql_using='ivfflat',
postgresql_with={'lists': 100},
postgresql_ops={'embedding': 'vector_l2_ops'}
)
index.create(engine)
vector_l2_ops
vector_ip_ops
for inner productvector_cosine_ops
for cosine distance