Skip to content

Conversation

@YassinNouh21
Copy link
Contributor

Add comprehensive PGVector tutorial with PostgreSQL integration

This commit introduces a comprehensive tutorial demonstrating the use of PostgreSQL with the pgvector extension as a vector database backend for Feast. It includes Docker setup instructions, feature definitions, sample data generation, and vector similarity search functionality. Key files added are docker-compose.yml, example_repo.py, feature_store.yaml, pgvector_example.py, README.md, and an initialization SQL script for pgvector.

What this PR does / why we need it:

This PR adds a new tutorial example demonstrating how to use PostgreSQL with the pgvector extension as a vector database backend for Feast. This is valuable because:

  1. It provides a complete end-to-end example of integrating Feast with vector search capabilities
  2. It helps users understand how to set up and configure pgvector with Feast
  3. It demonstrates vector similarity search on embeddings, which is essential for RAG and semantic search applications
  4. It includes sample code for generating embeddings using sentence-transformers

The tutorial walks users through:

  • Setting up PostgreSQL with pgvector using Docker
  • Defining feature views with vector fields
  • Generating sample data with embeddings
  • Performing vector similarity search queries

Which issue(s) this PR fixes:

This PR helps address the need for more comprehensive documentation and examples around using vector databases with Feast, particularly for applications involving embeddings and semantic search.

Misc

This tutorial complements the existing vector store implementations by providing a practical example with PostgreSQL's pgvector extension, which is becoming increasingly popular for vector search applications due to its simplicity and integration with existing PostgreSQL databases.

The example is designed to be self-contained and easy to follow, requiring minimal setup beyond Docker and Python.

This commit introduces a comprehensive tutorial demonstrating the use of PostgreSQL with the pgvector extension as a vector database backend for Feast. It includes Docker setup instructions, feature definitions, sample data generation, and vector similarity search functionality. Key files added are `docker-compose.yml`, `example_repo.py`, `feature_store.yaml`, `pgvector_example.py`, `README.md`, and an initialization SQL script for pgvector.

Signed-off-by: Yassin Nouh <70436855+YassinNouh21@users.noreply.github.com>
@YassinNouh21 YassinNouh21 requested a review from a team as a code owner April 23, 2025 10:33
Signed-off-by: Yassin Nouh <70436855+YassinNouh21@users.noreply.github.com>
@YassinNouh21 YassinNouh21 force-pushed the pgvector-example-update branch from 819b6bb to 7192e07 Compare April 23, 2025 11:06
Signed-off-by: Yassin Nouh <70436855+YassinNouh21@users.noreply.github.com>

print("\nNote: Using simulated results for display purposes.")
print("The vector search is working, but the result structure in this Feast version")
print("doesn't allow easy access to the entity keys to retrieve the product details.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this!!

Maybe we should fix this first before merging this demo?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably fix it in a follow up though.

Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Thank you for this!!

@franciscojavierarceo franciscojavierarceo merged commit bb1cbea into feast-dev:master Apr 23, 2025
23 checks passed
@YassinNouh21 YassinNouh21 deleted the pgvector-example-update branch April 23, 2025 12:03
franciscojavierarceo pushed a commit that referenced this pull request Apr 29, 2025
# [0.49.0](v0.48.0...v0.49.0) (2025-04-29)

### Bug Fixes

* Adding brackets to unit tests ([c46fea3](c46fea3))
* Adding logic back for a step ([2bb240b](2bb240b))
* Adjustment for unit test action ([a6f78ae](a6f78ae))
* Allow get_historical_features with only On Demand Feature View ([#5256](#5256)) ([0752795](0752795))
* CI adjustment ([3850643](3850643))
* Embed Query configuration breaks when switching between DataFrame and SQL ([#5257](#5257)) ([32375a5](32375a5))
* Fix for proto issue in utils ([1b291b2](1b291b2))
* Fix milvus online_read ([#5233](#5233)) ([4b91f26](4b91f26))
* Fix tests ([431d9b8](431d9b8))
* Fixed Permissions object parameter in example ([#5259](#5259)) ([045c100](045c100))
* Java CI [#12](#12) ([d7e44ac](d7e44ac))
* Java PR [#15](#15) ([a5da3bb](a5da3bb))
* Java PR [#16](#16) ([e0320fe](e0320fe))
* Java PR [#17](#17) ([49da810](49da810))
* Materialization logs ([#5243](#5243)) ([4aa2f49](4aa2f49))
* Moving to custom github action for checking skip tests ([caf312e](caf312e))
* Operator - remove default replicas setting from Feast Deployment ([#5294](#5294)) ([e416d01](e416d01))
* Patch java pr [#14](#14) ([592526c](592526c))
* Patch update for test ([a3e8967](a3e8967))
* Remove conditional from steps ([995307f](995307f))
* Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([#5280](#5280)) ([0ee3a1e](0ee3a1e))
* removing id ([268ade2](268ade2))
* Renaming workflow file ([5f46279](5f46279))
* Resolve `no pq wrapper` import issue ([#5240](#5240)) ([d5906f1](d5906f1))
* Update actions to remove check skip tests ([#5275](#5275)) ([b976f27](b976f27))
* Update docling demo ([446efea](446efea))
* Update java pr [#13](#13) ([fda7db7](fda7db7))
* Update java_pr ([fa138f4](fa138f4))
* Update repo_config.py ([6a59815](6a59815))
* Update unit tests workflow ([06486a0](06486a0))
* Updated docs for docling demo ([768e6cc](768e6cc))
* Updating action for unit tests ([0996c28](0996c28))
* Updating github actions to filter at job level ([0a09622](0a09622))
* Updating Java CI ([c7c3a3c](c7c3a3c))
* Updating java pr to skip tests ([e997dd9](e997dd9))
* Updating workflows ([c66bcd2](c66bcd2))

### Features

* Add date_partition_column_format for spark source ([#5273](#5273)) ([7a61d6f](7a61d6f))
* Add Milvus tutorial with Feast integration ([#5292](#5292)) ([a1388a5](a1388a5))
* Add pgvector tutorial with PostgreSQL integration ([#5290](#5290)) ([bb1cbea](bb1cbea))
* Add ReactFlow visualization for Feast registry metadata ([#5297](#5297)) ([9768970](9768970))
* Add retrieve online documents v2 method into  pgvector  ([#5253](#5253)) ([6770ee6](6770ee6))
* Compute Engine Initial Implementation ([#5223](#5223)) ([64bdafd](64bdafd))
* Enable write node for compute engine ([#5287](#5287)) ([f9baf97](f9baf97))
* Local compute engine ([#5278](#5278)) ([8e06dfe](8e06dfe))
* Make transform on writes configurable for ingestion ([#5283](#5283)) ([ecad170](ecad170))
* Offline store update pull_all_from_table_or_query to make timestampfield optional ([#5281](#5281)) ([4b94608](4b94608))
* Serialization version 2 deprecation notice ([#5248](#5248)) ([327d99d](327d99d))
* Vector length definition moved to Feature View from Config  ([#5289](#5289)) ([d8f1c97](d8f1c97))
jfw-ppi pushed a commit to jfw-ppi/feast that referenced this pull request Jun 7, 2025
* feat: Add pgvector tutorial with PostgreSQL integration

This commit introduces a comprehensive tutorial demonstrating the use of PostgreSQL with the pgvector extension as a vector database backend for Feast. It includes Docker setup instructions, feature definitions, sample data generation, and vector similarity search functionality. Key files added are `docker-compose.yml`, `example_repo.py`, `feature_store.yaml`, `pgvector_example.py`, `README.md`, and an initialization SQL script for pgvector.

Signed-off-by: Yassin Nouh <70436855+YassinNouh21@users.noreply.github.com>

* chore: Remove example_repo.py from pgvector tutorial

Signed-off-by: Yassin Nouh <70436855+YassinNouh21@users.noreply.github.com>

* update the docs

Signed-off-by: Yassin Nouh <70436855+YassinNouh21@users.noreply.github.com>

---------

Signed-off-by: Yassin Nouh <70436855+YassinNouh21@users.noreply.github.com>
Signed-off-by: Jacob Weinhold <29459386+j-wine@users.noreply.github.com>
jfw-ppi pushed a commit to jfw-ppi/feast that referenced this pull request Jun 7, 2025
# [0.49.0](feast-dev/feast@v0.48.0...v0.49.0) (2025-04-29)

### Bug Fixes

* Adding brackets to unit tests ([c46fea3](feast-dev@c46fea3))
* Adding logic back for a step ([2bb240b](feast-dev@2bb240b))
* Adjustment for unit test action ([a6f78ae](feast-dev@a6f78ae))
* Allow get_historical_features with only On Demand Feature View ([feast-dev#5256](feast-dev#5256)) ([0752795](feast-dev@0752795))
* CI adjustment ([3850643](feast-dev@3850643))
* Embed Query configuration breaks when switching between DataFrame and SQL ([feast-dev#5257](feast-dev#5257)) ([32375a5](feast-dev@32375a5))
* Fix for proto issue in utils ([1b291b2](feast-dev@1b291b2))
* Fix milvus online_read ([feast-dev#5233](feast-dev#5233)) ([4b91f26](feast-dev@4b91f26))
* Fix tests ([431d9b8](feast-dev@431d9b8))
* Fixed Permissions object parameter in example ([feast-dev#5259](feast-dev#5259)) ([045c100](feast-dev@045c100))
* Java CI [feast-dev#12](feast-dev#12) ([d7e44ac](feast-dev@d7e44ac))
* Java PR [feast-dev#15](feast-dev#15) ([a5da3bb](feast-dev@a5da3bb))
* Java PR [feast-dev#16](feast-dev#16) ([e0320fe](feast-dev@e0320fe))
* Java PR [feast-dev#17](feast-dev#17) ([49da810](feast-dev@49da810))
* Materialization logs ([feast-dev#5243](feast-dev#5243)) ([4aa2f49](feast-dev@4aa2f49))
* Moving to custom github action for checking skip tests ([caf312e](feast-dev@caf312e))
* Operator - remove default replicas setting from Feast Deployment ([feast-dev#5294](feast-dev#5294)) ([e416d01](feast-dev@e416d01))
* Patch java pr [feast-dev#14](feast-dev#14) ([592526c](feast-dev@592526c))
* Patch update for test ([a3e8967](feast-dev@a3e8967))
* Remove conditional from steps ([995307f](feast-dev@995307f))
* Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([feast-dev#5280](feast-dev#5280)) ([0ee3a1e](feast-dev@0ee3a1e))
* removing id ([268ade2](feast-dev@268ade2))
* Renaming workflow file ([5f46279](feast-dev@5f46279))
* Resolve `no pq wrapper` import issue ([feast-dev#5240](feast-dev#5240)) ([d5906f1](feast-dev@d5906f1))
* Update actions to remove check skip tests ([feast-dev#5275](feast-dev#5275)) ([b976f27](feast-dev@b976f27))
* Update docling demo ([446efea](feast-dev@446efea))
* Update java pr [feast-dev#13](feast-dev#13) ([fda7db7](feast-dev@fda7db7))
* Update java_pr ([fa138f4](feast-dev@fa138f4))
* Update repo_config.py ([6a59815](feast-dev@6a59815))
* Update unit tests workflow ([06486a0](feast-dev@06486a0))
* Updated docs for docling demo ([768e6cc](feast-dev@768e6cc))
* Updating action for unit tests ([0996c28](feast-dev@0996c28))
* Updating github actions to filter at job level ([0a09622](feast-dev@0a09622))
* Updating Java CI ([c7c3a3c](feast-dev@c7c3a3c))
* Updating java pr to skip tests ([e997dd9](feast-dev@e997dd9))
* Updating workflows ([c66bcd2](feast-dev@c66bcd2))

### Features

* Add date_partition_column_format for spark source ([feast-dev#5273](feast-dev#5273)) ([7a61d6f](feast-dev@7a61d6f))
* Add Milvus tutorial with Feast integration ([feast-dev#5292](feast-dev#5292)) ([a1388a5](feast-dev@a1388a5))
* Add pgvector tutorial with PostgreSQL integration ([feast-dev#5290](feast-dev#5290)) ([bb1cbea](feast-dev@bb1cbea))
* Add ReactFlow visualization for Feast registry metadata ([feast-dev#5297](feast-dev#5297)) ([9768970](feast-dev@9768970))
* Add retrieve online documents v2 method into  pgvector  ([feast-dev#5253](feast-dev#5253)) ([6770ee6](feast-dev@6770ee6))
* Compute Engine Initial Implementation ([feast-dev#5223](feast-dev#5223)) ([64bdafd](feast-dev@64bdafd))
* Enable write node for compute engine ([feast-dev#5287](feast-dev#5287)) ([f9baf97](feast-dev@f9baf97))
* Local compute engine ([feast-dev#5278](feast-dev#5278)) ([8e06dfe](feast-dev@8e06dfe))
* Make transform on writes configurable for ingestion ([feast-dev#5283](feast-dev#5283)) ([ecad170](feast-dev@ecad170))
* Offline store update pull_all_from_table_or_query to make timestampfield optional ([feast-dev#5281](feast-dev#5281)) ([4b94608](feast-dev@4b94608))
* Serialization version 2 deprecation notice ([feast-dev#5248](feast-dev#5248)) ([327d99d](feast-dev@327d99d))
* Vector length definition moved to Feature View from Config  ([feast-dev#5289](feast-dev#5289)) ([d8f1c97](feast-dev@d8f1c97))

Signed-off-by: Jacob Weinhold <29459386+j-wine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants