49 questions
Advice
0
votes
0
replies
44
views
What is the best text embedding model for ecommerce product search (short, noisy user queries)?
I am integrating a vector-based semantic search system into a large ecommerce platform's product search, and I want to select the right text embedding model.
Use Case
User queries are often:
Very ...
1
vote
1
answer
208
views
Hybrid search on Postgres with pgvector using vecs
I have an instance of Postgres with the pgvector extension enabled. I want to know if I can easily perform hybrid search on my data using both a vector similarity search as well as keyword matching. ...
0
votes
1
answer
125
views
Spring Boot 4 + Postgres EXTENSION vector not working correct
I'm trying to implement the use of vectors in my application.
I raised the version of Spring boot to 4.0.0-M1. This also gives me hibernate 7.
Added EXTENSION vector to Postgres
Added dependency:
&...
1
vote
0
answers
80
views
Using scoring profile with semantic ranker seems to be bugged when using filter
https://learn.microsoft.com/en-us/azure/search/semantic-how-to-enable-scoring-profiles
I've recently tried new preview feature of Azure Search, where I can use Scoring Profile with Semantic Ranker at ...
1
vote
2
answers
105
views
Vespa ai rank function with multiple operands
I am evaluating vespa ai for our search use case, I want to understand if I am using the rank function correctly and if this is a right way to use it
"default-index": "all_text",
...
1
vote
1
answer
47
views
How to send an empty string as parameter in MediaWiki's Semantic Search?
NOTE: This is an extension of my question.
The resume is as follows:
I am using Semantic Search in Yu-Gi-Oh! Media Wiki for:
Get all Dinosaur monster cards available.
The query results must match ...
0
votes
2
answers
85
views
How to set the conditionals - using MediaWiki's Semantic Search?
In Yu-Gi-Oh! Wikia you can use a tool called Semantic Search from MediaWiki for get certain results - i.e. cards.
I am fairly new on this tool.
The query I am working on is:
Get all Dinosaur monster ...
0
votes
0
answers
496
views
Qdrant 400 Bad Request Error When Inserting Multi-Vector Embeddings with Larger Batch Sizes via API
Context
I am working on a semantic search application and using Qdrant to store three types of embeddings per document:
Dense embeddings (from OpenAI)
Sparse embeddings (from Qdrant/BM25)
Rerank ...
0
votes
0
answers
50
views
Elastic Search KNN Semantic Search with pre stored embeddings returing the same score for every hit
I was following this documentation page from Elastic Search
https://www.elastic.co/guide/en/elasticsearch/reference/current/bring-your-own-vectors.html
I have stored the vectors already and tried to ...
1
vote
0
answers
169
views
ColBERT index in elasticsearch
I need to setup complex semantic search pipeline. This pipeline includes several retrieval sources, such as full text search, vector search, and ColBERT search. For the first two tasks there is an ...
0
votes
0
answers
92
views
Opensearch Insufficient number of hits for nested knn queries with efficient filter
What is the bug?
We use an index to store text documents for semantic search purpose. The text being long, we chunk it in paragraph to embed it using all-MiniLM-L6-v2 model. Each chunk being stored in ...
2
votes
0
answers
95
views
Spark approximate N-nearest neightbor join using cosine similarity
I have two spark DataFrames A and B with the same schema. They contain text and the embedding vector of the text pre-calculated using a model such as OpenAI ADA v2 or similar. Example:
id text ...
-1
votes
1
answer
527
views
RRF search in elasticsearch giving problem
GET example-index/_search
{
"retriever": {
"rrf": {
"retrievers": [
{
"standard": {
...
0
votes
1
answer
146
views
How do we put the value of k in the knnquery in using elastic search client?
Hi I'm trying to implement knn in elasticsearch, using maven java, but k is not being accepted in the querybuilder using elasticsearch client.
Query knnQuery = QueryBuilders.knn(m -> m
.queryVector(...
0
votes
0
answers
164
views
Speed up semantic search for large dataset with custom search function
I am building a semantic search engine for short sentences where I want to know which sentence have the more similar sentences. For that, I need to know based on a threshold on the similarity score, ...
1
vote
0
answers
183
views
How to improve openAI Semantic search speed
I have a database(currently a json file) of keywords and their embedding data that i created with openAI's embedding.
What i am trying to do is a similarity search with the input keyword.
So In my ...
0
votes
1
answer
500
views
How to get Retrieval QA to return the exact document that contains the answer from the retrieved top k document?
I'm creating a QA bot with RAG and aiming to provide the specific documents from which the answers are extracted.
Retrieval QA uses k documents which are semantically similar to query to generate the ...
2
votes
0
answers
293
views
create $vectorSearch index in mongodb mongosh terminal
I want to create $vectorSearch index to use mongodb semantic search. I found some official documents that describes how to make it through mongodb Atlas cloud dashboard. But I need to make it on my ...
0
votes
2
answers
409
views
How to get the combine result from multiple vectors stored in Pinecone?
We have generated vector embeddings using OpenAI from our custom data file which is in .xlsx format and stored the vectors in Pinecone, we are now trying to put query using pinecone index and get the ...
1
vote
1
answer
584
views
Purpose of Content, Title and Keyword in semantic ranking
I am new to semantic search/semantic ranking. I might also be incorrectly using the words. I have few questions to understand the concept of semantic ranking, if someone could help.
Are Semantic ...
0
votes
1
answer
64
views
instantiating SemanticSettings is causing a build error in web app
The following line of code is causing a build error:
SemanticSettings semanticSettings = new SemanticSettings();
This came from Quickstart: semantic ranking - Azure AI Search | Microsoft Learn
I was ...
-1
votes
2
answers
80
views
Result format of Vespa Query
I want to see the output of the query as "Track id" and "Title" but I am only able to see the metadata as attached. What changes do I need to make in the results to get desired ...
2
votes
2
answers
3k
views
The differences between Qdrant upload_records and upsert methods?
I am new to the Qdrant vector database and its literature. As I understand, for uploading data to the Qdrant client database, we use uploading methods such as upsert and upload_records but I did not ...
0
votes
1
answer
517
views
How to register sparse encoding model in AWS OpenSearch
I'm trying to deploy the pretrained model amazon/neural-sparse/opensearch-neural-sparse-encoding-v1 on AWS OpensSearch to use it for Neural Sparse Search but it doesn't seem to work.
The full request:
...
0
votes
1
answer
91
views
How can I do recommendations with Marqo? [closed]
I would like to use Marqo to get recommendations of a similar nature to the query from the database.
Instead of querying the index with text, I want to search with an existing document from the index. ...
2
votes
1
answer
439
views
OpenSearch: use vector search in combination with should
Suppose my index has:
A vector field called "text_encoded"
A field called "field1", that can contain one or more of the following classes: "A", "B", "C&...
3
votes
1
answer
1k
views
Semantic search with pretrained BERT models giving irrelevant results with high similarity
I'm trying to create a Semantic search system and have experimented with multiple pretrained models from the SentenceTransformers library: LaBSE, MS-MARCO etc. The system is working well in returning ...
1
vote
2
answers
606
views
Azure Cognitive Search: queryLanguage Parameter Not Affecting Semantic Search Results
I am working with Azure Cognitive Search and have set up an index with content in both English and German. I'm attempting to perform semantic search with different queryLanguage parameters to retrieve ...
1
vote
0
answers
413
views
Compiled slug size is too large (max is 500M) due to "sentence-transformers" in Heroku
I am receiving this error while deploying my application on Heroku:
error image
It is because I have added "sentence-transformers" to my applications requirements.txt file and If I don't ...
0
votes
1
answer
257
views
How to view Vespa Embedding?
I tried the following block of code to implement nearest neighbor search algorithm in Vespa.
https://docs.vespa.ai/en/nearest-neighbor-search-guide.html
I was able to run it successfully but was ...
1
vote
1
answer
263
views
How to run nearest neighbor search in vespa?
Trying to fetch closest neighbor for my given embedding, using below query:
vespa query -v 'yql=select text from VectorSearch3_content where {targetHits:10}nearestNeighbor(embedding,q)' 'hits=1' '...
1
vote
0
answers
403
views
Detecting duplicates and managing documents in Redis
Our team uses Redis as a vector store for our Langchain application. We store text chunks as hashes with indexed keys, and fields of metadata, content, and vector. The issue arises when we are trying ...
1
vote
0
answers
887
views
Json file based Question and answering model using VertexAI
I'm working on document based question and answering model using vertexAI.. But here I'm using Json file instead of document such as PDF or DOCS. Consider that, the 1000s of json file which contains ...
2
votes
1
answer
367
views
Fine tuning Sentence transformers for semantic product search task
Problem I have at hand is to build a product suggestion model which suggest products based on the context of the search query of a user. My plan is to get a pre-trained model from the sentence-...
2
votes
0
answers
448
views
Semantic search and text expansion query with self-deployed model in ElasticSearch
I'm trying to use the elasticsearch text expansion query to implement semantic search on a rank features field. I've read the ELSER documentation and understand the process. I'm using a local/...
0
votes
1
answer
3k
views
Azure cognitive search- create an Indexer with skillsets to convert pdf file content to vector data and map to Index field ContentVector
I was trying to achieve semantic vector search on my own data. The PDF file will be uploaded into the blob storage, Indexer with skillsets pick up the file content from datasource and map to the index ...
1
vote
0
answers
324
views
Cosine similarity in elasticsearch with multiple vectors per document
I have a "documents" index. Every document has multiple embeddings vectors corresponding to chunks of text of the document. Can I run a cosine similarity script using elasticsearch to get ...
0
votes
1
answer
661
views
embeddings and semantic search in spanish
I'm building an AI assistant that interacts with custom Q&A stored in a vector database.
All examples of it shows as a very simple task of chunking documents (QA in this case), creating embeddings,...
1
vote
1
answer
648
views
Getting the AnswerResult from Azure Cognitive Search
Does anyone have an example of getting the Answer from the semantic search in Azure Cognitive Search using c#.
This is what I have but I can seem to get the AnswerResult from it.
SearchOptions ...
1
vote
0
answers
512
views
Elastic Search | How to Give more weights to Semantic Search compared to default Term based Search while using Hybrid Search
I am using hybrid search in Elastic search. Below is an example from ES docs. So what I found in hybrid results term-based results come at the top, and I wanted to know how the scoring works in each ...
1
vote
0
answers
238
views
How to perform semantic search on Hebrew documents using AWS OpenSearch?
I am working on a project where I need to perform semantic search on Hebrew documents using AWS OpenSearch. I am wondering if it is possible to perform semantic search on Hebrew documents using ...
0
votes
2
answers
2k
views
Azure Cognitive Search: got an unexpected keyword argument 'query_language' in python vscode
I'm trying to use semantic search enabled azure cognitive search in my flask app (in python virtual env).
When I do pip install azure.search.documents, 11.3.0 version gets installed
and I get ...
0
votes
2
answers
2k
views
How is Semantic Search using fields set up in Semantic Configuration to re rank results in Azure Cognitive Search
I have an index in azure cognitive search where I have given the details of my offerings. There are two main fields (amongst many others like Name, Availability etc.):
Bio (Edm.String)
Tags (...
1
vote
2
answers
475
views
Azure Cognitive Search - Do sematic search configuration support binding content field to content type in a Edm.ComplexType?
Does Azure cognitive search sematic search configuration support binding content field to content type in a Edm.ComplexType? I have an index which have a collection of complex types. Each item in the ...
0
votes
1
answer
214
views
Can a semantic query type support advanced query forms?
I am working with Azure Cognitive Search and while going through its documentation I came across the advanced query forms like fielded search, fuzzy search, proximity search and many more, but all of ...
0
votes
0
answers
96
views
How to add scoring function of type Freshness to a field of type Collections(Edm.ComplexType) in Azure Cognitive Search?
I have configured and indexed the data for Azure Cognitive Search, the indexed data contains a field TimeSlots of type Collections(Edm.ComplexType) which looks like
"TimeSlots": [
...
1
vote
0
answers
462
views
Use Word Embedding models in. Elastic Search
I am trying use embedding models in elastic search following this tutorial
When I run this command
docker run -it --rm elastic/eland \
eland_import_hub_model \
--url https://username:password@...
1
vote
1
answer
1k
views
OpenAI Rate Limit 429 Bug
I am trying to use this repository to create semantic search for youtube videos using OpenAI + Pinecone but I am hitting a 429 error on this step - "Run the command npx tsx src/bin/process-yt-...
4
votes
1
answer
4k
views
Semantic searching using Google flan-t5
I'm trying to use google flan t5-large to create embeddings for a simple semantic search engine. However, the generated embeddings cosine similarity with my query is very off. Is there something I'm ...