Skip to main content
Filter by
Sorted by
Tagged with
2 votes
0 answers
250 views
+50

I have a OpenAI model with Retrieval-Augmented Generation (RAG): import {OpenAIEmbeddingFunction} from "@chroma-core/openai"; import chromaClient from "../config/chromadb"; import {...
yeln's user avatar
  • 757
Advice
0 votes
0 replies
110 views

I've been working on adapting Microsoft's BioGPT-Large for veterinary pharmacology using Plumb's Veterinary Drug Handbook (2023) as my domain corpus. After going through a lot of trial and error, I ...
sahil koshti's user avatar
Best practices
0 votes
0 replies
51 views

Body: I am architecting a Forensic Data Audit system (Multi-Agent RAG) to analyze fragmented, large-scale archives. A critical bottleneck is maintaining Entity Resolution (ER) across millions of ...
abdo zaalouk's user avatar
Advice
0 votes
1 replies
68 views

I’m working on a streaming pipeline where data is coming from a Kafka topic, and I want to integrate LLM-based processing and RAG ingestion. I’m running into architectural challenges around latency ...
Arpan's user avatar
  • 993
Advice
0 votes
4 replies
72 views

so one of the biggest hurdles I currently face is navigating through my University website to find the relevant data like fee structure for my course(which is updated bi annually) and other ...
Shehzad Khan's user avatar
Advice
4 votes
8 replies
219 views

I am building a local RAG chatbot using LangChain and ChromaDB (PersistentClient). I’m encountering 'hallucinations' when the similarity search returns documents with a low relevance score. How can I ...
grace h's user avatar
Best practices
2 votes
2 replies
105 views

I need some help. I'm really struggling with RAG or some other things to scrape large documents to local llm (I'm using llama-server for running gpt-oss 20b). My question is: how to implement such ...
bud's user avatar
  • 1
1 vote
0 answers
91 views

I'm building a Streamlit app using LangChain (latest), LangGraph, and Groq with the model: llama-3.3-70b-versatile I'm using the modern create_agent() API (LangGraph-backed). The agent has two tools: ...
Ravikiran Arasur T S's user avatar
5 votes
0 answers
165 views

I am using an RTX 3060 (12GB VRAM) and implementing a RAG pipeline with the BGE-M3 embedding model. Initially, I installed PyTorch with the CUDA 12.8 wheel (my NVIDIA driver supports CUDA 12.9). ...
Sujith A's user avatar
0 votes
1 answer
66 views

What I am Working on I’m building a conversational RAG pipeline using LangChain JS with Ollama (local models). If I use a normal retriever created from a vector store, everything works fine and ...
Praneeth's user avatar
0 votes
1 answer
86 views

import os, asyncio, json from dotenv import load_dotenv from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.teams import DiGraphBuilder, GraphFlow from chromadb import ...
Sushruth Kamarushi's user avatar
3 votes
0 answers
283 views

I'm using FastMCP in python to implement a MCP server. Currently I run into a problem when it comes to streaming of the generated tokens from the LLM. I don't want to wait for the completed response ...
Daniel's user avatar
  • 313
Tooling
0 votes
0 replies
101 views

I'm trying to use metadata in RAG systems using LangChain. I see a lot of tutorials using SelfQueryRetriever, but it appears that this was deprecated in recent versions. Is this correct? I couldn't ...
Augusto Firmo's user avatar
Advice
2 votes
2 replies
123 views

I’m building a tool that generates new mathematics exam problems using an internal database of past problems. My current setup uses a RAG pipeline, Pinecone as the vector database, and GPT-5 as the ...
Marc-Loïc Abena's user avatar
Best practices
1 vote
2 replies
175 views

I'm building a voice-based calling system where users can create AI agents that make outbound phone calls. The agent uses Deepgram for real-time transcription and ElevenLabs/Cartesia for speech ...
Sarthak Sahu's user avatar
Advice
0 votes
1 replies
63 views

I have a large set of phrases obtained via Azure Fast Transcription, and I need to group them into coherent semantic chunks (to use later in a RAG pipeline). Initially, I tried grouping phrases based ...
Daniel's user avatar
  • 13
0 votes
0 answers
50 views

I'm using LlamaIndex 0.14.7. I would like to embed document text without concatenating metadata, because I put a long text in metadata. Here's my code: table_vec_store: SimpleVectorStore = ...
Trams's user avatar
  • 421
0 votes
0 answers
71 views

This is my embedding code, which I run once only: embeddings = OpenAIEmbeddings(model="text-embedding-3-large") vector_store = MongoDBAtlasVectorSearch.from_connection_string( ...
Mingruifu Lin's user avatar
1 vote
1 answer
240 views

I’m trying to evaluate my Retrieval-Augmented Generation (RAG) pipeline using Ragas. . Here’s a complete version of my code: """# RAG Evaluation""" from datasets import ...
Chandima's user avatar
0 votes
1 answer
108 views

My objective is to do keyword filtering in Chroma. I have a field called keywords with a list of strings and I want to filter with it, but chroma won't let me add lists as a field. I checked my Chroma ...
Elena López-Negrete Burón's user avatar
0 votes
1 answer
117 views

I built a RAG chatbot in python,langchain, and FAISS for the vectorstore. And the data is stored as JSON. The chatbot sometimes refuses to answer when a question is rephrased. Here are two ...
SoftwareEngineer's user avatar
0 votes
0 answers
35 views

Question: I'm building a memory-augmented AI system using RAG with persistent vector storage, but facing memory leaks and context contamination between sessions. Problem: Vector embeddings aren't ...
TensorMind's user avatar
0 votes
1 answer
83 views

i am trying to create a small starter llm RAG project using haystack. my project packages are below (I use UV): [project] name = "llm-project" version = "0.1.0" description = "...
femi's user avatar
  • 992
0 votes
0 answers
84 views

I am trying to use lancedb to perform FTS, but getting spurious results. Here is a minimal example: # Data generation import lancedb import polars as pl from string import ascii_lowercase words = [...
MKWL's user avatar
  • 41
0 votes
0 answers
195 views

On the ingestion part to the graph db, I pass a json file, as an episode, custom entities (and edges), using gemini api, but I get some discrepancy on the structured output, like so: LLM generation ...
George Petropoulos's user avatar
0 votes
0 answers
66 views

I am using RAGFlow connected to a Spring Boot MCP server. My agent flow is simple: Begin node → collects inputs (auth_token, tenant_id, x_request_status) Agent (gpt-4o) → connected to MCP Tool (server)...
Ishan Garg's user avatar
1 vote
0 answers
101 views

I am using the python package ragas with the goal of generating a testset for a RAG application. I am defining my BaseRagasLLM as: from langchain_ollama import OllamaLLM from ragas.llms import ...
oyster's user avatar
  • 21
1 vote
1 answer
469 views

I set up a self-hosted Firecrawl instance and I want to crawl my internal intranet site (e.g. https://intranet.xxx.gov.tr/). I can access the site directly both from the host machine and from inside ...
birdalugur's user avatar
2 votes
1 answer
299 views

I'm building a document Q&A system using FAISS for vector search on an AWS EC2 t3.micro instance (1 vCPU, 1GB RAM). My FAISS index is relatively small (8.4MB .faiss + 1.4MB .pkl files), but I'm ...
user29255210's user avatar
0 votes
0 answers
162 views

I'm building a RAG (Retrieval-Augmented Generation) chatbot using LangChain, Gemini API, and Qdrant, with a Streamlit frontend. I want to write unit tests for the app using pytest, and I’m trying to ...
Krishna Suthar's user avatar
0 votes
1 answer
172 views

When using rag and memory, multiple identical copies of the same information is sent to the ai, when asking related questions. I have import java.util.ArrayList; import java.util.List; import dev....
MTilsted's user avatar
  • 5,555
0 votes
1 answer
403 views

I am trying to delete all the data points that are associated with a particular email Id, but I am encountering the following error. source code: app.get('/cleanUpResources', async (req, res) => { ...
Abhishek Prasad's user avatar
-1 votes
1 answer
329 views

The problem with this piece of code is that I am unable to import Client from the pinecone library. I tried to uninstalling and reinstalling different versions none of them worked. I also tried it ...
ACR's user avatar
  • 11
-1 votes
1 answer
63 views

I'm building a LangChain RAG pipeline using the FAISS vector store. I'm merging multiple FAISS indexes — each representing one document — and then querying them to generate summaries or answers via ...
Musab's user avatar
  • 54
1 vote
0 answers
242 views

I'm building a web application using Spring Boot 3.4.5 and Spring AI 1.0.0 with Llama3.2(Ollama) model integration. I've implemented tool calling, and because I have many tools in the application, I'm ...
Sarath Molathoti's user avatar
0 votes
0 answers
120 views

I have been recently trying to do a multiagent project that to summarize, consists on: Through an user input (often a query), the first agent will be dedicated to making the input more suitable for ...
PMathC's user avatar
  • 1
-1 votes
1 answer
879 views

I am trying to call Flask API which i alrady running on port 5000 on my system, i am desgning a agentic AI code which will invoke GET and then POSt based on some condition , and using google-adk. I ...
witty_minds's user avatar
1 vote
0 answers
138 views

Wanted to use the pipeline api from @huggingface/transformers js for sentence-similarity - but I do not see a specific pipeline for it. The closest thing is text classification and feature extractions ...
Edv Beq's user avatar
  • 1,020
1 vote
0 answers
82 views

I'm building a RAG-based document QA system using Python (no LangChain), LLaMA (50K context), PostgreSQL with pgvector, and Docling for parsing. Users can upload up to 10 large documents (300+ pages ...
Anton Lee's user avatar
0 votes
0 answers
59 views

I'm working on a RAG pipeline using a vector database to search over a Q&A dataset. I'm using embedding-based dense retrieval to fetch relevant answers to user queries. The issue I'm facing is ...
MojtabaMAleki02's user avatar
0 votes
0 answers
77 views

I wanted to make a web app that uses llama-index to answer queries using RAG from specific documents. I have locally set up Llama3.2-1B-instruct llm and using that locally to create indexes of the ...
Utkarsh's user avatar
1 vote
1 answer
865 views

I am experimenting with RAG on GCP/Vertex AI, and tried to create some simple example. Here's what I came up with, creating small dummy files locally and then uploading them one by one to a newly-...
Davide Fiocco's user avatar
0 votes
0 answers
213 views

I have a RAG system using llamaindex. I am upgrading library from 0.10.44 to 0.12.33. I see a different behaviour now. Before when there were not results from vectors store it seems it called the LLM ...
Deibys's user avatar
  • 669
0 votes
0 answers
102 views

I checked Azure's documentation on this topic here but I do not see anything related to this. My goal is to create a question and answer dataset for my RAG solution based on each chunk for a good ...
Mike B's user avatar
  • 3,629
1 vote
1 answer
170 views

I am using this model to embed a product catalog for a rag. In the product catalog, there are no red shirts for men, but there are red shirts for women. How can I make sure the model doesnt output ...
Advait Shendage's user avatar
0 votes
2 answers
79 views

from langchain_community.document_loaders import SitemapLoader def crawl(self): print("Starting crawler...") sitemap_url = "https://gringo.co.il/sitemap.xml" ...
Gulzar's user avatar
  • 29k
1 vote
2 answers
1k views

I am using AWS bedrock for the first time. I have configured the data source which is S3 along with opensearch serverless cluster for embeddings. However, I do not have any control over the mappings ...
Makarand's user avatar
  • 646
1 vote
0 answers
52 views

I'm trying to index a series of articles to use in a RAG knowledge base, I cannot find any best practice or recommendation documented about dealing with information that changes or evolves in time. ...
weeanon's user avatar
  • 821
0 votes
1 answer
523 views

I want to know if there are any other settings required for pgvector or what content needs to be set in the code to enable pgvector to support higher vector dimensions. I found on the official website ...
tom's user avatar
  • 3
0 votes
1 answer
28 views

I'm retrieving results from a Cypher query, which includes the article's date and text. After fetching the results, I'm formatting them before passing them to the LLM for response generation. ...
Yuvraj Singh Bhadauria's user avatar