Skip to main content

NLP Collective

A collective focused on NLP (natural language processing), the transformation or extraction of useful information from natural language data.
38.3k Questions
+1
12.9k Members
+28
Contact

Can you answer these questions?

View all unanswered questions

These questions still don't have an answer

1 vote
0 answers
28 views

How to handle I/O Memory issues with HuggingFace in Kaggle (SafeTensorError)?

I got the above error when I was trying to get a model from HuggingFace. I was using the AutoModelForCasualLM to get the model in question (this is a method directly from the transformers package). ...
Best practices
0 votes
0 replies
67 views

Best approach to generate embeddings for 10K+ documents in Spring Boot + OpenSearch (performance issue)

I am building a search system using Spring Boot and OpenSearch. Current setup: Using OpenSearch ingest pipeline with text_embedding processor Each ZIP file contains 10K+ documents Bulk indexing is ...
Best practices
0 votes
0 replies
51 views

Implementing Deterministic Entity Resolution in a Multi-Agent RAG for Investigative Archiving

Body: I am architecting a Forensic Data Audit system (Multi-Agent RAG) to analyze fragmented, large-scale archives. A critical bottleneck is maintaining Entity Resolution (ER) across millions of ...
3 votes
0 answers
36 views

How to convert the MLP in MoE to 4 bit quantization?

I'm doing some research about the information encoding with LLMs and need to find a way to quantize the weights of the MLP layers(MoE) to 4 bits and even customized mixed precision. Consider from ...
1 vote
0 answers
77 views

SentenceTransformer can't load all-MiniLM-L6-v2 – missing config_sentence_transformers.json

I'm trying to load the sentence-transformers/all-MiniLM-L6-v2 model but it fails with a missing file error. Code from sentence_transformers import SentenceTransformer model = SentenceTransformer(&...