Skip to main content
Filter by
Sorted by
Tagged with
1 vote
1 answer
98 views

I’ve exported a fine-tuned BERT-based QA model to ONNX for faster inference, but I’m noticing that the predictions from the ONNX model are consistently less accurate than those from the original ...
vinoth's user avatar
  • 41
3 votes
1 answer
241 views

Why do non-identical inputs to ProtBERT generate identical embeddings when non-whitespace-separated? I've looked at answers here etc. but they appear to be different cases where the slicing of the out....
Maximilian Press's user avatar
0 votes
1 answer
68 views

I am fine tuning a CodeBert Model using my custom Dataset and Tokenizers. I tried to unfreeze the last 4 layers of the model. When checking if the layers are unfrozen it shows me which layers are not ...
Ayush V Jain's user avatar
0 votes
1 answer
286 views

I have installed the latest Anaconda and updated everything. When I try to install BERTopic or PyTorch itself, I am getting this error: InvalidArchiveError("Error with archive C:\Users\myuser\...
Paul Nguyễn's user avatar
1 vote
1 answer
129 views

In the paper “Using Prior Knowledge to Guide BERT’s Attention in Semantic Textual Matching Tasks”, they multiply a similarity matrix with the attention scores inside the attention layer. I want to ...
Blockchain Kid's user avatar
1 vote
0 answers
70 views

I don't understand appropriate shape and values of the tensor expected by ModernBertSequentialClassification finetuning in Candle of Rust. Is there a formula to determine the appropriate shape and ...
whitebox3's user avatar
0 votes
3 answers
667 views

I have trying to recreate this tutorial that's found on tensorflow's docs. However, I've been having an error I cannot solve and seems to be related to the literal source code of the tutorial. Also, ...
franjefriten's user avatar
1 vote
1 answer
83 views

So I trained a tokenizer from scratch using Huggingface’s tokenizers library (not AutoTokenizer.from_pretrained, but actually trained a new one). Seemed to go fine, no errors. But when I try to use it ...
RobGG3938's user avatar
0 votes
0 answers
134 views

I'm having an issue with PyTorch in a Docker container where torch.cuda.is_available() returns False, but the same PyTorch version works correctly outside the container. Environment Host: Debian 12 ...
Antonio's user avatar
0 votes
1 answer
315 views

I have a BERT model that I want to fine-tune. Initially, I use a training dataset, which I split into a training and validation set. During fine-tuning, I monitor the validation loss to ensure that ...
Rishi Garg's user avatar
2 votes
1 answer
196 views

I'm using CodeBERT to compare how similar two pieces of code are. For example: # Code 1 def calculate_area(radius): return 3.14 * radius * radius # Code 2 def compute_circle_area(r): return 3.14159 * ...
Nep's user avatar
  • 21
0 votes
0 answers
36 views

When I fine-tuned the model Hubert to detect phoneme, I chose a fine-tuned ASR Hubert model and I removed the last two layers and added a linear layer to the config vocab_size of phoneme. What is ...
Ngoc Anh's user avatar
0 votes
0 answers
71 views

I seek advice on a classification problem in industry. The rows in a dataset must be classified/labeled--it lacks a target/column (labels have dot-separated levels like 'x.x.x.x.x.x.x')--during every ...
Johan's user avatar
  • 226
0 votes
0 answers
45 views

I need to detect words an LLM has no knowledge about, to add RAG-based definition of said word to the prompt, i.e.: What is the best way to achieve slubalisme using the new fabridocium product ?, ...
aguadoe's user avatar
  • 168
0 votes
1 answer
236 views

I’m running into a frustrating issue while training a BERT-based multi-label text classification model on an imbalanced dataset. After a few epochs, the training loss suddenly becomes NaN, and I can’t ...
Erhan Arslan's user avatar
0 votes
1 answer
277 views

I tried to adapt the mBERT model to an existing code. However, I received the following issue even though I tried different solutions. torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20....
MarMarhoun's user avatar
0 votes
0 answers
55 views

I've just learn about how BERT produce embeddings. I might not understand it fully. I was thinking of doing a project of leveraging those embeddings and feed it to an autoencoder to generate latent ...
Nik Imran's user avatar
0 votes
1 answer
56 views

I'm not referring to BERTScore. BERTScore uses token-level word embeddings, you compute pairwise cosine similarity of word embeddings and obtain scores using greedy matching. I'm referring to Sentence ...
Yuirike's user avatar
  • 41
0 votes
0 answers
90 views

I'm working on a BERT-based model for fake news detection. While applying additional layers(as my models encounters not getting good accuracy for only BERT model), like dropout and fully connected ...
Abrar Hussain's user avatar
1 vote
1 answer
473 views

I have been trying to run TFBertModel from Transformers, but it kept on throwing me this error ValueError Traceback (most recent call last) Cell In[9], line 1 ----> 1 ...
Faiz khan's user avatar
2 votes
0 answers
350 views

I am trying to fine-tune BERT for a multi-label classification task (Jigsaw toxic comments). I created a custom dataset and DataLoader as follows: class CustomDataSet(Dataset): def __init__(...
Hyppolite's user avatar
1 vote
2 answers
78 views

I was trying to run some epochs to train my sentiment analysis model, at the very last passage, the epochs stopped with the error in the title. I attach the codes here: Sentiment classifier: # Build ...
Laura Valentini's user avatar
3 votes
1 answer
66 views

I want to utilize BERT to assess the similarity between two pieces of text: from transformers import AutoTokenizer, AutoModel import torch import torch.nn.functional as F import numpy as np tokenizer ...
Beitian Ma's user avatar
0 votes
2 answers
573 views

The following will print 768 weight and bias parameters for each LayerNorm layer. from transformers import BertModel model = BertModel.from_pretrained('bert-base-uncased') for name, param in model....
Fijoy Vadakkumpadan's user avatar
0 votes
0 answers
75 views

I implemented a transformer Encoder Decoder (Bert2Bert) for text summarization task. In train phase train loss decreases but in prediction phase it generate a repetitive token as output for example [2,...
rasoul mohammadi's user avatar
1 vote
1 answer
129 views

I am trying adapters on LIMU-BERT, which is a lightweight BERT for IMU data. I pretrained LIMU-BERT on Dataset A and planned to add adapters and tune them on Dataset B. Here is my adapter-adding code: ...
555wen's user avatar
  • 178
0 votes
0 answers
64 views

I am trying to train the BERT model but I haven't figured out the structure of TensorFlow yet. In the line for x = self.bert_module(book) an error occurs. Exception encountered when calling layer '...
AzureStrannik's user avatar
1 vote
1 answer
164 views

I have been using the Named Entity Recognition (NER) model https://huggingface.co/cahya/bert-base-indonesian-NER on Indonesian text as follows: text = "..." model_name = "cahya/bert-...
Mauro Escudero's user avatar
2 votes
2 answers
304 views

Following the multi-headed attention layer in a BERT encoder block, is layer normalization done separately on the embedding of each token (i.e., one mean and variance per token embedding), or on the ...
Fijoy Vadakkumpadan's user avatar
2 votes
0 answers
29 views

I’m experimenting with a Siamese network using triplet loss to categorize sub-classes into broader classes. My setup differs from traditional triplet loss models: It involves using the sub-class as ...
LawNLP_9808's user avatar
2 votes
1 answer
48 views

I am working with a question-answer dataset UCLNLP/adversarial_qa. from datasets import load_dataset ds = load_dataset("UCLNLP/adversarial_qa", "adversarialQA") How do I map ...
Jack Peng's user avatar
  • 642
2 votes
1 answer
72 views

When I run Dutch sentiment analysis RobBERTje, it outputs just positive/negative labels, netural label is missing in the data. https://huggingface.co/DTAI-KULeuven/robbert-v2-dutch-sentiment There are ...
pjercic's user avatar
  • 473
1 vote
0 answers
82 views

I am trying to use tensorflow serving to serve a keras bert model, but I have problem to predict with rest api, below are informations. Can you please help me to resolve this problem. predict output (...
cceasy's user avatar
  • 303
-1 votes
1 answer
74 views

I am trying to use keras-nlp with a pretrained masked BERT model to predict some tokens in a sequence. However the model produces inconsistent results. What could be wrong or am i misunderstanding ...
user3085693's user avatar
0 votes
1 answer
56 views

I am using pre-trained BertForTokenClassification for nested Named Entities Recognition task. To define nested entities, I am using multi-labels method. In the output model returns 3 lists of logits, ...
Alexandr Duck's user avatar
0 votes
0 answers
62 views

I have used shap on my Bangla dataset and plotted bar graph with following code: pred = transformers.pipeline("text-classification", model=model, tokenizer=tokenizer, device=0, ...
Tanjim Taharat Aurpa's user avatar
0 votes
1 answer
545 views

I found in Is it possible to freeze only certain embedding weights in the embedding layer in pytorch? a nice way to freeze only some indices of an embedding layer. However, while including it in a ...
Mirco Ramo's user avatar
0 votes
2 answers
252 views

I have just noticed that the token/sentence embeddings trained from Transformer-based model will have strong anisotropy problem which means most of the embeddings are close to each other in the vector ...
yuu Mu's user avatar
  • 1
0 votes
1 answer
593 views

I have created an NER model using BERT to detect medical entities which works great. I'm trying to add a CRF layer on top of my BERT model to enhance its performances but I'm getting an error that I ...
Akram H's user avatar
  • 71
0 votes
1 answer
123 views

I'm working on a project using the SBERT pre-trained models (specifically MiniLM) for a text classification project with 995 classifications. I am following the steps laid out here for the most part ...
SohmOuse's user avatar
0 votes
0 answers
418 views

Suppose you are pretraining a BERT model with 8 layers, 768-dim hidden states, 8 attention heads, and a sub-word vocabulary of size 40k. Also, your feed-forward hidden layer is of dimension 3072. What ...
Ss Dev's user avatar
  • 1
-1 votes
1 answer
191 views

I'm currently working with data of customers reviews on products from Sephora. my task to classify them to sentiments : negative, neutral , positive . A common technique of text preprocessing is to ...
read data's user avatar
0 votes
1 answer
257 views

I have a set of sentences which I have transformed into vectors using SBERT embedding. I would like to cluster these vectors. When looking for informations online, I keep seeing post telling to do ...
Alex Jax's user avatar
-1 votes
2 answers
1k views

I currently am trying to use the BERT language model for invoice creation. However, i am receiving the error: OSError: [WinError 126] The specified module could not be found. Error loading "C:\...
Adam Ramondo's user avatar
0 votes
2 answers
3k views

For my python script (as seen below), I use the package sentence-transformer, which contains SBERT models. Even though this package is clearly listed as installed when executing "pip list" ...
Nea13's user avatar
  • 1
-2 votes
1 answer
56 views

I'm trying to solve a problem of recommending a doctor based on a user's symptoms and location using a hybridized collaborative filtering and sentence similarity-based recommender system that follow ...
Sadura Akinrinwa's user avatar
1 vote
2 answers
75 views

My original df looks like this - df Note in the data frame: The headers are there till row 3 & from row 4 onwards, the values for those headers are starting. The numbers of rows & columns ...
Debojit Roy's user avatar
0 votes
0 answers
78 views

I thought you can use BERT embeddings to determine semantic similarity. I was trying to group some words in categories using this, but the results were very bad. E.g. here is a small example with ...
mihovg93's user avatar
1 vote
0 answers
139 views

I am training a BERT model using pytorch and HuggingFace's BertModel. The sequences of tokens can vary in length from 1 (just a CLS token) to 128. The model trains fine when using absolute position ...
NW_liftoff's user avatar
0 votes
1 answer
411 views

I use the code below to export a Bert-based PyTorch model to CoreML. Since I used dummy_input = tokenizer("A French fan", return_tensors="pt") the CoreML model only works with ...
Franck Dernoncourt's user avatar

1
2 3 4 5
35