427 questions
0
votes
0
answers
41
views
torch dataloader next-method when using multiple workers
I have a Dataset that is based on IterableDataSet, looking like that
class MyDataSet(torch.utils.data.IterableDataset):
def __init__(self):
# doing init stuff here
def __iter__(self):
...
0
votes
1
answer
175
views
Best method to create generator for TensorFlow with list of array inputs
I am using TensorFlow/Keras to create a deep learning model. The network is built as follows:
inps = []
features = []
for i in range(number_windows):
inp = Input(shape=(window_length,), name=f&...
3
votes
1
answer
218
views
TensorFlow data loader from generator error "Dataset had more than one element"
I am trying to implement a TensorFlow dataset from a Python generator because I am having problems with my model consuming memory, inevitably resulting in a OOM crash (see my question on that here). ...
0
votes
0
answers
115
views
Why is my numpy-based custom data loader extremely slow and unstable when iterating over large tick data
I'm currently working on a model similar to DeepLOB, using high-frequency tick-level financial data. Due to the massive volume and the need to structure the data into time series format, it's ...
0
votes
0
answers
346
views
In HotChocolate 15, how to use projection and data loader for many to many relationship?
I'm using Hotchocolate v15, EF Core. For 1:m relationship, I'm using dataloader and projection to just fetch selected fields for child items. It work as expected, for example: load products by branch.
...
1
vote
1
answer
53
views
GraphQL DataLoader fails with 400 when batching many IDs
I'm running a GraphQL server using postgraphile and dataloader to batch and load data from a backend microservice. When the number of IDs passed to the loader grows large (~100+), the request fails ...
0
votes
0
answers
60
views
Cache State Inconsistency: Thread Incorrectly Assumes Cache is Full
I'm working with a class called CachedRandomIterDataset that uses an asynchronous thread to load data into a cache from a dataset. The cache is supposed to be filled with data, shuffled, and then ...
0
votes
0
answers
46
views
Can we use DataLoader outside of a GraphQL context in Node.js?
For a GraphQL request, in our resolver, we often need to insert messages to an AWS SQS queue which then triggers a Lambda function.
So we wanted to batch all invocations to our util function (where we ...
0
votes
1
answer
317
views
The FAISS indexing and the dataset indexing don't match
I'm trying to compute the recall after performing a HNSW search in FAISS. By recall, I mean the following metric:
Recall = TP / (TP + FN)
Where I consider an image as a True Positive (TP) if it ...
1
vote
0
answers
151
views
How to Build a More Efficient DataLoader to Load Large Image Datasets?
I am trying to train a deep learning model on a very large image dataset. The model input requires a pair of images (A and B). Because my image sizes are quite large, I have resized each of them to a ...
0
votes
1
answer
124
views
Why is my DataLoader process using up to 2.6GB of virtual memory, and is there any way to reduce it?
Why is my DataLoader process using up to 2.6GB of virtual memory, and is there any way to reduce it?
Each DataLoader process takes up 2.6GB of virtual memory, and 4 processes take up 10.4GB.
from ...
1
vote
0
answers
29
views
Asynchronous parallel data loading with torch in R
I want train cnns on a big dataset via transfer learning using torch in R. Since my dataset is to big to be loaded all at once, I have to load each sample from the SSD in the dataloader. But loading ...
0
votes
2
answers
283
views
TypeError: 'DataLoader' object is not subscriptable in SuperGradients Trainer
I've created DataLoader objects for my training and validation datasets, but when I try to pass them to the trainer.train() method, I get the following error:
Log summary:
TypeError: 'DataLoader' ...
1
vote
2
answers
291
views
Setting random seed in Torch dataloader
I'm trying to get the torch dataloader to load the data under a specific sequence determined by the random seed 1. Here's my code:
import random
import torch.utils.data.dataset as Dataset
import torch....
0
votes
1
answer
655
views
How to run torch dataloader in a sub-process of multiprocessing.Pool?
I want to inference model in multiprocessing, instead of use torch.distributed, how can I use multiprocessing.Pool?
I have to use num_workers=0 in subprocess to avoid error like "daemonic ...
1
vote
1
answer
74
views
How to use tf.data.interleave() with tf.py_function
I am trying to build TF data pipeline with tf.data API. I have ~100k of npz files to load and each npz has key of ["input"] and ["output"]. Some preprocessing is needed before ...
0
votes
1
answer
920
views
Is there a good way to BatchMapping or use Data Loaders in Spring GraphQL utilising non-exposed fields?
Consider the below minimal GQL schema:
type query {
appointments: [Appointment!]!
}
type Appointment {
id: ID!
job: Job!
}
type Company {
id: ID!
job: Job!
}
type Job {
id: ...
1
vote
1
answer
393
views
How to pass a pytorch DataLoader to huggingface Trainer? Is that even possble?
The usual steps to use the Trainer from huggingface requires that:
Load the data
Tokenize the data
Pass tokenized data to Trainer
MWE:
data = generate_random_data(10000) # Generate 10,000 samples
...
0
votes
0
answers
71
views
Federated dataloader deprecated?
In the federated Learning code below, I'm using Pysyft. the goal is to distribute the FashionMNIST dataset to different clients
federated_train_loader = syft.FederatedDataLoader(
datasets....
1
vote
1
answer
609
views
PyTorch DataLoader hangs when num_workers > 0 with custom torchvision transform
I’m using PyTorch’s DataLoader to load my dataset. I’ve noticed that my program hangs indefinitely during training when I set num_workers > 0. However, it works fine when num_workers = 0.
Here’s a ...
0
votes
1
answer
54
views
Issues between PyTorch DataLoader and Matplotlib's Imshow for Image Classification Task
I am currently working on a binary classification task involving image data. To begin, it is essential for me to inspect my dataset. However, I have encountered an issue with the DataLoader.
On the ...
-1
votes
1
answer
206
views
Issue about PyTorch, predicting without utilizing a DataLoader return distinct predictions compared to employing a DataLoader
I try to predict a single image without using Dataloader, but I get a weird result.
This image is the result of my prediction.
With Dataloader, predicted results are consistent with labels.
However, ...
0
votes
1
answer
745
views
How to use balanced sampler for torch Dataset/Dataloader
My simplified Dataset looks like:
class MyDataset(Dataset):
def __init__(self) -> None:
super().__init__()
self.images: torch.Tensor[n, w, h, c] # n images in memmory - ...
1
vote
1
answer
158
views
How to keep user logged in on refresh of dashboard page using firebase onAuthStateChange in a react app and react router dom RouterProvider API
I am using react router dom RouterProvider which decouples fetching from rendering, from the official remix-run react router example of auth-router-provider, it was stated in the README.md that
we ...
1
vote
1
answer
451
views
PyTorch, validation step is considerably faster if I train on the validation data, why?
I am training a FCN model, I have two dataloaders train_loader and val_loader. As you can see in the code below, I made the model train on the validation data. I did this to debug a problem I had ...
1
vote
1
answer
147
views
Validation data without targets
I have a validation dataset of images to be classified by my CNN model. I want to load these images using pytorch. torchvision.datasets.ImageFolder() function doesn't work, since there are no targets, ...
1
vote
0
answers
103
views
How can I resolve this problem with dataloaders?
I'm building some dataloaders for training and testing a machine learning model.
I have a list of tuples named "array" like this:
(Data(x=[468, 2], edge_index=[2, 1322], y=0, edge_weight=[...
0
votes
0
answers
64
views
stack expects each tensor to be equal size, but got [3, 128, 128] at entry 0 and [4, 128, 128] at entry 10
I created a custom ImageFolder with torch.utils.data.Dataset class and then converted it to a dataloader, but when I want to see one of the elements of data loader with img_custom, label_custom = next(...
1
vote
1
answer
80
views
pandas.DataFrame.to_sql intermittently loading data partially to snowflake/database
Intermittently it happens that pandas.DataFrame.to_sql partially loads data into snowflake.
Example: DF has 25000 rows, buy the function loads only 15000 to snowflake.
Has anyone faced this issue and ...
0
votes
1
answer
428
views
Pytorch dataset - len(train_dataset) returns zero
I am trying to create a custom dataset and dataloader in pytorch, to finetune a DONUT model. For context, my dataset is organised as follows:
dataset/
├── train/
│ ├── image1.jpg
│ ├── image2.jpg
│...
1
vote
0
answers
63
views
TF keras.utils.Sequence first batch called twice
While working on a data loader for a Keras deep learning model, I added some print statements in the get_item method of the data loader. This method is in charge of returning the n-th batch to the ...
0
votes
1
answer
178
views
PermissionError Access denied
while loading some data from a network drive a permission error occurs from time to time and the script terminates with a permission error.
the error occurs in this line :
try:
data = self....
0
votes
1
answer
170
views
Modify PyTorch DataLoader to not mix files from different directories in batch
I want to load image sequences of a fixed length into batches of the same size (for example sequence length = batch size = 7).
There are multiple directories each with images from a sequence with ...
0
votes
1
answer
350
views
The Pytorch lightning finds no tuner in lr_find_results=trainer.tuner.lr_find
I'm working on using PyTorch Lightning to train a neural network with a DataLoader. I have installed PyTorch and PyTorch Lightning successfully. However, I am encountering an issue with the learning ...
0
votes
1
answer
285
views
Implementing Dynamic Data Sampling for BERT Language Model Training with PyTorch DataLoader
I'm currently in the process of building a BERT language model from scratch for educational purposes. While constructing the model itself was a smooth journey, I encountered challenges in creating the ...
0
votes
0
answers
346
views
Why does async code mess up my dataloader in a graphql resolver?
I have a dataloader that I'm using to batch requests to a service to get my user's addresses.
The loader doesn't batch requests correctly when the parent resolver uses async code.
Here's a general ...
1
vote
0
answers
270
views
Pytorch 1.13 dataloader is significantly faster than Pytorch 2.0.1
I've noticed that PyTorch 2.0.1 DataLoader is significantly slower than PyTorch 1.13 DataLoader, especially when the number of workers is set to something other than 0. I've done some research and ...
0
votes
3
answers
4k
views
Salesforce Problem with Data Loader, Java version and installing the software
i'v installed the latest version of Zulu Jdk and DataLoader to get Data Loader on my computer, but after extracting the files from the compressed file of the DataLoader, and trying to run the install....
0
votes
1
answer
154
views
GraphQL Dataloader on non-id fields?
We're using NodeJS (typescript) and GraphQL for our backend.
Therefore we rely heavily on dataloaders, and we get more and more field resolvers that needs to be resolved on something other than IDs.
...
0
votes
2
answers
312
views
How to do counts in batch for graphql data loader?
I'm implementing a Graphql resolver for complex datatype. I try to use a data loader to avoid N+1 problem. I have a datatype with a counter. Therefore, in one graphql query I need to perform user ...
0
votes
1
answer
710
views
Dataloader/sampler/collator to create batches based on the sample contents (sequence length)
I am converting someone else's code into a neater torch-y pipeline, using datasets and dataloaders, collate functions and samplers. While I have done such work before, I am not sure how to tackle the ...
1
vote
2
answers
504
views
How to use the `shard_func` in tensorflow's `tf.data.Dataset.save`
Background:
I'm working with a large dataset saved in a non-standard format. I can write a pure python data-reader, but when called from DL dataloaders, like tf.data.Dataset, it takes forever to ...
0
votes
0
answers
125
views
Training a neural network without collapsing
I am trying to train a pytorch neural network to map from image space to 2D. I have the condition that I only want to use the ReLU activation function, linear layers, conv2d layers, and avgpool2d ...
0
votes
2
answers
264
views
Confusion in initialising GraphQL Dataloader in context
context: ({ req }) => { if (req) { return { ip: headers.userip, headers, userLanguage, decodedToken, dataLoaders: { seoDataLoader: createSeoDataLoader() } } } }
Here I create a createSeoDataLoader ...
4
votes
1
answer
1k
views
HotChocolate v.13 [UseProjections] attribute does not work with DataLoaders
I have the following GrapqhQL query:
query {
listTenants {
totalCount
items {
tenantId
name
sites {
totalCount
items {
siteId
cmxName
...
0
votes
4
answers
7k
views
salesforce dataloader installation problem
I am learning salesforce. I am trying to install dataloader. I have downloaded and extracted the files in the zip file, into a local folder.
But when I am trying to run the install.bat, I get this ...
2
votes
1
answer
6k
views
Iterating over PyTorch DataLoader slower than direct dataset access
I am using PyTorch to train a machine learning model, and I have encountered a significant issue where iterating over the DataLoader is noticeably slower than directly accessing the dataset. My main ...
0
votes
1
answer
52
views
Pytorch DataLoader switch item to List and cause several error
My source code is using pytorch and like this:
def Embed(sequenceSet):
output = []
for s in sequenceSet:
PseDNCSequence = Embedding.PseDNC(str(s))
ANFSequence = Embedding.ANF(str(s))
...
2
votes
1
answer
2k
views
HotChocolate v.13 DataLoader approach with attributes and source generated code not working
I recently started experimenting with HotChocolate v.13. and I am having issues implementing the data loaders with the [DataLoader] attributes like shown in this video: Let's simplify DataLoader with ...
1
vote
0
answers
109
views
Why do you inject batch functions into context when you implement dataloader?
There are often examples of injecting the batch function into the context when implementing dataloader in Golang, but I'm not sure why.
While I understand the concept of caching data per request in ...