Skip to content

how to make dataloader return multiple tensors each time #17773

@zhangyu2ustc

Description

@zhangyu2ustc

🚀 Feature

I am working on medical image analysis, in which for data sample (i.e. medial image) it contains multiple samples (data + label) for training. How can I write my own dataset and dataloader to extract all the tensors from one image?

Motivation

Here is my definition of the dataset:
class HCP_taskfmri_datasets(Dataset):
##build a new class for own dataset

def __init__(self, output_dir, fmri_files, confound_files, label_matrix, target_name, perm=None, data_type='train',block_dura=1,transform=None):
    super(HCP_taskfmri_datasets,self).__init__()
    self.pathout = output_dir
    os.makedirs(self.pathout, exist_ok=True)
    self.fmri_files = fmri_files
    self.confound_files = confound_files
    self.label_matrix = pd.DataFrame(data=label_matrix)
    self.target_name = target_name
    
    self.block_dura = block_dura
    self.data_type = data_type
    self.transform = transform

def __len__(self):
    return len(self.fmri_files)

def __getitem__(self,idx):
    fmri_file = self.fmri_files[idx]
    confound_file = self.confound_files[idx]
    label_trial_data = self.label_matrix.iloc[idx]

    fmri_data, label_data = self.map_load_fmri_event_block(fmri_file, label_trial_data, block_dura=self.block_dura)  
   ##function to extract data and save into 3d array
    print(fmri_data.shape, label_data.shape)  
    ##shape of data:(170, 1, 360) shape of label (170,)
    ###which means that we have 170 samples extracted from each image file

    tensor_x = torch.stack([torch.Tensor(fmri_data[ii].transpose()) for ii in range(len(label_data))]) # transform to torch tensors
    tensor_y = torch.stack([torch.Tensor([label_data[ii]]) for ii in range(len(label_data))])
    print(tensor_x.size(),tensor_y.size())
    
    return tensor_x, tensor_y

###the dataloader will treat all 170 samples as one tensor. How can I extract each individual sample and use it for training the model?

Pitch

Can we make the a for loop for the get_item function? for instance:

    for ii in range(trailNum):
        tensor_x = torch.Tensor(fmri_data[ii]) # transform to torch tensors
        tensor_y = torch.Tensor([label_data[ii]])
        sample = {'input': tensor_x, 'target': tensor_y} 
        yield sample

Alternatives

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: dataloaderRelated to torch.utils.data.DataLoader and Sampler

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions