datasets.simula.no

A collection of open datasets published by Simula Research Laboratory and SimulaMet.

Currently, we have published the following datasets:

Medical and Biology Datasets

Cellular, A cell autophagy dataset. [project]
Depresjon, The Depresjon Dataset. [publication | project]
GastroVision, A multicenter dataset. [publication | project]
HTAD, A Home-Tasks Activities Dataset with Wrist-accelerometer and Audio Features. [publication | project]
HYPERAKTIV, A Motor Activity Database of Patients with ADHD. [publication | project]
HyperKvasir, The Largest Gastrointestinal Dataset. [publication | project]
Kvasir, A Multi-Class Image-Dataset for Computer Aided Gastrointestinal Disease Detection. [publication | project]
Kvasir Capsule, The largest gastrointestinal PillCAM dataset. [publication | project]
Kvasir Instrument, A gastrointestinal instrument Dataset. [publication | project]
Kvasir SEG, Segmented Polyp Dataset for Computer Aided Gastrointestinal Disease Detection. [publication | project]
Kvasir-VQA, A Text-Image Pair GI Tract Dataset. [publication | project]
Kvasir-VQA-x1, A Large-Scale Multi-Task Benchmark for GI Tract Visual Question Answering. [publication | project]
KvasirCapsule SEG, A Capsule Endoscopy Segmentation Dataset. [publication | project]
MedMultiPoints, A Multimodal Dataset for Object Detection, Localization, and Counting in Medical Imaging. [publication | project]
Medico Multimedia - VISEM Tracking, A sperm tracking dataset. [publication | project]
Nerthus, A Bowel Preparation Quality Video Dataset. [publication | project]
Psykose, A Motor Activity Database of Patients with Schizophrenia. [publication | project]
VISEM, A Multimodal Video Dataset of Human Spermatozoa. [publication | project]
VISEM QC, A sperm quality control dataset. [project]

Sport and Activity Datasets

Alfheim, Soccer video and player position dataset. [publication | project]
Arx, A Text-Classification Dataset Consisting of Norwegian Soccer Articles from VG and TV2. [publication | project]
ExposureEngine, Oriented Logo Detection and Sponsor Visibility Analytics in Sports Broadcasts. [project]
Heimdallr, A Dataset For Sport Analysis. [project]
HockeyAI, A Multi-Class Ice Hockey Dataset for Object Detection. [publication | project]
HockeyOrient, A Dataset for Ice Hockey Player Orientation Classification. [publication | project]
HockeyRink, A Dataset for Precise Ice Hockey Rink Keypoint Mapping and Analytics. [publication | project]
PMData, A lifelogging dataset of 16 persons during 5 months using Fitbit, Google Forms and PMSys. [publication | project]
ScopeSense, A 8.5-month sport, nutrition, and lifestyle lifelogging dataset. [project]
Soccer Summarization, Soccer game captions and summary in English for game summarization. [publication | project]
SoccerChat, A Multimodal Video-Text Dataset for Natural Language Soccer Game Understanding. [publication | project]
SoccerMon, Subjective and objective data collected over two years from two different elite women´s soccer teams. [project]
SoccerNet-Echoes, A Soccer Game Audio Commentary Dataset. [publication | project]
SoccerSum, The SoccerSum Dataset for Automated Detection, Segmentation, and Tracking of Objects on the Soccer Pitch. [publication | project]
TACDEC, TACDEC: Dataset of Tackle Events in Soccer Game Videos. [publication | project]

Other Datasets

Anarchy Online, Server-side Network Traffic from Anarchy Online: Analysis, Statistics and Applications. [publication | project]
European Cloud Cover, A dataset containing reanalysis data from ERA5 and satellite retrievals from METeosat Second Generation. [publication | project]
Eye Tracker, A Serious Game Based Dataset. [publication | project]
HSDPA, HSDPA-bandwidth logs for mobile HTTP streaming scenarios. [publication | project]
Image Sentiment, A dataset for image sentiment analysis. [publication | project]
Njord, A fishing boat dataset. [project]
Right Inflight, A Dataset for Exploring the Automatic Prediction of Movies Suitable for a Watching Situation. [project]
THREAT, A Large Annotated Corpus for Detection of Violent Threats. [project]
Toadstool, A Dataset for Training Emotional and Intelligent Machines Playing Super Mario Bros. [publication | project]
WICO Graph Dataset, A Labeled Dataset of Twitter Subgraphs based on Conspiracy Theory and 5G-Corona Misinformation Tweets. [publication | project]
WICO Text, A labeled dataset of conspiracy theory and 5G-corona misinformation tweets. [publication | project]

How to contribute

To add a new dataset, follow these steps:

Fork the Repository: Fork this repository to your GitHub account.
Create a Markdown File: In your forked repository, navigate to the datasets folder and create a new Markdown file (.md) for your dataset. The file name should be descriptive of the dataset.

Add Dataset Information: Copy and paste the following template into your Markdown file:

---
title: <dataset name>
desc: <dataset description>
thumbnail: <dataset thumbnail>
publication: <link to publication>
github: <link to github>
tags:
  - <list of tags>
---

Fill in the template with the appropriate information about your dataset.

Add a Dataset Thumbnail: Add a thumbnail to the dataset that will be displayed on the main page. The thumbnail should use a 16:9 aspect ratio, like 320 x 180 or 640 x 360 pixels, and be placed under public/thumbnails.
Update the README: Update this README with the new dataset added under one of the categories above. Add links to the publication, code, or other things that may be useful.
Create a Pull Request: Once you have added the Markdown file and filled in the dataset information, commit your changes. Push the changes to your forked repository. Create a pull request to merge your changes into the main repository.

Contact

If you have any questions or need assistance, please open an issue in the repository or contact steven@simula.no.

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
.github/workflows		.github/workflows
datasets		datasets
public		public
src		src
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierignore		.prettierignore
README.md		README.md
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
prettier.config.js		prettier.config.js
tailwind.config.js		tailwind.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

datasets.simula.no

How to contribute

Contact

About

Uh oh!

Uh oh!

Contributors 7

Uh oh!

Languages

simula/datasets.simula.no

Folders and files

Latest commit

History

Repository files navigation

datasets.simula.no

How to contribute

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 7

Uh oh!

Languages