Intro

This project intends to provide in a data storage solution for all research data across it's entire life-span (Data Life Cycle - DLC). All services are supported by a joint effort of UU-ICT and ICT-Beta. We are currently involved in a pilot implementation with the Cell Biology group to store research data using services provided by Research-IT and cloud storage. This involves an Irods server providing access to cloud storage and analysis of metadata .

Access

Storage: https://science.data.uu.nl

Login using emailaddress as user (lower case only) and solis password:

get Cyberduck and use it to open https://science.data.uu.nl
configure Cyberduck for the IRODS-protocol when transferring large or many files:

download the Irods.cyberduckprofile
open this profile in Cyberduck (use open with and point)
fill in username PAM:j.doe@uu.nl (emailaddress in lower case only)
close screen and then open the science.data.uu.nl - Irods preset
enter your solis password
navigate up one directory by pressing the triangle icon in the upper right corner

general information on using Cyberduck: http://irods.org/2015/09/howtocyberduck/

alternative (not recommended): how to mount a WebDav drive in the OS-finder (Windows/Mac) see: Personal_storage
mounting the storage under Linux (Ubuntu16):

You can connect from your filemanager, option connect to: davs://science.data.uu.nl
Make a mount:

   sudo apt-get install davfs2
   sudo mkdir  /mnt/irods
   sudo mount.davfs 'https://science.data.uu.nl' /mnt/irods

Automatic mounting and storing your username/password See: https://wiki.archlinux.org/index.php/Davfshttps://wiki.archlinux.org/index.php/Davfs

User management (use web browser, data admin only): https://science.yoda.uu.nl and choose group manager

Metadata

Adding metadata

Metadata is used to classify researchdata independent of directory structure. This will enable you to find data lateron in a search-engine like style. Irods can be configured to extract metadata from the files itself if the syntax is known (i.e. TIFF files). This is not standard. Below you find the naming-scheme that can be used to add metadata to any directory. Adding a metadata.txt file to a folder will add this metadata to all data contained in the folder (also subfolders). For varying classifications down the line: just add another metadata.txt file containing the changes only. You can use a metadata_subtitle.txt file naming scheme for your own purposes (replace SUBTITLE with appropriate name)

Naming scheme general metadata

name	type	possible values	description
ownerID	alphaNum	solisID/groupID	conform to existing ID's as much as possible
deviceID	alphaNum	deviceID	enter the ID of the device used to obtain the data
projectName	alphaNum	projectName	official name of the project
localName	alphaNum	localName	make up your own naming scheme for local purposes
startDate	date	dd/mm/yyyy	start date of project
endDate	date	dd/mm/yyyy	planned end date of project
storageReferenceID	alphaNum	create syntax for storagereferenceID	this ID can be used to refer to existing physical storage
dataType	alpha	raw / analysis / archive / re-use	classification describes the lifecycle stage of the data
deleteAfter	alphaNum	1 / 3 / 5 / 10 / never / delete	number of years data should be kept after endDate
confidentiality	alpha	basic / sensitive / critical / public	following BIV-classification this determines privacy and legal status
description	alphaNum, 250 chars	free text	any additional data you might want to add

remarks:

default values are bold
the naming scheme can be extended for adoption to your own dataset, but should always be documented for proper use by the entire research group

I also need this!

ICT-Beta can guide you in the proces to a research data storage solution.

You can start by answering the following questions

how would you describe your current research data storage setup/situation? (good/not good enough/bad)

If not good: what needs to be done to improve this?

what kind of data do you want to store?

Use the Data Life Cyle to classify the kind of data:

gathering data / processing data: measurements / processing data: analysis / archiving data (consolidate / publicizing data (results) / re-using data (follow up)

how much storage do you need (by DLC category in TB) , what is the expected growth rate? (default: +20% per year)

remark: quality and speed of storage is derived from DLC category

what is your first priority in properly storing research data? Ie: current research: raw or data-analysis used as a back-up or live data, archive of previous research, publicizing data
when would you like to start? How much data would you be able to classify and move per month? Also see:
appoint a data-steward to guide the transition process of your data
what kind of specific metadata do you use/need in order to classify your research data? (think about data-retrieval lateron)
how long do you want to store the data for? (ie: 1/3/5/10 years or indefinitely)
also think about confidentiality: basic / sensitive / critical / none
what is your current financial investment in research data storage? Do you receive any specific budget for storage of research data?

Irods research data storage

Contents

Intro

Access

Metadata

Adding metadata

Naming scheme general metadata

I also need this!

You can start by answering the following questions

Navigation menu

Irods research data storage

Intro

Access

Metadata

Adding metadata

Naming scheme general metadata

I also need this!

You can start by answering the following questions

Navigation menu

Search