Access
Last updated on 2025-01-14 | Edit this page
Outcomes
1- To illustrate what is the communications protocol and the criteria for open and free protocol
2- To give examples of databases that uses a protocol with different authentication process
3- To interpret the usage licence associated with different data sets
Standard communication protocol
Simply put, a protocol is a method that connects two computers, the protocol ensure security, and authenticity of your data. Once the safety and authenticity of the data is verified, the transfer of data to another computer happens.
Having a protocol does not guarantee that your data are accessible. However, you can choose a protocol that is free, open and allow easy and exchange of information. One of the steps you can do is to choose the right database, so when you upload your data into database, the database executes a protocol that allows the user to load data in the user’s web browser. This protocol allows the easy access of the data but still secures the data.
Authentication process
It is the process that a protocol uses for verification. To know what authentication is, suppose we have three people named John Smith. We do not know which one submitted the data. This is through assigning a unique ID for each one that is interpreted by machines and humans so you would know who is the actual person that submitted the data. Doing so is a form of authentication. This is used by many databases like Zenodo, where you can sign-up using ORCID-ID allowing the database to identify you.
Exercise
After reading this guide on different protocol types, do you know what is the protocol used in arrayexpress? ::: solution As we explained before on how to use the RDMkit, going through the page, you will find different types of protocols explained
From this part, you can understand that the protocol used for the arrayexpress is HTTP (HyperText Transfer Protocol) (highlighted in purple)
::::
Data usage licence
This describes the legal rights on how others use your data. As you publish your data, you should describe clearly in what capacity your data can be used. Bear in mind that description of licence is important to allow machine and human reusability of your data. There are many licence that can be used e.g. MIT licence or Common creative licence. These licences provide accurate description of the rights of data reuse, Please have a look at resources in the description box to know more about these licences.
Exercise
- From the this RDMkit guideline on types of licence, what is the licence used by the following datasets: 1- A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration 2- RNA-seq of circadian timeseries sampling (LL2-3) of 13-14 day old Arabidopsis thaliana Col-0 (24 h to 68 h, sampled every 4 h) ::: solution The link we provided, provided a nice explanation on types of licence and as you read the following section from RDMKit, you will find the following: From this section, you can clearly understand the type of licence used for: 1- A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration is CC-BY-4 2- RNA-seq of circadian timeseries sampling (LL2-3) of 13-14 day old Arabidopsis thaliana Col-0 (24 h to 68 h, sampled every 4 h) is CC-BY-4
::::
Sensitive data
Sensitive data are data that, if made publicly available, could cause consequences for individuals, groups, nations, or ecosystems and need to be secured from unauthorised access. To determine whether your data is sensitive, you should consult national laws, which vary by country. Through the following resources, you will know more about sensitive data and what to do if your data is sensitive
Resources
- A nice recipe from FAIRcookbook on SSH protocols
- A nice explanation from RDMkit on protocols and how they will help you protect your dataProtocols and safety of data transfer
Having your work licenced does not sound simple as it seems; here are some resources to help you find the correct licence for you:
- Why should you assign licence to your protocol from RDMkit here
- A nice recipe from FAIRcookbook with step-by-step instructions for
- licence
- software licence
- Data licence
- Declaring data permitted uses
- To know more about creative common licence, check this link Creative commons licence
To get more information on sensitive data, you can have a look on these reources:
FAIR principles
This episode covers the following principles: 1- (A1) (meta)data are retrievable by their identifier using a standardised communications protocol
2- (R1.1) meta(data) are released with a clear and accessible data usage licence