Plan

Why research data management?

Research data refers to

  • digital or analog basic data and materials for scientific research,
  • further data refined from these, upon which the research findings and published research results are based, as well as
  • code and software upon which the research results build.

" "

At the University, management, storage and reuse of research data are essential elements of research infrastructure. Finnish and international research funders as well as publishers not only appreciate but increasingly mandate openness and transparency of research data. Above all, good data management leverages the research process itself. With well planned data management, you

  • make your research more efficient
  • comply with funders' requirements
  • comply with data protection legislation and protect your study subjects
  • agree upon data ownership and rights as well as data sharing and preservation together with your partners
  • agree upon how, when and on what terms you open your data
  • ensure that others can cite your data, giving you due scientific credit
  • ensure that necessary resources and equipment are available to you throughout your project.

JYU provides research teams with an in-house research data infrastructure. It features tools for supporting data management with appropriate data security and storage capacity, as well as related advice and guidance.

The FAIR principles for reusable data

In accordance with the FAIR principles for maximum reusability for digital research data, JYU encourages data management that leads to Findable, Accessible, Interoperable and Reusable research data and metadata. Major funders such as the Academy of Finland also require that metadata and, if possible, data issued with their funding are FAIR. In order to ensure that your data and/or their metadata are FAIR is to follow these five steps:

  • Archive your data in an established digital repository at the end of the project
  • Choose a repository that provides your data a persistent identifier (PID), such as DOI or URN
  • Store your data in a open file format such as Rich Text Fotmat (.rtf) or .csv; these are more interoperable and less subject to loss and obsolescence than proprietary formats
  • Create descriptive metadata for the data (see Documentation and metadata below)
  • License your data with a license that states clearly the conditions and restrictions for reuse (see ensuring further use of data below).

Data management plan

Research funders, e.g. the Academy of Finland, require applicants and/or grantees to provide a data management plan (DMP) as part of the research plan, stating how the research data will be obtained, used, stored and protected, and how their later use will be enabled for others.

As a rule, seek to publish the data for further use. However, publishing does not necessarily mean that the data could be used by anyone for any purpose. It is therefore important that you acknowledge and document the ownership, control and terms of use for the data and include these in the descriptive metadata. The ownership and control of a dataset entail a right to decide on the purpose of its use, but also responsibility for proper management of the data. In many cases, the data themselves cannot be for a justified reason be made available, but even in these cases, their metadata can.

A data management plan should typically describe:

1. What kind of data you will be reusing, collecting, and processing, and how?
2. What kind of ethical and legal considerations (data privacy, other ethical issues, ownership, usage rights) apply?
3. How will the data be documented, and what kind of descriptive metadata will you provide?
4. 
How will the data be stored and backed up
5. 
At the end of the project, what parts of the data will be archived, published, and/or disposed of?
6. What kind of roles and responsibilities apply to the management of the data? What is the planned budget?

You can draft your DMP either by copying the funder's DMP template straight into Word or other word processor,  or use the DMPTuuli online tool. 

Tools and guidance 

JYU Guidelines to How to write your DMP
Academy of Finland's DMP tips
General Finnish DMP template and guidance (.pdf)
JYU's organisational instructions and model clauses for General Finnish DMP Template (.pdf)


Know your data

When starting on drafting your DMP, briefly describe and categorise your data (e.g., pre-existing statistical or archive data; raw data that you collect; processed analysis data). You can use a table or a listing. Name the different data types so that you can reference them later on in your DMP.

Examples:

Data type Source Personal / sensitive data File format (recommended open formats:) Estimates size
Analysed DNA sample Processed from DNA sample No .xlsx, .csv 2 Gb
Statistical analysis X Pre-existing from FSD No, anonymised SPSS (.por, .sav)  
Questionnaire Collected from study subjects Yes, identifiable and health information .csv 5 Mt
Interview recording on video Collected Yes, identified personal information .avi, .mp4  
Interview transcript Processed No .csv, .txt, .xlsx >10 Mt
Image Collected No .tif, .jpeg, .gif, .raw  
Administrative documents Permissions collected from study subjects Yes .docx  

Source: Fuchs, S. 2020. RDM : Research Data Management Basics, Meilahti. Helsingin yliopisto. CC BY 4.0. [Retrieved on 25.2.2021.]


Contracts, agreements, and licensing

To secure your legal protection and in view of the further use of data compiled in your research project, make a written agreement upon ownership of and access rights to the data within your group and with your partners as early as possible. The project's PI is responsible for making sure that all project partners sign a Transfer-of-rights agreement about transferring ownership of the data to the University before signing the project agreement, and at the latest upon starting the project. Transfer-of-rights agreements are always recommended, but must be made at least in the following cases:

  • projects funded by Business Finland
  • EU research programmes
  • commissioned research
  • as requested by the funder.

Consider these when making agreements

  • What parts of data are meant to be made available for reuse?
  • If one or several researcher brings ready pre-existing data (such as statistical or register data) to the project, will these be included in the future published data? Are reuse rights clear?
  • When will the dataset(s) be published?
  • To what purpose will the data be made available (free reuse with a Creative Commons lisence; restricted use for research/teaching/studying purposes)?
  • Who has the right to make the publishing/archiving contract?
  • If reuse will be restricted, who is authorised to grant the reuse right?

Model for data authorship & rights agreement contract

Download here a ready model for data authorship/usage rights agreement.

JYU agreement models

JYU offers ready agreement templates for contracts.

Licensing 

Agreeing upon licenses for the future published data is important at this phase. Licensing means that you clearly define the reuse terms and possible restrictions to future reuse of the data. This way, you are in control of who will have rights to reuse the data, and how. The JYU policy is to use machine-readable licenses that follow international standards, preferably Creative Commons. Using them secures the maximum reusability for the data. Licensing is necessary for publishing data in the future; unlicensed datasets are unsafe to reuse.


Processing of personal data, ethical issues

Research ethics and the legal right of your study subjects to their personal information affect the ways in which you collect, store, and process research data, who is allowed to use the data and for what purpose, and how the data can be archived. When planning the collection of personal information, check the JYU data protection trainings and resources for researchers in good time. 

Do I process personal information in my research?

If your research involves the processing of any information that can be directly or indirectly linked to a natural, living person, you are processing personal information. In this case, data protection law applies to your research. Personal data is collected either directly from the subjects (interviews, surveys, measurements, observations, etc.) or indirectly from registers and archives.

Direct identifiers are information from which a person is immediately identifiable, e.g. full name, ID, personal e-mail address, face image, voice recording, fingerprint, or brain image, for example. Indirect identifiers are information that is not directly identifiable by an individual, but which, when combined with other information available about the person, may lead to identification. This information includes e.g. car registration number, grandparents' names, domicile, marital status, occupation, ethnic background and date of birth. For more information, see here

Special categories of personal data and sensitive information 

Special categories of personal data include a person's ethnic origin, political opinion, religious beliefs and philosophical worldview, trade union membership, genetic data, personally identifiable biometric data, health information, and sexual behavior or orientation. Sensitive information also includes personal identification number, bank account information and criminal record information.

If you process personal data of a special category, special provisions apply to your research regarding the definition of the exception for the processing of personal data and the storage and processing of data.

 

Researcher's privacy path

1. It all starts with planning. What personal information do you need to conduct your research? Is your study a one-time or follow-up study? When you provide a privacy statement required by the GDPR, you must be able to specify a clear life cycle for the processing of personal information, including the beginning and the end. Principle of minimization and limitation of the retention of personal data. Minimization means that only the amount of personal data necessary for the purpose defined in the study plan should be collected and that identifiers that become redundant should be removed as soon as possible. According to the principle of limitation, it is good to try to define a temporal end point for the storage of personal data. Note that the text of the privacy statement will legally bind you in the future.

2. Anonymisation, archiving, follow-up studies, reuse. The middle and end of the life cycle of personally identifiable information must be planned well in advance so that you can realistically describe it in the research notification and the privacy notice that you provide to your subjects.

  • At what point do you plan to pseudonymise the data collection and processing, and how do you ensure the security of the code key? Is anonymisation a viable option for you? Before collecting the information, familiarize yourself with what anonymisation in practice requires of you. Only make a decision to anonymise when you are absolutely sure you should take it. 

  • If you do not anonymise the data to open them anonymously for re-use after the research, justify in the privacy notice the reason for retention of the identifiable data after the study, such as verification of the research results. If it is not possible for you to set an exact end date for the retention of personal data, record in the information you provide to the subjects that the retention of data for the purposes of the original study will be evaluated, for example, every year or two.

  • Archiving of identifiable data is possible under certain conditions. For example, the Finnish data archive for language data, the Language Bank of Finland, requires that a plan for archiving the data to the Language Bank be written in the information provided to the subjects. The same is required by the Finnish Social Science Data Archive. There are ready-made clauses in the template for the university's privacy statement for the different options. 


3. 
Select the appropriate legal basis for the processing of personal data. The primary recommended basis is scientific research in the public interest. Public interest e.g. facilitates the possible future reuse of the data. Consent is recommended only in cases where public interest is not applicable for one reason or another. If you collect special categories of personal data and use consent as a legal basis, the consent must be explicit. See more information on the choice of legal basis. Record the selected criterion in the privacy notice.

4. Determine who will act as the controller or your personal data registry. If you are carrying out research on an externally funded project or working for a university, the university acts as the controller. Often, the university and the researchers both act as controllers. In consortium projects, there is usually joint controllership between the partner organisations. Record the controller in the privacy notice and in your DMP.

5. Prior to the start of the study, clearly inform your subjects how and by whom their personal data will be processed and managed during the study. JYU's Privacy Policy advises you on how to process personal information securely and lawfully at different stages of the investigation. If your research setting is of such a nature (e.g. extensive register data with incomplete contact information) that personal information is not possible, please consult the university's guidelines.

6. Always conduct at least a concise, free-form risk assessment for your research that contains personal information. For more information, see here. If the risk is estimated high, the study is made subject to the the Data Protection Impact Assessment (DPIA) in accordance with the EU's general data protection regulation.

A particularly high risk is considered

  • a large number of persons whose data are processed
  • a large amount of information about a person
  • sensitive information
  • information on vulnerable study subjects (e.g. children)
  • use of data for automated decision making
  • systematic monitoring.

See the JYU data privacy guide for researchers for detailed instructions.

7. When collecting personal data directly from your subjects, also ask them for consent to participate in the study. Consent to participate in the investigation is sought when the legal basis for the processing of personal data is public interest. Informed consent is a fundamental principle of research ethics, and deviating from it always requires an impact assessment.

8. When collecting personal information, strive to minimise the identifiable information, that is, avoid collecting personal information that is not necessary for your research question. Take care of the data security of the storage devices during the field study. Transfer the data to the University Nextcloud, the S: drive project folder, or the U: drive folder as soon as possible after saving.

9. When processing personal information, take care of the security of your procedures. Ensure that access to personally identifiable information is restricted to the persons or entities described in the DMP and the privacy notice. Document how you implement the security measures you have promised to subjects and how you control access to identifying information. Remember that personal data may only be processed in the manner and for the purpose for which the data subject was informed in the privacy notice before the start of the investigation. If the need to deviate from the one specified in the data privacy notice arises during the processing of the data, inform the subjects immediately and update the data privacy notice and other necessary documents.

10. At the end of the study, take care of the identifiable parts of the data according to the information you provided to the subjects. What parts of the data do you destroy? What will you possibly anonymise? What do you store post-project in identifiable form, e.g., pseudonymised, for verification of results or any follow-up that may be included in your original study? What are you archiving, where, and with what usage restrictions?

Make sure that the processing of personal data is described in a consistent manner in your material management plan and in the research notification and privacy notice you provide to subjects.


If you have questions about the processing of your data at the end of your research, please contact the Open Science Centre: researchsupport-osc@jyu.fi.

Need help with data privacy issues? Contact the university's Data Protection Officer: tietosuoja@jyu.fi.

Ethical Review in the Human Sciences

You may need a prior evaluation of your research from the University's Human Sciences Ethics Committee or the Ethics Committee of the Hospital District of Central Finland if your research set-up meets the specific criteria defined by the Finnish National Board on Research Ethics:

  • Participation in the study deviates from the principle of informed consent,
  • the investigation intervenes in the physical integrity of the subjects,
  • the study is conducted with participants under the age of 15 without the separate consent or information of the guardian, on the basis of which the guardian would have the opportunity to prohibit the child from participating in the study,
  • the subjects are presented with exceptionally strong stimuli,
  • there is a risk in the research to cause mental harm to the subjects or their relatives beyond the limits of normal everyday life, or
  • the conduct of the investigation may pose a safety threat to the subjects or the investigator or their relatives.

If needed, contact the University’s Human Sciences Ethics Committee well in advance of starting your research. Review can no longer be requested once you have started the research.