Introduction

On this page you will find answers to the tasks of the data management plan:

1) Read more about the Responsible conduct of research (RCR) page. Which RCR starting points are related to your research data? Justify!

2) Describe the difference between a research plan and a data management plan?
Why make a data management plan?


Research data as part of research

Research data refers to all data produced and used in research.

  • Typical data include, for example, interviews and surveys.
  • Research results are based on research data, but the creation of data in itself can also be a significant result of the research.

Research data is a key part of research. Research data is collected to answer research questions. Research design determines what kind of data is collected and what types (quantitative, qualitative) data are produced. On the other hand, when reviewing the data, new perspectives may be found or something surprising emerges from the data, in which case the original research questions may be modified.

Typical data include various

  • queries
  • interviews
  • observation
  • official documents
  • archival data
  • statistics
  • photos and videos
  • forums
  • laboratory samples
  • measurement results
  • medical imaging and
  • modelling, simulations and experiments in different disciplines, as well as code.

It is also common to produce new parts of the data, such as visualizations, spreadsheets, classifications or databases, based on raw data.

Please note that research data is different from research literature/sources. For example, research articles on your topic are in most cases sources, not research data.

Data management  plan is part of research planning

Data management planning is an important part of research planning.

  • Data management helps to take into account ethical, practical and legal issues, such as personal data.
  • Its purpose is to ensure that research data also complies with good scientific practice and that the data is not compromised at any stage.
  • At the same time, consideration will be given to whether it is possible to open and further use the data.

Data management is aided by a data management plan that complements the research plan.

When collecting your own research data, reserve enough time to prepare your data management plan before collecting the data. A data management plan is a kind of checklist and planning tool that helps you take the necessary issues into account. Its careful preparation makes it possible to solve issues related to the collection, processing, storage and possible opening of research data in a controlled manner. Carry out this work together with your supervisor as part of research planning.

Even if a data management plan is not mandatory for you, it is worth doing it because it ensures that the data is compiled properly. At the same time, you explain how you have taken good scientific practice into account.

The data management plan answers the following questions, among others:

  • Data overview and quality assurance:
    • What kind of data is your research based on? What kind of data do you collect, produce or reuse? How do you ensure the consistency and quality of your data?
  • Rights related to the data:
    • Is there a need to make agreements on the use of the data, for example if you have access to previously collected data? Are there copyrights attached to the data?
  • Personal data:
    • What personal data do you process and what personal data must be taken into account when processing? What other ethical considerations are included in the management of your data?
  • Documentation and metadata:
    • How do you document and describe your data so that it is understandable to others and the processing of the data is systematic and smooth?
  • Saving and backing up data:
    • Where is the data stored, and how do you back it up during your thesis process? How secure is your data and what software and hardware is safe to use?
  • Opening, archiving or destroying data:
    • What part of your data can be opened and archived for further use? How is data disposed of securely?

There are several templates for a data management plan.

The course data management plan is in many respects based on FSD's Data Management Guidelines. The plan also takes into account the Academy of Finland's guidelines.

In the research world, a data management plan is often a prerequisite for research funding.

When working with your data, remember to document and back up! Keep track of what you are doing and what you have agreed with research subjects or partners. Keep the data safe so that you can return to it afterwards if necessary.

Responsible conduct of research

Responsible science and research is ethically sustainable and as open as possible. Research is considered reliable only if it has been carried out in accordance with responsible conduct of research (RCR).

  • Responsible conduct of research is an ethical guideline whose strength is based on the research community's commitment to complying with it.
  • Legislation sets the framework conditions for RCR.

When research has been conducted in a manner required by responsible conduct of research, it can be considered ethically acceptable and reliable, and its results can be considered credible. The RCR guidelines have been prepared by the Finnish National Board on Research Integrity (TENK) together with the scientific community.

The openness and transparency of research are among these guidelines: in order to evaluate research, it is necessary to know how the research has been conducted. Therefore, in your thesis, clearly state which research method you have used, how you have collected the research data and how you have analysed it.

One of the principles of responsible research is that research is "as open as possible, as closed as necessary". Not everything can be opened, for example, because of the personal data contained in the data.

You can also remind yourself from the Library Tutor what good scientific practice means in terms of the use and referencing of research literature.

Source: vastuullinentiede.fi

Characteristics of science

Both research literature (theoretical basis of research) and research data (collected research data) must follow the method of producing scientific knowledge and meet its characteristics, which in summary are:

  • Justification
  • Claims must be substantiated by scientific methods.
  • Publicity and joint construction and agreement (intersubjectivity)
  • The allegations and their justifications must be public.
  • Science must be open to all.
  • Scientific information shall be presented in an understandable language.
  • Criticality, self-correction, autonomy
  • The investigator must carefully examine the claims presented as truths (criticality).
  • Scientific results should be understood as preliminary and conditional (self-correctiveness).
  • The correction of results is a matter for the scientific community (autonomy).
  • Progressiveness
  • Taking science towards truth through scientific activity
  • In the scientific context, there are different norms that are respected in the research community

Source: Niiniluoto Ilkka, "Tieteen tuntomerkit". Tiede, filosofia ja maailmankatsomus. Otava. Helsinki 1984, 21-30. [Niiniluoto Ilkka, "Characteristics of science". Science, philosophy and worldview. Otava. Helsinki 1984, 21-30.]

What if I used archived data?

Students and researchers can use previously collected data instead of collecting data themselves.

  • Data archives provide access to data collected for earlier research and opened for others to use.
  • This can give you access to data that would be impossible to collect within the framework of a single thesis (e.g. a longitudinal study conducted over several years in early childhood education).

Students may also have access to:

  • Data from an ongoing research project or participate in the production of research project data
  • Part of the data previously produced at the own department

Archived data can be found in the data archives. Data are archived in many different places, such as:

  • FSD
  • Language Bank
  • Publication archives of different universities
  • Zenodo

The JYX publication archive is our university's own archive. You can browse the research data in JYX. Some of the data are freely available in JYX. For some datasets, it is only possible to see descriptive information that describes the data.

Avoindata.fi: In particular, data produced by government agencies, municipalities and other public administration organisations.

Even if you have access to ready-made data, data management planning is still important.

Many aspects of the data management plan concern, above all, the collection of new data. The further use of archived data is easier in that most data management issues have already been solved when collecting and archiving research data. On the other hand, getting acquainted with the finished data takes time. Read about the experiences: Archived data is suitable for both quantitative and qualitative thesis.

The openness of data is part of the openness of science. It is often possible to conduct several different studies from different perspectives on the basis of the data of a research project. If the situation allows, it makes sense to give the data for further use or to be able to utilise previously collected data as a student.

FAIR principles

With regard to research data, the principles of responsible science are taken care of by following the FAIR principles as closely as possible. FAIR stands for Findable, Accessible, Interoperable and Re-usable

  • Findable
  • Accessible (available)
  • Interoperable
  • Reusable

To be more specific, these mean:

  • Findable
    • The data or its descriptive information is openly available.
    • Descriptive data (metadata, metadata) refers to information about your research data, such as when and how the data was collected. More on metadata later.
  • Accessible
    • The research data, or at least its metadata, is accessible. Accessibility can mean, for example, that metadata has been published in an archive through which data can be requested.
  • Interoperable
    • The research data has been compiled and documented in ways that comply with the formal requirements (formality) and enable the use of the data also in other contexts. This usually refers to the readability of files or data by various programs (machine readability).
  • Re-usable
    • The research data has a clear licence enabling use, the data has been comprehensively described, and the data meets the quality requirements of the scientific community.

Be aware and use where possible: When following the FAIR principles, you should strive as far as possible, but also be aware that they are ideal. The data can only meet some of the principles and still be valuable. As a student, you should be aware of the FAIR guidelines, even if it is only possible to follow them partially.

Depending on the data, some of the principles can be challenging. In particular, the interoperability section emphasises machine readability, which is not possible for all data types.

It is the researcher's (and student's) responsibility to focus primarily on compiling the data as systematically and documented as possible.

Help with metadata, storage and publication related to findability and accessibility is available from the Open Data Centre.

The FAIR principles make it possible to open up data, but they are not just for that. The FAIR principles and good data management practices are part of good scientific conduct, and following them makes research better – it is more reproducible, verified, structured and easier to report. If you wish, you can learn more about the FAIR principles and watch a short video.

On the choice of thesis topic

Bachelor's or Master's thesis should not deal with data the collection or research of which could put the thesis author or research subjects at risk or involve unreasonable risks.

The student should also consider whether, for example, the data is so ethically demanding that ethical questions could not be solved within the framework of the thesis. These are things you should discuss with your instructor.

Sources

Why data management and further use? (FSD)

Production of data (FSD)

Data types (FSD)