Publish, archive and dispose of data at the end of the project

Publishing metadata and datasets at the University of Jyväskylä

The University of Jyväskylä strives for the widest possible openness of research data and their metadata, pursuing the principle "As open as possible, as closed as necessary". Accordingly, the researchers as the best experts of their data, evaluate which parts of the datasets can be published. By default, data should be published, and if they cannot be made openly available, the reasons are justified in the data management plan. The university recognises that depending on the nature of the data, there can be different degrees of openness:

When planning your research, consider what selections of the data you can publish, what steps it takes to publish them, and what you will do with the rest of the data (archiving, disposal). This way, your plan covers the entire data lifecycle, and you will save a lot of time and effort at the end of your project.

Regardless of which data repository you choose, it is crucial that relevant information about the dataset and its storage site remain also at JYU. Metadata should be published for all datasets, including those that remain closed and/or are disposed of. This is done in the Research Data section of the Research Information System Converis (see the step-by step instructions). If you publish data in the University's digital repository JYX, the Open Science Centre will take care of recording the availability information for you.

Benefits for publishing my data?

Opening your own data makes the world a better place, but it also directly benefits your research:

  • You will find your own datasets faster and more reliably in the future for re-use - no matter how many years, new computers, new working institutions, or even new continents, you've gone through in between.
  • Was it ever difficult to remember what this or that column in an old dataset of your own actually means? Or decide who deserves a co-authorship in new research using old data? Your own research is easier and meets higher quality standards, when its metadata is managed from the outset and published, when ready.
  • Your research gets cited through datasets too, and you are easier to find. This means new contacts in your field of research, more name recognition, and more opportunities to do interesting and rewarding science.
  • Many funders and research institutions recognise published datasets as a significant output in evaluation of researcher merit. The weight of data as a contribution to science when awarding grants and positions is steadily increasing. Be in the vanguard and open everything you can - you will gain a competitive edge in funding applications and in other situations where your research merit is judged - for example, when you apply for tenure. 
  • Most importantly: the chance of your work to impact the world grows, a lot.

Preparing data for publishing

The prerequisite for opening data is the timeliness of the metadata and the documentation of the data. A way to ensure this from the beginning of the research project is to keep the documentation of the methods, structure, content and other relevant contextual information in a subfolder called /DOCUMENTATION. General metadata is maintained in the Research Data section of the University's Converis research information system.

Creating a metadata record does not mean that you need to publish it yet. You can update it as your project proceeds and keep it closed until you are ready to publish it at the end of the project. This enables early support from the University of Jyväskylä in your data management support needs. Moreover, you can easily request for help from the University's data experts via Converis with a few clicks. Finally, via Converis, you can request the publication of ready metadata or even your dataset itself in the University's digital repository JYX.

Choosing a repository

 

Best practices

• Publish your data primarily in a discipline or research-specific digital repository. In a sector archive, the data will most likely end up being found by researchers in your field.
The Re3data portal is an excellent site to search for a suitable repository and to browse repositories in your field.
• If a suitable field-specific repository is not available, publish your data in the university’s JYX repository.
• Use generalist data repositories such as Zenodo or figshare only as a last resort. The data deposited in them is heterogeneous, which lessens the discoverability of the data and makes it difficult to evaluate the findings.
• Remember that publishing data on you own or your project's website does not meet the requirements of the funders or the university's expectations regarding the discoverability and accessibility of the material.

Criteria for a FAIR repository

  • Widely used by researchers in your field
  • Gives the metadata (and, if applicable, the underlying data) a permanent identifier, such as a DOI or URN
  • Publishes machine-readable metadata and uses a known metadata standard
  • Has a certificate of operational reliability, such as the Core Trust Seal and the ISO 16363 standard
  • Allows you to choose the terms of use under which the material can be further used, and states them clearly as part of the metadata.

Finnish Social Science Data Archive (FSD)

The FSD focuses on acquiring social science data. Under certain circumstances, data from other relevant fields (Arts and Humanities, Education, and health sciences) can also be archived. Datasets deposited at the archive must meet certain technical and legal requirements. Before dissemination, archived datasets are processed and documented.

FSD promotes open access to research data as well as transparency, accumulation and efficient reuse of scientific research. FSD also responsibly implements the FAIR data principles, which aim at making data and services Findable, Accessible, Interoperable and Re-usable.

The archive is a national resource centre funded by the Ministry of Education and Culture and the University of Tampere. In addition to archiving and dissemination of data, key services include data-related information services and support for research data management. The archive operates as a separate unit of the University of Tampere.

tietoarkisto_merkki_colour_325x338.png

The Language Bank of Finland

The Language Bank of Finland is a service for researchers using language resources. The Language Bank has a wide variety of text and speech corpora and tools for studying them. The corpora can be analyzed and processed with the Language Bank’s tools or downloaded.

Many corpora are publicly accessible, some require logging in. The rights to use restricted resources can be applied for electronically. Using the Language Bank is free for researchers and students.

If you are new to the Language Bank, take a look at the Language Bank introduction.

Kielipankki_Kielipankki_CS6-e1450869278629.jpg


JYX

JYX is JYU's repository for publications and research data. It gives data sets permanent identifiers (DOI, URN). Metadata is sent to national METAX-catalogue that ensures that metadata and datasets can be found using national ETSIN service.

Publishing dataset in JYX is simple. Just create metadata for the dataset in Converis current research information system and make a request in Converis to publish (meta)data.

I urgently need a DOI for my dataset, what do I do?

If you urgently need a Permanent Identifier (PID) for your data for reference, this is possible by publishing the metadata and/or the dataset in JYU's JYX Publication Archive: 

  • Is the dataset ready for publication, and you wish to publish it in JYX? Enter the metadata information of the dataset in the Converis research information system as instructed. On the last interleaf in the Converis metadata form, select that you want to publish data with metadata. In the ""Nextcloud sharing link for publication" field, paste the sharing link into the processed and organized dataset you have previously saved in the university's Nextcloud storage service. If you need advice on how to prepare the data for publication or help in transferring the data in Nextcloud, or if you want to submit the data in another method, please contact researchsupport-osc@jyu.fi . In the "More information" field of the form, enter the date by which you want to publish the data. Then save the information in the "For validation" status. After this, the data expert of the Open Data Centre checks the Converis metadata and transfers it for publication in the JYX publication archive. In JYX, the dataset receives a permanent identifier, DOI.

  • Is the dataset still incomplete, but you need a permanent identifier for reference in advance? Proceed as above, but on the last interleaf in the Converis form, select the option to publish only the metadata of the dataset. Save the metadata entry in "To be completed" status. Before saving, make sure that all the descriptive information required in Converis is up to date. In this case, a Nextcloud link or a link from another data archive to the dataset can be added to the information at a later stage, when ready for publication. When the dataset is ready, make a separate request to complete the descriptive information to researchsupport-osc@jyu.fi or by using the Change Requests to OSC field on the last interleaf of the Converis form. 

Archiving and disposal of data

Check from your faculty whether your home subject has a research data archiving policy, of e.g. 1-5 years from the project's conclusion.  

Disposal of any sensitive data must be carefully planned. Deleting files using operating system tools, or even reformatting a hard drive, will not irretrievably destroy the data. It is important to permanently destroy any data that includes personal, confidential or sensitive data after storage is no longer necessary. Choose the suitable method of disposal according to the data format:

  • Paper data is disposed of in the grey locked data security boxes at the university campus.
  • Electronic files can be overwritten with a overwriting software. Eraser and WipeFile are examples of free, open source erasing programmes. The University's Data Security Officer helps you in furthwr questions about secure data disposal measures at the HelpJYU portal.

Long-term preservation

By long-term preservation is meant preservation of specially valuable datasets of over 25 years. The national CSC service Fairdata-PAS preserves these datasets. According to its data policy, JYU coordinates the preservation of nationally remarkable datasets in Fairdata-PAS. For more information, see below.