Publish, store post-project, and dispose of data

Publishing metadata and datasets at the University of Jyväskylä

The University of Jyväskylä strives for the widest possible openness of research data and their metadata, pursuing the principle "As open as possible, as closed as necessary". Accordingly, the researchers as the best experts of their data evaluate which parts of the datasets can be published. By default, data should be published, and if they cannot be made openly available, the reasons are justified in the data management plan. The University recognises that depending on the nature of the data, there can be different degrees of openness:

The descriptive metadata is always published. This is done by creating a metadata record in Converis and requesting its publication via Converis - this is done with a couple of clicks; see step-by-step instrutions in Intranet Uno. Once published, the metadata publication appears in the University's JYX repository with a DOI.
Creating a metadata record does not mean that you need to publish it yet. You will update it as your project proceeds and keep it private until you are ready to publish it at the end of the project. This enables early support from the Open Science Centre in your data management support needs. Once ready, you will save the metadata entry in Converis in "For validation" status in which it will be checked and published by the Open Science Centre in JYX. This way, you get a published FAIR metadata record with a DOI for persistent findability.
The data files themselves can be published in part or as a whole and with varying terms for reuse in an established discipline-specific data repository, in the JYX archive together with the Converis metadata, or e.g. as a data article (for data journals, see tips provided by Aalto University).

When planning your research, consider what parts of the data you can publish, what steps it takes to publish them, and what you will do with the rest of the data (temporary post-project storage for e.g. a verification period; disposal). This way, your plan covers the entire data lifecycle, and you will save a lot of time and effort at the end of your project.

Regardless of which data repository you choose, it is crucial that relevant information about the dataset and its storage site remain also at JYU. Converis metadata should be published for all datasets, including those that remain closed and/or are disposed of; see step-by step instructions. If you publish data in the University's digital repository JYX, the Open Science Centre will take care of recording the availability information for you.

Where should I publish my data?

Best practices

• Publish your data primarily in a discipline or research-specific digital repository. In a sector archive, the data will most likely end up being found by researchers in your field.
• The Re3data portal is an excellent site to search for a suitable repository and to browse repositories in your field.
• If a suitable field-specific repository is not available, publish your data in the university’s JYX repository. Note: For individual data files, maximum size at the moment in JYX is 3 to 4 GB. The amount of individual files to be deposited, however, is not restricted, so you can deposit as many files as needed.
• Use generalist data repositories such as Zenodo or figshare only as a last resort. The data deposited in them is heterogeneous, which lessens the discoverability of the data and makes it difficult to evaluate the findings.
• Remember that publishing data on you own or your project's website does not meet the requirements of the funders or the university's expectations regarding the discoverability and accessibility of the material.

Criteria for a FAIR repository

Widely used by researchers in your field
Gives the metadata (and, if applicable, the underlying data) a permanent identifier, such as a DOI or URN
Publishes machine-readable metadata and uses a known metadata standard
Has a certificate of operational reliability, such as the Core Trust Seal and the ISO 16363 standard
Allows you to choose the terms of use under which the material can be further used, and states them clearly as part of the metadata.

Finnish Social Science Data Archive (FSD)

The FSD focuses on acquiring social science data. Under certain circumstances, data from other relevant fields (Arts and Humanities, Education, and health sciences) can also be archived. Datasets deposited at the archive must meet certain technical and legal requirements. Before dissemination, archived datasets are processed and documented.

FSD promotes open access to research data as well as transparency, accumulation and efficient reuse of scientific research. FSD also responsibly implements the FAIR data principles, which aim at making data and services Findable, Accessible, Interoperable and Re-usable.

The archive is a national resource centre funded by the Ministry of Education and Culture and the University of Tampere. In addition to archiving and dissemination of data, key services include data-related information services and support for research data management. The archive operates as a separate unit of the University of Tampere.

The Language Bank of Finland

The Language Bank of Finland is a service for researchers using language resources. The Language Bank has a wide variety of text and speech corpora and tools for studying them. The corpora can be analyzed and processed with the Language Bank’s tools or downloaded.

Many corpora are publicly accessible, some require logging in. The rights to use restricted resources can be applied for electronically. Using the Language Bank is free for researchers and students.

If you are new to the Language Bank, take a look at the Language Bank introduction.

JYX

JYX is JYU's repository for publications and research data. It gives data sets permanent identifiers (DOI, URN). Metadata is sent to national METAX-catalogue that ensures that metadata and datasets can be found using national ETSIN service.

Publishing dataset in JYX is simple. Just create metadata for the dataset in Converis current research information system and make a request in Converis to publish (meta)data.

How do I publish my dataset in JYX?

Option 1. If you store the data in a Nextcloud group folder (JYU Groups), follow these steps:

1. Make sure that all the required documentation and documentation files and subfolders are included in the group folder in the correct order and that the files are clearly named. It's a good idea to place the documentation file at the root of the directory, where it's easy to open first.

2. First, click the link icon to the right of the folder. " "

3. Click the + icon next to the Share link command to display the sharing settings.

" "

4. Change the sharing setting to Allow upload and editing (Read only by default) and click to deactivate the by default active expiration date in the menu.

" "

5. Click the Copy to clipboard icon and copy the link text to Notepad or directly in Converis.

" "

6. Check that your metadata entry in Converis is in "To be completed" status. Place the Nextcloud link in the “Nextcloud research data link for publishing” field in the Converis form:

" "

7. Finally, change the status of the Converis entry to "For validation" from the Save and select status menu to the bottom right on the form. Next Open Science Centre checks the entry and publishes in in JYX. After that, Open Science Centre will send you the activated DOI link.

Option 2. If you do not store the data in Nextcloud, follow these steps:

1. Organise the data and documentation files that you want to publish in clearly named subfolders. Place the documentation file(s)/folder is in the directory root where it can be easily found.

2. Convert the folder to a compressed .zip folder: right-click the folder you want to publish, select Send to and Compressed (zipped) folder:

" "

3. Send the .zip folder to the OSC data specialists who will then store the data in Nexctloud and publish the data with the metadata: researchsupport-osc@jyu.fi. For larger packages, you can use the JYU Funet File Sender system to send the email: https://filesender.funet.fi/.

4. In Converis, save the metadata entry in For validation status from the Save and select status menu to the bottom right corner of the form. Now Open Science Centre adds the dataset link in place and publishes the entry in JYX.

I urgently need a DOI for my dataset, what do I do?

If you urgently need a Permanent Identifier (PID) for your data for reference, this is possible by publishing the metadata and/or the dataset in JYU's JYX Publication Archive:

Is the dataset ready for publication, and you wish to publish it in JYX? Enter the metadata information of the dataset in the Converis research information system as instructed. On the last interleaf in the Converis metadata form, select that you want to publish data with metadata. In the "Nextcloud sharing link for publication" field, paste the sharing link into the processed and organized dataset you have previously saved in your project's Nextcloud group folder. If you need advice on how to prepare the data for publication or help in transferring the data in Nextcloud, or if your project does not use a Nextcloud group folder, contact researchsupport-osc@jyu.fi. In that case, the Open Science centre will publish the data for you.
In the "More information" field of the form, enter the date by which you want to publish the data. Then save the information in the "For validation" status. After this, the data expert of the Open Data Centre checks the Converis metadata and transfers it for publication in the JYX publication archive. In JYX, the dataset receives a permanent identifier, DOI.
Is the dataset still incomplete, but you need a permanent identifier for reference in advance? Proceed as above, but on the last interleaf in the Converis form, select the option to publish only the metadata of the dataset. Save the metadata entry in "To be completed" status. Before saving, make sure that all the descriptive information required in Converis is up to date. In this case, a Nextcloud group folder link or a link from another data archive to the dataset can be added to the information at a later stage, when ready for publication. When the dataset is ready, make a separate request to complete the descriptive information to researchsupport-osc@jyu.fi or by using the Change Requests to OSC field on the last interleaf of the Converis form.

Post-project storage and disposal of data

In longitudinal and follow-up studies, it is often necessary to store data in identifiable form according to the research plan and the personal data lifespan plan stated in the data privacy notice. When you plan to store personal data after the end of the initial study phase, justify it in the data privacy informing and in your DMP (typically, the need for contacting the study participants again after some time; storage is necessary for follow-up measures that build directly upon the initial research and are compatible with the original research plan). Once justified, you can set a timeframe and name responsible persons for re-assessing the need for further storing the data in identifiable form (e.g, 5 to 10 years). Nextcloud group folders are the recommended storage solution for non-sensitive data. For updated advice in selecting a storage solution for sensitive data, please contact Open Science Centre: researchsupport-osc@jyu.fi.

Disposal of any sensitive data must be carefully planned. Deleting files using operating system tools, or even reformatting a hard drive, will not irretrievably destroy the data. It is important to permanently destroy any data that includes personal, confidential or sensitive data after storage is no longer necessary. Choose the suitable method of disposal according to the data format:

Paper data is disposed of in the grey locked data security boxes at the university campus.
Electronic files can be overwritten with a overwriting software. Eraser and WipeFile are examples of free, open source erasing programmes. The University's Data Security Officer helps you in furthwr questions about secure data disposal measures at the HelpJYU portal.

Long-term preservation

By long-term preservation is meant preservation of specially valuable datasets of over 25 years. The national CSC service Fairdata-PAS preserves these datasets. According to its data policy, JYU coordinates the preservation of nationally remarkable datasets in Fairdata-PAS. For more information, see below.