A case study from History

This best practices model for documenting the folder structure for a database of qualitative research data is from a case in Finnish History at JYU. Courtesy of PhD Miia Kuha.

A 'Documentation' tile placed in the root of the whole data directory:


THIS DOCUMENTATION.dox FILE WAS CREATED ON XXXX-XX-XX, CREATED BY: [Name]
<Example texts in brackets>

AVAILABILITY INFORMATION

1. Usage license and terms, possible usage restrictions:
2. Links to publications referring to the data:
3. If part of the data has been published open access online, links to the data:
4. How to cite the data:

DATA DESCRIPTION

5. GENERAL DESCRIPTION

<EXAMPLE: This dataset contains information about clergymen’s wives and widows in the diocese of Vyborg between 1650 and 1710. The dataset consists of a prosopographical database “Clergymen’s Wives 1650–1710”. The database consists of .xlxs spreadsheets containing information about the wives of all pastors and chaplains within the Vyborg diocese during the period. Biographical information about each studied person, i.e., clergyman’s wife, has been collected in individual separate folders, if source material exists of the person. These folders further contain:
- a text file containing biographical information about the person (in .docx format) and
- transcriptions of original sources, mainly court records and funeral sermons (in .docx format).>

6. DATA DIRECTORY DESCRIPTION

<EXAMPLE: Data is located in a folder titled XX in the personal Nextcoud directory of [data author]. Address: xxxx.
In the main directory, there are individual master folders for the data, administrational documents, and publications issued in the project. This DOCUMENTATION file is stored in the root of the main directory. The structure is as follows:
• Data
• Administration
• Publications
• DOCUMENTATION.

The Data folder contains the following subfolders:
• Tables_ClergymensWives
• Database_ClergymensWives

7. DESCRIPTION OF SUBFOLDERS AND FILES

SUBFOLDERS AND FILES: <A list of all the folders OR, if the directory is small, of files + a short description of each, EXAMPLE:

Tables_ClergymensWives folder:

7.1 Base data:

ClergymensWives_all.xlsx

CONTENTS: The table contains the following information about 363 individuals: name, name and status of husband, information about becoming a widow, places of living, lifespan or estimate, “Other information”.

METHOD DESCRIPTION: Criteria for selection in table are a certain knowledge of at least first name and patronym or surname, based on previous research, and 2) a husband serving as a pastor or chaplain in Vyborg diocese between 1650 and 1710, and married at least part of the time within this period. In my earlier project I have compiled a database:

PastorsInVyborgDiocese_1650–1710.xlsx

which serves as a basis for creating this database.

Pastors were picked from the database in alphabetical order, and their spouses were added in this table in alphabetical order. If no information about the spouse was available, I excluded the person from the table. I also excluded the spouses married to the pastors only after 1710.>

7.2 Filtered data tables:

ClergymensWives_concise.xlsx

CONTENTS: There are information about 171 individuals in the file.
METHOD DESCRIPTION: A criterion for filtering was whether original sources were available about the individual.

ClergymensWives_concise_PastorsWives.xlsx

CONTENTS: This table contains information about 128 individuals who have been married to a pastor at least one time. Wives of chaplains have been excluded. >

SOURCES: <Add source listing>

8. Database_ClergymensWives folder

CONTENTS: In this folder, a separate subfolder has been created for each clergyman’s wife about whom biographical information is available. These individuals are listed in the ClergymensWives_concise.xlsx file.

NAMING CONVENTION: The folders are named according to the person whose information they contain. The folder of each clergyman’s wife contains a file called
Name_biography.docx.

VARIABLES: Variables are organized in the following matrix:
name,
spouse,
career of spouse,
places of living,
parents,
children,
other relatives,
phase of investigation in this research,
sources.

The file contains all surviving biographical information about the individual clergyman’s wife. In addition, the file contains in separate files a transcription of each remaining source and associated notes. Each document has thus its own separate file.

METHOD DESCRIPTION: Hand-written documents have been transcribed verbatim maintaining the original writing form. Printed funeral sermons have been transcribed whenever necessary. The main section in the sermons to be selected for transcription is the biography of the deceased person at the end of the sermon. To interpret the text, Svenska Akademiens Ordbok (SAOB) database was utilized. Transcription files contain notes about the content and transcription whenever they have been needed to ensure the understandability of the data.

SOURCES: <See the Sources section of each subfolder.>