Adding and reusing metadata

Metadata is “data about data”. Metadata serves multiple purposes in Yoda, the most important being:

To describe the contents of a dataset for a broad audience.
To inform the audience whether the data can be reused and if so, under what conditions.
To prescribe how the data should be cited and whom to acknowledge.
To inform digital archivists and IT staff about how long the data should be retained.
To facilitate finding the dataset in data catalogues.

We distinguish two types of metadata:

Structured metadata consists of information that is standardized globally and used by data catalogues. Examples are the name of the data package, its creator, the retention period of the package, etc.

When a data package is published, Yoda makes the structured metadata available for harvesting by data catalogues.

Unstructured metadata is intended to provide more detailed information about the data. This information can be in a README.TXT or other file that is included as part of the data package. The format of this file is chosen by the researcher. Users will need to open and inspect the data package to find this metadata. Unstructured metadata can include information about (for example) the experimental design, data transformation, sampling method, etc.

Adding metadata in Yoda

Yoda facilitates adding both structured and unstructured metadata to your research data. Entering structured metadata is a prerequisite for archiving a data package. If a folder is published, its structured metadata will be published as well and can be harvested by data catalogues such as DANS NARCIS and DataCite.

In order to add structured metadata to a folder, navigate to the folder in the Yoda portal and press the “Metadata” button.

Once you have added metadata and clicked on the “Save” button, the metadata will be stored in a specific format in the folder. Yoda uses files named “yoda-metadata.xml” for this purpose.

Unstructured metadata can be added as a file to the dataset, for example in a “Readme.txt” or “Codebook.pdf” file.

The metadata form

By default, the Yoda metadata form consists of approximately 30 fields. Please consult the metadata element list below for a detailed description of the elements.

All mandatory fields are marked with an asterisk.

Some metadata elements consist of multiple fields. For example, if you enter a person identifier, you should also specify the type of identifier.

Some fields can have multiple values. In order to add a value, press the “+” sign next to the field.

Reusing metadata

Structured metadata is reusable. The metadata form includes a button “Clone from parent folder”. One way to use this feature is to create a project-level folder with several subfolders for data. Common metadata elements for the project can be entered in the project-level folder. This metadata can then be copied to the data folders.

You can also copy the “yoda-metadata.json” file of a folder to another folder in order to copy its metadata.

Properties and explanations

M Mandatory
R Recommended for optimal findability
O Optional

No	Property	Obligation	Explanation

No	Property	Obligation	Explanation

No	Property	Obligation	Explanation

No	Property	Obligation	Explanation
1	Title	M	A descriptive title for your dataset, should not be longer than about 200 characters
2	Description	M	Describe your dataset, e.g. the subject, the sample size, methodology, etc. It is best to keep this description concise. More elaborate documentation should be added in a text file called README.
3	Discipline	R	The (sub)discipline of the study
4	Version	O	Version number of your dataset. Useful if you need to publish an updated version of your dataset later.
5	Language of the data	O	The primary language of your dataset.
6a	Collection Process - Start Date	R	Indicate when you’ve started collecting the data for this dataset
6b	Collection Process - End Date	R	Indicate when you’ve finished collecting the data for this dataset
7	Location(s) covered	R	If your data is linked to particular locations provide place names.
8a	Period Covered - Start period	O	An indication of the start date of the period covered by your dataset
8b	Period Covered - End Period	O	An indication of the end date of the period covered by your dataset
9	Tags	R	Free text field for adding (searchable) keywords to your dataset
10a	Related Data package	R	The way in which the present data package is related to another data package.
10b	Related Data package - Title	R If 10a	Title of the data package related to the present data package.
10d	Related Data package – Type	R If 10e	The type of the persistent identifier of the related data package
10e	Related Data package – Identifier	R If 10d	The persistent identifier of the related data package.
11	Retention Period	M	The minimal number of years the data will be kept in the archive. The default value is 10 years.
12	Retention Information	O	To be used for remarks about the retention period.
13	Embargo enddate	O	If the dataset has an embargo, on what date does the embargo end?
14	Data type	M	Please indicate the type of the data.
15	Data Classification	M	Please indicate the classification of the data. Translation to VU Classifications: Public=Low, Basic=Medium, Sensitive=High, Critical=Very High
16	Name of Collection	O	If this data package is part of a larger (conceptual) collection of data packages, you can enter the collection name here.
17a	Funder	O	The name(s) of the organization(s) funding the research. If using this property also add the Award Number.
17b	Award Number	R if 17a	The grant number issued by the funding organization
19	Remarks	O	Remarks from the data manager
20a	Creator of Data package - Name	M	The main researchers involved in producing the data, in priority order.
20b	Creator of Data package – Affiliation	M	The organizational or institutional affiliation of the creator
20c	Creator of Data package – Persistent Identifier: Type	M if 20d	Please indicate the type of persistent person identifier.
20d	Creator of Data package – Persistent Identifier: Identifier	R	The Persistent Identifier.
21	Contributor to Data Package - Name	R	The institution or person responsible for collecting, managing, distributing, or otherwise contributing to the development of the resource. For software, if there is an alternate entity that “holds, archives, publishes, prints, distributes, releases, issues, or produces” the code, use the contributor Type “hostingInstitution” for the code repository.
21a	Contributor to Data Package - Type	M if 21	Enter what type of contribution the registered person has had to this data package.
21b	Contributor to Data Package - Affiliation	M if 21	The organizational or institutional affiliation of the contributor.
21c	Contributor to Data Package - Persistent Identifier: Type	M if 21d	Please indicate the type of persistent person identifier.
21d	Contributor to Data Package - Persistent Identifier: Identifier	R	The unique person identifier
25	Licence	M	The licence under which you offer the data package for use by third parties. The preferred value for open data is CC By 4.0.
26	Data Package Access	M	Once archived, should your dataset be accessible to third parties?

RICC

Adding and reusing metadata

Analytics

Adding metadata in Yoda

The metadata form

Reusing metadata

Properties and explanations

Related content