Proper documentation increases the accessibility and usability of your research data for you and your research team as well as future users. The following are some best practices to follow when documenting your research data.
Keep file names short, descriptive, and use consistent conventions. Here are some general guidelines and examples to help:
DO: NBCFH_GrantProposal_20170228_v01-04.docx
DON'T: finaldraft1 or finalfinaldraft3
Any file format can be uploaded to the Scholars Portal Dataverse however, to ensure the longevity, accessibility, and usability of your data, open and non-proprietary file formats are recommended.
File Type | Preferred Formats |
Databases | XML, CSV |
Container and Compressed files |
ZIP*, TAR, GZIP |
Images | TIFF, PNG, JPG |
Sound | BWF, AIFF, FLAC, MP3 |
Text | TXT, CSV, PDF/A, ASCII, EPUB |
Video | AVI (uncompressed), MOV (uncompressed), MPEG-4 |
Spreadsheets | CSV |
Medical Images | DICOM |
Geospatial | ESRI, SHP, GeoTiff, DBF |
Statistical analysis | SPSS (.por), R, STATA |
For more file format guidance please contact the Data Services Librarian.
Metadata describes data like a label describes the contents of a container. A label is not strictly necessary but makes the contents of the container identifiable and discoverable. Metadata does the same for your data as well as making it citable and reusable.
Basic required and recommended fields:
The above list is based on the "A Brief Guide: Dataverse Metadata" produced by the Metadata Subgroup of the Portage Dataverse North Working group.
Some disciplines have specific metadata standards and schemas. Browse the Disciplinary Metadata standards via the Digital Curation Centre (DCC) to find a metadata standard and controlled vocabulary lists that best suits your research.
In addition to metadata, ReadMe files allow you to further document and describe your dataset to future users. ReadMe files are usually text files to prolong the life of the file and ensure its accessibility. There are no standards for readme files but should include the above metadata along with:
Find more information on ReadMe files in the Guide to writing "readme" style metadata by the Research Data Management Service Group at Cornell University.
The FAIR Principles are concise and measurable guidelines to ensure that research (meta)data are findable, accessible, interoperable, and reusable. Since their introduction, the FAIR Principles have become a standard to evaluate research data management tools and services and have been widely adopted by funders, publishers, and service providers (Wilkinson et al., 2018).
F1. (meta)data are assigned a globally unique and persistent identifier (e.g. DOI)
F2. data are described with rich metadata (defined by R1 below)
F3. metadata clearly and explicitly include the identifier of the data it describes
F4. (meta)data are registered or indexed in a searchable resource
A1. (meta)data are retrievable by their identifier using a standardized communications protocol
A1.1 the protocol is open, free, and universally implementable
A1.2 the protocol allows for an authentication and authorization procedure, where necessary
A2. metadata are accessible, even when the data are no longer available
I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (meta)data use vocabularies that follow FAIR principles
I3. (meta)data include qualified references to other (meta)data
R1. meta(data) are richly described with a plurality of accurate and relevant attributes
R1.1. (meta)data are released with a clear and accessible data usage license
R1.2. (meta)data are associated with detailed provenance
R1.3. (meta)data meet domain-relevant community standards
Source: Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), 160018. https://doi.org/10.1038/sdata.2016.18