The online Service Agreement (SA) is a form to capture the details of your deposit with the EIDC which sets out the curation services you can expect to receive from us (e.g. DOI, embargo, etc.) and the files we can expect to receive from you. Information included with it (title, authors, description etc.) will be used to populate a metadata record for a dataset (see an example).
This guide is in three parts:
How to access the service agreement
To access the SA, click the link in the email you have been sent. When you first open the service agreement, it will be incomplete and so will report a number of errors (see screenshot below).
You add more information by clicking the 'edit' button in the top right.
Completing the form
Please note: There is currently no autosave feature. However, if you try to exit the form with unsaved changes you will be prompted to save. Please remember to save any progress before closing the window.
The first four fields (deposit reference, depositor name, depositor's email and EIDC contact name) will already be completed - they should not be changed.
The Depositor's email address is used to grant access permission and should be the email address that is registered on your account with us.
You can navigate through the form by clicking on the tabs on the left hand side, or clicking the back and next arrows in the bottom left corner.
Fields marked with an asterisk (*) are mandatory and must be completed before submission.
Other fields are optional or conditional and may not be relevant for all deposits.
Next to some questions (e.g. 'authors'), there is an 'Add' button, which if clicked allows you to add multiple entries.
Service Agreement content
The following sections detail the content that should be included in each section of the SA (mandatory fields are indicated by *).
Title *
This will be the title that will be used to publish the data in the EIDC data catalogue.
The title should describes the data itself, not the project or activity from which the data were derived. It should be brief but descriptive and distinctive enough to distinguish it from other, possibly similar, titles. Additional guidance is available here
Author(s) *
Data are a first-class research output. As such, the author list should include anyone who contributed substantially to the data collection, processing, and analysis. The data author list may not necessarily be the same as a related journal publication. A person who made a minimal contribution to the data, or who contributed only to a paper that used or analysed the data, should not be listed as an author. (See https://doi.org/10.1073/pnas.1715374115 for more information).
The person depositing the data resource should be an author of the dataset. This is because the depositor must have sufficient knowledge of the data that they are able to describe the data accurately in the supporting documentation and respond to any queries that are raised during the quality checks we perform on the data and supporting documentation (Resource Acceptance Checks). It is sometimes acceptable for resources to be deposited by an individual who is not an author, although this is not encouraged. In these instances, confirmation must be provided by the lead author that the depositor is sufficiently familiar with the data to deposit and to resolve any queries arising.
Please note:
- The authors should be listed in the order in which they will appear in the citation
- The affiliation provided for an author should be the affiliation the author was associated with at the time the resource was produced. If more than one affiliation is provided, we will only use the first affiliation listed. If the author was unaffiliated, please state this rather than leaving the affiliation blank.
- Authors' names must be in the format:
Family name comma Initial(s)
. For example, Smith, K.P. not Kim P. Smith - Because of privacy laws, living authors MUST have a valid email address so they can be informed of the publication
Data retention
Data issued with a DOI will be kept indefinitely. Otherwise, the minimum period of guaranteed curation is ten years, after which it will be periodically reviewed and may be disposed of. Ten years is the minimum period required by the NERC Data Policy for datasets supporting publications. Please record here if a DOI is NOT required and/or any other information you would like us to consider regarding retention of the data resource.
Number of files to be deposited *
Specify the total number of data files being deposited.
Although it may seem trivial to ask for this information when there are only a handful of files, if there are a large number of files to be deposited it is useful for us to be able to quickly confirm that the correct number of files have been transferred.
Files
If there are a handful of files, the names of the files being provided should be specified individually in this section. If there are a large number of files, a naming convention can be specified instead (see below).
File names must not include any spaces or non-standard characters to prevent any errors occurring with file-handling. It is strongly recommended that only the following characters are used in filenames:
- Lowercase roman alphabet (a-z)
- Numerals (0-9)
- Underscore (_)
- Hyphen (-)
File names should be concise and ideally should reflect the content. File names for multiple, related files should be consistent and use a relevant naming convention. See Preparing your datasets for deposit for further guidance.
We make no distinction between UPPERCASE and lowercase characters, so the names DataFile1, dataFILE1 and datafile1 should be considered to be identical. However, we recommend that only lowercase letters are used. Indeed, when uploaded, filenames may be automatically converted to lowercase.
The file extension and size should also be provided.
Naming convention
If there are a large number of files to be deposited the naming convention(s) used to name the files should be described here.
The convention(s) should follow the requirements for naming files specified in the section above. The size (or a size range) for files covered by the convention(s) should be provided. Also include the total size of the files covered by the naming convention.
Format of files
Datasets must be provided in a suitable format to ensure they may be retrieved and re-used over the long-term. Non-proprietary formats are required. For example, we recommend MS Excel files are transformed into Comma-Separated Values (.csv) files for ingestion. Wherever possible, please ensure the files for deposit will be provided in one of our preferred formats.
Transfer method *
In most cases, datasets and supporting documentation will be transferred using the file uploader on our data catalogue. Details on how to do this will be provided by a member of our team.
If the data volume is large (approximate total size is bigger than 5 GB or any individual file is over 2 GB) they can be transferred using an alternative method
- File sharing services such as OneDrive, Dropbox.com, Google Drive or ftp. If this option is chosen, the depositor is responsible for arranging the EIDC's access to the service (see guidance).
- UKCEH depositors can also place the data on a shared location on the network. If they use this method, then the depositor is responsible for arranging permissions with the IT team and must identify the exact location of the data and provide a file path in the service agreement.
- We can also accept data on CD-ROM, USB or external hard drive. However, the EIDC will accept no responsibility for loss or damage to devices or data in transit.
If the EIDC uploader is not used to transfer the data, the depositor must send an email to eidc@ceh.ac.uk to notify us that the dataset has been sent. The email should include the title of the data resource in the subject line and the deposit reference in the email body.
Data and supporting documentation transferred at this stage must be the final definitive versions - any checks on suitability of either the data itself or the supporting documentation should have been carried out prior to deposit.
Data category *
Where data are wholly or partly funded by NERC, the data are categorised under NERC Data Policy as either "Environmental Data" or as an "Information Product". These terms are defined in the NERC Data Policy.
Environmental data are defined as individual items or records (both digital and analogue) usually obtained by measurement, observation or modelling of the natural world and the impact of humans upon it, including all necessary calibration and quality control. This includes data generated through complex systems, such as information retrieval algorithms, data assimilation techniques and the application of numerical models. A few examples are:
- Model output from running a numerical climate model
- Time series logged by environmental instrumentation
- Conductivity-Temperature-Depth casts from oceanographic cruises
- Groundwater chemistry and stable isotope measurements
- Butterfly abundance observations
Information Products are created by adding a level of intellectual input that refines or adds value to data through interpretation and/or combination with other data. They result from analysis or repackaging of data in such a way that has provided significant added value (intellectual or commercial), e.g. tidal predictions or Land Cover maps.
For more information, see NERC Data Policy - Guidance Notes.
Supporting documents *
Supporting documentation is required to assist with re-usability of the data.
There are a number of mandatory elements for supporting documents - they could be included in one document, or may span several documents such as in the example pictured below:
In this example the first three mandatory elements, "Collection/generation methods", "Nature and Units of recorded values", and "Details of data structure", are indicated to be included in the file dataStructure.docx. This file also includes the conditional element "Fieldwork and laboratory instrumentation". The final mandatory element, "Quality control", is included in the file dataQuality.docx. This way all four mandatory elements have been ticked.
Include the file name (no spaces) and indicate which areas are covered by each document.
Please note - some elements of metadata are conditional but they are required if appropriate to the data resource being deposited. For example, a dataset containing experimental data must include details of the Experimental design.
Our preferred format for supporting documents is .docx, but we will accept .txt or .csv.
See Guidance - Supporting Documentation for more information on the individual elements.
End user licence *
An appropriate licence must be identified. By default data are licensed under the Open Government Licence (OGL) unless:
- It is an information product rather than environmental data (as defined in NERC data policy)
- There are 3rd party data and/or 3rd party contractual obligations which preclude OGL licensing
- The depositor requests to use CC BY 4.0. This licence is broadly the same as OGL and can be used where there are no Crown Copyright issues
(Note that some 3rd parties may also be publicly funded, and therefore operating OGL themselves, so the presence of 3rd party IPR does not always imply special licensing requirements.)
It is the responsibility of the depositor to make sure that an appropriate licence is identified.
UKCEH depositors should be advised to contact UKCEH Data Licensing Team if any assistance is required.
Owner of IPR *
For data funded by a grant, Intellectual Property Rights (IPR) normally rests with the PI/Grant holder's employer. However, this depends on the terms under which the grant was awarded. Depositors should check this if they are unsure.
If no institution or individual is identified, we will assume IPR rests with the funder.
Availability of data resource
An embargo period can be specified here if required.
Provide a date by which the data should be made accessible to the public. If no date is specified, we will make the data available as soon as possible.
Note: For NERC-funded data resources, an embargo of up to two years is typically permitted. However, this is two years from the date of creation (e.g. the point at which the data are collected in the field, or the point at which the data has been generated by experiment or analysis), NOT from the end of the project or the point at which the data were submitted to the data centre.
Depositors should be aware that Freedom of Information (FOI) or Environmental Information Regulations (EIR) requests for a dataset can override any embargo periods and if so, the dataset must be made available within a specified timeframe, unless valid reasons can be given. Further information is available in sections 3.(a) and (b), and 4.(c) and (f) of the NERC Data Policy - Guidance Notes.
Additional use constraints
Add any additional constraints on use of the data if required. For example, a copyright statement.
Other policies/legislation
If there are any other specific policies (other than NERC Data Policy and UK Data Protection laws) that the EIDC should be made aware of please include them here e.g. INSPIRE, other funder's data policies, etc.
Please note, if you identify your dataset as falling under the EU INSPIRE directive for spatial data, then you will need to provide the data in a format that conforms to the appropriate INSPIRE standard.
Grants/awards used to generate this resource
Add the details of any grants or other awards that were used to fund generation of the data resource being deposited.
Superseding existing data
This section should only be completed if you wish the dataset being deposited to replace a dataset that the EIDC already holds. For example, if it is replacing a dataset found to contain errors, or is extending an existing dataset spatially and/or temporally.
If the data is replacing one because errors have been found, you should provide a full description of the errors.
If additional data are being deposited to extend the coverage of an existing dataset, a description of the nature of the extension is required. For example, "This new dataset has an additional five years of data''.
In both cases, please provide the identifier (e.g.2e3bec6e-1e62-42d5-a221-016d0ad447d9) of the dataset you wish to replace - this can be found in the catalogue record for the dataset in the EIDC catalogue.
Related data holdings
If you'd like to link the dataset to other resources (datasets/model code) hosted by the EIDC, they can be listed here - include the identifier or web address for the related data resources.
The types of relationships we can accommodate are listed here
If you are depositing several related datasets you may want to also request a new 'collection' metadata record to link them together (see this example). In this case we will ask for some additional descriptive metadata to create the collection record.
Other info
You can include any other information not captured elsewhere here. For example, associated publications, websites, etc.
ISO 19115 topic categories *
For compliance with the UK's data discovery metadata standard (GEMINI) you should include at least one topic category (although multiple are permitted). The topics are very broad, so please select the one(s) that most closely align with the data resource being deposited.
Science topic *
These are broad science topics used to enhance discovery by populating the data catalogue's search filters. Multiple topics can be added but you must add at least one.
Other keywords *
Any other keywords that usefully describe the data and help others to find it can be entered here. You can add free-text tags or, if you use terms from controlled vocabularies you can add its URL/URI.
Discovery metadata
Every data resource deposited with the EIDC will have a record entry in the data centre's data catalogue. The following metadata elements must be provided for all data resources for population of the catalogue record:
Description
The description is an 'executive summary' that allows the reader to determine the relevance and usefulness of the data. The text should be concise but should contain sufficient detail to allow the reader to easily determine the dataset's scope and limitations. See guidance for more information.
Lineage
The lineage should describe how the data came into existence and the stages it has passed through before arriving at the data centre (e.g. quality control processes). See guidance for more information.
Area of Study
The area of study is defined by a simple bounding box -is the smallest rectangular shape which totally encloses the locations of all of the referenced data.
To add a bounding box, click on the Add button. You are given the option to choose a pre-defined option such as, "UK" or "Scotland" or "Wales". If the extent of your data is not listed, choose "Custom" - this will allow you to define your own extent.
Please note: a map should appear when you add an area, however there is a bug which means the map doesn't show up automatically. However, if you click the globe button, the map will show up.
If you choose custom, you can either enter the the lat-long coordinates manually, or draw a box by clicking on the square symbol (highlighted below) dragging/resizing the box that appears on the map .
Finishing & submitting the form
Validation errors
When you save the form, if there is any missing or incorrect information you will see an error report. Errors must be corrected before submission.
The report will also contain warnings/points of information. These may indicate that information is missing and they should be checked. However, it is not always necessary to correct them.
In the example below there are three errors that need to be addressed and an information point to say that you should consider if an area of study is relevant to your data.
Once you are happy with all your answers, save the service agreement and hit the Submit button.
The member of the EIDC team handling your deposit will be notified and the status will change to 'Submitted' (see below)
Status
The status of the document cycles through the statuses 'Draft', 'Submitted', 'Under Review', 'Ready for Agreement' and 'Agreed'. Depending on the status different options will be available to you, although you should be able to view the summary page for the document at all times.
Status | Meaning |
Draft | The Service Agreement is being edited. |
Submitted | The SA has been submitted and the content is being reviewed by the EIDC. |
Under Review | The SA has been passed to a second member of the EIDC team for approval. |
Ready for agreement | The SA has been approved by EIDC and has been returned to the depositor for final agreement. The depositor can choose to agree the SA or make further edits by clicking further edits required which will return it to 'draft' status. |
Agreed | The SA has been agreed by both the EIDC and the depositor. |
For example when the SA is in the 'Draft' status, the depositor has the option to click the 'Submit' button to move the status to 'Submitted'. At this point the depositor cannot make any changes.
History
The History shows the page view each time the SA has been saved. It does not currently highlight which elements have changed.
Printing
Once the SA has been agreed a 'Print' button will appear in the top right, allowing you to save the Service Agreement summary page as a pdf for your own records.
Hints and tips
Only the person logged in with the email address in the depositor email address field will be able to edit and submit the SA, no collaborators. However, if you would like other members of your team to be able to view the document, just let us know and we can provide them with view access.