Research data can be collected in different ways and have different properties. For instance, they could be results from experiments or simulations, statistics and measurements, models and software, observations from field work, survey responses, interview data in multiple formats, images from cameras and other equipment, and text sources with annotations. Data collection is a preliminary procedure that often consists of a systematic search and assessment of information based on certain variables with the intention of providing a basis for research questions and assessments.
- The data must be described and all corrections and other changes must be documented and explained
- The collection of data should be organised in a carefully considered folder and file structure or equivalent, where the files are systematically named. This is preferable when using version control when naming files, etc.
- The original data should be kept intact, and copies should be used for processing
- The data is to be saved in a secure place and be regularly backed up. Certain types of data may also need to be encrypted.
- Once the project has ended, the data is to be archived in accordance with current rules and regulations
Read more about:
- Collection of personal data
- Ethical review
- Processing of personal data
- Information security
- Finding more data
Special rules apply when collecting data that contains personal information, primarily ethical guidelines . If there is a risk that the collection of data will affect the participants physically, or if it involves sensitive information, an ethical review board must review the research before the work can begin. The person in charge of the data collection is to inform the participants of the purpose of the study (either before or after the data collection). Regardless of whether the study involves personal data or not, the participant is to be informed as to whether the data will be anonymous, if the study intends to share the collected data with other researchers or publish results based on these data. In the following section this will be described in further detail, based on information from SND.
The Swedish Act (2003:460) Concerning the Ethical Review of Research Involving Humans contains provisions on the ethical review of research involving humans and biological material from humans. Research that is subject to the Ethical Review Act may only be conducted if it has been approved through an ethical review.
Since 2004, it is prohibited to conduct certain research unless it has been approved by an independent ethical review board, so if you are planning to conduct research involving humans or processing of personal data you may need to apply for ethical approval.
You can also get advice on things to consider in connection with completing an application for an ethical review:
- Research ethics at LU
- Information about rules concerning ethical review from Swedish National Data Service (in Swedish)
Processing of personal data
General information about the Personal Data Protection Act
The Swedish Personal Data Protection Act aims to protect people’s personal integrity when processing personal data. As an employee, you may need to apply this law in situations that involve processing of personal data.
Responsibility for and notification of personal data processing
The personal data controller is the person who, alone or together with others, determines the purposes and means of the processing of personal data. You as an employee are never responsible for personal data, but rather the public authority or company – in this case, Lund University through the relevant head of department. If you are working with contract research that involves a commercial stakeholder, the processing is regulated by special agreements. The University’s personal data representative is to be notified of all processing of personal data in some type of registry or database at Lund University.
Information security is about protecting information from various types of threats, such as unauthorised access or distortion of the data, by adjusting the technical, physical and administrative environments in which the information is processed. Even if the research information that does not contain personal information or other sensitive data, it is crucial to think about information security.
Read about information security on SND (in Swedish)
Encoded and encrypted data
Encoded or encrypted information are considered personal data, as long as the code or encryption key is available somewhere, that is, as long as it is possible to identify the person behind the code. The Personal Data Protection Act therefore applies to all encoded or encrypted data.
De-identification and/or anonymisation of data
In order to de-identify personal information, all possible identifiers must be removed so that the information in the data can no longer be linked to a specific living person. This means that the personal data code and encryption keys must be destroyed. Data that has been de-identified is not subject to the Personal Data Protection Act. Sometimes the term anonymisation is used synonymously with de-identification which, in Swedish, is incorrect. Here, anonymised data means that the research subjects are anonymous to the person processing the data, but that the data can still be linked to them, and is therefore subject to the Personal Data Protection Act.
Backdoor identification/risk of disclosure
It is important to check whether the data that will be made available contains information that involve a risk of disclosure for the individuals who participated in the study. Information that directly points to a person could be their personal identity number, phone number or address. Information that could indirectly identify a person are those that could collectively lead to a so-called backdoor identification, such as their profession, place of residence and age.
Finding more data
There are several services that can be used to disseminate your own data and to find datasets for downloading directly or after contacting the researcher in charge. Some are subject-specific, others are multi-disciplinary. Here are some examples:
- Registerforskning.se is the Swedish Research Council’s website to support researchers working with registry data
- DataCite – a global non-commercial organisation that provides persistent identifiers (DOIs) for research data, with a goal of helping researchers locate, identify and cite research data with confidence
- Awesome Public Datasets – a list of public domains for datasets compiled at GitHub
- Openaire – technical infrastructure to support research output from the projects part of Horizon 2020
- Data citation index – a service, part of Web of Science, offering access to research data from repositories from various disciplines from all around the world.
- Re3data – a registry of data repositories with content from a large number of online services in a wide range of disciplines
Is the data open?
If it is unclear whether certain data are open and available for use, processing and sharing, you must first make sure by, for instance, asking the relevant contact person. Examples of such inquiries can be found here:
Open Knowledge Foundation: Is It Open Data? (in Swedish)