skip to content
 

Audience

This page is intended for use by students and researchers in the University of Cambridge Schools of Technology and Physical Sciences whose research involves releasing software that collects data about its usage. It is part of a larger set of research guidance pages on work with human participants.

Issues to note for ethical review

Definitions

It is reasonably common in technology research to release a piece of experimental software, and to collect research data based on the software usage.

This does not directly correspond to well-established categories in ethical guidance for research. However, it is clear that it raises the same issues to be considered and addressed.

Users of instrumented software should be clearly informed that this will be done, and should be asked for their informed consent, giving permission for the data to be collected.

When planning how this data is to be used, you should take account of guidance on data research.

Licensing issues

Distribution of software for research purposes raises a number of other issues (e.g. warranty, liability, duration of licence, intellectual property) that should be reviewed by the University's legal department.

In particular, the functional performance of the software could be considered both a legal and an ethical issue. Software licence agreements often claim that there is no 'implicit warranty' that it works as described, but those claims could be challenged in court. From the perspective of ethical research, any promises made to the software users, whether explicitly or implicitly, should be recognised and honoured.

Consent issues

For research purposes, users of software should be aware that it has been instrumented, and should have given their permission. Software that collects data on user activities without explicit permission (even if that permission has been included as one of the clauses within a license agreement) is known as spyware. Making and distributing spyware without consent is considered unethical both in the software industry and among academic researchers.

In some cases, it may be the case that data will be collected from people other than those who provided the original consent (for example, consent may have been provided by a research collaborator within the organisation that is hosting the software).

Practicalities

Recruitment and advertising

How will you advertise your software? If using word of mouth, or recruiting via academic or special interest groups, will this bias your sample?

Obtaining additional data

It is quite likely that some information about your participants will be important to your research. For example, have they had experience of this kind of software before? Collecting such information will require a questionnaire, probably at the time when they agree to participate, download, or install your software. If using a questionnaire, consider the additional issues described under survey methods.

You may also find that you want to ask questions about why participants have used the software in particular ways. This could be done by interviewing (perhaps sending questions by email, or even personal interviews). You may wish to collect data on a more regular basis about the context in which the software is being used, in which case the issues raised on the page describing diary and probe studies will be relevant.

Release and installation

It is advisable to make a 'beta release' in order to get feedback on bugs - especially any bugs in the data collection method!

How will you issue updates, if it is necessary to fix bugs?

Must your software be installed on the participants' own machine? Sticking to online applications, if at all possible, is far more advisable.

Data collection

If you are installing software, rather than building a web application, how will data be transferred to you for analysis? Will it be necessary for users to be online when using your software? Will you stream data, or send larger bundles? Will the data transfer protocols that you use be compatible with any firewalls? Might it be more efficient to request physical access to computers?

Collection of data at a low level of granularity (e.g. keypress or mouse events) can quickly result in very large volumes of data for analysis. It is worth thinking in advance how you expect to visualise results or carry out statistical analyses. Time series data can often be inconclusive, so it is worth trying out your intended analysis using pilot data that has been collected during a beta release period.

 

 

The initial version of this page was drafted by Alan Blackwell. 

All comments and feedback are welcome. Please send any feedback to ethics@tech.cam.ac.uk