Science as a Public Enterprise (SAPE) is an initiative of the Royal Society who intend it to “ask how scientific information should be managed to support innovative and productive research that reflects public values”. The matter is being overseen by a “high-level working group” who have issued a call for evidence. The preferred route of response is by way of this form of set questions.
I’m not sure I’ve got that much in the way of material evidence to submit, but I couldn’t resist framing a few replies to their questions. In the spirit of openness, I’m publishing my draft replies here before submitting them to the RS. I welcome comments, criticisms and questions.
SAPE: What ethical and legal principles should govern access to research results and data? How can ethics and law assist in simultaneously protecting and promoting both public and private interests?
Research results and data are essentially documented accounts of certain experiences. Experiences in themselves are necessarily private, but documents containing accounts of them can be sold, shared or published. Arguably, the exchange of information through the distribution of such documents is the essence of science (as opposed to anecdotal knowledge), but that doesn’t automatically mean there’s an obligation to exchange on terms dictated by others.
Conventionally (in law), such documents are ‘works’ and any copyright or other rights as may exist in them belong to the authors, or in the case of works made for hire, to the author’s employer. One may sometimes hear the idea that research data can be owned being described as “absurd” or “preposterous”, but it is simply a matter of acknowledging that the carrying out of research has a cost and that the party who bears that cost has some priority in deciding how the results should be used. Only rather rarely nowadays is that cost really borne by the scientists carrying out the research as they are usually compensated by being paid for their time.
In line with that, research conducted by scientists in industry is generally regarded as works made for hire without problems. With academic scientists, the situation is more complex since some research funding (possibly the majority nowadays) is made in support of a particular research project. As such, it could be argued that the research belongs to the party who paid for it and that it is a work made for hire. However, some academic research is still funded on the basis of supporting individual researchers, a department or institute and the ownership of this would need to be the subject of local agreement.
The choice of who has access to research results and on what terms should ultimately be that of the owner of the rights in them, although provision should be made to legally oblige the owner to allow access or prohibit the owner from allowing access in certain cases where the public interest would otherwise be damaged. Under such circumstances, the owner should be entitled to just compensation for any consequential losses.
SAPE: How should principles apply to publicly-funded research conducted in the public interest?
The rights in any research entirely funded by the public should be automatically assigned to the public. The decision on whether to publish or otherwise distribute then lies ultimately with the public and any decision made on the public’s behalf has to be made in the public interest by accountable representatives. It follows that any assignation back into private hands should itself be made in the public interest.
SAPE: How should principles apply to privately-funded research involving data collected about or from individuals and/or organisations (e.g. clinical trials)?
The key thing here is “data collected about or from individuals and/or organisations” regardless of whether the research was privately or publicly funded. Information about legal persons that is not already publicly available should be used for research only with the emphatic consent of each individual concerned and with express statement of the limits to use. Ethical review should be made of the process used to gain such consent. Publication or other distribution of documents containing such information can then only occur within the limits set by the subjects’ consent.
SAPE: How should principles apply to research that is entirely privately-funded but with possible public implications?
If publication of the research would allow members of the public to avoid otherwise likely positive harm, then obligatory publication should be possible subject to the proprietors of the research receiving just compensation for any consequential losses. Simply asserting that publication would benefit the public interest should not be sufficient to force obligation to publish.
SAPE: How should principles apply to research or communication of data that involves the promotion of the public interest but which might have implications from the privacy interests of citizens?
I’m not sure this is the kind of question that can be answered in general terms. Clearly “the public interest” does not uniformly match up against everyone’s individual interests. Conflicts can only be decided in a case-by-case basis. Obviously, there has to be a right to petition for redress for anyone who feels “the public interest” is being interpreted in a way that infringes their private interests.
SAPE: What activities are currently under way that could improve the sharing and communication of scientific information?
“Scientific information” can mean two things: information in the data that scientists collect or produce; and information in knowing what scientists say the data mean. With regard to the first, the so-called “open data” initiative stands to make the biggest difference by advocating the indiscriminate online publication of “all” data (as opposed to only data selected to illustrate points made in scientific papers). There would appear to be a strong case that this will allow better use of data though there are problems around the costs of digitising certain types of information and standardization of data formats. There are also issues around how a commitment to open data conflicts with the career interests of professionalized scientists, or could encourage overenthusiastic publication of data without the owner’s consent. Overall, developing working open data initiatives would appear to be an area deserving of funding support in its own right.
SAPE: How do/should new media, including the blogosphere, change how scientists conduct and communicate their research?
The ease with which information can be published online means that freely available, informal, non peer-reviewed publications (such as blogs) containing scientific data and/or discussion as to what the data mean could proliferate and eventually displace traditional scientific journals as the preferred source of scientific information. This might make scientific information more easily available to those not working within industry or academia. The withering authority of the peer-reviewed journals might also result in less readiness to take the integrity of published data on trust and lesser readiness by individuals to fall in with ‘leading’ opinions with regard to data interpretation. Science could come to be seen less as a quest for truth – a set of limits within which all things must operate, and more as a search for pointers to what might be possible – a set of priorities for the next step in one’s own on-going investigations.
SAPE: What additional challenges are there in making data usable by scientists in the same field, scientists in other fields, ‘citizen scientists’ and the general public?
The general problem associated with making data suitable for use by others than those who generated them (“cross-use”), is that data on their own are meaningless. One must also know how they were produced, what they represent. Information relating to this is sometimes called “metadata”.
Cross-use of data by scientists in the same field can be relatively unproblematic in this technical sense because there is usually an implicit shared understanding of the experimental and observational fashions in the field. Only brief summary details of what the data represent are necessary for understanding. Of course, cross-use by scientists in the same field presents the greatest political/economic problems in professionalized science because it is in this case that the scientists from whom the data originated face the greatest risk of losing priority over its interpretation.
The problems of the quality of available metadata in allowing cross-use of data become more acute if we wish to allow cross-use by scientists in other fields or those working outside professionalized science. The quantity of necessary metadata can become substantial. It will be necessary not only to develop standardized formats for shared data, but also for metadata, to allow commensurability between data sets.
SAPE: What might be the benefits of more widespread sharing of data for the productivity and efficiency of scientific research?
In principal, making data available can contribute by reducing the number of occasions on which investigations are carried out into something that is already known and by increasing the number of occasions in which a given data set is used to answer different questions from those for which it was originally compiled. It does not, however, do this on its own. Before the availability of shared data can make a difference, one has to understand what range of possible data types might be helpful in advancing one’s own project, know how to look for them and know how to use the data once found.
SAPE: What might be the benefits of more widespread sharing of data for new sorts of science?
By encouraging the re-examination of existing data sets from new perspectives, data sharing could help the development of scientific investigation along lines not previously envisaged. Data sharing also encourages a “new sort of science” in another sense: it separates the collection of data from the theoretical interpretation of data. Traditionally, possibly as a legacy of the self-funded “gentleman scientist” era, many types of science have been carried out in a way that implicitly understands the collection of data and their interpretation to be the work of one person or a small group of people all working together (the “authors”). Of course, this vision of science has, since the mid-twentieth century, been in competition with “big science” where research projects are managed efforts carried out by organized division of labour between large numbers of research workers. This probably began with the Manhattan project. More recent examples might be the human genome project and the Large Hadron Collider project. More recently still, we have seen the rise of what might be called “stakeholder science” where scientific debates have taken place outside professionalized science, sometimes with direct participation of groups and individuals from outside professionalized science. Examples might be that of climate science and “global warming” or the debate around the science of autism and use of the MMR vaccine. Data sharing may encourage and accelerate the growth of these “styles” of science. A long term effect might be that professionalized scientists are seen less as “expert witnesses” and more as “knowledge workers” and that scientific problems are not seen as self-contained but rather as part of broader political or economic problems.
SAPE: What might be the benefits of more widespread sharing of data for public policy?
Data sharing might help crystallize ideas about what specific investigations need to be made to answer questions relating to policy formulation. It might be easier to discover if such investigations have already been made. However, without a warranty as to the authenticity of the data (problematic in a “free” sharing ethos), it would be unwise to take such data on trust. Specific audited studies would still have to be commissioned in most cases.
SAPE: What might be the benefits of more widespread sharing of data for other social benefits?
As above, it might help crystallize ideas about what specific investigations need to be made, but without some kind of trusted third party to act as guarantor of data authenticity, it might be unwise to take data on trust.
SAPE: What might be the benefits of more widespread sharing of data for innovation and economic growth?
SAPE: What might be the benefits of more widespread sharing of data for public trust in the processes of science?
Initially, the availability of data additional to that used to illustrate the points made in research papers and even so-called “negative” data could improve public trust. In the longer term, a proliferation of data from unverified sources could have the opposite effect.
SAPE: How should concerns about privacy, security and intellectual property be balanced against the proposed benefits of openness?
The desire for privacy, security and ability to earn a return for making one’s original ideas available to others are natural human concerns. Irresponsible data sharing could easily lead to transgressions of those concerns. Everything depends on the data sharers behaving responsibly. This, in turn, depends on them understanding first that being a scientist does not in any way diminish one’s common ethical obligations as a citizen, and second, on understanding to whom (as well as to the public generally) they owe specific obligations in any particular case. This would include parties to whom the data relate directly and parties who funded the research either expressly or by way of funding the supporting infrastructure. Ethical frameworks of this kind already exist for research that involves studies of human subjects (such as clinical trials) and suitable components of these could be extended to cover other types of research.
SAPE: What should be expected and/or required of scientists (in companies, universities or elsewhere), research funders, regulators, scientific publishers, research institutions, international organisations and other bodies?
Following the above, subscription to an ethical framework that sets out guidelines for understanding who all the stakeholders are (individuals and collectives) in a given research project and what obligations one owes to each of them.
SAPE: Other comments?
Science as a public enterprise can achieve and should aim to achieve more than just opening up access to scientific information. Science is presently understood largely as a professionalized activity that produces data in accordance with self-referring criteria of evaluation with funding from well-financed organizations (business corporations, governments, charities). Whether the funding is public or private, the scope of scientific investigations is formulated in response to standardized or homogenised concepts of market demand or public interest. In this model, science is effectively closed to the disparate minority interests of individuals or small collectives. At present, such minorities have to pursue their interests without the benefits of scientific investigation or have to reformulate their interests to align them with the homogenized or standardized ‘administrative’ interests of the organizations. A task for open science is to lower the financial and cultural entry barriers to instigating scientific research projects in response to disparate minority interests. Some of the requirements would be:
- To help “lay” people articulate their interests in terms that reveal what types of scientific information could inform them. Informal learning initiatives such as Peer to Peer University might help with this.
- To find which parts of those investigations have already been performed or are in common with the interest requirements of other parties.
- To encourage instigation of short-lived, low overhead “collective experimentation” networks to carry out research projects with component parts distributed across multiple sites.
- To encourage more specialized scientists to become “citizen scientists” prepared to engage with disparate citizen interests rather than see them as a funding opportunities for scientific interests.