the internet and the future of academic publishing
Last Friday, I attended a fascinating talk at the Berkeley Institute for Data Science by Lenny Teytelman, a computational biologist and cofounder of protocols.io, a platform for sharing experimental protocols (recipes, if you will) in the biological sciences. He produced a remarkably nuanced sociological analysis of how the internet is changing academic publishing. I want to outline his talk and consider its implications for sociology specifically and the social sciences in general.
Before we get to the future, let’s make sure we understand the past… The first scholarly journals were published in 1665. The Journal des Sçavans [Journal of Learned Men] was founded in Paris on January 5th by counselor and scholar to Parlement Dennis de Sallo. It published reviews of scholarly books; announcements of scientific inventions and experiments; essays on chemistry, astronomy, anatomy, religion, history, and the arts; obituaries of famous men of letters and science; and news about the Sorbonne. Inspired by this innovation, the Philosophical Transactions was launched in London on March 6th by Henry Oldenberg, the Secretary of the Royal Society of London. It recorded investigations into the natural sciences and report news about the activities of English scientists. Later, it became the official organ of the Royal Society. These pioneering journals left a deep imprint on science: 351 years later, their descendants dominate academic publishing. Contemporary academic journals operate very similarly to these pioneers: authors submit articles, editors decide on what merits inclusion (relying now on peer review), and publishers distribute journals to subscribers (now often in electronic rather than print form). Over this vast sweep of history, academic journals have created strong – if not always peaceful – communities of scientists that span the globe.
In March 1989, Tim Berners-Lee, a software engineer at CERN [Conseil Européen pour la Recherche Nucléaire] near Geneva, proposed a new communications medium for sharing information among academic scientists, the world wide web, that would bring together some existing technologies (hypertext, TCP/IP, FTP, and internet domain names) and create others (HTTP, HTML, URLs). The many, many developments in distributed information sharing that have occurred since Berners-Lee’s vision of the world wide web went live in December 1990, when the first web page appeared on the internet, are threatening to erode the dominance of academic journals. In his talk, Teytelman noted six ways that the internet has changed academic publishing specifically and academic information-sharing in general.
1) What we publish: Beyond printing summaries of our investigations, we can now “publish” our data and analytical procedures in repositories for data, data visualizations, and research protocols/methods at such sites as figshare, data dryad, protocol.io, and ResearchGate. This idea has filtered into the social sciences. Journals published by the American Economics Association require authors to deposit their data and the computer programs required to reproduce their analyses. And starting in 2014, the American Psychological Association instituted an optional “open practices” policy, offering open-data, open-materials, and preregistered-analysis-plan “badges” for authors of accepted manuscripts. This was part of the discipline’s effort to increase transparency and reduce “p-hacking,” the practice of collecting or selecting data or results that are statistically significant. This idea is not new to sociology: in the 1990s, people who published papers in ASA journals were urged to make their data publically available, in an effort to facilitate reproduction, but that effort to create more open access to the behind-the-scenes work that goes into published papers soon faltered. If we are to get serious about honesty and reproducibility of sociological research –and in light of several recent controversies, we should – we need to develop standards, protocols, and repositories for our data and methods of analysis – not just for quantitative analysis, but also for qualitative analysis of observational, interview, and historical data.
2) How we publish: online. Several online-only, open-access “mega-journals” have appeared since the turn of the twentieth century, most famously the Public Library of Science (PLoS), which was launched December 2001, as well as mega-journals that are affiliated with established scientific imprints such as Science (Science Advances), Nature (Scientific Reports (Nature)), Cell (Cell Reports), and SAGE (SAGE Open). Peter Binfield, cofounder of PLoS and PeerJ, has argued that these journals, which cover broad subject areas, review submissions on technical merit only, and require authors to pay for the costs of article preparation, “are not limited in their potential output and as such are able to grow commensurate with any growth in submissions.” He stated that in as of mid-2013, mega-journals were publishing almost 4,000 articles per month. In sociology, we now have two online-only, open-access journals that may, someday, grow in to mega-journals: the independent, editor-reviewed Sociological Science, and the ASA-supported, peer-reviewed Socius. These offer much more rapid turnarounds that standard journals – less than 30 days.
3) How we publish part 2: preprinting. Repositories for the optimistically named “preprints” (some will never be “printed” in either an online or paper journal) – more commonly known as working papers – are making it possible for scientists to share their work without waiting for the often-long and tortuous review process to reach its conclusion. The first such repository is arXiv.org, which began by covering mathematics and physics, and later expanded to computer science, computational biology, finance, and statistics. Its success recently spawned several other archives, including bioRciv, engrXiv, and PsyArXiv. This summer, Philip Cohen and several colleagues, in partnership with the Center for Open Science, launched SocArXiv, a repository for sociology working papers. (You may have seen the buttons being passed out at this year’s ASA meeting.) And of course there are other, multi-disciplinary, repositories, notably the Social Science Research Network (SSRN) and ResearchGate.
4) How we publish part 3: open review. Some journals practice “open review,” a term that covers a variety of experiments by publishers – including some long-established journal publishers. In one form, every stage of the submission, review, and publication process is “open” to scrutiny by posting all materials online immediately. After authors submit articles, they are posted online, and editors solicit peer reviews. After those reviews are received, they are posted online. After editors make decisions, their letters to authors are posted online. After authors revise their papers to respond to reviews, their papers and response letters are posted online… And so it goes up to the final – accepted for publication – version of the paper. This procedure is used at F1000Research, a platform for life scientists. Obviously, this process is not double-blind, but instead double-cited: the identities of authors are known to reviewers, and vice versa. In another form of open review, the process is double-blind until after papers are accepted for publication. At that point, all versions of papers and all authorial and reviewer correspondence are posted online. A variant of this is an option at the online medical journal BMJOpen. I’m not sure that the discipline of sociology is ready to consider this, but it’s an intriguing idea.
5) After we publish: version control. It used to be that publishing was an absorbing state: the end of the road, after which nothing changed. Such a temporal structure implies that what is published is the truth. But methods and theories are always evolving, and when they do, they may invalidate prior publications. But researchers who are not aware of new methods and theories to waste time and effort using invalidated theories and outdated methods. The constant improvement of scientific theories and methods makes it useful to make public new and improved “versions” of methods and results. It is for that reason that Teytelman’s protocol.io was founded: to publicly track the evolution of life-science experimental protocols.
6) After we publish part 2: discovery. With the rapid growth of science – ever larger numbers of scientists seeking to publish their work in ever larger numbers of (sometimes very large-scale) outlets – it is becoming increasingly difficult to thoroughly review the extant literature, to find among the mass of studies the specific work that is relevant to your own project. There are several online tools that automatically notify you of relevant research, including ResearchGate, Google Scholar, RePEc for economics, and SciReader for biomedicine.
7) After we publish part 3: post-publication discussion. There is no central hub for internet-mediated discussions of published research. Only Sociological Science offers readers a place to comment, and authors to reply to such comments. Perusal of the first 25 articles published in that journal revealed only 23 comments, and the modal number of comments was zero. So this is clearly not a common activity for sociologists. But that might change in the future.
 Social-network proximity disclosure: I am on the editorial boards of both journals.