Government Information in Canada/Information gouvernementale au Canada, Volume 3, number/numéro 2 (fall/automne 1996)
Data Liberation and Academic Freedom 1, 2

Wendy Watkins 3
Carleton University

Ernie Boyko 4
Statistics Canada


The Data Liberation Initiative (DLI) is a cooperative, five-year agreement between Canadian universities, Statistics Canada and six other federal government departments. Under the agreement, universities have unprecedented access to Statistics Canada data for purposes of scholarly research and teaching, in return for a predictable and affordable annual fee. The article discusses the implications of the DLI for scholarly research and, more broadly, for academic freedom.

L'Initiative de démocratisation des données (IDD) n'est autre qu'une entente de coopération conclue pour cinq ans entre des universités canadiennes, Statistique Canada et six autres départements ministériels. En vertu de cette entente, les universités ont un accès privilégié aux données de Statistique Canada dans le cadre de leurs enseignement et leurs recherches académiques, moyennant des droits annuels prévisibles et à un prix abordable. Dans ce document, nous discutons les répercussions que l'IDD aura sur ces recherches universitaires et, de façon plus générale, sur la liberté intellectuelle.


Introduction

Academic freedom has been defined and discussed in many ways at this conference. The linkage between data and academic freedom may not be obvious to those who have not been exposed to quantitative analysis and have no first-hand knowledge of the power of data. We hope to be able to provide that bridge.

If you define academic freedom as the ability to develop and express different points of view without fear of reprisals or hindrance, then it is possible to develop a line of reasoning showing the importance of full and affordable access to data and information for teaching and academic purposes.

For the purposes of this paper, we will focus on access to government information in general and Statistics Canada information in particular. This is justifiable for a number of reasons:

In general, governments are our largest information producers. It would be foolish to ignore this.

This body of information includes data that governments use to make policy and program decisions. Access to this information is certainly an important aspect of our overall freedom.

The information produced by government generally falls into the category of a public good, meaning it is not economically feasible for the private sector to produce it.

There has been considerable controversy regarding affordable access to government data and information in this country during the past decade.

A number of new developments that are affecting the dissemination of government and other information are enough, in themselves, to warrant this discussion. Electronic publishing and its impact on access come to mind.

And finally, on a positive note, there has been a major new development that should strengthen that part of the equation dealing with academic freedom. We are referring here to the recently established Data Liberation Initiative or the DLI as it has become known.

While we cannot pretend to do justice to all information from all government sources, it is likely that the analysis of the issues surrounding access to the Statistics Canada data will illustrate some of the challenges that academics have faced in this regard.

Background

Traditionally, Statistics Canada data have been available in a condensed or aggregated form on paper. While these publications have always had a price attached to them, Statistics Canada did not start trying to recover the full cost of publishing until the early 1980s. Even under cost-recovery, these publications have remained broadly available through a network of libraries under the auspices of the Depository Services Program.

As the volume of government publishing grew and concerns about deficits increased, the governments of the early 1980s took steps to contain the costs associated with their publishing activities. Many publications were either discontinued or converted into electronic products. In the latter case, dissemination through the DSP was overtly discontinued. Thus, as more and more government information moved from paper to electronic format, less and less was available to the public through the broad network of libraries.

The situation with respect to data files and databases was not the same. These had always been excluded from the DSP. With the change in government in 1984, the policy on data became even more stringent. While data had been available for the marginal additional cost of dissemination, data producers were suddenly obliged to recover as many of the costs from collection through dissemination as feasible. This policy resulted in a huge increase in prices charged for data.

At the same time, this policy defined data users as special interests and no distinction was made between the academic researcher and, for example, the big banks. Thus the academic community was largely shut out of the picture.

This shift in policy by the government prompted Paul Bernard, Professor of Sociology at the Université de Montréal and member of the National Statistics Council, to say, in a paper aimed at Statistics Canada's pricing practices: ". . . the genuine exercise of democracy increasingly requires that citizens get access to complex information and have the skills required to understand it." While he realized there were and are pressures on Statistics Canada to reduce costs and increase income, he felt the outcome had been the restriction of ". . . access to information only to groups that have the solid ability to pay." Bernard felt that this may ". . . hamper the participation in public debates of groups whose contribution is not backed up by much money" as well as "those who have no prospect of turning a profit or reaping some tangible and relatively immediate benefit from using it." This, he stated, is ". . . likely to lead, in the long run, to suboptimal development and less than full-blown democracy." (1991)

The situation in general and Bernard's paper in particular provided the impetus for a second paper, "Liberating the Data: A Proposal for a Proposal." (1992)

History

In April 1993, after receipt of the "Liberation Paper," the Social Science Federation of Canada (SSFC) hosted a meeting with representatives from the Social Sciences and Humanities Research Council (SSHRC), the Association of Universities and Colleges of Canada (AUCC), the Canadian Association of Research Libraries (CARL), the Canadian Association of Public Data Users (CAPDU) and other interested parties to devise a strategy to make Canadian data more readily available to the education and research communities. The meeting resulted in the striking of a smaller working group, under the aegis of the SSFC, to devise a plan that would be acceptable to all parties. Statistics Canada and the DSP played advisory roles in this process. While the initiative has involved government in an advisory role, it is unique in that it was conceived and developed by members of the Canadian research community.

The working group, consisting of researchers, representatives from CARL and CAPDU, as well as members of the SSFC, held a series of meetings over the next months. Advice from both Statistics Canada and the Depository Services Program was invited and found to be invaluable. When the group had formulated a working document to which both Statistics Canada and the DSP agreed, meetings were arranged with senior management in several government departments. The SSFC also met with Ministers and their executive assistants in order to move the proposal forward. Finally, in December 1995, the DLI had received a strong enough informal blessing that the project was deemed to be a go. Letters of agreement were distributed and data began to be released.

More officially, the DLI received approval by the Treasury Board Ministers in a February 1996 decision. It was subsequently included as part of the federal government's Science and Technology Strategy in March. Most recently, in October 1996, it was officially announced by Dr. John Gerard, Minister of State for Science and Technology at a press conference held in conjunction with National Science and Technology Week and the 30th anniversary of Carleton University's Data Centre.

The Project

The DLI is a cooperative five-year agreement between universities and federal government departments, which makes unprecedented amounts of Statistics Canada's data available for scholarly research and teaching to the universities for a predictable and affordable yearly fee.

Under the agreement, Statistics Canada provides participating institutions with access to all its standard data products. For their part, universities pay a set annual fee, undertake to make the data available to members of their communities and ensure that the data are used only for non-commercial teaching and research. To date, fifty-two Canadian universities have signed on as participants.

Implications for Academic Freedom

Data are unlike other tools of the research endeavour. They provide the raw material from which information and knowledge can be created. By their nature, data allow for exploration of topics of interest to the researcher. Unlike printed tables which, like a postcard, provide a picture of one view of a larger phenomenon, data can act as a camera, allowing the researcher to manipulate the background, change the foreground and more fully investigate the object under study.

For example, only nine questions were asked of every Canadian in the 1991 Census. One might think that there would be a very limited amount of information that could be gleaned from a small number of variables. On the contrary; the number of tables that theoretically can be produced is enormous--over 350,000. Statistics Canada published thirteen. If only a fraction of these tables make "sense," there is still a tremendous gap between what was produced and what might be of concern. Thus, without access to the data, the researcher is left with a product that answers questions the information provider thinks are important, rather than addressing the problem under investigation.

And it is not just the enormity of the number of tables that may produce constraints. Decisions regarding what to produce are not made in a vacuum. Governments and other information providers are unlikely to produce information that would be critical of their own programs. Yet an informed policy debate requires that critical investigation be undertaken. Without access to data, it is unlikely that such a debate would be possible.

In fact, partly because many academics were unable to afford Statistics Canada data for nearly a decade, there is now a dearth of people trained to perform policy analyses. This problem was explicitly recognized by the Policy-Capacity Task Force, chaired by the Chief Statistician of Canada, and partially accounts for the federal support given to the DLI.

Access to data implies the ability to use it. Without trained analysts, the data will sit gathering dust in tape libraries or on disks. Thus, training is an integral part of the DLI philosophy. In fact, there is a second phase of the Data Liberation Initiative with the overarching objective of creating a data culture in which the use of data as another piece of evidence in the argument becomes mainstream.

Finally, the importance of a critical voice from those without a vested economic interest in the outcome of policy changes cannot be overemphasized. Without it, we will be treated to the arguments and analyses of the well-funded and not exposed to both sides of the issues.

Conclusion

The link between Data Liberation and academic freedom may not be as tenuous as it first appears. If one cannot access and analyze numeric information about Canada and Canadians, one cannot fully participate in the debate about things that may have profound impacts on the shape of society and on our future freedoms. This ability requires access to the data, training in their use, critical thinking, and an environment that supports all of the above. That is the major objective of Data Liberation and an essential element of academic freedom.

Post Script: Issues and Concerns

While it is generally agreed that DLI is a major step in the right direction, the authors will be among the first to admit that more needs to be done to ensure sufficient and appropriate access to public data. In the course of our work on DLI, as well as at this conference, we have been asked several questions about current and future directions. While we do not have all the answers at this time, we would like to respond to a number of these concerns and to provide our thoughts on the future of the initiative.

What happens after five years?

The temporary financing for the DLI pilot is for a five-year period. It is hoped that the project will be so successful during the pilot phase that there will be no thought of discontinuing it. At the same time, it must be recognised that a permanent source of budgetary funds must be found for the federal portion of the project. It is expected the university support will continue.

What about colleagues in non-DLI institutions, Canadian students abroad, users in the not-for-profit sector, small business and the general public?

While this is a valid question, it must be remembered that the DLI was initiated by the Canadian academic community and tailored to serve its needs. That said, there needs to be some thought given with respect to collaborative research with partners in other countries and to access by Canadian students studying abroad. This type of access can probably be accommodated within the current DLI structure. Both these issues are currently under investigation by the External DLI Advisory Group. 5

The inclusion of other parties would lead to a great expansion of what is now a fairly small infrastructure dealing with a limited group of institutions. Increasing this will lead to an increase in costs for which additional funds must be sought. Under the current fiscal environment, Statistics Canada is required to fully recover these costs. We would suggest that those groups with interests in broader access to these data use DLI as a model and enter into negotiations with Statistics Canada.

Is this the thin edge of the wedge with respect to cost-sharing and the Depository Services Program (DSP)?

No. DLI products have always been sold to universities completely outside the auspices of the DSP. Even with the privatization of the public printer (i.e. the Canada Communications Group-Publishing), the government has reaffirmed its commitment to the DSP. While the DSP is becoming more electronic, its focus remains on documents while the DLI is concerned exclusively with statistical files and databases. Thus, we see the two programs as completely separate, although complementary.

What thought has been given to training for faculty, students and university service-providers?

The need for training of university service-providers has emerged as the most pressing issue for DLI institutions, since each university must identify a particular individual as responsible for DLI services. To that end, the External Advisory Committee has set up an ad-hoc training group. They have developed a draft curriculum and outlined a series of training activities that should commence this spring.

Training of faculty and students will be subsumed under "DLI -- Phase II: Developing a Data Culture." 6 This is a project that is again coordinated by the Humanities and Social Science Federation of Canada (HSSFC). 7 Its main focus is to acquaint members of the research community with the riches of the DLI and to introduce them to ways of handling data that range from the simple production of a table or graph to sophisticated statistical analyses. A workshop was held at the 1996 Learned Societies and another is planned for St. John's in June, 1997. This will likely become an annual event at the Learneds. In addition, strategies are being considered for offering a series of regionally-based workshops devoted to specific statistical themes or individual data files.

Statistics Canada is not the only government data producer. Has anything been done to include other government data?

We are aware that there are other data producers, both federal and provincial. Overtures have been made to another federal government department which has significant holdings. While negotiations are still in the very preliminary stages, talks to date have been promising. Much of that is based on the fact that the DLI is seen as both a realistic and successful model.

Provincial governments pose other problems. It is difficult, if not impossible to negotiate a "national program" based in ten provinces and two territories. This might be best achieved through provincial or regional professional associations rather than at a broader level.

Much of the data included in DLI are not at a low-enough level of detail to enable the consideration of specific policy questions. Has any thought been given to including special or customized tabulations in the initiative?

One of the reasons that the data included in DLI are at a broad level of detail is to ensure the confidentiality of the respondent. Customized information, while available upon request, requires that tables be manually checked on a case-by-case basis for disclosure. This is a costly endeavour that cannot be supported by the DLI given the current financial constraints.

It should be noted, however, that other government departments are among the largest requesters of special/custom tabulations from Statistics Canada. It is hoped that the results of these requests, and others that are funded with public money, will become part of the DLI in the future. This would greatly expand the availability of data at a more detailed level.

In closing, the authors would like to observe that the questions raised about DLI are generally positive in nature and are aimed at broadening and strengthening the concept. This illustrates the power of a cooperative approach to a problem that had posed a threat to academic freedom.

References

Bernard, Paul, "Discussion Paper on the Issue of the Pricing of Statistics Canada Products," February, 1991.

Watkins, Wendy, "Liberating the Data: A Proposal for a Proposal," January, 1992.


Notes

[1] May be cited as/On peut citer comme suit:

Watkins, Wendy, and Ernie Boyko, "Data Liberation and Academic Freedom" Government Information in Canada/Information gouvernementale au Canada 3, no. 2 (1996). [http://www.usask.ca/library/gic/v3n2/watkins2/watkins2.html]
Back to text.

[2] This article is based on a paper presented at "Academic Freedom: The History and Future of a Defining Idea," September 21, 1996, in Saskatoon, Saskatchewan.
Back to text.

[3]

Wendy Watkins
Carleton University
wwatkins@ccs.carleton.ca
Back to text.

[4]

Ernie Boyko
Statistics Canada
wcseb@ccs.carleton.ca
Back to text.

[5] The DLI External Advisory Group includes members of the research community as well as data-service providers from a variety of universities. Its mandate is to advise Statistics Canada on the day-to-day workings of the DLI and to provide broad policy guidance.
Back to text.

[6] See Bernard, Paul, "Phase 2 of the Data Liberation Initiative: Extending the Data Culture," Government Information in Canada, v.3, no.1. [ http://www.usask.ca/library/gic/v3n1/bernard/bernard.html]
Back to text.

[7] Because of cutbacks in funding by SSHRC, the Canadian Federation for the Humanities and the SSFC amalgamated in April, 1996, forming the HSSFC.
Back to text.


... [HOME PAGE / PAGE D'ACCUEIL] ...