The primary role of JISC Collections is the licensing of content on behalf of its UK Higher Education (HE) and Further Education (FE) member organisations. Over the last 10 years, JISC Collections has invested over £20 million in centralised licensing of digital content archives and collections in perpetuity on behalf of all its members. The first agreement was signed in 2002 for ProQuest’s Early English Books Online (EEB0). Since then, national licences have been negotiated for historic books, journal archives and multimedia content such as documentaries and educational films. In 2010, JISC Collections invested a further £2.5 million in film and image content representing UK and world history of the last twenty-five years, specially selected for teaching and learning. The critical historical book, journal and multimedia archives are prohibitively expensive for the majority of JISC Collections’ member organisations (or not even electronically available) and through centralised licensing the content is opened up to all. Access to these archives has typically been via the publisher interface, or in the case of the multimedia content, via platforms managed by UK data centre EDINA, the costs of which were funded by JISC. Most of the journal archives can currently be accessed by JISC Collections members without further charge, but access charges are levied for the historical book collections, and the real costs of the multimedia platforms are hidden through JISC funding.
In this context, JISC Collections is seeking to protect and preserve existing and future investments by developing JISC eCollections (www.jiscecollections.ac.uk), an independent service that provides an affordable (to all its members) alternative to reliance on content providers for access to perpetually licensed content. In this way, the education community will take ownership of its acquisitions and be assured of future control.
In addition to the above, a key rationale in devising the JISC eCollections service was to offer simplicity and to be inclusive to all users at all academic levels. Consolidating the range of licensed content would simplify both the user experience and the administrative management of licensed content by librarians. It was envisaged that this approach would help expose the content to a wider range of institutions, especially FE, and help their users feel more confident in exploring and exploiting the content in teaching and learning.
Introducing JISC eCollections
The JISC eCollections service is managed by JISC Collections (JISC Collections, n.d.) in partnership with EDINA and Mimas, national datacentres of JISC. The service currently consists of three platforms:
- JISC Historic Books – the full text of over 350,000 books published in Great Britain from 1475-1900 (Early English Books Online (EEBO) and Eighteenth Century Collections Online (ECCO)) including over 65,000 19th Century books from the British Library (BL).
- JISC Journal Archives – more than 3.75 million articles from major publishers and societies including Brill, Cambridge University Press, Institution of Civil Engineers, Institute of Physics, ProQuest, Oxford University Press, Royal Society of Chemistry and Taylor & Francis.
- JISC MediaHub – over 500,000 multimedia items including film, images and sound collections from sources including ITN, Getty Images, the AP archive, the Wellcome Library and the Culverhouse Collection.
JISC Collections levies a single annual service fee, which is kept deliberately low to cover access to all three platforms. The service fee revenue is then re-invested into the service.
JISC eCollections uses a model licence for all its agreements. The model licence helps to standardise the many terms and conditions of use, acts as stamp of approval and helps librarians communicate these terms to users. In presenting the ‘simple’ philosophy to end users and librarians, JISC Collections negotiated variations with all forty-two publishers to ensure that librarians only had to sign two sub-licenses – one for the JISC MediaHub platform and one to cover both JISC Historic Books and JISC Journal Archives. This was a major undertaking with some hard negotiations taking place to include text and data mining, open metadata and the creation of new metadata to supplement that already provided by the publisher. In developing JISC MediaHub, metadata and thumbnails for all the content was made fully open and discoverable on the web. Agreeing these clauses meant giving detailed explanations to providers extoling the benefits to educational users – and to the content owners themselves – of openness.
A community-owned content service
The core vision of JISC eCollections is that of a ‘community-owned content service’ – that the education community take ownership of the content licensed on their behalf and drive forward developments to the service. Advisory boards for each platform, consisting of librarians, teaching staff and researchers, all experts in their fields, are in charge of the service. The remit of these boards is to discuss new opportunities and to make sure that future developments and content licensing support use in education and research, and contribute to the ongoing sustainability of the service. The advisory boards have control over how the service fee revenue is reinvested into the service. More information is provided about the JISC Historic Books (JHB) advisory board below.
The principles behind the development of the three platforms was informed by a range of studies on user behaviours of students and academics, information seeking strategies and digital and information literacy undertaken in the UK and US (Connaway and Dickey, 2010). Many of these studies were funded by JISC with a focus on providing institutions with practical recommendations on how to improve library services to support the needs of users. One of the studies, the JISC national e-books observatory project (JISC Collections, 2009) was managed by JISC Collections and provided a great deal of insight into the frustrations and issues faced by users in accessing and using e-book platforms. The most influential study, however, in terms of the development of the service and the platforms, was the User Behaviour in Resource Discovery (UBiRD) undertaken by Wong et al at the Middlesex University Interaction Design Centre (Wong, 2009). The findings and recommendations of this study supported the user behaviour seen and the feedback gathered in the Observatory project and also the frustrations expressed by librarian members of JISC Collections at various advisory board meetings. For JISC Collections, and JISC as the funder of the studies and JISC eCollections, it was important to take on board the recommendations of the studies in order to base design and development decisions on the evidence gathered.
Providing seamless access is not a simple task when dealing with a plethora of resources and different providers, but research by Head and Eisenberg suggests that students apply a ‘consistent and predictable information-seeking strategy and therefore a ‘less is more’ approach may be more suitable in guiding students to resources (Head and Eisenberg , 2009). The 2006 Research Information Network (RIN) report, Researchers and discovery services: Behaviour, perceptions and Needs, highlighted that for researchers, access remains an issue (Research Information Network, 2006), Connaway and Dickey state that this is especially the case for journal backfiles which are ‘particularly problematic in terms of access’ (Connaway and Dickey, 2010). In the same report, which summarised the findings over twelve studies, it is reported that a common finding is that ‘library systems must do better at providing seamless access to resources’. In addition, JISC Collections works closely with UK librarians, as a consequence of this, any issues or users frustration over access to online resources are quickly related back.
By grouping the content formats together onto three platforms, the aim was to assist libraries in simplifying the user journey. Instead of linking to over fourteen platforms, librarians need only link to three in order to direct their users to the content. While linking to multiple platforms may not be a major issue for some institutions who make use of link resolvers to support their user’s discovery of resources, for others, especially those that have not previously been able to afford access fees, particularly in Further Education (FE), the ‘less is more’ (Head and Eisenberg, 2009) approach simplifies and supports seamless access. In addition, users need to authenticate once using their institutional username and password on entry to each platform and not on multiple occasions.
Keeping the interfaces of the platforms clean and simple was a requirement in line with Connaway and Dickey’s recommendations. Connaway and Dickey summarise this by saying that the ‘evidence provided by the results of the studies supports the centrality of Google and other search engines’ (Connaway and Dickey , 2010). The clear message is that users value familiarization and convenience (Pabha et al, 2006) and that ‘library systems and interfaces need to look familiar to people by resembling popular Web interfaces, and library services need to be easily accessible and require little or no training to use’ (Connaway et al, 2011). Interfaces need to be simple and clean to make users feel comfortable and aid familiarity.
These principles were taken forward in the development of the three interfaces by the JISC datacentes and design company partners (see below), thus maximising white space and presenting simple search boxes to help users feel familiar and confident.
Students and academics employ different search strategies across subject areas and at various stages of their academic career. Hampton-Reeves et al found that ‘students predominantly use keyword searches on a “mixture of tools including internet search engines, library catalogues and specialist databases” (Hampton-Reeves et al, 2009). The RIN study of researchers found that the most common search strategy was ‘refining down from a large list of results’ (RIN, 2006). Across all the studies, the Google-type approach of entering keywords was a common strategy.
Each platform was aimed to present the search functionality in light of the findings of the studies described above, whilst also taking into account how best to display the results for the content format and how to distinguish the provenance of the content, i.e. which collection it comes from. The JISC Historic Books platform landing page in its first iteration offered a single ‘Google like’ search box to users (Figure 1).
Figure 1. JISC Historic Books landing page and search box
Filtering of the search results is common functionality across all three platforms and as they are aggregating collections, the results also needed to be filtered by each collection. For JISC Historic Books, which contains three collections – ECCO, EEBO and the BL 19th Century collections, filtering was applied using colours and tabs as shown in Figure 2.
Figure 2. Presentation of search results and filters in JISC Historic Books
For JISC MediaHub and JISC Journal Archives, where many collections are aggregated together, the filtering techniques and symbols applied to inform users of which collection the results relate to needed much more thought and testing in order that it remained simple (Figure 3 and 4).
Figure 3: Search results and filtering be Journal Archive on JISC Journal Archives
Figure 4. Search results and filtering options on JISC MediaHub
The studies discussed above show that users are driven by familiarity. When users are unfamiliar they have to spend much more time thinking about where things are on a platform and navigating within it. This was evidenced in the deep log analysis of user behaviour and feedback from JISC national e-books observatory project focus groups where students struggled with interfaces, were frustrated at having to search to find the function buttons and often gave up (JISC Collections, 2009). This is best expressed in the UBiRD study; “navigating from one system to another – all of which have different functionalities and different bells and whistles with respect to searching, limiting / refining, indexing, saving and storage or exportation – is confusing to users” (Wong et al, 2009). In moving between different publishers platforms, Wong suggests that users are wasting valuable time as they have to ‘re-frame’ their minds each time to work out where the log-in is, where the print button is and so forth.
The need for consistency to support familiarity is another key principle behind the aggregation of the historic book, journal archive and multimedia collections on their respective platforms, for example, there are over 50 collections on JISC MediaHub, searchable and viewable in one central location. Users of JISC eCollections need only become familiar with three platforms, rather than a plethora of platforms
The UBIRD study found that users did not understand how the e-resources being used operated and that platforms do not make it clear to users how they are structured, organised or what they contain (Wong et al, 2009). The result is that assumptions are often made by users about what the resource contains or how it functions.
For historic books this is actually very critical as the quality of the information, and what information is returned in a search is often dependent on the quality of the Optical Character Recognition (OCR) machine created text or the text developed as part of the Text Creation Partnership (TCP) and the metadata associated with each digitised image (Text Creation Partnership, n.d.). Taking this principle on board the development plan for JISC Historic Books included making the OCR / TCP text of each image available as well as the metadata, and where OCR / TCP was not available, to make this clear to the users. The implications of this approach are discussed below.
JISC Historic Books – mistakes, challenges and solutions
In developing JISC eCollections and the platforms in accordance with the principles outlined above, issues have arisen, mistakes have been made, and solutions to challenges been found. This paper is now going to focus on JISC Historic Books (JHB) and share some of the complications that arose and lessons learned in developing this platform.
User behaviour and change
Providing a simple interface with a single Google type search box (as in Figure 1) was one of the elements taken forward in the design of JHB to support common student search strategy (keywords) and help them feel familiar. It was thought that this approach would also align with the aims of JISC eCollections to be more inclusive of all users regardless of their academic level and to make the content more approachable to user in colleges. However, upon release of the beta version in August 2011, the simple Google-like search option and filtering presented was rejected by researchers as inadequate for their search strategies. Researchers that have been making use of the Cengage and ProQuest platforms have developed search strategies in line with these platforms, or if this is not the case, are using their comprehensive knowledge of the content to undertake very specific searches, for example using the English Short Title Catalogue (ESTC) number. A complete overhaul and implementation of granular advanced search functionality to cater for these search strategies was undertaken with the assistance of the JISC Historic Books advisory board. This feedback showed that the platform had to cater for a variety people who use ‘different combinations of search components, which depends on their level of literacy and the domain knowledge’ (Wong et al, 2009). With hindsight, JISC Collections should have paid more attention to this element of the UBiRD study rather than solely focusing on the keyword search.
Compare and contrast
It has been mentioned that the decision was made to show the OCR / TCP supporting each digital image to the users. This transparency was to support and help users evaluate the content and understand the limitations of the search which is dependent on the quality of the OCR / TCP and metadata. Users are again used to the ProQuest platform which also displays the EEBO TCP where available, however, the Cengage platform does not display the ECCO OCR. It is natural for users to compare and contrast JHB with the publisher’s platforms, this was expected and has led to the development of a great deal of new functionality post launch, but what has become apparent is that information literacy levels with regards to evaluating and assessing ECCO and EEBO are still relatively low.
Anecdotal evidence in the form of comments from users whose institutions have transferred to JHB suggest that ‘JISC OCR’ on JHB is of much poorer quality than the ECCO content on the Cengage platform. These comments are perhaps a little surprising, as JISC Collections licensed the content from Cengage and uses the same OCR. It has been suggested that this misconception stems from the fact that users can, for the first time, see the OCR on JHB. It is understandable then, that publishers are sometimes reserved about showing the underlying OCR that supports the digital images, however, JISC Collections believes and supports the UBiRD principles and is working with the JHB advisory board to further develop users’ information literacy skills with regards to the content available on the platform and its limitations.
The biggest challenges (and perhaps with hindsight, errors in development) have been in the area of metadata. The agreements JISC Collections made with ProQuest and Cengage, to licence the content in perpetuity on behalf of all UK HE and FE institutions, did not include the MARC records available from the publishers. This was unfortunate but at the time, JISC Collections was unable to fund the purchase of the metadata at a national level. Instead, the MARC records were sold directly to institutions by the publishers. The licensed material supplied to JISC Collections therefore only included basic citation metadata, e.g. author, title, image ID number in the case of EEBO. This meant, that searching EEBO content was limited to data within the basic metadata fields except for where an EEBO TCP text was available, in which case the full text was searched. Of the 125,000 volumes available in EEBO, 25,000 have been transcribed and marked up in XML (Text Creation Partnership, n.d ). JISC Collections was therefore in a predicament whereby 100,000 EEBO texts would not be surfaced through search on JHB, unless a user entered the title or author.
To rectify this situation, considerable time was spent exploring options to licence high quality metadata to support the EEBO content. For example, The British Library and the ESTC North Amercia, jointly own the English Short Title Catalogue (ESTC), which includes detailed and high quality metadata for the titles within EEBO. The viability of each option was assessed in light of the approaching launch date, and it was decided to purchase the MARC records directly from ProQuest for use within the platform only.
Once purchased, the rationalisation process, to integrate the MARC records within the existing schema applied to the ECCO and BL metadata, commenced. It became clear to JISC Collections that the reliance on the developers, who are not MARC record or historic books metadata experts, to devise the schema had meant that fields that could have potential value to researchers, had not been included, for example, in the BL collection, the metadata includes tags for visual elements such as images and portraits. JISC Collections is now working to include such elements in order to surface them in searching. The lesson learned here, is that an analysis of the metadata from all three providers, should have been undertaken as the very first step in the development process for JHB.
As mentioned previously, developing and launching a new service and content delivery platforms is a new venture for JISC Collections and its partners. The lessons learned so far (and there are many more not mentioned in this paper) have been extremely valuable but there are still many more challenges to be overcome such as providing free MARC records to the community and of course, further recommendations from the information seeking reports to implement.
Taking ownership and control
One of the key visions of JISC eCollections, is that the education community take ownership of the content licensed on their behalf and are put in control of the service and its development. The JHB advisory board has risen to this challenge and has set itself an ambitious Terms of Reference (JISC Historic Books Advisory Board, n.d.) with a focus on pioneering new technologies, partnerships and ease of use. The board discusses, agrees and prioritises developments to the platform against the budget, provides guidance to JISC Collections on content acquisition and is the driving force behind new experiments. One particular project that the board is exploring is crowd-sourcing corrections to OCR text in partnership with Eighteenth Century Connect, ProQuest and Cengage. By crowd-sourcing and sharing corrections at an international level, the scholarly community would be working collaboratively to improve the quality of and use of historic book collections. This is an extremely complex and ambitious project, but the advisory board members believe in harnessing the power of the digital humanities community and that this is an important development for the future of JISC Historic Books and its users.
In the last two years, there have been some serious issues, mistakes and challenges faced by JISC Collections in developing an independent service that protects and preserves the communities content investments, simplifies access and use in line with information seeking research and lets the community take ownership. Yet, sitting firmly within the education community it serves, and with the advisory boards in charge, JISC eCollections as service, is well placed to change with user behaviours and the scholarly environment, to ensure it becomes a valued community-owned content service.
Connaway, L. S., and Dickey, T. J. (2010). The Digital Information Seeker: Report on the Findings from Selected OCLC, RIN and JISC User Behaviour Projects. Dublin, OH., OCLC Research. Retrieved 31 May 2012 from http://www.jisc.ac.uk/media/documents/publications/reports/2010/digitalinformationseekerreport.pdf
Connaway, L. S., Dickey, T. J., and Radford, M. L. (2011). “If it is too inconvenient I’m not going after it”: convenience as a critical factor in information-seeking behaviors. Library and Information Science Research, 33, 179-190. http://dx.doi.org/10.1016/j.lisr.2010.12.002
Hampton-Reeves, S., Mashiter, C., Westaway, J., Lumsden, P., Day, H, Hewertson, H., et al (2009).
Students’ use of research content in teaching and learning: a report for the Joint Information
Systems Council (JISC), Preston, Centre for Research-informed Teaching, University of Central Lancashire.Retrieved 31 May 2012 from http://www.jisc.ac.uk/media/documents/aboutus/workinggroups/studentsuseresearchcontent.pdf
Head, A. and Eisenberg, M. (2009). Lessons Learned: How College Students Seek Information in the Digital Age. Washington, The Information School, University of Washington. Retrieved 31 May 2012 from http://projectinfolit.org/pdfs/PIL_Fall2009_finalv_YR1_12_2009v2.pdf
JISC Collections (2009). JISC national e-books observatory project: Key findings and recommendations. Final Report, November 2009. London, JISC Collections. Retrieved 31 May 2012 from http://observatory.jiscebooks.org/reports/jisc-national-e-books-observatory-project-key-findings-and-recommendations/
JISC Collections. (n.d.). Retrieved 31 May 2012 from http://www.jiscecollections.ac.uk/
JISC Historic Books Advisory Board. (n.d.). Retrieved 31 May 2012 from http://www.jiscecollections.ac.uk/advisory-board/jhbadvisoryboard/
Prabha, C., Connaway, L. S. and Dickey, T. J. (2006). Sense-making the information confluence: The whys and hows of college and university user satisficing of information needs. Ohio, The Ohio State University. Retrieved 31 May 2012 from http://www.oclc.org/research/activities/past/orprojects/imls/default.htm
Research Information Network. (2006). Researchers and discovery services: Behaviour, perceptions and needs. London: Research Information Network. Retrieved 31 May 2012 from http://www.rin.ac.uk/our-work/using-and-accessinginformation-resources/researchers-and-discovery-services-behaviour-perc.
Text Creation Partnership. (n.d). Retrieved 31 May 2012 from http://www.textcreationpartnership.org/Text Creation Partnership
Wong, W., Stelmaszewska, H. and Barn, B. (2009).JISC user behaviour observational study: User behaviour in resource discovery. London, JISC Collections. Retrieved 31 May 2012 from http://www.jisc.ac.uk/publications/programmerelated/2010/ubirdfinalreport.aspx
This post is the write-up of the author’s presentation at the Liber General Annual Conference 2012 in Tartu. A full video of the presentation can be found at http://uttv.ee/naita?id=12538