Data and Information Management

Cooperative Research Centre for Torres Strait Torres Strait Research Program Data and Information Management CRC-TS Project Task Number: 5.2 T. J. ...
Author: Caitlin Lyons
9 downloads 1 Views 1MB Size
Cooperative Research Centre for Torres Strait

Torres Strait Research Program

Data and Information Management CRC-TS Project Task Number: 5.2

T. J. Taranto C. R. Pitcher

Final Report

National Library of Australia Cataloguing-in-Publication data:

Taranto, T. J. (Thomas John). CRC Torres Strait task 5.2, data and information management : final report for CRC Torres Strait. Bibliography. Includes index. ISBN 9781921232619 (pbk.) ISBN 9781921232626 (pdf) 1. Marine sciences - Research - Queensland - Torres Strait Islands. 2. Torres Strait Islands (Qld.). I. Pitcher, C. R. (Clifford Roland). II. Cooperative Research Centre for Torres Strait. III. CSIRO. Marine and Atmospheric Research. IV. Title. 551.4609943 Citation: Taranto, T. J. and C. R. Pitcher (2007). CRC Torres Strait Task 5.2: Data and Information Management. Final Report for CRC Torres Strait. CSIRO Marine and Atmospheric Research, Cleveland. pp.42. Published: March 2007 by CSIRO Marine and Atmospheric Research © CSIRO Marine and Atmospheric Research and CRC Torres Strait, 2007 This work is copyright. Except as permitted under the Copyright Act 1968 (Cwth), no part of this publication may be reproduced by any process, electronic or otherwise, without the specific written permission of the copyright owners. Neither may information be stored electronically in any form whatsoever without such permission. DISCLAIMER CSIRO has taken all reasonable steps to ensure that the information contents in this publication are accurate at the time of publication. Readers should ensure that they make appropriate inquiries to determine whether new information is available on the particular subject matter

CRC Torres Data and Information Repository

June 2007

Data and Information Management

CRC-TS Task Number: 5.2

Tom Taranto Roland Pitcher CSIRO Marine and Atmospheric Research 233 Middle St, Cleveland, Qld.

ISBN 9781921232619 (pbk) ISBN 9781921232626 (pdf) CRC Torres Strait Research Task 5.2 Final Report

i

CRC Torres Data and Information Repository

ACKNOWLEDGEMENTS The compilation of data and research information into the Torres Strait Marine Research repository was achieved through the collaboration of many research agencies. The contributions by AFMA, AIMS, JCU, QDPI and the TSRA along with the funding by the CRC Torres Strait and the CSIRO Division of Marine and Atmospheric Research are acknowledged. The continued support by the Reef and Rainforest Research Centre (RRRC) and the CSIRO Marine and Atmospheric Research (CMAR) Data Centre will ensure that marine research in the Torres Strait will have an extensive internet based library of searchable information and data to draw upon into the future.

ii

CRC Torres Data and Information Repository

iii

TABLE OF CONTENTS ACKNOWLEDGEMENTS.....................................................................................................................................ii TABLE OF CONTENTS....................................................................................................................................... iii FIGURES........................................................................................................................................................... iii TABLES ............................................................................................................................................................ iii NON-TECHNICAL SUMMARY ....................................................................................................................... 1-1 PROJECT: Task 5.2 Data and Information Management................................................................................ 1-1 PRINCIPAL INVESTIGATOR: Tom Taranto................................................................................................ 1-1 CO-INVESTIGATOR: Roland Pitcher ........................................................................................................... 1-1 ADDRESS: ...................................................................................................................................................... 1-1 OBJECTIVES:................................................................................................................................................. 1-1 NON-TECHNICAL SUMMARY: .................................................................................................................. 1-1 Achievements and Outcomes against the objectives (2006 - 2007)............................................................. 1-1 Utilisation and Application of the Research (2006 - 2007).......................................................................... 1-2 Publications (2006 - 2007)........................................................................................................................... 1-2 Outcomes Achieved..................................................................................................................................... 1-2 1. INTRODUCTION ....................................................................................................................................... 1-3 1.1. BACKGROUND ................................................................................................................................. 1-3 1.2. NEED................................................................................................................................................... 1-3 1.3. OBJECTIVES...................................................................................................................................... 1-3 2. METHODS .................................................................................................................................................. 2-4 2.1. Facilitate the collation of CRC-TS Intellectual Property (IP).............................................................. 2-4 2.2. Develop a searchable repository website ............................................................................................. 2-4 2.3. Coordinate the moderation and listing of sensitive data and publications ........................................... 2-5 3. RESULTS .................................................................................................................................................... 3-6 3.1. Facilitate the collation of CRC-TS Intellectual Property (IP).............................................................. 3-6 3.2. Develop a searchable repository website ............................................................................................. 3-8 3.3. Coordinate the moderation and listing of sensitive data and publications ......................................... 3-12 4. DISCUSSION............................................................................................................................................ 4-13 4.1. Facilitate the collation of CRC-TS Intellectual Property (IP)............................................................ 4-13 4.2. Develop a searchable repository website ........................................................................................... 4-13 4.3. Coordinate the moderation and listing of sensitive data and publications ......................................... 4-13 5. BENEFITS................................................................................................................................................. 5-14 6. FURTHER DEVELOPMENT................................................................................................................... 6-14 7. ACHIEVEMENT OF OUTCOMES.......................................................................................................... 7-14 8. CONCLUSIONS ....................................................................................................................................... 8-15 9. RECOMMENDATIONS........................................................................................................................... 9-15 10. REFERENCES .................................................................................................................................... 10-16 11. ABBREVIATIONS & GLOSSARY ................................................................................................... 11-16 12. APPENDIX 1: INTELLECTUAL PROPERTY.................................................................................. 12-17 13. APPENDIX 2: TASK MANAGEMENT LISTING ............................................................................ 13-28 14. APPENDIX 3: STAFF......................................................................................................................... 14-34

FIGURES Figure 3-1. Index page showing custom repository search tool and Marlin Metadata Search tool...................... 3-9 Figure 3-2. Repository custom Google search result page................................................................................... 3-9 Figure 3-3. Direct access to CMAR Data Centre............................................................................................... 3-10 Figure 3-4. Additional page to search other websites ........................................................................................ 3-10 Figure 3-5. Direct access to other Australian Data Centres .............................................................................. 3-11 Figure 3-6. Search libraries................................................................................................................................ 3-11 Figure 3-7. Custom Google search on contributing web domains .................................................................... 3-12

TABLES Table 3-1. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by project.. ........................................... 3-6 Table 3-2. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by Task level....................................... 3-7

CRC Torres Data and Information Repository

1-1

NON-TECHNICAL SUMMARY PROJECT: Task 5.2 Data and Information Management PRINCIPAL INVESTIGATOR: Tom Taranto CO-INVESTIGATOR: Roland Pitcher ADDRESS: CSIRO Marine and Atmospheric Research 233 Middle St, Cleveland, 4163 Ph: 07 3826 7259 Fax: 07 3826 7222 Email: [email protected]

OBJECTIVES: 1.

To facilitate the collation of CRC-TS Reports, metadata and available associated data from Principal Investigators in a standard format where possible.

2.

To develop a searchable website on a secure repository containing linked Reports, metadata and available associated data (for limited access where appropriate).

3.

To coordinate the moderation and listing of sensitive data and publications for the project.

NON-TECHNICAL SUMMARY: The CRC Torres Strait Data and Information Management Task was commissioned in June 2006, with the involvement of a CRC TS Steering Committee, to capture publications and data produced by the CRC Torres Strait. Preliminary work on the project began in June 2006. As the CRC Torres Strait wound up in July 2006, the Contract and IP collected by this project has since been transferred to the Reef and Rainforest Research Centre (RRRC). A product of the project was to provide a website where all available CRC Torres Strait marine related research information can be accessed by all stakeholders. The website has been established and populated with publications and data as received from past CRC Torres Strait Task Leaders. The website and associated holdings are hosted by the CSIRO Marine and Atmospheric Research Data Centre, to be maintained in perpetuity. During the collection phase of the Task, progress reports were provided by way of fortnightly emails (up to 18th Dec 2006) to all CRC Task Leaders detailing the status of the collection of works and requesting that outstanding works be lodged to the repository. Due to the slow response by Task Leaders in lodging their research works, this Project was granted an extension to provide additional opportunity for Task Leaders to lodge their CRC Torres Strait works. At the time of drafting of this Final Report (23 March 2007), only 96 of 187 identified works have been received. An important objective of this Task was to identify sensitive material. Effective procedures were developed in association with Torres Strait Regional Authority (TSRA) staff to ensure appropriate access constraints are enforced for all CRC works submitted to the repository. All available literature and data is now available online at http://www.cmar.csiro.au/DataCentre/torres, accessible using customized search engines. Following final feedback, the website will be promoted to other Torres Strait agencies and libraries for inclusion onto their websites. Achievements and Outcomes against the objectives (2006 - 2007) The resultant Torres Strait Marine Research Repository has successfully published the CRC Torres Strait IP works that have been lodged with the website administrator. In addition, the customised search interfaces available for stakeholders significantly enhanced the repository’s planned outcome to provide a searchable repository of research from the CRC Torres Strait.

CRC Torres Data and Information Repository

1-2

With a defined repository now permanently maintained by the CSIRO Marine and Atmospheric Research Data Centre, the maintenance and availability of any lodged IP - not just that of the CRC Torres Strait - is assured, thus providing an enduring service to the Torres Strait community. A customised search interface has been developed to simplify searching of the Repository, and can be incorporated onto other agency websites and/or linked to other independent repositories. A new search interface that can seamlessly links independent research efforts. Utilisation and Application of the Research (2006 - 2007) The design of the repository and its search interfaces provide both individual stakeholders and organizations the opportunity to participate in providing knowledge to a common user interface. Individual stakeholders are invited to lodge their works to the repository permanently served by the CSIRO Marine and Atmospheric Research Data Centre, while those agencies with their own inventories are invited to supply internet links to their publicly accessible directories. The application of this research requires that the repository be promoted to stakeholders of the Torres Strait. In addition there will always be a need to facilitate either the lodgment of IP works to the repository or hyperlinks to new and valued inventories of information managed by other research agencies. It is only by the continued maintenance, cooperation and participation of like minded stakeholders that this search interface can maximize research developments within the Torres Strait. Publications (2006 - 2007) Taranto, T. J. and C. R. Pitcher (2007). CRC Torres Strait Task 5.2: Data and Information Management Final Report for CRC Torres Strait. CSIRO Marine and Atmospheric Research , Cleveland. Taranto, T. J. (2007) CRC Torres Strait Marine Research Repository. http://www.cmar.csiro.au/DataCentre/torres . CSIRO Marine and Atmospheric Research Data Centre.

Outcomes Achieved The Torres Strait Marine Research Repository provides stakeholders of the Torres Strait both a secure repository of past research efforts and a utility that searches both this repository and other information repositories related to the Torres Strait marine environment. The collation of CRC-TS reports, metadata (and available associated data) from Principal Investigators (PIs) is the foundation of the Torres Strait Marine Research Repository. All lodged literature is in standard pdf format with all submitted metadata adhering to the Australian ANZLIC standard. All CRC-TS literature and data has been vetted by the TSRA for sensitivity, and appropriate internet website security options implemented. Both metadata and associated data are maintained by the CSIRO Marine and Atmospheric Research Data Centre. In addition to providing a customised searchable website that links to the Repository of CRC-TS outputs (literature and data), the website provides a searchable interface to other non-CRC research works and data sources related to Torres Strait marine resources. By promoting the Repository and its search capabilities this project benefits not only future research in the region but also the communities that depend on its resources.

KEYWORDS: Torres Strait data metadata reports literature archive repository

CRC Torres Data and Information Repository

1.

1-3

INTRODUCTION

1.1. BACKGROUND This CRC Torres Strait Task was initiated in response to a request from the CRC TS Board during June 2005 regarding the issue of data archiving and management of information arising from CRC TS research tasks. Following that request, CSIRO Marine and Atmospheric Research conducted a preliminary project to scope the requirements to address the CRC TS Board's needs.

1.2. NEED The scoping project identified that the CRC TS had contracted over 24 projects that were expected to produce data and final reports/theses. There were also several AFMA contracted projects conducted since the completion of the AFMA TS Reports and Data Archive (Taranto, 2004). A need was identified to capture these reports, data & metadata before some CRC TS Task Leaders disperse and/or become difficult to contact. The CRC Torres Strait managed a set of Research Tasks under a co-ordinated Research Plan due to complete by mid 2006. It was identified that the principle tasks of a Data Management Task would be to facilitate the entry of metadata and collection of each of the individual CRC TS Task datasets, facilitate the production of reports in a standard PDF format, and work with each PI to get the actual data and reports lodged into a central system. It was also identified that a web site would need to be developed to maximise future distribution of collected information and data — preferably linked under the Torres Strait Regional Authority web site (www.tsra.gov.au) — containing searchable metadata, the PDF reports as a searchable document library and access to the actual data for direct downloading where possible and appropriate. All content was to be moderated for privacy and cultural sensitivities and published under the appropriate restrictions as defined by the TSRA. A companion DVD was identified as an additional or alternative product that could also be developed, but this was not progressed.

1.3. OBJECTIVES The original objectives were: 1. To facilitate the collation of CRC-TS Reports, metadata and available associated data from Principle Investigators (PI’s) in a standard format where possible. 2. To develop a searchable website on a secure repository containing linked Reports, metadata and available associated data (for limited access where appropriate). 3. To coordinate the moderation and listing of sensitive data and publications for the project.

CRC Torres Data and Information Repository

2.

2-4

METHODS

2.1. Facilitate the collation of CRC-TS Intellectual Property (IP) To address the objective of collating CRC-TS Reports, metadata and available associated data from Principal Investigators in a standard format where possible, a number of facilitation services were provided to CRC TS Task Leaders. Immediately following an email from the CRC TS to all Task Leaders advising of the initiation of this Data and Information Management Task, and requesting their response, this task initially produced and distributed a CRC Torres Strait Final Report template document (on 13 June 2006) to facilitate a common reporting interface. The template was designed in consultation with CMAR graphic designers and CRC TS staff. This Final Report adheres to that design template. At the commencement of the Task, extensive research and interviews were conducted with CRC Task Leaders and other CRC TS project staff to establish the extent of IP works attributed to the CRC TS. This IP inventory became the basis of coordinating the lodgment of Task outputs to the Data and Information Repository. See APPENDIX 1: Intellectual Property. To promote ongoing lodgment of CRC TS IP works, all Task Leaders were emailed status reports fortnightly from mid September 2006 to end December 2006. Each status report highlighted the need for Task Leaders to lodge IP works to the repository and included instructions on the agreed lodgment process. In addition, all Task Leaders were advised of the preference to lodge reports in PDF format based on the CRC report template that was specifically developed for the exercise. Due to the lower than expected lodgment of CRC IP works all Task Leaders were personally contacted during December 2006 and an extension of this Task's Milestone Report (from end December 2006 to February 2007) was sought to provide further time for Task Leaders to lodge their IP works. In addition to the above status reports, and to further facilitate IP lodgment, each Task Leader was personally informed that they could simply lodge their works by email directly to the Repository Administrator. Though discussions with Task Leaders were positive, at the time of drafting this report (23 March 2007) there were still a significant number of identified works yet to be submitted to the repository. Another addition to the publications and metadata being collected from the CRC Torres Strait Program is the likely inclusion of publications and metadata as published on the AFMA Torres Strait Research DVD (Taranto, 2004). During September 2006 the then AFMA Director (Richard McLoughlin) was approached to enquire if AFMA was agreeable to releasing copyright to selected works of the AFMA Torres Strait Research DVD. After receiving a positive response, CSIRO Legal services were requested to draft the appropriate copyright permissions to present to AFMA. It was envisioned that conditional on AFMA granting permission, the addition of the 96 publications dating back to 1980 and approximately 30 metadata statements, from this DVD, to the repository website would provide stakeholders ready access to an extensive library of information to facilitate planning, management, and research within the Torres Strait region.

2.2. Develop a searchable repository website Research was carried out to address the objective of developing a searchable website on a secure repository containing linked reports, metadata and available associated data (for limited access where appropriate). Discussions were also conducted with e-Repository stakeholders at various Australian Universities, CSIRO Corporate and the CMAR Data Centre, regarding implementation of an enterprise level searchable information and data website repository on a secure environment as part of this Task’s output. Evaluations into the open source Fedora and Dspace e-repository applications were investigated (CatalystIT 2003) and consultation conducted with Fedora hosting agencies at the University of Qld

CRC Torres Data and Information Repository

2-5

and Monash University; and Dspace hosting Agencies at Queensland University of Technology and the Australian National University Digital Resource Services Centre. After consideration of available research and discussions, recommendations were made to the host datacenter (CMAR Data Centre) for implementation of the DSpace digital repository system as a pilot project. The DSpace open source application was recommended as it is specific to capture, store, index, search, preserve, and distribute digital research material. Research institutions worldwide use DSpace as an institutional repository. The DSpace open source platform is freely available and can be customized and extended as required. Its contents are exportable to other common formats if needed. The intent of selecting an electronic repository such as Dspace was to automate lodgment, network linking, storage and distribution of reports, metadata and available associated data and be ranked highly by the ‘Google Scholar’ search engine. However, due to time constraints this proposed enterprise system was shelved and a more hybrid system developed that simply addressed the direct searchable web delivery needs of this Data and Report Management Task. To successfully manage the anticipated small number (initial expectations were 60 to 80) of CRC works requiring processing, it was deemed that lodged material be manually loaded to a web file directory structure hosted by the CMAR Data Centre and that any metadata be manually submitted to the existing CMAR Marlin Marine Research Metadata Directory by the Repository Administrator. Without the implementation of an enterprise level repository, other cost effective search engines were considered. The search engine as used in the AFMA Torres Strait Research DVD publications and metadata (Taranto, 2004) was examined, but rejected as it did not cater for searching of Adobe pdf files, which was the prescribed information format to be lodged by Task Leaders. In January 2007, Google released a new custom search engine which appeared to address the needs of the Task. It was easy for users to use, quick and searched all necessary file formats (pdf, MSWord and ascii). The Google search interface was also considered superior as its search protocol is familiar to most stakeholders who would use the Torres Strait repository — and being freeware, it was also cost effective. The Custom Google Search engine permits search queries to be constrained to selected websites, web directories and file types — permitting quick searching of specific material. Due to the need to utilize the current CMAR Data Centre Marlin Metadata Repository, a separate search interface was developed to link directly to the system from the website. This interface provides Torres Strait stakeholders direct access to searching all public holdings of the CSIRO database and provides direct access to editing of additional metadata records (and associated data) if desired. As an additional service to Torres Strait stakeholders, other search tools were investigated (and developed) to provide direct access to: the Australian Spatial Data Directory (ASDD), Google Scholar and various web domains belonging to Agencies conducting relevant Torres Strait activities.

2.3. Coordinate the moderation and listing of sensitive data and publications Security for sensitive research publications and data has been paramount. All lodgments to the repository were initially placed in a password protected website. During January 2007, discussions with TSRA staff identified an effective protocol for TSRA to assess the sensitivity of each individual report and dataset. Under this process, all material deemed non-sensitive by the TSRA was identified and the Repository Administrator directed to relocate such works onto the public website area in time for the scheduled repository launch in April 2007. Identified sensitive works were to remain within the restricted area of the repository under three levels of security. Namely: password protection, ‘robot exclusion’ (no caching) and an internet index search control file.

CRC Torres Data and Information Repository

3.

3-6

RESULTS

Each of the Task’s three objectives was accomplished. The varying levels of completeness were related to the degree of responses from CRC TS Task Leaders.

3.1. Facilitate the collation of CRC-TS Intellectual Property (IP) The outcome of this process has been a less than 50% lodgment of identified works to the repository and of them only a minimal number using the supplied template. See APPENDIX 1: Intellectual Property. During the follow-up discussions with each of the Task Leaders during December 2006, some had implied that the lower than expected response by Task Leaders to lodge their works with the repository could be attributed to: the late commencement of the CRC Task 5.2 Data and Information Management Task and the need by many Task Leaders to ‘prepare for the Special Issue Publication incorporating much of the CRC research outputs’. In order to facilitate a timely start on this Task, CSIRO authorized staff to commence work in May 2006 prior to contract signing - which occurred on the 8th December 2006. Most CRC TS Tasks completed at end June 2006. Table 3-1 summarises the status of lodgments at Project level and Table 3-2 summarises the status of lodgments at individual Task levels as at 23March 2007. See APPENDIX 1: Intellectual Property for a more detailed Task Level Table that lists identified IP works and their current status. This listing is also available online at http://www.cmar.csiro.au/datacentre/torres/CRCTS2003_06/index.htm. This comprehensive IP listing has been the basis of information collation and at all times during the project has been available for Task leaders to provide feedback on.

Identified

Posters submitted

Identified

Presentations

Identified

Articles submitted

Identified

TOTAL SUBMITTED

TOTAL IDENTIFIED

OUTSTANDING

5

2

2

45

74

29

Project 2

8

0

8

0

7

0

7

5

5

0

0

0

0

0

2

13

38

25

Project 3

10 15 0

1

0

8

0

8

0

0

0

0

0

0

7

7

17

39

22

Project 4

7 10 0

5

1

3

2

3

0

0

0

0

0

0

0

0

10

21

11

Project 5

4

0

0

2

2

0

2

0

0

0

0

0

0

0

0

6

8

2

0

7

0

7

7

91

187

96

Unknown TOTAL

9

4

Identified

4

Identified

1

Data submitted

1

Identified

0

Papers submitted

22 23 1 13 7 15 8 15 0

Identified

Project 1

Project

Reports submitted

Abstracts submitted

Metadata submitted

Table 3-1. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by project. Note that the number of Reports also includes the Final (Web) Reports lodged at the CRC Torres website in July 2006.

51 61 1 34 10 35 10 35 5

5

1

1

4

5

9 11

The lower than expected use of the Report template may also be associated to the late commencement of the project — although the template was provided to all Task Leaders (on 13/06/2006) in advance of the milestone date, this was closely coincident with the Final Reporting deadline for CRC Torres Strait Tasks and some Task Leaders had already completed their reports or their reports were well underway, thus there was insufficient time for many of them to adopt the template. The majority of Reports lodged were either based on existing agency templates or the MSWord normal template.

1 2

1 2

1 2

0

1

0

1

0

1

1 0 0 0 0

13 1 5 2 8

7 0 0 0 0 0

15

15

0

0

5 2 7 2

8 0 0 0 0 0

5 2 7 2

5

5

5

5

0

1

0 0

2 3

0 0

2 3

0 0 1

1 8 1

0 0 1

1 8 1

0

1

1

1

0 0

1 3

1

0

1

1

0

2

2

0

1

4

0

5

0

1

1

1

2

2

0

2

0

2

7

7

0

0

0

0

0

0

7

7

0

1

0 0

1 5

0 1 2

1 3 2

0 2 0

1 3 2

0

0

0

0

0

0

0

0

0 0 1

0 7 34

2

2

0

2

0

0

0

0

0

0

0

0

10

35

10

35

5

5

1

1

4

5

9

11

OUTSTANDING

0 0

1

1

TOTAL IDENTIFIED

2 1 1 5

TOTAL SUBMITTED

0 1 0 4

2

Identified

2 1 1 5

2

Articles

2

Identified

0

Presentations

2

Identified

0 0 0 1 1 5

Posters

4 1 2 1 1 3

Identified

Identified

0 0 0 0 0 1

Abstracts

Metadata

61

Identified

51

Data

3 2 2 2 2 2 1 2 2 2 1 1 1 23 2 5 2 9 2 2 2 5 1 2 1 15 1 2 2 2 2 1 10 2 2 4

Identified

3 2 2 2 2 2 1 2 2 2 1 0 1 22 1 5 2 8 2 2 1 3 1 1 0 10 1 2 2 1 0 1 7 2 2 4

Papers

Identified

1.1 1.2 1.3 1.4 1.5 1.6a 1.7 1.8 1.11 1.13 1.14 1.15 1.16 Prj 1 2.1 2.2 2.3 Prj 2 3.1 3.2 3.3 3.4 3.5 3.6 3.7 Prj 3 4.1a 4.2 4.3 4.4 4.6 4.7 Prj 4 5.1 5.2 Prj 5 Unk TOTAL

Reports

Task

CRC Torres Data and Information Repository 3-7 Table 3-2. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by Task. Note that the number of Reports also includes the Final (Web) Reports lodged at the CRC Torres website in July 2006.

5 2 2 5 3 15 1 3 4 3 1 0 1 45 1 10 2 13 2 2 1 3 1 8 0 17 3 2 3 1 0 1 10 4 2 6 0 91

13 3 8 6 5 18 1 5 6 3 4 1 1 74 3 27 8 38 6 2 7 11 1 9 3 39 6 2 4 3 2 4 21 6 2 8 7 187

8 1 6 1 2 3 0 2 2 0 3 1 0 29 2 17 6 25 4 0 6 8 0 1 3 22 3 0 1 2 2 3 11 2 0 2 7 96

The additional inclusion of selected publications and metadata as published on the AFMA Torres Strait Research DVD publications and metadata (Taranto, 2004) is still in progress at the time of drafting this report. CSIRO Legal have sought permission from AFMA to load the entire contents of the AFMA DVD onto the website - except six publications that were copyright to agencies other than AFMA or CSIRO. Upon receipt of AFMA Copyright permission the selected works can be transferred to the readied website.

CRC Torres Data and Information Repository

3-8

3.2. Develop a searchable repository website The repository website was successfully developed in conjunction with the CMAR Data Centre. With the Torres Strait Marine Repository now residing on servers maintained by the CSIRO, its contents are automatically backed up and maintained into the future. The Torres Strait Marine Research Repository as linked to the CRC Torres Strait homepage can be found at http://www.cmar.csiro.au/datacentre/torres. The simplified webpage has been specifically designed to minimize any issues should fellow research agencies wish to incorporate the search tools developed for the repository into their own webpages. The custom Google Search engine permits search queries to be constrained to the repository website, searching all necessary file formats (pdf, MS Word, MS Excel, ascii, etc.) and provides a quick search result of specific material in the familiar Google format. A separate search interface has been developed to link directly to the CMAR Data Centre. This interface provides Torres Strait stakeholders direct access to searching all public holdings of the CSIRO database and provides direct access to editing of additional metadata records (and associated data), if desired, directly from the website. The website provides access to CRC TS (now RRRC) copyrighted IP. As the owner of the IP onus is on the RRRC to ensure all necessary licensing requirements of IP stored within the repository (and hosted by the CMAR Data Centre) are in place. To assist the RRRC, the Repository Administrator investigated suitable licensing for the RRRC to assign to the repository holdings when it is made available to the public at large. The below diagram of the website depicts a link to an existing CSIRO Disclaimer and the use of a ‘Creative Commons’ license - a simplified licensing regime specifically designed for the transfer of Australian internet based information and recommended by the Queensland Government. These are provided as examples that the owner (the RRRC) can put in place before the IP is made available to the public at large. As an additional service to Torres Strait stakeholders, other search tools have been developed on an associated webpage to provide direct access to: 1. the Australian Spatial Data Directory (ASDD), for which the CMAR Data Centre is a node. This interface facilitates searching across data warehouses e.g. Geoscience Australia, ERIN. 2. Google Scholar. This search engine provides a simple way to broadly search for scholarly literature. The search term incorporates ‘Torres Strait +marine’ to refine the search to literature specific to this Task’s area of interest, namely Torres Strait marine research. Publicly available peer-reviewed papers, theses, books, abstracts and articles, from academic publishers, professional societies, preprint repositories, universities and other scholarly organizations are available. 3. Various web domains belonging to Agencies with relevant Torres Strait activities e.g. AFMA, CRC Torres, RRRC, CSIRO. Visitors to the repository are invited to add other Torres Strait related web domains by contacting the Repository Administrator identified on the search engines webpage.

CRC Torres Data and Information Repository

Figure 3-1. Index page showing custom repository search tool and Marlin Metadata Search tool

Figure 3-2. Repository custom Google search result page

3-9

CRC Torres Data and Information Repository

Figure 3-3. Direct access to CMAR Data Centre

Figure 3-4. Additional page to search other websites

3-10

CRC Torres Data and Information Repository

Figure 3-5. Direct access to other Australian Data Centres

Figure 3-6. Search libraries

3-11

CRC Torres Data and Information Repository

3-12

Figure 3-7. Custom Google search on contributing web domains

3.3. Coordinate the moderation and listing of sensitive data and publications During the acquisition phase it was noted that some agencies considered their works as confidential or restricted. IP works that have been identified by either individual researchers and/or TSRA staff as restricted are listed in the IP listing (APPENDIX 1: Intellectual Property). Confidential/restricted IP works that were not already managed by individual researchers within a secure enterprise environment were requested to submit their CRC TS works to the Repository. A small number of sensitive works (2) currently reside within a restricted area of the repository under three levels of security. Namely: password protection, ‘robot exclusion’ (no caching) and an internet index search control file. It should be noted though that some researchers were not prepared to lodge IP that was seen as confidential or restricted. In this situation, it was ascertained whether the current custodian provided a secure backup environment. As the Repository Administrators were authorized only to collate lodged IP, no other action was taken other than to identify those works within both the Milestone Report and this Final Report. See APPENDIX 1: Intellectual Property for a more detailed Task Level Table that lists of all identified IP works and their current status – those that are coloured black have been identified as remaining with the existing custodian.

CRC Torres Data and Information Repository

4.

4-13

DISCUSSION

4.1. Facilitate the collation of CRC-TS Intellectual Property (IP) The number of CRC TS IP works identified were higher than initially anticipated — 187 compared with the anticipated 60 to 80. However, the lodgment rate was markedly lower — approximately 50%. This low rate of lodgment by Task Leaders was qualitatively attributed to 2 factors; late initiation of CRC Task 5.2 Data and Information Management, and the need by many Task Leaders to ‘prepare for the Special Issue Publication incorporating much of the CRC research outputs’. The CRC Task 5.2 Data and Information Management Task did not commence until June 2006. With most CRC TS Tasks scheduled to complete at end June 2006 this left little time to develop and familiarize Task Leaders with any protocols necessary to effectively lodge IP works. The late commencement also limited the adoption of the report template that was specifically drafted to provide a consistent appearance to CRC TS IP works. The need by many Task Leaders to ‘prepare for the Special Issue Publication incorporating much of the CRC research outputs’ also took attention away from the lodgment of IP works in a coordinated manner. Many Task Leaders appeared to place higher priority on the publication than securing the IP into the repository. This prioritization was not contested.

4.2. Develop a searchable repository website The resultant ‘hybrid’ repository - a web directory hosted by the CMAR Data Centre with a custom Google search engine and linked directly to the CMAR Data Centre Metadata Directory - successfully addresses the objective to develop a searchable website on a secure repository containing linked reports, metadata and available associated data (with limited access where appropriate). The tools developed for the ‘hybrid’ repository also offer opportunity to provide additional services for Torres Strait stakeholders. Additional search tools have been developed to provide direct access to the Australian Spatial Data Directory (ASDD), Google Scholar and various web domains belonging to Agencies conducting relevant Torres Strait activities. Services that provide greater access to Torres Strait related research material than could be stored within the Data Centre Repository alone. Research into enterprise level repositories identified the Dspace system was superior for storing, searching and disseminating institutional or enterprise level research material. Logistics for implementing such a system over the short timeframe however, deemed it impractical. The future success of the Repository requires its promotion as an effective tool for locating relevant research information. By designing a search engine that is more participatory - allowing stakeholders to also provide links to the search engine - as well as developing an interface to embody within other agencies websites it is hoped that it will be openly adopted. Suitable licensing for the RRRC to assign to the repository holdings has been considered. The existing Disclaimer or the use of a ‘Creative Commons’ license - a simplified licensing regime specifically designed for the transfer of Australian internet based information - have been provided as examples for the RRRC to assign to the website before the IP is made available to the public at large.

4.3. Coordinate the moderation and listing of sensitive data and publications The cooperation of TSRA staff helped effective management of lodged research material. All works lodged were vetted by TRSA staff to ascertain sensitivity and appropriate website security measures adopted by the Repository Administrator. Some Task Leaders were not prepared to lodge IP that was seen as confidential or restricted. As Repository Administrators were authorized only to collate lodged IP, no other action was taken other than to identify those works within both the Milestone Report and this Final Report. The APPENDIX 1: Intellectual Property identifies IP (coloured black) as having been identified as remaining with the existing custodian.

CRC Torres Data and Information Repository

5.

7-14

BENEFITS

It was envisioned that the CRC Torres Strait online marine research repository would be a foundation for other research within the region. Future research would be well served by ready access to extensive past research material, to share research findings with minimal constraints yet with the expectation to acknowledge sources. The design of the repository and its search interfaces provide both individual researchers and organizations the opportunity to participate in providing materials to a common user interface. Individual stakeholders are invited to lodge their works to the repository permanently served by the CSIRO Marine and Atmospheric Research Data Centre, while those agencies with their own inventories are invited to supply internet links to their publicly accessible directories. Fellow research Agencies serving publications, and/or data are invited to provide links to the custom search interface. The presumption however is that any information linked to the search interface is publicly available. Restricted items are only catered for within the existing repository which is served by the CMAR Data Centre. The Repository search interface has been designed to facilitate inclusion on other websites, and not to be seen as owned by any one agency. The search page can be simply incorporated into the web frames belonging to other web domains. It contains minimal parentage information as the purpose of the repository is to simply provide access to research works irrespective of the custodian of the research.

6.

FURTHER DEVELOPMENT

There will always remain a need to promote the repository to stakeholders of the Torres Strait. In addition there will always be a need to facilitate either the lodgment of IP works to the repository or hyperlinks to new and valued inventories of information managed by other research agencies. It is only by the continued cooperation and participation of stakeholders that this search interface can maximize research developments within the Torres Strait. At present, the system is dependent on external applications and services being delivered by Google, relying on the Google web trawler applications (Googlebots) to search and index individual web files. Though prescribed commands and an efficient web design have been followed to maximise the instance of hits by Google’s web trawlers, the Googlebots are self managed and there is no guarantee of a quick search of the complete repository. It has been observed that the current indexing of new information within the repository takes between one and three weeks. This can be improved by implementing an enterprise repository system such as Dspace.

7.

ACHIEVEMENT OF OUTCOMES

The resultant Torres Strait Marine Research Repository has successfully published the CRC Torres Strait IP works that have been lodged with Administrators. In addition, the customised search interfaces available for future stakeholders significantly enhances the repository’s planned outcome to provide a searchable repository of research IP works from the CRC Torres Strait. With a defined repository now permanently maintained by the CSIRO Marine and Atmospheric Research Data Centre, the maintenance and availability of any lodged IP - not just that of the CRC Torres Strait - is assured, providing an enduring service to the Torres Strait research community. The customised search interface has been developed so that it can be simply incorporated onto and/or linked from other agencies websites, seamlessly linking independent research efforts between stakeholders and custodial agencies of the Torres Strait.

CRC Torres Data and Information Repository

8.

9-15

CONCLUSIONS

Overall, the outputs of the project have contributed to greater outcomes than initially anticipated. The number of CRC IP works successfully lodged into the enduring and secure repository were 10% higher than initially predicted and the customised search interface provides not only a search of the repository holdings but also of other defined external repositories. The disappointing response from CRC Principal Investigators (50%) could have been improved if this project was commenced before the completion date of the CRC. Relating the lodgement of works to contractual agreements and/or other pecuniary costs could also be considered.

9.

RECOMMENDATIONS

A recommended priority is for the current owner of the Repository, the RRRC, to ensure all necessary licensing requirements are in place before the repository is made available to the public at large. Investigations have identified the ‘Creative Commons’ license — a simplified licensing regime specifically designed for the transfer of Australian internet based information — as a possible solution for the RRRC. The current draft website also contains a link to the CSIRO Disclaimer as an example of current practices. The late commencement and coincidence with Special Issue publication significantly limited lodgement of intellectual property (IP) works into the repository, only 50% of identified IP works were eventually lodged to the repository. It is recommended that any future research programs consider this experience and provide timely services for stakeholders to adhere to protocols such as template usage and provide an effective protocol for researcher to lodge their works. It is also suggested that any future contractual agreements include a clause that withholds final payment until all works have been submitted. The lodgement of outstanding IP to the repository remains the responsibility of current custodians. The intellectual property (IP) listing (APPENDIX 1) identifies those works awaiting lodgement. The now functional Torres Strait Marine Research Repository is a significant resource. It not only incorporates and searches the IP works of the CRC Torres Strait, it includes search interfaces to many other Torres Strait related repositories. Though the repository’s security and ongoing service is guaranteed by its inclusion within the CSIRO Marine and Atmospheric Research Data Centre, it will require periodic updates and maintenance to continue to act as a value resource. In addition, the customised search interface, though currently searching many already known websites and external repositories, will require ongoing maintenance. It is recommended that a repository administrator be assigned with the responsibility to promote the Torres Strait Marine Research Repository to stakeholders and other researcher agencies; to update links to new inventories containing relevant marine research; and to ensure that sustainable protocols are developed and followed by contributors to the Repository. The implementation of an enterprise level repository also offers many benefits such as increasing the rating by academic search engines and a guaranteed timely and comprehensive search. It is recommended that to enhance the outcomes of the existing ‘hybrid’ search repository, consideration be given to the implementation of the Dspace open source repository, a widely recognised research repository.

CRC Torres Data and Information Repository

11-16

10. REFERENCES CatalystIT (2003). Technical Evaluation of selected Open Source Repository Solutions. https://eduforge.org/docman/view.php/131/1062/Repository%20Evaluation%20Document.pdf . Cited January 2007. Taranto, T.J. and C.R. Pitcher (2004) Torres Strait marine science: collected publications and data, 1980-2003. DVD. Cleveland, Qld., CSIRO Marine Research.

11. ABBREVIATIONS & GLOSSARY AFMA – Australian Fisheries Management Agency ASDD - Australian Spatial Data Directory CRC TS – Cooperative Research Centre, Torres Strait CMAR – CSIRO Marine and Atmospheric Research Dspace - an open source software package which provides the tools for management of digital assets Googlebot – a ‘spider’ or robot that collects documents from the web to build a searchable index IP – Intellectual Property RRRC - Reef and Rainforest Research Centre TSMRR - Torres Strait Marine Research Repository TSRA - Torres Strait Regional Authority

CRC Torres Data and Information Repository

12-17

12. APPENDIX 1: INTELLECTUAL PROPERTY Listing of all identified IP works attributed to the CRC Torres Strait with hyperlinks (blue) to those works that have been lodged to the Torres Strait Marine research Repository as of 23 March 2007. Hyperlinks coloured red are awaiting lodgment while those that are coloured black have been identified as secured with existing custodian. See current listing at http://www.cmar.csiro.au/datacentre/torres/CRCTS2003_06/index.htm

CRC Torres Data and Information Repository

12-18

CRC Torres Data and Information Repository

12-19

CRC Torres Data and Information Repository

12-20

CRC Torres Data and Information Repository

12-21

CRC Torres Data and Information Repository

12-22

CRC Torres Data and Information Repository

12-23

CRC Torres Data and Information Repository

12-24

CRC Torres Data and Information Repository

12-25

CRC Torres Data and Information Repository

12-26

CRC Torres Data and Information Repository

12-27

CRC Torres Data and Information Repository

13-28

13. APPENDIX 2: TASK MANAGEMENT LISTING Webpage with lodgment details as provided to Task Leaders and listing of each Task, Task Leader names and Repository Administrator actions to acquire identified task IP. See non-cached webpage http://www.cmar.csiro.au/datacentre/torres/CRCTS2003_06/Admin/CRCTasks.htm

CRC Torres Data and Information Repository

13-29

CRC Torres Data and Information Repository

13-30

CRC Torres Data and Information Repository

13-31

CRC Torres Data and Information Repository

13-32

CRC Torres Data and Information Repository

13-33

CRC Torres Data and Information Repository

14. APPENDIX 3: STAFF Principal Investigator - Tom Taranto Co-Investigator – Roland Pitcher Contacts CMAR Data Centre Manager – Tony Rees TSRA Contact – Vic McGrath CRC Torres Strait Contact – David Williams RRRC Contact – Russell Reichelt

14-34

Suggest Documents