Workshop on Finding and Re-using open scientific resources

Participants

  • Jonathan Gray, The Open Knowledge Foundation
  • Peter Murray-Rust, Cambridge University
  • Sabine McNeill, 3D Metrics and Forum for Stable Currencies

  • Non Scantlebury, The Open University (Head of Library Research and Innovation)
  • Jessie Hey, Electronics and Computer Science, University of Southampton
  • Rufus Pollock, The Open Knowledge Foundation
  • Terence Freedman, The National Archives
  • Hilary Smith, The University of Sussex
  • Frank Norman, MRC National Institute for Medical Research [Head of Library]
  • Rhian Cunliffe, BioMed Central

  • Vincent Rouilly, Open Wetware
  • Cameron Neylon, Open Wetware + STFC
  • Tim Hubbard, Sanger Institute
  • Giota Alevizou, LSE

Agenda

  • Introductions/opening discussion
  • Discussion of openness
    • Licensing clarity
  • Funders - what policies
  • How do we do data sharing (cost, the recipe, the standards)
  • Registries
  • Lunch
  • Examples: successes and failures
  • Finding open educational and scientific resources
    • labelling open/closed resources
    • Sciences, social sciences, arts, humanities etc
    • Finding and accessing: registries (again), (shiny) front-ends, software etc
    • By MACHINES ...
    • mapping the discovery landscape?
  • Research funding/training
  • Education (about openness) among students/researchers

Planned Actions

Focus on 3 main things:

  1. Providing a simple recipe for making things open
  2. Advocacy: benefits of openness, education, changing funder mandates
  3. http://www.ckan.net/ -- editors/curators, expanding coverage

1. Simple Recipe

  1. Unlocking/clarification service: a simple way for people to ask for data to be made open (or have its status clarified)
  2. 'How to make my data open': Basic 1-2-3 webpage
    1. Choose license (do I have the rights?)
    2. Apply license (insert url, say how you want to be cited)
    3. Make data available somewhere (archive.org)
    4. [Optional]: register it (e.g. CKAN)

2. Advocacy

  1. Prepare 1-page summaries of benefits of openness (altruistic and 'selfish'). Some of this can be standard but a good portion needs to be specific to the subject area
    • more citations/usage
    • eligible for openness award
    • giving also means receiving
    • very easy to do
    • satisfy funder requirements simply and easily
  2. Include openness as part of best-practice (see recipes above)
  3. Talk to funders about mandating/considering data openness as part of their policies

3. http://www.ckan.net/

Already have a good number of data 'packages' in CKAN in scientific areas: http://www.ckan.net/tag/read/science. However would be useful to supplement current efforts with more permanent editorship/curatorship:

  1. Appoint named curators/editors in particular areas (chemistry,astronomy, bioinformatics etc)
  2. Provide clear guidance as to what packaging could involve
    • tagging
    • clarifying open/closed status
      • Connect this with unlocking service
      • ensuring download url
    • uploading data to a reliable repository
    • checking data etc

Notes

  • What is 'open'? Value of openness. Sharing, integrating data into other projects/datasets.
  • Access + *re-use*!
  • Attribution + Sharealike
  • Privacy concerns? Reducing barriers while accounting for such concerns. Multiple tiers?
  • Anonymisation + pseudo anonymisation?
  • Doesn't work for genetic data
  • Virtual machine which guarantees certain things can't happen
  • No simple one size fits all rule
  • Different communities want different things
  • Each community will look at how open their data is in their community
  • Science Commons pushes for public domain (no restrictions - inc. Attribution/Sharealike)
  • License incompatibility - problems with different licenses
  • No discrimination against particular users/uses
  • Open Knowledge Definition - http://www.opendefinition.org

  • Licenses and other legal tools (waivers, declarations,...) and non-legal tools (community norms + intentions + curses?)?
  • EC Database Directive (cf. Creative Commons Zero, CC0, + PDDL)
  • Very few people have litigated/taken action about attribution
  • NIH + Wellcome Trust are supportive of openness/open access...
  • Not all funders do this
  • Large commercial interests
  • Landscape of discovery
  • Metadata?
  • Searching for "solubility of boc glycine in thf" on http://www.oaister.org/ vs google

  • Easier to use Google?
  • Best practices - a document?
  • Recipe
  • Template for enquiries
  • unlocking service for scientific datasets?
  • existing work - :
  • point people to CC0/PDDL etc. other appropriate open licenses (e.g. http://opendefinition.org/licenses )

  • questions about openness - checklist: http://shirleyfung.com/mbdb/filter.php?by=alltab

  • http://www.opendatacommons.org/odc-public-domain-dedication-and-licence/

  • http://www.oaister.org/ interesting to query datasets - includes 'rights' metadata

  • http://okfn.org/wiki/OpenEnvironmentalData

  • prizes, fellows, editors, ...
  • held to best practices for licensing/rights for your field
  • http://www.bbsrc.ac.uk/

  • case studies: 'i am a chemist...', 'i am an economist...' benefits of openness, motivations, experiences in different fields.. stories!
  • instructions for being open
    • choose license
    • make it available (upload?)
    • ask questions
      • rights
      • who are you?
      • title
  • where in the digital lifecycle? should be done from the start, not at the end (when it may be much more complex/messy)
  • hosting data
  • instructions/how-to for making data open (carrot cake)
  • ethics: public accessibility
  • OAISTER - free text in rights field
  • SKOS - http://www.w3.org/2004/02/skos/