2019-12-02: Development Meeting

Craig, Tommy, Tim, Mike H, Elias, Mike L

Agenda

  • Updates
  • Discussion
    • v0.9 status
      • User settings (merged)
      • Register data from Zenodo (merged)
      • Publish to Zenodo (merged)
      • Publish refactor
        • Backend (merged)
        • UI
          • https://github.com/whole-tale/dashboard/issues/570
          • https://github.com/whole-tale/dashboard/issues/571
      • Run/import from Zenodo (in progress)
      • Extra v2 fix (need to issue PR)
      • Tapis integration (needs merge)
      • Bdbag integration (PR issued)
    • C2Metadata plan
      • Stata and Matlab support
      • “WT Prov” (=strace) + YW + C2Metadata = Comprehensive provenance model (CPM)
        • For now, looking strace + YW
      • Discussion of “queries”
      • Discussion of formats
        • Export to DataONE would require outputting everything in a way DataONE understands
        • Facts: http://try.yesworkflow.org/
        • Argument for JSON-LD format. If you want someone else to write an adapter – are they doing Prolog or parsing JSON?
        • This means that we’re defining a JSON schema for CPM?
      • Discussion of CPM and DataONE
        • DataONE Matlab project was able to capture some of this information
        • Tim: Matlab script + YW + D1 Matlab could output/query useful info
      • Tommy can look at DataONE model and its relation to outputs from above
      • How is CPM different than W3C Prov (abstract)?
        • W3C is more abstract and therefore ambiguous
        • “CPM” is intended to be concrete, supporting specific queries
        • Concrete models (have schemas, but tend to be practical):
          • YesWorkflow facts
          • Sciunit – sqlite DB
          • ReproZip
          • NoWorkflow – more of an implementation
          • DataONE/Metacat is just Prov-ONE implemented in the resource map.
      • How is CPM different than Prov-ONE?
        • YW already has things that aren’t supported in Prov-ONE
          • Directions, flow of data, nesting of workflows
      • Discussion of codebooks and how we could generate them *
  • Actions:
    • Craig - setup VM for Mike H’s demo
    • Craig - re-test DataONE on master
    • Tommy - Test DataONE

Updates

  • Craig:
    • UTK tale
  • Tommy:
    • PR Reviews
    • Extra /v2/ fix
    • Had some downtime with fires (back home now)
  • Tim:
    • Working on comparing the provenance models of ReproZip and Sciunit. Will be making a series of increasingly complex examples, each started by a run.sh script, showing what events and relationships each captures.
    • Will be working on Linux running on metal, then in VMs and in Docker. Plan to share computing environments via Ansible and Docker images made using Ansible.
    • This will be the start of a comprehensive provenance model for WT.
    • The WT provenance model needs to incorporate, and span, both script-external (system) observables (what strace can see), and script-internal observables, e.g. what C2Metadata can see.
    • The current plan is to YesWorkflow and associated concepts to drive our modeling and prototyping of script-internal provenance, with the expectation that C2Metadata will fit roughly into the same place as YesWorkflow, both conceptually and in the implementation of the software.
    • Repo: https://github.com/whole-tale/wt-prov-model
  • Mike L:
    • Dockerized new UI rewritten in Angular
      • Added Dockerfile to Angular repo
      • Created PR to deploy-dev to deploy new UI alongside existing EmberJS UI
      • :tada: Angular UI should be ready for preliminary testing :tada:
  • Elias:
    • PR for bdbag needs review