2021-01-11: Development Meeting

Kacper, Craig, Tommy, Mike L, Tim, Mike H

Agenda

  • Updates
  • Recorded run – [name=Craig]
    • Draft tracking issue
      • Feedback welcome
      • Need to create sub-issues
    • Current recorded run structure
    • Current implementation
      • API is partially complete (create, list, rename, delete, stream stderr/out). Actual run implementation is not done.
      • Assumes results folder and .stderr/stdout written to run top level directory
      • During run, workspace is a pointer to an immutable version
    • Current structure
      • tale (order conceptually, not alphabetically)
        • data (externally registered data)
        • workspace
        • versions
          • v1
            • data
            • workspace
        • runs
          • r1
            • data/ (link to ../versions/v1/data)
            • results/ (output from run)
            • workspace/ (link to ../versions/v1/workspace)
            • version/ (link to ../versions/v1/)
            • .stderr
            • .stdout
    • Discussion of “results” directory
      • Can the tale write into the workspace?
      • Assumes output written to ../results folder
      • Writing to workspace presents problem (what if they modify the src code as part of the run?)
    • Discussion of export/import
      • Two options:
        • Export run/version as current tale (adds results and stderr/out files)
        • Export run/version structure (adds notion of version and run along with results and stderr/out)
      • Example: Exported tale with current structure from the run/version – including results and stdout/err, which must be reimported without creating a version or run
        • tale (v1)
          • data/ (bagit…)
            • data/
              • dataone.csv
            • workspace/
              • run.sh
              • my.py
            • results/
              • my.out
            • .stderr
            • .stdout
      • Example: Exported tale with the run/version structure
        • tale
          • data
            • dataone.csv
          • workspace
            • run.sh
            • my.py
          • versions
            • v1
              • data
                • dataone.csv
              • workspace
                • run.sh
                • my.py
          • runs
            • run1
              • data (link to Versions/v1/data)
              • results
                • my.out
              • version (link to Versions/v1)
              • workspace (link to Versions/v1/workspace)
              • .stderr
              • .stdout
      • Discussion about “data”
        • There is ongoing confusion about the ../data directory and what it’s used for (external data v my data)
      • Discussion about “workspace” name
        • There is ongoing confusion about the workspace (tale directory or scratch space?)

Updates

  • Kacper
  • Craig
    • Versioning PR review + testing
    • MATLAB “Web Desktop” POC
      • They publicly released https://github.com/mathworks-ref-arch/matlab-integration-for-jupyter/
    • Submitted proposal to IBM for SPSS support
      • Appear to require site license for SPSS, which none of our partners support
    • Began drafting Recorded Run issues
    • Traefik for CSP
  • Tommy
    • Back from vacation
    • Two PRs for Tale Sharing & Git Integration
      • Addressed first round of feedback
  • Tim
    • Close to having a query that produces a variable-level dataflow diagram from SDTL RDF.
      • Had to add rule-like SPARQL macro capability to Geist.
      • Note how sdtl_select_variable_write_read_edges uses upstream_variable_writer (defined above the former) three times: sdtl.g
      • Will be useful to generating ports and channels for ProvONE export via CONSTRUCT query.
      • Will prettify the Geist “rule” syntax to look more like SPARQL.
    • Ready to customize CPR and Geist as needed for Whole Tale.
  • Mike L
  • Mike H