Date

Attendees

Friday, September 10th at 10am PT/1pm ET

Join Zoom Meeting
https://zoom.us/j/92440959002?pwd=d1dsRHRZZXpad3FMMUxodVJPejVwZz09


Meeting ID: 924 4095 9002
Passcode: 369672
Dial by your location
Find your local number: https://zoom.us/u/acIvYP0wGe

Agenda Items - TSC Meeting #2

  1. Call to order: 10am PT/1pm ET
  2. Anti-Trust, Meeting Protocols and Welcome
  3. (Approve Minutes from previous meeting)
  4. Agenda Review
    1. Open Floor: call for New Agenda Items to identify/consider
  5. Second Meeting Agenda:
    1. TSC member introductions (Members)
    2. TSC Charter and Ops overview (Chair)
    3. openIDL overview and TSC resources (Ken Sayers, AAIS – TDOCS and GitHub)
  6. Standing Agenda Items:
    1. TSC efforts and status
      1. Infrastructure Members
      2. Working Groups and POCs
        1. Flood Working Group
      3. Update on migration project (Ken Sayers, AAIS-- TDOCS and GitHub)
      4. Architecture diagrams
        1. Tenets of Architecture
      5. TSC Member Engagement
      6. Member Nodes
      7. Other Steering or Board committee updates
      8. Open Floor: Community input/feedback on current efforts
    2. TSC Backlog Planning/Grooming
    3. Upcoming events and TSC schedule
      1. LF: Open Source Summit- September 27-30th- Seattle, WA
        1. AAIS Attending
    4. TSC Items to Consider:
    5. Wrap-up: To-dos and community communication
    6. Discussion: time permitting
  7. Adjourn: 11am PT/2pm ET
  8. Extended Discussion/Socialization – Chair will keep meeting open as community requests, for up to 1 hour (12pm PT/3pm ET; minutes not recorded)

Notes:

started at 10:04am PT

Motion to approve prior minutes made by Truman, James M seconds, minutes approved

Truman gave updates on infrastructure members, working groups and POCs (including the Flood Working Group)

Ken updated on the migration project:

  • Will soon have a reference implementation of a carrier node running on AWS
  • To be done: turning off servers on IBM cloud, and finalizing documentation

Provided pointer to Tech Docs space: https://wiki.openidl.org/display/TDOCS/TechDocs
Invitation to review and discuss what TSC wishes to ratify
Specifically he walked through the "Architecture Tenets":

  • System must be manageable
  • System must be cloud agnostic
  • Infrastructure as a Service, over Self-Managed Infrastructure
  • System must be transparent
  • Privacy of distributed nodes is paramount

Q from Jeff: should we add comments to the wiki? Yes, just don't edit it directly.
Ken also pointed to the Architecture Diagram and DevOps documents

Truman reported on work he's doing to recruit additional organizational participation in the TSC, including state regulators.

Truman encouraged folks to attend the Open Source Summit end of the month in Seattle, where he and Ken will be.

James asked - I've got some technical questions, are they appropriate here?

Brian: yes, let's just figure out what the questions imply for decisions, or things to apply resources to (e.g. improved documentation)

James asks: when I introduce others in my organization to the specifics in the tech stack, it's so helpful to have the rationale for picking them, and perhaps the alternatives reviewed. Can we go through the rationales for them? (general agreement yes)

James asks: how do we know what to put into the containers, and how much do I move out into the framework around the containers?

Ken answers: the main principle there is encapsulation, and using Kubernetes to manage that

Brian: this is great conversation, how do we formalize this? Let's get this into docs, we can pay for better tech docs.

James: what was the rationale behind MongoDB?

Ken: Driven by a desire to share the entire extraction pattern. (Truman added some other details.)

James: why map reduce vs aggregation pipelines?

Ken: This approach allows you to see it happen on the blockchain. But that was an early decision, we could revisit in the long term.

James: The generalized data model is one of my concerns. Is that something we at the TSC decide and formalize?

Truman: That will be at the RR Data Model committee. But it is intended to be collaborative. Dina and Ruturaj are the stewards of the model currently. (more discussions on the data model ensued)

Discussion ensured about future topics, whether a review of prior decisions would be desirable and acceptable.

Truman said: Ken is cleaning up the architecture issues on the wiki and expects to let folks know they're ready for review on/about Wednesday next week. Comments or feedback is requested to prioritize open items to be addressed in the next meeting or fleshed out further.

Meeting adjourned 10:52am PT.


Type your task here. Use "@" to assign a user and "//" to select a due date.
Action items

1 Comment

  1. Per the action item to review the decision log.  My observations and possible discussion points are below.

    Thanks!

    ===


    From page: openIDL - Architecture - Tenets and Decisions - TechDocs - Wiki


    Tenets

    • Is it worth mentioning a bias toward open-source products? Or is that obvious at this point, and some of the old IBM hooks are just run off items.


    Decisions

    • Per “DA – Harmonized Data Scope”:
      • It says “Is this one single model?”   Highly generalized models (which is what you would need if you go single-model) are notoriously hard to execute.  But if you can, they scale nicely.  Reasonably specialized models make more sense to more people who have to live with it, but can get cumbersome and redundant at scale.  What’s our general bias?
      • It says “Is the data at rest in the same model as the data in motion?”   Whatever it is, if it’s different in motion, the mapping must be very minimal—to the point where machine automation can handle it all, even if it may be difficult to initially create.


    • Per “DA – Harmonized Data Store”.
      • This brings up the whole question of mandating a physical database at all. Physical databases are more in alignment with the tenet “The system must be manageable”, as more people of a more junior level can do database work versus API work, particularly data-over-API work.  But API support makes for better engineering at scale.  At high level, what’s our preferred direction?


    • Per “DA – Harmonized Datastore DBMS Implementation”:
      • This aligns to the “revisit” of MongoDB we discussed, and basically eliminates it. Strongly agree with this direction.  However, this brings up a problem that is either brutal, or non-existent:
        • How big will this database get? Is the nature of the reporting work only current or recent history?  If so, then explosive history should not be an issue.  But is that the case?  Or could it get huge?
        • How does this trade-off on the single-model question? The database selection question also trades off against the single-model question.  A single-model will likely have many tables with much normalization.  Could this fragmentation get out of control?
      • Would we make a rule to be strictly ANSI compliant? If we say that we want to allow carriers to chose from several databases, this would be powerful.  This connects to the “Can it just be an interface?” comment in the discussion cell.
      • Could we consider a column store DB, such as Cassandra? While I agree most NoSQL technologies should be out, is the column store family still worth considering?  It’s SQL-enough, but could do better at scale.  Or not, since such technologies tend to be “query first” by design, and we don’t know all the queries in advance.


    • Per “DA – Harmonized Data Model”:
      • The “Is this an ETL process?” question is interesting. How do we know that the extraction patterns won’t get huge?  Particularly against a highly normalized single model?
      • Even if we have a single logical model, should we have an entire discussion around physical design that optimizes for access? Such a physical model may look very different.


    • Per “MongoDB” item:
      • It says “As long as the db is mongo…”, which is challenged by another decision. Are all the DBMSs under consideration able to work in the Kubernetes environment?


    • Per “UI Deployment” item:
      • Very nicely spelled out rationale. Thank you.  Will be helpful given the cons in case those come up.