You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

This page covers the backlog for openIDL.


DateItemDescription
118-JAN-21Extraction pattern - tech

What is the tech for the extraction pattern? map/reduce, optimized for scale, GraphQL or others?

Context: 

  1. The extraction pattern model currently uses a map reduce function in MongoDB.  This locks us into MongoDB and uses a closed environment without access to the outside world we’ll need for correlating other data like census.  The extraction capability must be reimagined.
  2. Suitable for simple data source layout. Can't do cross lookup, validation or reference data checks
  3. Currently works for mongoDB only. Won't work for other sources, example AAIS' data lake (Hadoop/Cloudera)
  4. GraphQL seems like a strong candidate. POC is required to validate the hypothesis
  5. Discuss any other candidates that should be considered
218-JAN-21How to assert data integrityHow to assert data integrity? A checksum after record is locked & written to the chain, store the acknowledgement from the HLF to a control DB and map it to a record set etc.
318-JAN-21How to assert data quality

How to run technical and business validation on data and certify the data?

Context:

  1. Technical rules: may include JSON schema validation, format, cardinality check
  2. Business rules: enum check, field-to-transaction-to-record-set-to-dataset validations, reference data
  3. Error threshold: calculate the error threshold. NAIC allows for up to 5% error rate
  4. Timing: when should validation be applied? As the data arrives into the HDS (staging→core), or just before extraction (time pressure, lost opportunity with time that could be used to rectify errors), timeouts on the extraction API etc.
418-JAN-21Common Rule Set

Is it possible to provide a common set of rules that can be used by all carriers against their data before making it available to the extraction?

Context:

  1. Assumption: rules are standardized across openIDL members for a given use case. Example, rules related to Auto stat reporting 
524-JAN-21Is the HDS persistent or temporary?
618-JAN-21Data quality error threshold

Current practice allows for an error rate of up to 5%. Allow? If allowed, how to design & implement

718-JAN-21Reference data validationWhere to host reference data service? Within member's enterprise or within node? Must be applied before extraction (tenet) 
818-JAN-21Reference data lookup services/APIsWhich APIs to look up? For example, USPS state/zip validation, Carfax etc.
918-JAN-21Reference data lookup services/APIs - pricing modelWho pays (assumption - whoever owns the data pays), and how to charge the consumers (via assigned accounts, via centralized billing account prorated to consumption etc.). Who signs the vendor contracts
1018-JAN-21Separating the Hyperledger Fabric Network from the data accessCan a carrier participate in the network from a hosted node without putting the data there?  That is, can we give a carrier access to the network without them having the data access portion hosted in the same node.  The HLF runtimes are not required to run in the carrier, and only a simple api is made available for extraction.
1118-JAN-21Simplify the technical footprintCan we simplify the architecture so that there are not so many technologies required?
1218-JAN-21Hosted nodesShould we consider hosted nodes for the HLF network instead of requiring all carriers who desire data privacy to host the network?
1319-JAN-21HDS vs Interface Spec for nodesDo we need HDS to be uniform or only have uniform interface spec and communicate via service?
14


15


  • No labels