Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • PA: Earned premium is not really about one's data set, but about EP within a certain timebound. To get this, we take earned premium/months covered. This is very basic, and data set will never be one like this. 
  • PA: Working with SQL - hard to utilize parameters without an extremely advanced client. I have start dates and end dates, and strings, but they are parameterized when actually used.
  • PA: We begin by running a setup command and creating a temp table, called Temp Auto & Earned Premium. Begin calculating group 1, group 2, group 3, only 1 creates the table, next ones insert into that table.
  • Extraction for EP comes down to selecting sum from temp table. Teardown at end is drop of 1 table. 8 columns - EP is first column. Second part - not only earned premium but EP by coverage - column by column. So breakdown will expand significantly over time.
  • KS: current path is sufficiently robust to implement this. PA: this is actually a single select statement (KS pointed out not a single select to implement the whole thing).
  • KS: Line 14 - this is a create table statement not a select statement. We have to recognize HDS schema request as a select statement. 
  • PA: What we need to do is take these commands, break them down, and separate parse them into objects - e into queries with semicolons.
  • KS: This will not work - won't come up as a select statement. We therefore need to break it up into script, yes?
  • JM: No, will be fine -after tokenizing this each token represents a query, within that larger query. Show the first token within those larger tokens - if it's select, it will do something, but it doesn't matter, because it's the responsibility of the query writer to identify the two schemas in question.
  • JM: All scripts assume their default schema is going to be openIDL_ep - we only specify schema for occasions that involve reaching out to base schema.
  • PA: But when we're doing the create statement for EP... we should go openIDL_ep?
  • JM We leave it blank. Because it is so common to reference EP schema
  • PA: I'm talking more when I create the database.
  • JM: but in creating base schemas we want to specify schema name, absolutely. (KS: this happens outside of extraction execution. PA: but pivotal to do this in setting up HDS).
  • KS: The default schema then is the EP schema, therefore it doesn't have to be mentioned anywhere even though it can be but doesn't have to be. 
  • PA: It may be that we start to have some persistent more advanced tables in EP - recurring structures.
  • JM: This proves our point and this is exactly what this code tends to look like in terms of business logic.  From a pattern perspective, there is different logic for 1 2 3 4, correct?
  • PA: Correct. We process records differently depending on when they fall in the year vs. where they are extracted.
  • JM: This is a source of confusion - i.e., what is the difference between line 25 and line 34. (He got clarity on this).
  • JM: This demonstrates the case of being able to make intermediate tables. For people who have to read it comprehensibility is a must.
  • PA: For EP - could make more sense to utilize a function and build it differently. 
  • JM: For scalability & modularity, the desire for functions explains the need for EP schema separation from the HDS/base schema. It points to repeat question of should this be in the extraction patterns or DB definition?
  • PA: For auto, with earned premium, if we have this, and the corresponding function exists in the architecture, we can have a good test set and a high level of confidence that the function works correctly.
  • JM: Business rule management is critical here. One of the requirements from business people: they have to be able to manage these rules and comprehend them in a useable way. If we can start getting acute granularity - 1:1 on requirements and functions - there should be a single functional entry point for every designated business rule. This is incredibly powerful re: the scalability of human comprehension. Functions should be defined in either the base schema or EP schema but not buried into extraction patterns. They are too stable across time.
  • PA: We're going down an interesting path that will put us in the realm of metadata-driven architectures or table-driven architectures. 
  • JM: we can put business logic into code modules, aligned to business requirements in written form, we will see repetition even within functions that need to be tabularized. This will put us in a highly scalable architecture/part of a scalable solution. We can talk later about how much we will scale them.
  • JM: this is critical if we ask Mr. Antley and others to start coding - a key differentiation. 
  • JM: We also ask selves if functions should be built into base data? It's not uncommon to venture down a path where we ask if we can do this with ETL time.
  • PA: we can't do this w/ETL time - until we know what our bounds are, we can't calculate the results.
  • JM: We should assume we want to use functions at this point.
  • PA: a ltd # of functions should be part of bedrock infrastructure from outset, but not created at query time, otherwise SQL query will get so large that most people can't even read it. 
  • PA: With the power of functions, I can do one select statement and get results comparable to auto coverage report. Functions we're talking about right now will be created before extraction is run, + validated and tested. So we aren't creating functions at run time. They will be built into the database and utilized. 
  • KS: Disagreed strongly with this, because we have to govern those functions and install them. Doesn't believe this is the right approach - we want to keep it as simple as possible and just run the script. Doesn't see the value of creating these elements ahead of time if we can create them as part of the script.
  • PA: we have to do governance regardless, and not many functions made. e.g., 10 functions, a test bank to prove they are correct - govern these far more easily than, say, giving someone the ability to create the function at runtime. The amount of validate done w/extraction patterns goes up exponentially without the use functions.
  • JM: the idea that we're codifying a certain body of knowledge in a schema is fairly standard. The logic has to exist somewhere - question is not does it need to exist, but where do we put it. Schema definition - putting it in the schema build - is a much more stable location. The question: does the pace of change match the pace of the place where we put it. Pace of extraction patterns - extremely high and dynamic and different every time.
  • KS: not challenging logic but timing. Doesn't believe we need to go there for the MVP. More complicated parsing.
  • PA: the amount of renaming (reformatting semicolons, etc.) is overwhelming. But we would set a delimiter. 
  • PA: we are likely taking about 9 functions. First is DateDiv (takes 2 iso dates - givess us difference btwn dates/months). 1 function for each column in that table. 
  • JM: If we can develop a set of functions that map cleanly to business requirements - lucid enough that any businessperson can read it - this represents a huge strength.
  • KS: we've added a significant dependency on strong governance. Latest If the code is in the install base, the latest version at any time will be necessary for a user to run extraction patterns. Key consideration. (Potential drawback) We just put a huge burden on everyone having the latest version of these helper functions. Very concerned about the governance here - ensuring functions get into nodes, and making sure everyone's nodes are updated.
  • JM: but no worse than the fact that we'll need to govern it anyway, and there will be versioning regardless. We will have the same challenges regardless. Without functions, all logic has to fit inside extraction patterns. Then the database is humble, we have to add more and more to script w/additional calculations, we could end up with massive extraction patterns. Massive prework pre-work to get one basic result. Ignores our ability to take advantage of the inevitable repetition within our codes.
  • JB: We will have governance-related obligations anyway. Wants on the database anyway. Changes will be less frequent than we fear - may only be once or twice a year. Stability is key. JB wants to try to put a higher-level overview on our discussion -re:  impact of the adapter on our wrt its functionality and the operations it might need to perform.

Mr. Braswell presented a diagram w/various logical endpoints of an interface for API and the various types of resources that might be used. One of the central ideas behind this: this can and should be used work whether these are call-ins or call-outs.  The diagram illustrates that there are other types of interactions in the adapter to the hosted node in addition to just processing the data extract.: 

  • Details of how we perform the extract can be summarized by the two ellipses at the bottom and the middle <initiate data call processing> and <return data call results>.
  • The diagram illustrates something fundamental: Since we are positing the notion of an interface to a hosted node, some of the functions that exist in the node itself - re: looking at the queue of requests or the UI that interacts with one of the Kubernetes containers to talk to the chain code - this particular function (knowing what the requests are/what's in the queue) needs to be proxied out into some form of API call in this interface. JB is calling 
  • function of knowing what's in the queue will need to be proxied out to some form of API.
  • View Data Call request will ferret out specific request in the standing queue. 
  • Approving data call request - part of the validation process.

*****

JB: What I was describing is in idl adapter. Just looking at API.

JM: Who is the client of API? KS: as we're doing an extraction for a data call, we're interacting with this part to say "I consent.' Chain code says "I need to get data from carrier." Trigger in Kubernetes - an application called the extraction processor that lives in Kubernetes in the hosted node (a node on the network - has functions that interact with HL fabric). 

...

  • calling this "data call management" - i.e., the notion of refreshing data call queue, is the equivalent of getting the current list of data calls that might be seen in the UI on the actual node, but we aren't assuming that someone has this UI because they aren't running that node, and they lack the container in the Kubernetes cluster.  We could actually call out and say "please let me know what all the outstanding data calls are," which will yield a list - and if we want to look at a particular one, that view data call request may give us the ability to look into the detail of that or make a call out to say 'let me know what this request is.'
  • Like and consent are notions that precede actual notions of the extract. JB has put the extract processing under data call processing and we start this with 'get data call request.' Whether or not this is something that is called in or called out, there is then the notion of approving data call request - part of the validation process. Making sure we want to run it.  This doesn't have to be an API call, but we should have the option not to run it. We can also at least acknowledge it and say to the other side "I'm beginning to run it."
  • Returning the results completes this process. The final column, Data Call Disposition, has a notion of Data Call Results received - did you get it? (Confirmation of receipt)
  • Two-phased consent - after is all is said and done, is it okay to release the results to this carrier? Total aggregated results etc. Called 'release and final consent.' 
  • Results of disposition of data call - last step - what is to be done with resultant data? Is it deleted? Is it stored? Is it archived? (i.e., Closure)
  • Overall, this diagram shows, logically/conceptually, what some of the major interactions might be in the course of the life cycle/data calls, that would abstract this directly through a hosted node UI, without the use of the UI directly into the chain code. There does need to be some type of agent or endpoints that will service this interface regardless of whether it is pulled or pushed.


Discussion Ensued

KS: Raised question of what is in hosted node vs what is in carrier's world. Proposed that we may have great difficulty installing all this on a carrier. 

JM: The reason on the data call management side, that the refresh data call queue exists, as a message, is that x user doesn't have access to the Kubernetes running on the node that queries the chain code to get the outstanding request... So all of this has to do with messaging that is talking to the hosted node via the interface. The adapter says what are the outstanding requests, sends back a list of current requests - a roster - in message form. Kubernetes would be on the other side of the hosted node interface.  We can do this with a very simple script that looks at a table or a file? 

JM: Questions. We say "hosted node" but what is the official logical name of the other node?

JB: This is the side of the interface that would be the adapter, that would know what to do, to talk to the hosted node. 

JM:  There is a box we call the carrier node - and it has an entry point/API with various functions. But what is the name of the thing calling this?

JB: If one has one's own node, one doesn't necessarily need the hosted node interface. One can perform these tasks directly. But for the purpose of the carriers that don't want to host their own node, the concept of the hosted node - this is the fabric node that runs on the network, but the fabric talks to it, so it isn't necessary to maintain that in one's own carrier space. 

JB: all of these mechanisms sit in the adapter on the carrier side. 

KS: We need to refactor the naming of this - the blue box that makes it work, that's in the carrier - more than just the adapter, it may be necessary to deploy some of these other pieces. 

JM: What entity calls the API? Who is the client of this?

KS: As we're doing an extraction for a data call, we're interacting with this part x to say "I consent.' Upon consent, the chain code says "I need to get data from the carrier." There is a trigger - something's listening in a pod in Kubernetes (a trigger). This is an application called the extraction processor that lives in Kubernetes in the hosted node, right next to the chain code (JB: a node on the network - has functions that interact with HL fabric).  JB: consent has to come from carrier, can't come from hosted node.

KS: Yes it can, because the carrier's interacting with the hosted node... There is a UI, we have to use the UI running on the hosted node to interact with the data call. That's what this diagram depicts. 

JB: I'm assuming that it isn't necessarily going be that easier for a carrier who doesn't have the node to access the UI that is running in the node - this may present security issue for getting access to the node?

KS: This can definitely present security issues, it's just a question of getting permissions. Basically it's SAAS at this point, something like an Atlassian Confluence. Just interacting with the system, it's hosted elsewhere. That's what this picture says - something running over there has a UI that we interact with, via permissions from the host. At some point it reaches out, because it needs to run an extraction, sends the extraction across, and that is when this takes place. 

JB: We shouldn't assume implementations of full function node, in cases where you're running your own node with the UI and the different containers that make up that cluster, if there's a dependency on getting access to this, the interface btwn a carrier and the hosted node as a svc could be done with this interface, so that you don't have any dependency on the implementation of the hosted node but you do have an interface to talk to. Ergo, whether it's a call out or call in, it can be terms of not necessarily having access to components inside the hosted noded self. 

KS: It's not receiving from the UI, it's receiving from a module running in the hosted node - the extraction processor program, that grabs the data call, takes the extraction pattern out, sends to the adapter, and says 'Please run this, I will take the results and put the results on this channel so it goes through the analytics node." The results are then transformed into a report by analytics node. We're splitting out the execution of that extraction pattern into the carrier world, because that's a stack we think will run in the carrier.

PA: Hosted node will send a request for data to the member's enterprise, member's enterprise returns the data back to the hosted nodes, and the operator will utilize UI coming from the hosted node. No operator will be operating on the enterprise.

KS: Actually if the enterprise wants to run the whole stack, this should be an option. We've laid out the above because we want to get a tech stack that is approachable and one that includes HL fabric isn't approachable in the short term.

JM: OpenIDL as a service is not a hosted node - but an environment in which you can host any number of member nodes.  

JB: But there would be a hosted node for each carrier. 


 that

KS: Disagreed with this statement. SAAS at this point. Permissions will be required for the purpose of security. This picture suggests a UI, user interacts with via permissions from the host. 

JB: We shouldn't assume implementations of full function node - that if there's a dependency on getting access to this...

...