Design and Implementation
This section describes the design of the Jeopardy service and its implementation.
Plan Fragment Migration
Every plan fragment has details about the possible sections for its associated plan item, including Typical and Maximum duration for each section. Jeopardy relies on this information for its functioning. When a plan is created, Jeopardy uses these details to define all potential plan paths based on the provided sections in the plan fragment. By considering the Typical and Maximum duration for each section, Jeopardy figures out how much time each plan path might take. This helps Jeopardy identify the critical path.
To ensure smooth performance, Jeopardy needs to move plan fragments from the catalog service to its own database before handling order status changes. Jeopardy extracts the necessary details from each plan fragment and stores them in its database for quicker and more efficient processing.
Configurations
Property | Purpose |
---|---|
catalogServiceBaseUrl | The base URL of the catalog service to fetch the plan fragments. |
catalogRetryCount | Retry count in case a request failed due to server-side exceptions from the catalog service. |
catalogRetryInterval | Interval in seconds between each retry. |
catalogServiceTrustStorePassword | In case the catalog service is exposed on HTTPS, this property holds the trust store password to establish the SSL handshake. |
catalogServiceTrustStoreType | In case the catalog service is exposed on HTTPS, this property holds the trust store type to establish the SSL handshake. |
catalogServiceTrustStoreFileName | In case the catalog service is exposed on HTTPS, this property holds the trust store file's name, available in classpath, to establish the SSL handshake. |
enableSecureAPI | This is the security enabled on the catalog service. |
apiKey | This is the key to determine the HashKey for inter-service communication in case enableSecureApi is true. |
riskThreshold | Percentage of Typical Duration used to calculate the Hazard Duration. |
outOfScopeThreshold | Percentage of Maximum Duration used to calculate the out-of-scope Durations. |
Process
The following information is extracted or calculated from the plan fragment:
-
PlanFragmentID
-
PlanFragmentName
-
PlanFragmentVersion
-
Sections (list of all the sections mentioned in the plan fragment)
-
PerfValues (this is the map that contains the duration of all the sections available in the plan fragment)
-
Typical Duration: Extracted directly from the plan fragment.
-
Hazard Duration: Calculated using the riskThreshold and Typical Duration.
-
Maximum Duration: Extracted directly from the plan fragment.
-
OutOfScope Duration: Calculated using the outOfScopeThreshold and Maximum Duration.
-
-
InferredPerfValues
-
There could be a possibility that not all possible traversable section information is provided in the Plan Fragment. In this case, Jeopardy can infer the PerfValues of those traversable sections.
-
Jeopardy prepares an undirected graph of milestones, acting as a map that shows connections between milestones.
-
Each milestone is a point on the map (Vertex), and the distance between them indicates which milestones are connected (Edge).
-
Using the Breadth-First Search (BFS) path-finding algorithm, Jeopardy identifies paths and calculates distances between milestones based on performance values.
-
Approaches
Plan fragment migrations are performed using the following methods:
-
Rest API: Jeopardy introduces a new REST API for plan fragment migration. When triggered, this API initializes the migration process by making a REST call to the
v1/planfragmentmodel/all
endpoint of the catalog service. It fetches 20 plan fragments in a single call. For handling multiple calls efficiently, it employs Java's CompletableFuture and maximizes parallelism./v1/plan-fragment/migrate
-
Plan Fragment Refresh: Whenever a plan fragment is added to the catalog service through REST or JMS, the catalog service dispatches a notification to the
tibco.fos.global.cache.clean.publish
topic. Jeopardy subscribes to this topic, and on receiving a notification, it initiates the migration process for that specific plan fragment. This is done by making a REST call to the/v1/planfragmentmodel/bulk
endpoint of the catalog service.
-
Start and End Time Computation of Section
Jeopardy calculates two duration maps for each plan item section, each serving a distinct purpose:
-
EarlyStartMap: This map captures the earliest possible start time of a section among all plan paths. It represents the initiation time of a section, considering dependencies and the critical path.
For example,
If the section is in Execution state,
earlyStartTime = actualStartTime of Section
Otherwise,
If the node is the Virtual Start Node, earlyStartTime is set to the plan start time.
For other nodes, the early start time is determined by selecting the maximum value between the parent's start time (if the dependency is based on the start milestone) and the parent's end time (if the dependency is based on the end milestone) in the parent-child relationship of the nodes.
-
EarlyFinishMap: This map denotes the earliest finish time of a section among all the plan paths. It signifies the earliest point at which a section could be completed, considering dependencies and the critical path.
For example,
If the section is in the Completed state,
EarlyFinishTime = ActualEndTime
Otherwise,
EarlyFinishTime is calculated as EarlyStartTime + Duration Value + Total Suspension Time (of the section).
The values in these maps depend on all the dependent sections as a plan item section can belong to multiple plan paths.
For each duration type, the calculations are as follows:
Communication Flow for Plan Monitoring
Submit Order Execution Plan
Jeopardy maintains records of plan item, milestone, and plan completion timestamps. To effectively process status change notifications dispatched by the Orchestrator, Jeopardy requires prior knowledge of the plan details. AOPD ensures this by submitting the plan to Jeopardy via a REST API before sending it to Jeopardy. This ensures that Jeopardy processes the plan before receiving status change notifications from the orchestrator.
Configurations
Property | Purpose |
---|---|
riskThreshold | Percentage of Typical Duration used to calculate the Hazard Duration. |
outOfScopeThreshold | Percentage of Maximum Duration used to calculate the out-of-scope Durations |
Process
On receiving the plan, Jeopardy populates the Plan, Plan_Instance, and Milestone tables, including Virtual Start and Virtual End plan items.
Database tables
It populates the following tables:
-
Plan_instance
-
Plan_item_instance
-
Milestone
Status Change Notification Listener
To monitor the completion status of plans, plan items, and milestones, Jeopardy subscribes to the outbound status change notifications dispatched by the Orchestrator. It selectively processes the following types of notifications:
Order Status Change Notification
The purpose of monitoring the order status change notifications is to enable Jeopardy to respond to specific statuses, particularly the Withdrawn status. When the Orchestrator dispatches an order status change notification indicating that an order has been withdrawn, Jeopardy listens to this notification and takes appropriate actions to reflect the updated status.
Behavior
When Jeopardy receives an order status change notification with the newStatus set to "Withdrawn":
-
Jeopardy updates the status of the associated plan to "Withdrawn" without deleting the plan instance from the database. This approach ensures graceful handling of any other pending notifications.
-
Additionally, Jeopardy deletes the corresponding entry from the Time Window table if it exists. This action ensures that JeopardyDetectionCycle stops monitoring the plan for this order.
Plan Status Change Notification
This section describes about the notifications whenever there is a change in the status of plans.
Transition from Pending to Execution
When a plan transitions from "Pending" to "Execution", Plan execution is started. Jeopardy systematically processes the transition from "Pending" to "Execution", updates relevant tables, and prepares the necessary data structures to start plan monitoring and management. The detailed process includes:
-
Move Plan to Execution
Jeopardy updates specific columns in the plan_instance table to reflect the transition:
-
planStartTime: Set to
eventTimeInMillis
. -
actualStartTime: Set to
eventTimeInMillis
. -
lastStatusChangeTime: Updated to
eventTimeInMillis
. -
status: Changed to "
Execution
". -
startNotificationReceived: Marked as "
true
". -
currentRiskRegion: Set to "
NORMAL
".
-
-
Complete Virtual Start Plan Item
The Virtual Start Plan Item (identified by
id = "__START_PLAN_ITEM
") is marked as completed by updating the relevant columns in thePlan_Item_Instance
table:-
status: Set to "
COMPLETE
". -
actualEndTime: Updated to
eventTimeInMillis
.
-
-
Complete All Milestones in Virtual Plan Item
All milestones within the virtual plan item are marked as completed by updating the status and
actualRelease
columns in the Milestone table:-
status: Set to "
COMPLETE
". -
actualRelease: Updated to
eventTimeInMillis
.
-
-
Dispatch Initial Plan Path Request
Request to prepare the Initial Plan Path is dispatched to the
planPathRequestNotificationDeliveryQueue
queue.For more information, see the PlanPathRequestEventListener section.
When a plan undergoes suspension, Jeopardy ensures it is cognizant of this change, allowing for the incorporation of plan suspension time into the monitoring process. The detailed process involves the following steps:
-
Move Plan to Suspension
Jeopardy updates the
plan_instance
table to represent accurately the latest change in the plan-
status: Set to "
Suspended
".
-
-
Suspend the Started Adjacency
All sections currently in "Execution" status are marked as "Suspended" to acknowledge the plan's suspension
-
sectionStatus: Set to "
Suspended
". -
previousSectionStatus: Set to "
Execution
". -
lastStatusChangeTime: Updated to
eventTimeInMillis
.
-
-
Removing Entries from the Time Window table
As the plan enters a suspended state, Jeopardy ceases monitoring by eliminating all corresponding entries for the plan from the
time_window
table.
Transition from Suspend to Execution
When a plan transitions back to the Execution state from Suspension, Jeopardy considers the duration the plan spent in suspension. During the Jeopardy Detection Cycle, if the plan experienced a period of suspension, Jeopardy adds this duration to the predictedEndTime to assess if the plan is at risk. The detailed process is outlined as follows:
-
Move plan to Execution
Jeopardy updates the plan_instance table to accurately reflect the latest change.
status = Execution
-
Restart all suspended adjacency
-
For all sections that are suspended, Jeopardy calculates the duration the section spent in the suspended state.
suspensionTime = eventTimeInMillis - lastStatusChangedTime
-
It updates the earlyFinishMap with the additional suspensionTime.
-
Jeopardy then updates the plan_adjacency table with the following:
-
sectionStatus = Start
-
previousSectionStatus = SUSPENDED
-
suspensionTime = Computed suspensionTime
-
lastStatusChangeTime = eventTimeInMillis
-
-
-
Dispatch Rebuild Plan Path Request
The request to rebuild the plan path is dispatched to the planPathRequestNotificationDeliveryQueue.
Refer to PlanPathRequestEventListener section for more information.
Transition to Complete or Canceled
When a plan reaches a final state, Jeopardy takes specific actions to account for this transition and appropriately updates its records. The process involves the following steps:
-
Move plan to Complete or Canceled
Jeopardy updates the
plan_instance
table with the following:-
status = Complete or Canceled
-
actualEndTime = eventTimeInMillis
-
lastStatusChangeTime = eventTimeInMillis
-
-
Complete Virtual End Plan Item
Virtual End Plan item (
id = "__END_PLAN_ITEM"
) is marked as completed, signifying that the plan has reached its final state.-
status = Complete
-
actualEndTime = eventTimeInMillis
-
-
Complete all Milestones in Virtual End Plan Item
All milestones of the Virtual End Plan Item are marked as completed in the milestone table:
-
status = Complete
-
actualRelease = eventTimeInMillis
-
-
Purging All data for Short Lived
Short-lived orders are not monitored by Jeopardy. Thus, once the plan reaches its final state for a short-lived order, all corresponding data are removed from the following tables:
-
plan_instance
-
plan_item_instance
-
milestones
-
-
Compute Plan Expected Finish Times and Determine Risk Region
-
Plan Expected Finish Times are computed based on the last real nodes of the Critical Paths.
-
Plan Expected Typical Finish Time (planExpectedTypicalFinishTime) is calculated using the TypicalEarlyFinishTime state of the last real node of the Typical Critical Path.
-
Plan Expected Maximum Finish Time (planExpectedMaximumFinishTime) is calculated using the MaximumEarlyFinishTime state of the last real node of the Maximum Critical Path.
-
Plan Expected Out of Scope Finish Time (planExpectedOosFinishTime) is calculated using the OosEarlyFinishTime state of the last real node of the Maximum Critical Path.
-
The Risk Region is determined based on the actualEndTime in comparison to the expected finish times:
-
If actualEndTime < planExpectedTypicalFinishTime, riskRegion = Normal
-
If planExpectedTypicalFinishTime < actualEndTime < planExpectedMaximumFinishTime, riskRegion = Hazard
-
If planExpectedMaximumFinishTime < actualEndTime < planExpectedOosFinishTime, riskRegion = Critical
-
If actualEndTime > planExpectedOosFinishTime, riskRegion = Out of Scope
-
-
The computed riskRegion is updated in
plan_instance
:-
currentRiskRegion = riskRegion
-
-
-
Complete all Sections of Virtual End Plan Item
All sections for the Virtual End Plan Item are marked as completed in the
plan_adjacency
table. -
Removing Entries from the Time Window table
As the plan enters a final state, Jeopardy stops monitoring by removing all corresponding entries for the plan from the time_window table.
Plan Item Status Change Notification
Whenever a plan item undergoes state changes, the orchestrator dispatches plan item status change notifications. These notifications fall into two categories based on the Action header in the JMS message: REQUEST and RESPONSE.
The Action REQUEST indicates that a PlanItemExecuteRequest was dispatched during this transition, typically occurring when a plan item shifts from Pending to Execution. Notifications with Action RESPONSE signify that this transition occurred based on the response from the Southbound System.
Processing Plan Item Status Change Notification with Action REQUEST
-
Complete the Start milestone
-
As the Plan Item Execute response is dispatched during this transition, indicating the completion of the start milestone for this plan item.
-
Jeopardy updates the Start milestone of this plan item by modifying the milestone table with the following information:
-
status = COMPLETE
-
actualRelease = eventTimeInMillis
-
-
-
Mark the plan item as under processing
isUnderProcessing is updated to true in the plan_item_instance table for this plan item.
-
Move the plan item to Execution
The plan_item_instance table is updated with the following information:
-
status = "EXECUTION"
-
riskRegion = "NORMAL"
-
actualStartTime = eventTimeInMillis
-
typicalEndTimestamp = eventTimeInMillis + planItemTypicalDuration
(Typical Duration of Start to End section available in PC) -
maximumEndTimestamp = eventTimeInMillis + planItemMaximumDuration
(Maximum Duration of Start to End section available in PC)
-
-
Start all sections with the start milestone as Plan Item Start Milestone
The plan_adjacency table, where
start_milestone = "START"
, is updated with the following information:-
sectionStatus = "Start"
-
actualStartTime = eventTimeInMillis
-
-
Unmark the plan item from under processing
isUnderProcessing is updated to false in the
plan_item_instance
table for this plan item.
Processing Plan Item Status Change Notification with Action RESPONSE
For non-executing plan items, the transition occurs directly from PENDING to COMPLETE. In this case, the orchestrator dispatches the plan item status change notification with Action as RESPONSE. Therefore, handling such plan items involves some steps similar to those done when the action is REQUEST.
-
Steps specific For Non-Executing Plan Item
-
Update Actual Start Time
-
Non-executing plan items transition directly from PENDING to COMPLETE. Hence, startTime and endTime are the same for such plan items.
-
Jeopardy updates the plan_item table with the following information:
-
actual_start_time = eventTimeInMillis
-
-
-
Complete the Start Milestone
Jeopardy updates the milestone table, where the milestone id is START, with the following information:
-
actualRelease = eventTimeInMillis
-
status = Complete
-
-
Start all sections with the start milestone as Plan Item Start Milestone
The plan_adjacency table, where start_milestone = "START", is updated with the following information:
-
sectionStatus = "Start"
-
actualStartTime = eventTimeInMillis
-
-
-
Mark the plan item as under processing
isUnderProcessing is updated to true in the plan_item_instance table for this plan item.
-
Complete the plan item
Jeopardy updates the plan_item_instance table with the following information:
-
status = COMPLETE
-
actualEndTime = eventTimeInMillis
-
-
Complete the END Milestone
Jeopardy updates the milestone table, where
milestoneid = 'END'
, with the following information:-
status = Complete
-
actualRelease = eventTimeInMillis
-
-
Update all sections for this plan item where endMilestone = 'END'
-
Compute the risk region
-
Compute the time taken for this section to complete:
timeTaken = eventTimeInMillis - sectionStartTime - sectionSuspensionTime
-
If timeTaken > section's Maximum Duration,
riskRegion = CRITICAL
-
If timeTaken > section's Typical Duration,
riskRegion = HAZARD
-
Else
riskRegion = NORMAL
-
-
Update the section in the plan_adjacency table with the following information:
-
actualEndTime = eventTimeInMillis
-
sectionStatus = COMPLETE
-
riskRegion = Computed Risk Region
-
-
Remove the section from the Time window table as this section is completed and no longer requires monitoring.
-
-
Unmark the plan item from under processing
isUnderProcessing is updated to false in the plan_item_instance table for this plan item
-
Dispatch rebuild plan path request
Request to rebuild the plan path is dispatched to the
planPathRequestNotificationDeliveryQueue
.Refer to PlanPathRequestEventListener section for more information.
Milestone Status Change Notification
The orchestrator dispatches status notifications only for intermediate milestones. Jeopardy takes note of this status change and performs the following steps:
-
Complete the Milestone
Jeopardy updates the milestone table with the following information:
-
status = Complete
-
actualRelease = eventTimeInMillis
-
-
Mark the Plan item as under processing
isUnderProcessing is updated to true in the
plan_item_instance
table for this plan item. -
Process Sections where startMilestoneId = given milestone
Jeopardy updates such sections in plan_adjacency with the following information:
-
sectionStatus = START
-
actualStartTime = eventTimeInMillis
-
-
Process Sections where endMilestoneId = given milestone
For every such section,
-
Compute the risk region
-
Compute the time taken for this section to complete
-
timeTaken = eventTimeInMillis - sectionStartTime - sectionSuspensionTime
-
If timeTaken > section's Maximum Duration, riskRegion = CRITICAL
-
If timeTaken > section's Typical Duration, riskRegion = HAZARD
-
Else
riskRegion = NORMAL
-
-
Update the section in the plan_adjacency table with the following information:
-
actualEndTime = eventTimeInMillis
-
sectionStatus = COMPLETE
-
riskRegion = Computed Risk Region
-
-
Remove the section from the Time window table as this section is completed and no longer requires monitoring.
-
-
Unmark the plan item from under processing
isUnderProcessing is updated to false in the plan_item_instance table for this plan item
-
Dispatch Rebuild Plan Path Request
The request to rebuild the plan path is dispatched to the planPathRequestNotificationDeliveryQueue.
Refer to PlanPathRequestEventListener section for more information.
Plan Path Computation
Jeopardy employs a depth-first approach to generate all plan paths and updates earlyStart and earlyFinish for each section. In this methodology, the Virtual Start Node is treated as the root node.
Process
-
For the given node, compute the EarlyStartMap and EarlyFinishMap of the Virtual Start Node.
-
If the node is not a virtual node,
-
Populate the time_window table for MUST_START detection with a typical earlyStartTime as the expectedTime.
-
Populate the time_window table for TYPICAL_DURATION detection with typical earlyFinishTime and MAX_DURATION detection with max earlyFinishTime as expectedTime.
-
-
If the section has a dependency,
-
Repeat the entire process
-
If a section has multiple dependencies, the path branches, creating a path of execution
-
-
If the section does not have any dependency
-
Consider the path as it ended.
-
Add this path to the list of generated paths.
-
PlanPathRequestEventListener
This component serves as a dedicated listener for handling plan path requests throughout various stages of the plan's lifecycle. These requests might be initiated as either an initial plan path request or a rebuild plan path request.
Process
Initial Plan Path Request
-
Populate Plan Adjacency
The plan adjacency, a representation of plan item sections within the plan, is computed and stored in the plan_adjacency table.
-
Prepare and Populate Plan Paths
-
On obtaining section information, the system initiates the preparation of all possible plan paths.
-
Computed plan paths are subsequently saved in the plan_path table.
-
-
Determine Critical Plan Path
The critical plan path, denoting the longest sequence through the plan, is computed and stored in the plan_critical_path table.
-
Determine Plan Expected End Time
The predicted end time in the critical path is considered as the plan's expected end time.
-
Determine if the Plan is ShortLived
-
Jeopardy identifies short-lived plans, where the difference between predicted end time and plan start time is less than the specified threshold (shortLivedThresholdInMinutes).
-
Short-Lived plans are excluded from monitoring.
-
-
Populate the Time Window table for All Plan Sections
Sections not yet completed are stored in the time_window table, enabling Jeopardy to commence monitoring during the next Jeopardy Detection Cycle.
Amendment Plan Path Request
-
Populate Plan Adjacency
-
The plan adjacency, a representation of plan item sections within the plan, is computed and stored in the plan_adjacency table.
-
In the case of an amendment, the plan might contain sections that were previously present and some newly introduced sections.
-
Jeopardy would delete sections no longer present in the plan and add new sections while keeping existing ones intact.
-
-
Prepare and Populate Plan Paths
-
On obtaining section information, the system initiates the preparation of all possible plan paths.
-
Computed plan paths are subsequently saved in the plan_path table.
-
-
Determine Critical Plan Path
The critical plan path, denoting the longest sequence through the plan, is computed and stored in the plan_critical_path table.
-
Determine Plan Expected End Time
The predicted end time in the critical path is considered as the plan's expected end time.
-
Determine if the Plan is ShortLived
-
Jeopardy identifies short-lived plans, where the difference between predicted end time and plan start time is less than the specified threshold (shortLivedThresholdInMinutes).
-
Short-Lived plans are excluded from monitoring.
-
-
Populate the Time Window table for All Plan Sections
Sections not yet completed are stored in the time_window table, enabling Jeopardy to commence monitoring during the next Jeopardy Detection Cycle.
Rebuild Plan Path Request
-
Stop Plan Monitoring during Plan Path Rebuild
Delete all entries from the time_window table corresponding to the given planId and tenantId.
-
Check if any plan items are still being processed
-
Given the possibility of multiple plan items being processed simultaneously by Jeopardy, the system updates the isUnderProcessing flag to true before processing each plan item. This flag is then set to false once Jeopardy completes the processing of that specific plan item.
-
If any plan item for the plan is still being processed, skip the plan path rebuild request.
-
-
Prepare and Populate Plan Paths
-
By now some of the sections are updated with their actualStartTime and actualEndTime.
-
Jeopardy uses this information and prepares plan paths again.
-
The system then saves the updated plan paths in the Plan_Path table.
-
-
Determine Critical Plan Path
The critical plan path, denoting the longest sequence through the plan, is computed and stored in the plan_critical_path table.
-
Determine Plan Expected End Time
The predicted end time in the critical path is considered as the plan's expected end time.
-
Populate the Time Window table for All Plan Sections
Sections not yet completed are stored in the time_window table, enabling Jeopardy to commence monitoring during the next Jeopardy Detection Cycle.
Pending Jeopardy Events
The orchestrator dispatches status change notifications for all orders. In multi-instance scenarios, there is a possibility that certain order status change notifications are picked by one instance while another instance is still processing plan development notifications. To handle this, Jeopardy introduces the concept of pending jeopardy events.
Plan Availability
Jeopardy processes incoming notifications only if the plan is available for processing. If the plan is not available for processing, the incoming event is saved in the pending_jeopardy_events table. These events are processed after a plan path is created for the respective plan.
Conditions of Plan Availability
-
Is the plan available in the plan_instance table? If not, the plan is not available.
-
If the plan is available, is the plan under amendment? If yes, the plan is not available.
-
If the plan is available and not under amendment, do plan paths exist for the given plan? If not, the plan is not available.
-
If the plan is available, not under amendment, and the plan path request is processed, then the plan is available.
Processing of Pending Jeopardy events
After a plan path request is processed for a newly created plan or for a plan that was amended, Jeopardy dispatches a notification to the jeopardy.pending.events.notification queue. This queue is used by Jeopardy to process all pending jeopardy events for a given plan asynchronously. After the jeopardy events are processed, an event is deleted from the pending_jeopardy_events
table.