Design and Implementation

This section describes the design of the Jeopardy service and its implementation.

Plan Fragment Migration

Every plan fragment has details about the possible sections for its associated plan item, including Typical and Maximum duration for each section. Jeopardy relies on this information for its functioning. When a plan is created, Jeopardy uses these details to define all potential plan paths based on the provided sections in the plan fragment. By considering the Typical and Maximum duration for each section, Jeopardy figures out how much time each plan path might take. This helps Jeopardy identify the critical path.

To ensure smooth performance, Jeopardy needs to move plan fragments from the catalog service to its own database before handling order status changes. Jeopardy extracts the necessary details from each plan fragment and stores them in its database for quicker and more efficient processing.

Configurations

Property Purpose
catalogServiceBaseUrl The base URL of the catalog service to fetch the plan fragments.
catalogRetryCount Retry count in case a request failed due to server-side exceptions from the catalog service.
catalogRetryInterval Interval in seconds between each retry.
catalogServiceTrustStorePassword In case the catalog service is exposed on HTTPS, this property holds the trust store password to establish the SSL handshake.
catalogServiceTrustStoreType In case the catalog service is exposed on HTTPS, this property holds the trust store type to establish the SSL handshake.
catalogServiceTrustStoreFileName In case the catalog service is exposed on HTTPS, this property holds the trust store file's name, available in classpath, to establish the SSL handshake.
enableSecureAPI This is the security enabled on the catalog service.
apiKey This is the key to determine the HashKey for inter-service communication in case enableSecureApi is true.
riskThreshold Percentage of Typical Duration used to calculate the Hazard Duration.
outOfScopeThreshold Percentage of Maximum Duration used to calculate the out-of-scope Durations.

Process

The following information is extracted or calculated from the plan fragment:

  • PlanFragmentID

  • PlanFragmentName

  • PlanFragmentVersion

  • Sections (list of all the sections mentioned in the plan fragment)

  • PerfValues (this is the map that contains the duration of all the sections available in the plan fragment)

    • Typical Duration: Extracted directly from the plan fragment.

    • Hazard Duration: Calculated using the riskThreshold and Typical Duration.

    • Maximum Duration: Extracted directly from the plan fragment.

    • OutOfScope Duration: Calculated using the outOfScopeThreshold and Maximum Duration.

  • InferredPerfValues

    • There could be a possibility that not all possible traversable section information is provided in the Plan Fragment. In this case, Jeopardy can infer the PerfValues of those traversable sections.

    • Jeopardy prepares an undirected graph of milestones, acting as a map that shows connections between milestones.

    • Each milestone is a point on the map (Vertex), and the distance between them indicates which milestones are connected (Edge).

    • Using the Breadth-First Search (BFS) path-finding algorithm, Jeopardy identifies paths and calculates distances between milestones based on performance values.

Approaches

Plan fragment migrations are performed using the following methods:

  • Rest API: Jeopardy introduces a new REST API for plan fragment migration. When triggered, this API initializes the migration process by making a REST call to the v1/planfragmentmodel/all endpoint of the catalog service. It fetches 20 plan fragments in a single call. For handling multiple calls efficiently, it employs Java's CompletableFuture and maximizes parallelism.

    /v1/plan-fragment/migrate
    • Plan Fragment Refresh: Whenever a plan fragment is added to the catalog service through REST or JMS, the catalog service dispatches a notification to the tibco.fos.global.cache.clean.publish topic. Jeopardy subscribes to this topic, and on receiving a notification, it initiates the migration process for that specific plan fragment. This is done by making a REST call to the /v1/planfragmentmodel/bulk endpoint of the catalog service.

Start and End Time Computation of Section

Jeopardy calculates two duration maps for each plan item section, each serving a distinct purpose:

  • EarlyStartMap: This map captures the earliest possible start time of a section among all plan paths. It represents the initiation time of a section, considering dependencies and the critical path.

    For example,

    If the section is in Execution state,

    earlyStartTime = actualStartTime of Section

    Otherwise,

    If the node is the Virtual Start Node, earlyStartTime is set to the plan start time.

    For other nodes, the early start time is determined by selecting the maximum value between the parent's start time (if the dependency is based on the start milestone) and the parent's end time (if the dependency is based on the end milestone) in the parent-child relationship of the nodes.

  • EarlyFinishMap: This map denotes the earliest finish time of a section among all the plan paths. It signifies the earliest point at which a section could be completed, considering dependencies and the critical path.

    For example,

    If the section is in the Completed state,

    EarlyFinishTime = ActualEndTime

    Otherwise,

    EarlyFinishTime is calculated as EarlyStartTime + Duration Value + Total Suspension Time (of the section).

The values in these maps depend on all the dependent sections as a plan item section can belong to multiple plan paths.

For each duration type, the calculations are as follows:

Communication Flow for Plan Monitoring

Submit Order Execution Plan

Jeopardy maintains records of plan item, milestone, and plan completion timestamps. To effectively process status change notifications dispatched by the Orchestrator, Jeopardy requires prior knowledge of the plan details. AOPD ensures this by submitting the plan to Jeopardy via a REST API before sending it to Jeopardy. This ensures that Jeopardy processes the plan before receiving status change notifications from the orchestrator.

Configurations

Property Purpose
riskThreshold Percentage of Typical Duration used to calculate the Hazard Duration.
outOfScopeThreshold Percentage of Maximum Duration used to calculate the out-of-scope Durations

Process

On receiving the plan, Jeopardy populates the Plan, Plan_Instance, and Milestone tables, including Virtual Start and Virtual End plan items.

Database tables

It populates the following tables:

  • Plan_instance

  • Plan_item_instance

  • Milestone

Status Change Notification Listener

To monitor the completion status of plans, plan items, and milestones, Jeopardy subscribes to the outbound status change notifications dispatched by the Orchestrator. It selectively processes the following types of notifications:

Order Status Change Notification

The purpose of monitoring the order status change notifications is to enable Jeopardy to respond to specific statuses, particularly the Withdrawn status. When the Orchestrator dispatches an order status change notification indicating that an order has been withdrawn, Jeopardy listens to this notification and takes appropriate actions to reflect the updated status.

Behavior

When Jeopardy receives an order status change notification with the newStatus set to "Withdrawn":

  • Jeopardy updates the status of the associated plan to "Withdrawn" without deleting the plan instance from the database. This approach ensures graceful handling of any other pending notifications.

  • Additionally, Jeopardy deletes the corresponding entry from the Time Window table if it exists. This action ensures that JeopardyDetectionCycle stops monitoring the plan for this order.

Plan Status Change Notification

This section describes about the notifications whenever there is a change in the status of plans.

Transition from Pending to Execution

When a plan transitions from "Pending" to "Execution", Plan execution is started. Jeopardy systematically processes the transition from "Pending" to "Execution", updates relevant tables, and prepares the necessary data structures to start plan monitoring and management. The detailed process includes:

  1. Move Plan to Execution

    Jeopardy updates specific columns in the plan_instance table to reflect the transition:

    • planStartTime: Set to eventTimeInMillis.

    • actualStartTime: Set to eventTimeInMillis.

    • lastStatusChangeTime: Updated to eventTimeInMillis.

    • status: Changed to "Execution".

    • startNotificationReceived: Marked as "true".

    • currentRiskRegion: Set to "NORMAL".

  2. Complete Virtual Start Plan Item

    The Virtual Start Plan Item (identified by id = "__START_PLAN_ITEM") is marked as completed by updating the relevant columns in the Plan_Item_Instance table:

    • status: Set to "COMPLETE".

    • actualEndTime: Updated to eventTimeInMillis.

  3. Complete All Milestones in Virtual Plan Item

    All milestones within the virtual plan item are marked as completed by updating the status and actualRelease columns in the Milestone table:

    • status: Set to "COMPLETE".

    • actualRelease: Updated to eventTimeInMillis.

  4. Dispatch Initial Plan Path Request

    Request to prepare the Initial Plan Path is dispatched to the planPathRequestNotificationDeliveryQueue queue.

    For more information, see the PlanPathRequestEventListener section.

Transition to Suspend

When a plan undergoes suspension, Jeopardy ensures it is cognizant of this change, allowing for the incorporation of plan suspension time into the monitoring process. The detailed process involves the following steps:

  1. Move Plan to Suspension

    Jeopardy updates the plan_instance table to represent accurately the latest change in the plan

    • status: Set to "Suspended".

  2. Suspend the Started Adjacency

    All sections currently in "Execution" status are marked as "Suspended" to acknowledge the plan's suspension

    • sectionStatus: Set to "Suspended".

    • previousSectionStatus: Set to "Execution".

    • lastStatusChangeTime: Updated to eventTimeInMillis.

  3. Removing Entries from the Time Window table

    As the plan enters a suspended state, Jeopardy ceases monitoring by eliminating all corresponding entries for the plan from the time_window table.

Transition from Suspend to Execution

When a plan transitions back to the Execution state from Suspension, Jeopardy considers the duration the plan spent in suspension. During the Jeopardy Detection Cycle, if the plan experienced a period of suspension, Jeopardy adds this duration to the predictedEndTime to assess if the plan is at risk. The detailed process is outlined as follows:

  1. Move plan to Execution

    Jeopardy updates the plan_instance table to accurately reflect the latest change.

    status = Execution

  2. Restart all suspended adjacency

    1. For all sections that are suspended, Jeopardy calculates the duration the section spent in the suspended state.

      suspensionTime = eventTimeInMillis - lastStatusChangedTime

    2. It updates the earlyFinishMap with the additional suspensionTime.

    3. Jeopardy then updates the plan_adjacency table with the following:

      • sectionStatus = Start

      • previousSectionStatus = SUSPENDED

      • suspensionTime = Computed suspensionTime

      • lastStatusChangeTime = eventTimeInMillis

  3. Dispatch Rebuild Plan Path Request

    The request to rebuild the plan path is dispatched to the planPathRequestNotificationDeliveryQueue.

    Refer to PlanPathRequestEventListener section for more information.

Transition to Complete or Canceled

When a plan reaches a final state, Jeopardy takes specific actions to account for this transition and appropriately updates its records. The process involves the following steps:

  1. Move plan to Complete or Canceled

    Jeopardy updates the plan_instance table with the following:

    • status = Complete or Canceled

    • actualEndTime = eventTimeInMillis

    • lastStatusChangeTime = eventTimeInMillis

  2. Complete Virtual End Plan Item

    Virtual End Plan item (id = "__END_PLAN_ITEM") is marked as completed, signifying that the plan has reached its final state.

    • status = Complete

    • actualEndTime = eventTimeInMillis

  3. Complete all Milestones in Virtual End Plan Item

    All milestones of the Virtual End Plan Item are marked as completed in the milestone table:

    • status = Complete

    • actualRelease = eventTimeInMillis

  4. Purging All data for Short Lived

    Short-lived orders are not monitored by Jeopardy. Thus, once the plan reaches its final state for a short-lived order, all corresponding data are removed from the following tables:

    • plan_instance

    • plan_item_instance

    • milestones

  5. Compute Plan Expected Finish Times and Determine Risk Region

    1. Plan Expected Finish Times are computed based on the last real nodes of the Critical Paths.

    2. Plan Expected Typical Finish Time (planExpectedTypicalFinishTime) is calculated using the TypicalEarlyFinishTime state of the last real node of the Typical Critical Path.

    3. Plan Expected Maximum Finish Time (planExpectedMaximumFinishTime) is calculated using the MaximumEarlyFinishTime state of the last real node of the Maximum Critical Path.

    4. Plan Expected Out of Scope Finish Time (planExpectedOosFinishTime) is calculated using the OosEarlyFinishTime state of the last real node of the Maximum Critical Path.

    5. The Risk Region is determined based on the actualEndTime in comparison to the expected finish times:

      • If actualEndTime < planExpectedTypicalFinishTime, riskRegion = Normal

      • If planExpectedTypicalFinishTime < actualEndTime < planExpectedMaximumFinishTime, riskRegion = Hazard

      • If planExpectedMaximumFinishTime < actualEndTime < planExpectedOosFinishTime, riskRegion = Critical

      • If actualEndTime > planExpectedOosFinishTime, riskRegion = Out of Scope

    6. The computed riskRegion is updated in plan_instance:

      • currentRiskRegion = riskRegion

  6. Complete all Sections of Virtual End Plan Item

    All sections for the Virtual End Plan Item are marked as completed in the plan_adjacency table.

  7. Removing Entries from the Time Window table

    As the plan enters a final state, Jeopardy stops monitoring by removing all corresponding entries for the plan from the time_window table.

Plan Item Status Change Notification

Whenever a plan item undergoes state changes, the orchestrator dispatches plan item status change notifications. These notifications fall into two categories based on the Action header in the JMS message: REQUEST and RESPONSE.

The Action REQUEST indicates that a PlanItemExecuteRequest was dispatched during this transition, typically occurring when a plan item shifts from Pending to Execution. Notifications with Action RESPONSE signify that this transition occurred based on the response from the Southbound System.

Processing Plan Item Status Change Notification with Action REQUEST

  1. Complete the Start milestone

    1. As the Plan Item Execute response is dispatched during this transition, indicating the completion of the start milestone for this plan item.

    2. Jeopardy updates the Start milestone of this plan item by modifying the milestone table with the following information:

      • status = COMPLETE

      • actualRelease = eventTimeInMillis

  2. Mark the plan item as under processing

    isUnderProcessing is updated to true in the plan_item_instance table for this plan item.

  3. Move the plan item to Execution

    The plan_item_instance table is updated with the following information:

    • status = "EXECUTION"

    • riskRegion = "NORMAL"

    • actualStartTime = eventTimeInMillis

    • typicalEndTimestamp = eventTimeInMillis + planItemTypicalDuration (Typical Duration of Start to End section available in PC)

    • maximumEndTimestamp = eventTimeInMillis + planItemMaximumDuration (Maximum Duration of Start to End section available in PC)

  4. Start all sections with the start milestone as Plan Item Start Milestone

    The plan_adjacency table, where start_milestone = "START", is updated with the following information:

    • sectionStatus = "Start"

    • actualStartTime = eventTimeInMillis

  5. Unmark the plan item from under processing

    isUnderProcessing is updated to false in the plan_item_instance table for this plan item.

Processing Plan Item Status Change Notification with Action RESPONSE

For non-executing plan items, the transition occurs directly from PENDING to COMPLETE. In this case, the orchestrator dispatches the plan item status change notification with Action as RESPONSE. Therefore, handling such plan items involves some steps similar to those done when the action is REQUEST.

  1. Steps specific For Non-Executing Plan Item

    1. Update Actual Start Time

      • Non-executing plan items transition directly from PENDING to COMPLETE. Hence, startTime and endTime are the same for such plan items.

      • Jeopardy updates the plan_item table with the following information:

        • actual_start_time = eventTimeInMillis

    2. Complete the Start Milestone

      Jeopardy updates the milestone table, where the milestone id is START, with the following information:

      • actualRelease = eventTimeInMillis

      • status = Complete

    3. Start all sections with the start milestone as Plan Item Start Milestone

      The plan_adjacency table, where start_milestone = "START", is updated with the following information:

      • sectionStatus = "Start"

      • actualStartTime = eventTimeInMillis

  2. Mark the plan item as under processing

    isUnderProcessing is updated to true in the plan_item_instance table for this plan item.

  3. Complete the plan item

    Jeopardy updates the plan_item_instance table with the following information:

    • status = COMPLETE

    • actualEndTime = eventTimeInMillis

  4. Complete the END Milestone

    Jeopardy updates the milestone table, where milestoneid = 'END', with the following information:

    • status = Complete

    • actualRelease = eventTimeInMillis

  5. Update all sections for this plan item where endMilestone = 'END'

    1. Compute the risk region

      1. Compute the time taken for this section to complete:

        timeTaken = eventTimeInMillis - sectionStartTime - sectionSuspensionTime

      2. If timeTaken > section's Maximum Duration, riskRegion = CRITICAL

      3. If timeTaken > section's Typical Duration, riskRegion = HAZARD

      4. Else riskRegion = NORMAL

    2. Update the section in the plan_adjacency table with the following information:

      • actualEndTime = eventTimeInMillis

      • sectionStatus = COMPLETE

      • riskRegion = Computed Risk Region

    3. Remove the section from the Time window table as this section is completed and no longer requires monitoring.

  6. Unmark the plan item from under processing

    isUnderProcessing is updated to false in the plan_item_instance table for this plan item

  7. Dispatch rebuild plan path request

    Request to rebuild the plan path is dispatched to the planPathRequestNotificationDeliveryQueue.

    Refer to PlanPathRequestEventListener section for more information.

Milestone Status Change Notification

The orchestrator dispatches status notifications only for intermediate milestones. Jeopardy takes note of this status change and performs the following steps:

  1. Complete the Milestone

    Jeopardy updates the milestone table with the following information:

    • status = Complete

    • actualRelease = eventTimeInMillis

  2. Mark the Plan item as under processing

    isUnderProcessing is updated to true in the plan_item_instance table for this plan item.

  3. Process Sections where startMilestoneId = given milestone

    Jeopardy updates such sections in plan_adjacency with the following information:

    • sectionStatus = START

    • actualStartTime = eventTimeInMillis

  4. Process Sections where endMilestoneId = given milestone

    For every such section,

    1. Compute the risk region

      • Compute the time taken for this section to complete

      • timeTaken = eventTimeInMillis - sectionStartTime - sectionSuspensionTime

      • If timeTaken > section's Maximum Duration, riskRegion = CRITICAL

      • If timeTaken > section's Typical Duration, riskRegion = HAZARD

      • Else riskRegion = NORMAL

    2. Update the section in the plan_adjacency table with the following information:

      • actualEndTime = eventTimeInMillis

      • sectionStatus = COMPLETE

      • riskRegion = Computed Risk Region

    3. Remove the section from the Time window table as this section is completed and no longer requires monitoring.

  5. Unmark the plan item from under processing

    isUnderProcessing is updated to false in the plan_item_instance table for this plan item

  6. Dispatch Rebuild Plan Path Request

    The request to rebuild the plan path is dispatched to the planPathRequestNotificationDeliveryQueue.

    Refer to PlanPathRequestEventListener section for more information.

Plan Path Computation

Jeopardy employs a depth-first approach to generate all plan paths and updates earlyStart and earlyFinish for each section. In this methodology, the Virtual Start Node is treated as the root node.

Process

  • For the given node, compute the EarlyStartMap and EarlyFinishMap of the Virtual Start Node.

  • If the node is not a virtual node,

    • Populate the time_window table for MUST_START detection with a typical earlyStartTime as the expectedTime.

    • Populate the time_window table for TYPICAL_DURATION detection with typical earlyFinishTime and MAX_DURATION detection with max earlyFinishTime as expectedTime.

  • If the section has a dependency,

    • Repeat the entire process

    • If a section has multiple dependencies, the path branches, creating a path of execution

  • If the section does not have any dependency

    • Consider the path as it ended.

    • Add this path to the list of generated paths.

PlanPathRequestEventListener

This component serves as a dedicated listener for handling plan path requests throughout various stages of the plan's lifecycle. These requests might be initiated as either an initial plan path request or a rebuild plan path request.

Process

Initial Plan Path Request

  • Populate Plan Adjacency

    The plan adjacency, a representation of plan item sections within the plan, is computed and stored in the plan_adjacency table.

  • Prepare and Populate Plan Paths

    • On obtaining section information, the system initiates the preparation of all possible plan paths.

    • Computed plan paths are subsequently saved in the plan_path table.

  • Determine Critical Plan Path

    The critical plan path, denoting the longest sequence through the plan, is computed and stored in the plan_critical_path table.

  • Determine Plan Expected End Time

    The predicted end time in the critical path is considered as the plan's expected end time.

  • Determine if the Plan is ShortLived

    • Jeopardy identifies short-lived plans, where the difference between predicted end time and plan start time is less than the specified threshold (shortLivedThresholdInMinutes).

    • Short-Lived plans are excluded from monitoring.

  • Populate the Time Window table for All Plan Sections

    Sections not yet completed are stored in the time_window table, enabling Jeopardy to commence monitoring during the next Jeopardy Detection Cycle.

Amendment Plan Path Request

  • Populate Plan Adjacency

    • The plan adjacency, a representation of plan item sections within the plan, is computed and stored in the plan_adjacency table.

    • In the case of an amendment, the plan might contain sections that were previously present and some newly introduced sections.

    • Jeopardy would delete sections no longer present in the plan and add new sections while keeping existing ones intact.

  • Prepare and Populate Plan Paths

    • On obtaining section information, the system initiates the preparation of all possible plan paths.

    • Computed plan paths are subsequently saved in the plan_path table.

  • Determine Critical Plan Path

    The critical plan path, denoting the longest sequence through the plan, is computed and stored in the plan_critical_path table.

  • Determine Plan Expected End Time

    The predicted end time in the critical path is considered as the plan's expected end time.

  • Determine if the Plan is ShortLived

    • Jeopardy identifies short-lived plans, where the difference between predicted end time and plan start time is less than the specified threshold (shortLivedThresholdInMinutes).

    • Short-Lived plans are excluded from monitoring.

  • Populate the Time Window table for All Plan Sections

    Sections not yet completed are stored in the time_window table, enabling Jeopardy to commence monitoring during the next Jeopardy Detection Cycle.

Rebuild Plan Path Request

  • Stop Plan Monitoring during Plan Path Rebuild

    Delete all entries from the time_window table corresponding to the given planId and tenantId.

  • Check if any plan items are still being processed

    • Given the possibility of multiple plan items being processed simultaneously by Jeopardy, the system updates the isUnderProcessing flag to true before processing each plan item. This flag is then set to false once Jeopardy completes the processing of that specific plan item.

    • If any plan item for the plan is still being processed, skip the plan path rebuild request.

  • Prepare and Populate Plan Paths

    • By now some of the sections are updated with their actualStartTime and actualEndTime.

    • Jeopardy uses this information and prepares plan paths again.

    • The system then saves the updated plan paths in the Plan_Path table.

  • Determine Critical Plan Path

    The critical plan path, denoting the longest sequence through the plan, is computed and stored in the plan_critical_path table.

  • Determine Plan Expected End Time

    The predicted end time in the critical path is considered as the plan's expected end time.

  • Populate the Time Window table for All Plan Sections

    Sections not yet completed are stored in the time_window table, enabling Jeopardy to commence monitoring during the next Jeopardy Detection Cycle.

Pending Jeopardy Events

The orchestrator dispatches status change notifications for all orders. In multi-instance scenarios, there is a possibility that certain order status change notifications are picked by one instance while another instance is still processing plan development notifications. To handle this, Jeopardy introduces the concept of pending jeopardy events.

Plan Availability

Jeopardy processes incoming notifications only if the plan is available for processing. If the plan is not available for processing, the incoming event is saved in the pending_jeopardy_events table. These events are processed after a plan path is created for the respective plan.

Conditions of Plan Availability

  • Is the plan available in the plan_instance table? If not, the plan is not available.

  • If the plan is available, is the plan under amendment? If yes, the plan is not available.

  • If the plan is available and not under amendment, do plan paths exist for the given plan? If not, the plan is not available.

  • If the plan is available, not under amendment, and the plan path request is processed, then the plan is available.

Processing of Pending Jeopardy events

After a plan path request is processed for a newly created plan or for a plan that was amended, Jeopardy dispatches a notification to the jeopardy.pending.events.notification queue. This queue is used by Jeopardy to process all pending jeopardy events for a given plan asynchronously. After the jeopardy events are processed, an event is deleted from the pending_jeopardy_events table.