COMPANY |
Crisis Event Management and Operations Playbook |
|
About this documentation:
This document outlines all system recovery steps to be taken in the event of any crisis event.
This document was last reviewed and updated on: May 20, 2019, by: Drae J. Namaste-Rose
Table of Contents
Crisis
Event Definition and Operational Response Procedure
Generalized
Recovery Scenarios
Cannot Send
MESSAGES 1 To SITE 2(s)
Cannot Send
MESSAGES 2 To SITE 2(s)
Cannot
Process MESSAGES 1 From SITE 2(s)
Cannot
Process MESSAGES 2 From SITE 2(s)
Experiencing
Issues in PROCESSING
Network
Issues Compromise Dual Data Center Connectivity
Needs to
Claim Self Help; Another SITE’s Problems Compromising Their Own Published MESSAGING
Server
Specific Recovery Scenarios
APPGROUP08
or APPGROUP09 Server
APPGROUP11
Server (APP31 – SITE 2 to Multicast)
APPGROUP13
Server (APP36 – to SITE 2 MESSAGES 1 and MESSAGES 2, MESSAGING, MESSAGING and
TRF)
Application
Specific Recoveries
APP05 (MESSAGING
Reader Application)
APP11 (SITE
2 MESSAGES 1 MESSAGING Processor)
APP13 (SITE
2 MESSAGING Processor)
APP19
(Non-Binary / XML Loaders).
APP19
(Binary – SUBGROUP01, SUBGROUP02, SUBGROUP03, SUBGROUP04, SUBGROUP05)
APP26 (MESSAGING
APP20 PROCESSOR)
APP30 (APP02/SUBGROUP02/SUBGROUP03)
APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04)
APP36 (SUBGROUP01,
SUBGROUP02, APP06)
APP38 (MESSAGING
Service Bridge Process)
APP41 Data
Stores (MESSAGING and Outbound MESSAGING)
A Crisis Event is defined as any event outside of normal
operational procedures that would require management assistance to resolve.
For every Crisis Event:
1) Event Management (Specific Staff List) must be informed via text and email, and/or phone as necessary.
2) Event Management must determine if Executive (Specific Staff List) should be advised, and if executive contacts are necessary.
3) Control room/Help Desk staff will communicate the event (via THIRD PARTY) to Event Management Teams and initialize EXECUTIVE conference call.
4) Event Managers will start addressing event, with deference given to EXECUTIVE (Specific Staff List) on proposed solution approvals.
5) Control room/Help Desk staff will provide 15 minute status updates (via THIRD PARTY) based on EXECUTIVE Conference Call communications.
For all Crisis Management situations,
the following roles must be accounted for:
Role: |
Responsibility: |
Control Room Help Desk |
Initial Response to Event; Primarily responsible for THIRD PARTY text and email communications, and initializing EXECUTIVE Conference call. During event, continues trader support and reporting issues seen to Incident/Communication Managers. |
Control Room Operations |
Initial Response to Event; Secondarily responsible for THIRD PARTY text and email communications, and initializing EXECUTIVE Conference call. During event, continues monitoring and reporting issues seen to Incident/Communication Managers. |
EXECUTIVE Conference Call Manager (Specific Staff List) |
Manages communications between conference call and control room, as well as SMS/email updates. |
Incident Manager (Specific Staff List) |
Assigns and manages all event research, corrective action, and communication staff, as well as their activity. |
Communication Manager (Specific Staff List) |
AAPP13 as liaison between staff outside of the control room and in control room as necessary. |
Compliance Regulatory Manager (Specific Staff List) |
Confirms Compliance and Regulatory issues are accounted for, in coordination with EXECUTIVE conference call. |
The following checks must be done after any major interruption in Trading, and before any resumption in Trading can be considered:
Situation |
ImpaAPP13 |
Response |
cannot send MESSAGES 1 to the SITE 2 |
During Trading Hours: cannot fulfill quoting obligations to National Market System. If problems involve SUBGROUP01 not able to connect to the SITE 2, will
also: - not be fulfilling trade reporting obligations, - not be fulfilling MESSAGING obligations, and - not be updating MESSAGING. |
1) Confirm scope of impaAPP13. Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) If problems are not immediately resolved, zero affected MESSAGES 1 if not already marked “manual” by APP10. - use NTM Control SUBGROUP01 by APP32 commands, or - Ask SITE 2 to zero MESSAGES 1 as they are able. 4) Consider suspending trading (if issues not resolved in 5 minutes). - Use NTM Control APP32 commands. 5) Consider SUBGROUP01 APP06 Bypass. - Use NTM Control SUBGROUP01 commands. 6) At tiAPP32 of resolution, confirm the most recent affected MESSAGES 1 are sent to the SITE 2. - Either SUBGROUP01 will auto-regenerate MESSAGES 1, or - Use NTM Control APP32 commands, if necessary. 7) Confirm any residual SUBGROUP02, APP06 and MESSAGING concerns are addressed. |
Scope of Impact Response Considerations:
1) If considering zeroing of MESSAGES 1, consider doing so only for the affected APP32 instances.
2) If considering suspending trading, consider doing so only for the affected APP32 instances.
3) Only consider SUBGROUP01 APP06 Bypass if SITE 2 connectivity is lost, and trade reporting and/or MESSAGING becomes a concern during outage.
4) If SUBGROUP01-to-SITE 2 connectivity was lost as part of the issue, upon reconnection to the SITE 2, the SUBGROUP01 process will request any queued MESSAGES 1 from the APP41 data store and send these to the APP19 only (not the SITE 2), and then request the most recent stock MESSAGES 1 from all connected APP32 instances and send these to the SITE 2. All MESSAGES 1, MESSAGES 2, MESSAGING and MESSAGING should be up to date at that time.
Situation |
ImpaAPP13 |
Response |
cannot send MESSAGES 2 to the SITE 2 |
During Trading Hours: cannot fulfill last sale reporting obligations to National Market System. If problems involve SUBGROUP02 not able to connect to the SITE 2,
will also: - not be fulfilling MESSAGING obligations, and - not be updating MESSAGING. |
1) Confirm scope of impaAPP13. Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) Consider suspending trading (if issues not resolved in 15 minutes). - Use NTM Control APP32 commands. 4) Consider SUBGROUP02 APP06 Bypass. - Use NTM Control SUBGROUP02 commands. 5) At tiAPP32 of resolution, confirm all MESSAGES 2 not reported during outage are resent to SITE 2 as “sold” MESSAGES 2. - Either SUBGROUP02 will auto-generate “sold” MESSAGES 2. - Use APP12 to manually resend “sold” MESSAGES 2. 6) Confirm any residual APP06 and MESSAGING concerns are addressed. |
Scope of Impact Response Considerations:
1) If considering suspending trading, consider doing so only for the affected APP32 instances.
2) Only consider SUBGROUP02 APP06 Bypass if SITE 2 connectivity is lost, and MESSAGING becomes a concern during outage.
3) If SUBGROUP02-to-SITE 2 connectivity was lost as part of the issue, upon reconnection to the SITE 2, the SUBGROUP02 process will request any queued MESSAGES 2 from the APP41 data store and send these to the SITE 2 as sold. APP12 trade reporting queries can help uncover any MESSAGES 2 not reported. APP12 may also allow for the manual re-sending of any MESSAGES 2 “sold”.
Situation |
ImpaAPP13 |
Response |
cannot process MESSAGES 1 from the SITE 2 |
During Trading Hours: At a PROCESSING level, cannot adequately validate against locked markets, trade-throughs, or suspending trading. May be trading against stale BBO information. At a MESSAGING database loading level, post trading integrity checking may be compromised. |
1) Confirm scope of impaAPP13. Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? - PROCESSING processing? - MESSAGING Database Loading? 2) Notify management. 3) Consider suspending trading (if issues not resolved in 5 minutes). - Use NTM Control APP32 commands. 4) At tiAPP32 of resolution, there will be no retransmission of lost MESSAGING to PROCESSINGs; Confirm MESSAGING is known by MEs. - Use MESSAGING MESSAGING queries. 5) After tiAPP32 of resolution, consider whether or not a reload of any missing data can be, and should be copied or replayed into the database for post trading integrity processing. |
Scope of Impact Response Considerations:
1) Inbound MESSAGING processing is configured such that there is dual redundancy supported, reading from the SITE 2s. If redundancy is not lost and MESSAGING is being delivered via at least one path, then there should be no operational impaAPP13, other than a potential loss of MESSAGING in the MESSAGING database loading processing. (Lost data in MESSAGING database loading may not be fully realized until integrity reports are run.)
2) If complete MESSAGING delivery is lost, then if considering suspending trading, consider doing so only for the affected APP32 instances.
3) If MESSAGING was received by MESSAGING Processors, but not processed correctly by MESSAGING APP19, then it is possible for this data to be replayed into the database; However, this is a complicated and tiAPP32 restrictive procedure.
Situation |
ImpaAPP13 |
Response |
cannot process MESSAGES 2 from the SITE 2 |
During Trading Hours: At a PROCESSING level, may not be able to process orders for IPO issues or Market IOC orders (if first sale not yet received). At a MESSAGING database loading level, post trading integrity checking may be compromised. |
1) Confirm scope of impaAPP13. Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? - PROCESSING processing? - MESSAGING Database Loading? 2) Notify management. 3) Manually open IPO issues as dictated by other Financial MESSAGING Vendor information. - Use NTM Control APP32 commands. 4) Consider enabling Market IOC processing in affected issues - Use NTM Control APP32 commands. 5) At tiAPP32 of resolution, there will be no retransmission of lost MESSAGING to PROCESSINGs; Confirm MESSAGING is known by MEs. 6) After tiAPP32 of resolution, consider whether or not a reload of any missing data can be, and should be copied or replayed into the database for post trading integrity processing. |
Scope of Impact Response Considerations:
1) MESSAGING Inbound processing is configured such that there is dual redundancy supported, reading from the SITE 2s. If redundancy is not lost and MESSAGING is being delivered via at least one path, then there should be no operational impaAPP13, other than a potential loss of MESSAGING in the MESSAGING database loading processing. (Lost data in MESSAGING database loading may not be fully realized until integrity reports are run.)
2) If complete MESSAGING delivery is lost, then if considering enabling Market IOC processing, consider doing so only for the affected APP32 instances.
3) If MESSAGING was received by MESSAGING Processors, but not processed correctly by MESSAGING APP19, then it is possible for this data to be replayed into the database; However, this is a complicated and tiAPP32 restrictive procedure.
Situation |
ImpaAPP13 |
Response |
Either is not: 1) Accepting soAPP32 or all inbound orders, 2) Executing soAPP32 or all inbound orders, 3) Canceling soAPP32 or all inbound orders, 4) Or sending Execution Reports to firms |
During Trading Hours: cannot fulfill order processing and/or reporting obligations to National Market System. |
1) Confirm scope of impaAPP13. Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) If cause of problem is not APP32 related, then work with affected firms to 4) Consider marking MESSAGES 1 as manual. - Use NTM Control SUBGROUP01 by APP32 commands 5) Consider suspending trading (if issues not resolved in 5 minutes). - Use NTM Control APP32 commands. 6) At tiAPP32 of resolution, confirm that all messages sent to the Exchange or sent back to the affected firms were sent and processed as expected. |
Scope of Impact Response Considerations:
1) If problems are isolated to firm connectivity issues only and not APP32 processing issues, then only consider suspending trading if an entire data center’s processing is affected.
2) If problems do involve APP32 processing issues, and if considering marking MESSAGES 1 as manual or suspending trading, consider doing so only for the affected APP32 instances.
Situation |
ImpaAPP13 |
Response |
trading applications lose their ability to communicate across data centers. |
During Trading Hours: Depending on the scope of the problem: Monitoring of applications may be lost or compromised. may not be able to fulfill order processing and/or reporting obligations to National Market System. Database loading in affected data center may no longer be up to date, and clerical support of order research and/or trade corrections in the affected data center would be impossible. Post Trade Processing using the affected data center may be invalid. All clerical and administrative operations would need to rely on the working data center’s database, including all Post Trade Processing. |
1) Confirm scope of impaAPP13. Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - System Monitoring? - PROCESSINGs? - Inbound MESSAGING Processing? - Outbound MESSAGING Processing? - Firm Connectivity? 2) Notify management. 3) For each impact identified, see the appropriate Generalized Recovery Scenario in this documentation for appropriate responses, with a focus on whether or not should be zeroing MESSAGES 1 and/or suspending trading. System monitoring is a priority. 4) Consider moving processing from the affected data center to the healthy data center. See Data Center Move Procedures for more details. 5) At tiAPP32 of resolution, see the appropriate Generalized Recovery Scenario in this documentation for appropriate integrity checks, and post trading impaAPP13. |
Scope of Impact Response Considerations:
1) The network design is such that auto-failovers to redundant services should help the auto-recovery of any situation within 5 minutes of the event.
If the problems are such that we don’t believe auto-failovers are working as expected, or that the problems may reoccur to the extent that our trading integrity is compromised, we should consider moving processing from the most adversely affected data center to the most healthy data center.
See each application’s specific recovery procedures in this document for appropriate responses and considerations.
Move DC1 to DC2 Data Center |
Move DC2 to DC1 Data Center |
1) Suspend Trading, Notify Industry 2) Move Opcon (APP35, SUBGROUP04) for monitoring See Application Specific Recoveries 3) Move Instrument/System Activity loading (APP27, DLAC) for ME See Application Specific Recoveries 4) Move PROCESSINGs (ME, APP40, DLME) for trading See ME_Recovery_Considerations 5) Move MESSAGING (SUBGROUP01, SUBGROUP02, APP06) for SITE 2 and BFD reporting See APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure 6) Move MESSAGING (APP09, APP23) for MESSAGING See Application Specific Recoveries 7) Move Firm Comm (APP37, APP38, APP22, APP24) See Application Specific Recoveries 8) Move MESSAGING and Administrative (APP07, APP12, MNT, APP33) See Application Specific Recoveries 9) Move APP10 (APP10, APP20, APP26) See APP10_Recovery_Considerations 10) Move MESSAGING Retrans and RISK (APP04, APP28, APP08) See Application Specific Recoveries 11) Move APP19 (DLBF, DLBP, DLCM, DLCS, DLKS, DLMP, DLOM, DLQR, DLRT, DLTR) See DBL_Recovery_Considerations 12) Perform System Integrity Checks See System Integrity Checklist 13) ResAPP41 Trading, Notify Industry 14) Notify PTT for post trading impaAPP13 and resolutions. NOTE: If DC1 moves to DC2: - MESSAGING will lose access to APP01, APP03 and APP39 processes and related functionality. APP26 access will be limited to those firms connecting in DC2 data center. - Order sending firms without redundant connectivity in DC2 will not have access to PROCESSINGs. They will also lose APP21 drop copies. |
1) Suspend Trading, Notify Industry 2) Move PROCESSINGs (ME, APP40, DLME) for trading See ME_Recovery_Considerations 3) Move MESSAGING (SUBGROUP01, SUBGROUP02, APP06) for SITE 2 reporting See APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure 4) Move Firm Comm (APP37, APP22) See Application Specific Recoveries 5) Move APP10 (APP10, APP20, APP26) See APP10_Recovery_Considerations 6) Move APP19 (SUBGROUP02, DLBF, DLCM, DLMP, DLOM, DLQR, DLTR, APP25) See DBL_Recovery_Considerations and Binary_DBL_Recovery_Considerations 7) Perform System Integrity Checks See System Integrity Checklist 8) ResAPP41 Trading, Notify Industry NOTE: If DC2 moves to DC1: - Order sending firms without redundant connectivity in DC2 will not have access to PROCESSINGs. They will also lose APP21 drop copies. |
The following applications do not move between Data Centers;
Either the data is replicated in the other data center or external sources are responsible for providing redundancy:
- MESSAGING PROCESSORS / INBOUND MESSAGING READERS (APP11, APP13, SUBGROUP02, SUBGROUP04)
o Data is redundantly read in both data centers.
- MESSAGING LOADERS / INBOUND MESSAGING LOADERS (APP02, SUBGROUP02, QMTL, SUBGROUP01, SUBGROUP03, SUBGROUP05)
o Data is redundantly loaded in both data centers.
o PTT will need to account for using data out of one data center or the other.
- MESSAGINGS (APP01, APP03, APP21, APP20, APP26, APP39)
o Firms are responsible for providing connectivity in both data centers.
o APP01, APP03, APP21 and APP39 are only running in DC1.
- APPGROUP02 (APP14, APP15, APP16, APP17, APP18))
o APPGROUP02 Functionality is provided redundantly in both data centers.
o IBs will need to use client links facilitating use out of one data center or the other.
- MESSAGING READER (APP05)
o MESSAGING data is multicast redundantly in both data centers.
o Operations are the only users of MESSAGING Readers.
- APP41 DATA STORES (FIRM COMM AND APP36)
o APP41 data stores are separated between data centers.
o APP41 data senders and receivers are limited to data center specific store storage and retrieval.
- NESPR NAAPP32 SERVICE (APP34)
o NESPR facilities exist in both data centers redundantly.
o applications only require one instance of NESPR .
Situation |
ImpaAPP13 |
Response |
Database Access is lost in either DC1 or DC2, but network connectivity between the two data centers is in tact. Production FileShare access is also working as expected between data centers. |
During Trading Hours: Trading could continue but database loading in affected data center would no longer be up to date, and clerical support of order research and/or trade corrections in the affected data center would be impossible. If processes need start in the affected data center, they would not be able to. Post Trade Processing using the affected data center would likely be invalid. All clerical and administrative operations would need to rely on the working data center’s database, including all Post Trade Processing. Non-java application startups would need rely on TNS_NAMES.ORA file pointing to the working data center’s database. They could start in their primary data center’s during this recovery if desired. Java application startups would need to rely on TNS_NAMES.ORA file pointing to the working data center’s database as well as database specific configurations located on local servers in each data center. They may need to start in alternate data center depending on which database is affected. |
1) Notify management. 2) If decision is made to rely on the working data center’s database exclusively for the rest of the day, copy the working data center’s TNS_NAMES.ORA file over the affected data center’s TNS_NAMES.ORA file in chxappcfg folders. 3) Stop any affected APP19 trying to write to affected database. These should remain down. 4) Confirm all database loading to working database is up to date. 5) Start/Restart any required java application in the saAPP32 data center as the working data center. These applications include: APP07, APP12, MNT, APP33 and APP37. 6) Start/Restart any required non-Java application as necessary in its primary data center. 7) Consider suspending trading only if MEs need to restart. - Use NTM Control APP32 commands before stopping/restarting MEs. - If MEs crash, rely on restart to halt trading automatically. 8) Note that OSF Simulators have hard-coded database references in their xml configurations. If these will be used for testing, they may need changes. 9) Production Support should also be cognizant of database’s being used in ER queries. 10) Change post-trading processing to work from single healthy data center. |
Scope of Impact Response Considerations:
1) This scenario documents a very specific problem with a well defined scope of impact and response.
Situation |
ImpaAPP13 |
Response |
Database Access is lost in either DC1 or DC2, but network connectivity between the two data centers is in tact. Production FileShare access in affected data center is also lost. |
During Trading Hours: The saAPP32 exact impaAPP13 as when a single data center’s database access is lost with a working production fileshare, with the exception that the following local environment variables will need to be modified in order to support application restarts: CHX_APP_CONFIG IPC_CONFIG IPC_CONFIG_LOC TNS_ADMIN All of these will need to change from the generic \\chx.com\prod\chxappcfg value to either a DC1 or DC2 specific production fileshare value. |
1) Redirect the affected data center’s servers to the working data center’s production fileshare. These are the steps required to do this: - If Altiris is available, Tech Services has jobs to redefine local environment variables as necessary. - If Altiris is not available, therer are registry key files in \\keymaster that must be manually imported on every server. To import registry key files on a server: A) Login to Server B) Via Windows Explorer, find registry key file in \\keymaster\it_operations\controlroom\Test Reg Keys C) Double click the registry key file and follow prompts to import the keys. 2) Once local environment variables have been changed, Tech Services will need to reboot all affected servers. - If Altiris is available, Tech Services can use it. - If Altiris is not available, each server must be logged onto and restarted manually, individually. 3) Follow the saAPP32 exact procedure used when a single data center’s database access is lost with a working production fileshare. |
Scope of Impact Response Considerations:
1) This scenario documents a very specific problem with a well defined scope of impact and response.
Situation |
ImpaAPP13 |
Response |
Either SITE is: 1) Marking their MESSAGES 1 as “manual”, 2) Cannot mark their MESSAGES 1 as “manual”, 3) Or cannot update existing MESSAGES 1. |
During Trading Hours: Unless excludes affected SITE’s MESSAGES 1 from BBO calculations (implementing Self Help Procedures), may not be adequately validating against locked markets, trade-throughs, or suspending trading. And may be trading against stale BBO information. |
1) Notify management. 2) Consider implementing Self Help Procedures. - Confirm SITE Issues - Use NTM Control Utilities APP31 APP11/SUBGROUP02 Control APP11 Markets commands. - Send Self Help Email Notices |
Scope of Impact Response Considerations:
1) Any problems or their resolution are not in CHX’s control. Respond appropriately to whichever SITE is having issues.
Services Involved |
ImpaAPP13 |
Response |
Services include: MESSAGING Reader Client Service Types include: APP05 - Service requires FireDaemon setup. |
During Off-Trading and Trading Hours: - No trading suspension considerations necessary. The functionality provided is not required for Trading. - IT Operations are only users. - IT Operations will not be able to confirm MESSAGING multicast going out to subscribers. -MESSAGING reader subscribers may still be getting MESSAGING multicast. |
- Notify Tech Services for support. - Do not notify EXECUTIVE. - No alternate nodes. - Live without functionality until server is reactivated and confirmed healthy. |
Services Involved |
ImpaAPP13 |
Response |
Services MAY include any of the following: MESSAGING Retrans, Instrument/System Activity Reader and related database loader. Service Types MAY include any of the following: APP04, APP25, APP27, SUBGROUP02 or DLAC |
During Off-Trading and Trading Hours: - No trading suspension considerations are
necessary. The functionality provided
is not required for Trading. - MESSAGING Retrans users will not be able to request MESSAGINGs. - If APP27 and related database loader were on node, trading applications will not be up to date with Instrument/System Activities. These are imperative at the tiAPP32 of PROCESSING restarts. |
- Notify Help Desk to notify MESSAGING Retrans users
affected. - Do not notify EXECUTIVE. - No configuration changes necessary. -Database loader files can be concatenated after end of
day shutdown. |
Services Involved |
ImpaAPP13 |
Response |
Services include: Apache Tomcat Services |
During Off-Trading and Trading Hours: - No trading suspension considerations are
necessary. The functionality provided
is not required for Trading. - IB users will not be able to use utility for administrative functions specific to MESSAGING MESSAGES 2. - Operations staff will not be able to generate BBO Reports. |
- Notify Help Desk to notify all APPGROUP02 users
affected. - Do not notify EXECUTIVE. - Notify Tech Services to identify cause and resolution.
|
Services Involved |
ImpaAPP13 |
Response |
Services include: MESSAGING and related APP19 Service Types include: APP07, DLBP |
During Off-Trading and Trading Hours: - No trading suspension considerations are
necessary. IBs should have capability of trading via
TRF terminals. -IBs will not be able to perform trading or administrative functions via MESSAGING terminals. -Inbound orders to IBs should reject back to firm with “communication problems”. -PROCESSING responses to IB orders will be seen upon MESSAGING restart. -Regulatory drop copies from vendors to IBs should queue on their end. |
- Notify IBs affected.
- Notify EXECUTIVE. - Move all applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts and host specific configurations are
pre-defined, outside of JNLP requirements for APP07 client connections. These files must be changed to allow APP07
client connectivity. Restart order (as applicable by server): APP07, DLBP - Database loader files can be concatenated after end of
day shutdown. - Confirm all data accounted for around move. |
Services Involved |
ImpaAPP13 |
Response |
Services MAY include any of the following: APP10, APP20, APP26, Nespr and related APP19. Service Types MAY include any of the following: APP10, APP20, APP34 and DLCM |
During Off-Trading and Trading Hours: - No trading suspension considerations are
necessary. - APP10 testing will not be occurring per rules. |
- Notify Management that MESSAGES 1 may be manual. - Notify EXECUTIVE. - Move all applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts and host specific configurations are
pre-defined, outside of APP10 service configuration files that must change to
accommodate when APP10 related APP20 or APP26 processes change hosts. - Nespr is never failed over to another node. One instance of Nespr runs in each data
center and the naAPP32 services work across the network, so we live without
redundancy in these situations. Restart order (as applicable by server): APP20, APP26, APP10, DLCM - Database loader files can be concatenated after end of
day shutdown. - Confirm all data accounted for around move. |
Services Involved |
ImpaAPP13 |
Response |
Services MAY include any of the following: SITE 2, MESSAGING, APP20 (order sending firm), Drop Copy (for order sending firm), APP26 (order sending firm), MESSAGING (for IBs). Service Types MAY include any of the following: APP01, APP03, APP20, APP21, APP26, APP39 |
During Off-Trading and Trading Hours: - No trading suspension considerations necessary. OSFs and vendors are responsible for
supporting alternative connectivity if primary connectivity options are not
available. -Any messaging supported by the applications involved will not be processed as expected. There is always a potential for lost messages being in transit at the point of failure. -IB messages routed to APP01s should reject back to MESSAGING. -IB messages routed to APP03s should queue. -OSF orders sent from APP20 firms to PROCESSINGs should queue on their side. Open orders already sent will either remain open or be canceled per APP20 configurations. -Drop copies from PROCESSINGs to APP21 firms should be queued and/or resent via THIRD PARTY data stores. -OSF orders sent from APP26 firms to PROCESSINGs or IBs should queue on their side. Open orders already sent will remain open. -Regulatory drop copies from vendors to IBs should queue on their side. |
- Notify Tech Services for server and/or NAT change
support. - Notify affected firms.
- Notify EXECUTIVE. - Move all applications to DR nodes with cooperation of
Technical Services and necessary NAT changes. - All hosts are pre-defined. - No configuration changes necessary. - All PROCESSOR files need to be moved ahead of any
service restarts. Restart order (as applicable by server): APP20, APP26, APP21, APP39, APP01, APP03 - Confirm all data accounted for around move. |
Services Involved |
ImpaAPP13 |
Response |
Services include the following: RISK, COMMUNICATION and related APP19. Service Types include the following: APP28, APP08 and DLKS |
During Off-Trading and Trading Hours: - No trading suspension considerations are
necessary. - There are no RISK users, but if there were, they would lose risk management functionality. |
- Notify Management.
- No need to notify EXECUTIVE. - Move all applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - APP08 requires Linux server WAR file deployment. Restart order (as applicable by server): APP28, GTWY, DLKS - Database loader files can be concatenated after end of
day shutdown. - Confirm all data accounted for around move. |
Services Involved |
ImpaAPP13 |
Response |
Services MAY include any of the following: APP29 and/or APP33, APP41 Data Store ServiceTypes MAY include any of the following: MNT, APP33 |
During Off-Trading and Trading Hours: - No trading suspension considerations are
necessary. Neither APP29 or APP33 functionality are
required for Trading. APP41 Data Stores functionality should failover to
alternate APP41 Data Store. |
- Notify s/Compliance that APP33 is affected. - Notify IBs that APP29 is affected. - Notify EXECUTIVE that APP41 Data Store involved is no
longer redundantly supported until server is put back on line. - Move APP29 or APP33 applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts and host specific configurations are
pre-defined, outside of JNLP requirements for MNT and APP33 client
connections. These files must be
changed to allow MNT and APP33 client connectivity. Restart order (as applicable by server): MNT, APP33 -APP41 Data Stores are not to be moved between
servers. These services may only be
reactivated when the server is placed back in service. |
Services Involved |
ImpaAPP13 |
Response |
Services include: PROCESSING, Transaction Readers, and related APP19 Service Types include: ECHX, APP40 and DLME, DLMP |
During Trading Hours: - will be exposed until PROCESSING restarts if trading was not already halted at the tiAPP32 of the server shutdown. - Trading suspension will occur automatically upon PROCESSING
restarts; However, production stocks do not open for trading until 6am. -All orders open in the PROCESSING will be canceled upon restart regardless of origin. -All orders sent to PROCESSING while PROCESSING is down will be rejected, regardless of source of origin. -All affected APP32 stocks should have MESSAGES 1 zeroed by SUBGROUP01 processes upon APP32 disconnect. During Off-Trading Hours: - No trading suspension considerations unless PROCESSING will not be up before 6am on a Trading Day. |
- Restart PROCESSINGs as soon as possible. - Notify EXECUTIVE and participants immediately if
trading suspension occurred. - If any question regarding whether or not affected MESSAGES
1 have been zeroed, zero MESSAGES 1 via NTM Control Utility, SUBGROUP01
options by ME. - Move all applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts are pre-defined. - No configuration changes necessary. - All DLAPP32 database loading must be up to date in the
data center that the APP32 is going to be restarted in before restarts
occur. The only tiAPP32 a database
loader file should move between nodes is if the database loading has not been
completed and can only be completed by doing so. NOTE: APP40 and DLAPP32 processes for any given APP32 are
typically configured to run on a separate node from the APP32 to try and
avoid data loss in the case of an APP32 crash. Restart order: APP40, ECHX, DLME, DLMP - Confirm all data accounted for around move. - Database loader files (not moved by necessity) can be
concatenated after end of day shutdown. - Confirm systems integrity and defer to EXECUTIVE
instructions before resuming trading, if trading was suspended. |
Services Involved |
ImpaAPP13 |
Response |
Services include: SITE 2 Quote Processor, SITE 2 Trade Processor, NASD Quote Processor, NASD Trade Processor Service Types include: APP11, APP13, SUBGROUP02, SUBGROUP04 |
During Off-Trading and Trading Hours: - No trading suspension considerations are necessary unless all MESSAGING is lost between two different servers. -Inbound MESSAGING will not be processed as expected. Redundancy of inbound MESSAGING is supported between “A” series processors and “B” series processors, so unless both servers go down together, there should be no loss of data, except possibly when redundancy is returned (as a result of sequence gap related processing). |
- Do not notify EXECUTIVE unless trading suspension will
be considered. - Move all applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts and host specific configurations are
pre-defined. - No configuration changes are necessary. Restart order: APP11, SUBGROUP02, APP13, SUBGROUP04 |
Services Involved |
ImpaAPP13 |
Response |
Services include: Readers and Loaders for BBO Duration, Quote Montage, Lastsale Montage Service Types include: BBOD, SUBGROUP02, SUBGROUP03, and SUBGROUP01, SUBGROUP03, SUBGROUP05 |
During Off-Trading and Trading Hours: - No trading suspension considerations are necessary. -Quote Montage, Lastsale Montage and BBO Duration will all be compromised. Post Trading Processing will be compromised as a result. - IT Operations will need to work with Post Trading Technology and Database Technologies to determine whether or not missing data needs to be “replayed” and if so, what method of “replay” should be used. |
- Notify EXECUTIVE and Post Trading Technology that MESSAGING Loading is compromised. - Move all applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts and host specific configurations are pre-defined. - No configuration changes are necessary. Restart order: SUBGROUP03, BBOD, SUBGROUP02, SUBGROUP05, SUBGROUP01, SUBGROUP03 - Confirm all data received was loaded as much as possible. - Database loader files can be concatenated after end of day shutdown. |
Services Involved |
ImpaAPP13 |
Response |
Services MAY include any of
the following: MESSAGING, SITE 2 Regional Inputs, MESSAGING/APP23 and APP24 and all related APP19. ServiceTypes MAY include any of the following: APP06, SUBGROUP01, SUBGROUP02, APP09, APP23, APP24 and DLBF, DLAR, DLTR, DLRT |
During Trading Hours: - Trading suspension should only be considered if recovery takes longer than 5 minutes. -Outbound MESSAGES 1 will be queued in THIRD PARTY Data Store to be sent to database upon SUBGROUP01 restart. All PROCESSING top of book orders will be re-quoted when SUBGROUP01 process reconneAPP13 after restart. -Outbound MESSAGES 2 will be queued in THIRD PARTY Data Store to be sent to SITE 2 “sold” upon SUBGROUP02 restart. -Outbound MESSAGING will be queued and resent upon RTC and APP23 restarts/reconneAPP13. -Outbound messages from MESSAGING to TRF will queue to be sent upon APP24 restart/reconnect. -MESSAGING quote and trade messaging will be queued in THIRD PARTY Data Store to be sent in chronological order upon APP06 restart (as if all MESSAGING Subscribers requested a retransmission of this data). During Off-Trading Hours: - Trading suspension should only be considered if recovery goes past 6am. |
- Notify EXECUTIVE - Notify SITE 2s, APP23 and/or APP24s affected. - Move applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts and NAT addresses are pre-defined. - No configuration changes necessary. -All UTDRI (NASDAQ SUBGROUP02) database loading must be
up to date in the data center that the APP32 is going to be restarted in
before restarts occur. The only tiAPP32
a database loader file should move between nodes is if the database loading
has not been completed and can only be completed by doing so.
- APP23 and APP24 PROCESSOR files need to be moved ahead of service restart. - APP06 log and index files need to be moved ahead of service restart to accommodate later MESSAGING Requests. - Restart order (as applicable by server): SUBGROUP01, SUBGROUP02, APP23, APP09, APP24, APP06 then APP19. - Confirm all data accounted for around move. - Database loader files (not moved by necessity) can be
concatenated after end of day shutdown. |
Services Involved |
ImpaAPP13 |
Response |
Services MAY include any of
the following: APP12, Opcon, APP37, APP38ridge, Drop Copy Reader Service Types MAY include any
of the following: APP12, APP35, APP37, APP38, DSCR, and DLCS, SUBGROUP04, DLOM |
During Off-Trading and Trading Hours: - No trading suspension considerations are necessary. -System monitoring will be compromised. EMT will not receive messages from Opcon until problems resolved. -Administrative functions and queries from APP12 will not be possible. -MESSAGING, APP26 and APP12 messaging to PROCESSINGs and/or Order Sending Firms will be queued until APP37 and APP38ridge processes are restarted. - PROCESSING drop copies to Order Sending Firms will be queued in THIRD PARTY Data Store until APP22 process is restarted. |
- Notify EXECUTIVE and IBs that APP38ridge connectivity between APP07 and APP32 is compromised. - Move applications to alternate nodes. - If alternate nodes not available, move to DR nodes. - All hosts and host specific configurations are pre-defined, outside of JNLP requirements for APP12 client connections. These files must be changed to allow APP12 client connectivity. Restart order (as applicable by server): APP35 (followed by confirmation of monitoring integrity), APP38, APP37, APP22, APP12, then APP19. - Confirm all data accounted for around move. - Database loader files can be concatenated after end of day shutdown. |
Services Involved |
ImpaAPP13 |
Response |
Services include: APP41 Data Store (FireDaemon) NOTE: There are two pairings of APP41 Data Stores in each data center. 1) APP36 Data Store, storing data from ME, to SUBGROUP01, to SUBGROUP02, to APP06, and 2) FS Data Store, storing data from APP38 to ME, to APP40, to APP22, to APP09 |
During Off-Trading and Trading Hours: - No trading suspension considerations necessary. |
- Notify EXECUTIVE. - No alternate nodes. - Live without redundancy until server is reactivated and confirmed healthy. - Service requires FireDaemon setup. - Confirm APP41 Data Store failover occurred as expected and systems integrity of affected data. |
All application specific documentation contains the following sections:
- Purpose, describing what the application does and what data is processed in general terms.
- Troubleshooting Table, describing known events related to the application, their impaAPP13 and expected responses, in general terms.
- Recovery Considerations, outlining detailed steps required to move, and/or recover the application.
- NTM Control Commands, outlining available NTM Control Commands specific to the application to facilitate various operations.
APP01 Purpose:
APP01 processes receive orders from IBs and send them to External Vendors or MESSAGING Services.
They then receive related responses from these
services and send these back to IBs.
Use the
following hyperlinks to jump to the desired section of APP01 documentation:
APP01_Monitoring_Considerations
APP01 Recovery Considerations:
Stopping/Restarting/Moving
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
-
When moving between nodes:
o
ALTERNATE NODES are not defined for APP01
services.
o
DR NODES must be allocated un-natted nodes.
No node can support more than one natted address at the saAPP32 time.
- When stopping/restarting APP01 processes:
1) Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.
2) Notify Tech Services if moving APP01 processes to DR nodes and NAT addresses need to change to accommodate move.
3) Stop the APP01 process.
4) If NOT moving APP01 to new node, skip to step 5.
a) If moving the APP01 to a new node, copy the day’s APP01 PROCESSOR files to alternate node:
a) Copy: \chx\data\APP01File\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.
b) Copy: \chx\data\APP01File\Global.* file created for the day to the alternate node.
c) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
d) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
5) Start the APP01 process.
6) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open OSF Channel:
- Use NTM Control Utility – Service Control - APP01 – Open OSF Channel to make OSF connection possible.
Close OSF Channel:
- Use NTM Control Utility – Service Control - APP01 – Close OSF Channel to make OSF connection impossible.
Set Inbound Sequence Number:
- Use NTM Control Utility – Service Control - APP01 – Set Inbound Sequence Number to set Inbound Sequence Number.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - APP01 – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP01 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP01 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP01 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP01Services.xml file within the disconnect message. |
IB is no longer able to send or receive order or order related messages with Vendor or MESSAGING Service. |
1) Contact firm. 2) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 3) Stop/Restarts of affected application service may help resolve the issue. |
APP01 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from to SITE 2s. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. 1) Check status of process 2) If process is up, call Technical Services. |
Status is Disconnected or Open |
Firm is not connected. |
|||
Write Queue is non-zero values and not reducing as
expected. |
Firm may not be processing as expected. |
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from to SITE 2s. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
No data is displayed. |
No data has been generated to any SITE 2. |
|||
InCount does not match OutCount. |
Firm may not be processing as expected. |
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from to SITE 2s. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP01 processes are displayed as expected. |
APP01 Service has not been started. |
|||
Msgs In and/or Msgs Out are zero. |
No messages have been sent/received since that monitor has
been started. |
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from to SITE 2s. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP01 or APP37 processes are displayed as
expected. |
APP01 or APP37 Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from to SITE 2s. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP01 or APP25 processes are displayed as
expected. |
APP01 or APP25 Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
All APP02 Application
Specific Recovery documentation is referenced in the APP30 (APP02/SUBGROUP02/SUBGROUP03) section of this documentation.
APP03 Purpose:
APP03 processes receive order and execution drop copies from IBs and send them to External Vendors or MESSAGING Services.
They then receive related responses from these
services and send these back to IBs.
Use the
following hyperlinks to jump to the desired section of APP03 documentation:
APP03_Monitoring_Considerations
APP03 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
-
When moving between nodes:
o
ALTERNATE NODES are not defined for APP03
services.
o
DR NODES must be allocated un-natted nodes.
No node can support more than one natted address at the saAPP32 time.
- When stopping/restarting APP03 processes:
7) Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.
8) Notify Tech Services if moving APP03 processes to DR nodes and NAT addresses need to change to accommodate move.
9) Stop the APP03 process.
10) If NOT moving APP03 to new node, skip to step 5.
a) If moving the APP03 to a new node, copy the day’s APP03 PROCESSOR files to alternate node:
a) Copy: \chx\data\{APP03}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.
b) Copy: \chx\data\{APP03}\Global.* file created for the day to the alternate node.
c) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
d) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
11) Start the APP03 process.
12) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open OSF Channel:
- Use NTM Control Utility – Service Control - APP03 – Open OSF Channel to make OSF connection possible.
Close OSF Channel:
- Use NTM Control Utility – Service Control - APP03 – Close OSF Channel to make OSF connection impossible.
Set Inbound Sequence Number:
- Use NTM Control Utility – Service Control - APP03 – Set Inbound Sequence Number to set Inbound Sequence Number.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - APP03 – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP03 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP03 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP03 Troubleshooting Table:
APP03 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP01Services.xml file within the disconnect message. |
IB is no longer able to send or receive order or drop copy related messages with Vendor or MESSAGING Service. |
1) Contact firm. 2) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 3) Stop/Restarts of affected application service may help resolve the issue. |
APP03 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from to MESSAGING
Destinations. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Status is Disconnected or Open |
Firm is not connected. |
|||
Write Queue is non-zero values and not reducing as
expected. |
Firm may not be processing as expected. |
APP04 Purpose:
APP04 processes allow MESSAGING Subscribers to request MESSAGING multicast retransmissions of data they believe they might have missed, on demand. APP04 processes receive MESSAGING Requests from MESSAGING Subscribers and send requested data back in return.
Use the
following hyperlinks to jump to the desired section of APP04 documentation:
APP04_Monitoring_Considerations
APP04 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP04 nodes only when moving between nodes. There are NAT dependencies on this functionality.
- When stopping/restarting APP04 processes:
1) Notify the MESSAGING Subscribers and work in cooperation with them, as appropriate to situations.
2) Notify Tech Services if moving APP04 processes to alternate nodes and NAT addresses need to change to accommodate move.
3) Stop/Restart the APP04 process.
- APP04 recovery must also be considered if MESSAGING (APP06) Processes are moved between nodes.
- See APP04_Move_Procedure in the APP36_Recovery_Considerations section for more details.
There are no APP04
related NTM Control Commands.
APP04 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
MESSAGING Subscribers will not be able to request MESSAGINGs. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Notify Management. |
APP04
Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate MESSAGING requests from MESSAGING
subscribers utilizing MESSAGING log files. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Connection status is not LoggedIn. |
Subscriber is not logged in and is unable to request or
receive MESSAGINGs. |
|
|
|
Retrans Requests Accepted and Sent values are unexpectedly
high. |
May indicate that there are MESSAGING delivery issues;
Especially if seen for several different subscribers. |
APP05 Purpose:
APP05 processes emulate MESSAGING Subscriber firms by running on servers that are outside of access switches.
APP05 processes receive MESSAGING multicast from MESSAGING (APP06) processes.
APP05 Client applications allow users to view MESSAGING data statistics, status and error messaging as well as MESSAGING data.
Operations are the only users of MESSAGING Readers.
Any issue with MESSAGING Readers may, or may not, mean MESSAGING Subscribers are having similar issues.
See APP06 Application Specific Scenarios for information related to MESSAGING multicast.
Use the
following hyperlinks to jump to the desired section of APP05 documentation:
APP05_Monitoring_Considerations
APP05 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- APP05 processes do not move between data centers.
There are no APP05 related NTM Control Commands.
APP05 Troubleshooting Table:
APP05 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
MESSAGING Readers will not be able to receive MESSAGING multicast. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Notify Management. |
APP05 Monitoring Considerations:
MESSAGING
Reader Client |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes emulate MESSAGING Subscribers. MESSAGING Reader Client allows user monitoring of MESSAGING
Statistics, Messaging and MESSAGING data. |
PROD MENU: |
- Status Bar (bottom of client) - Backup State - Primary State - Messages Window |
Status Bar shows client in disconnected state. APP06 Process(es) are either down or Client started before
APP06 Processes. |
1) If APP06 processes are down, restart them. 3) If reconnection attempts fail, call Technical Services;
There may be networking issue. |
|
|
|
Backup State and/or Primary State is not “active”. APP06 multicast for alternate channel is not subscribed
to. |
1) Work with Tech Services to investigate potential
network issue. 2) If only one of the channels is not active, MESSAGING
Subscribers may not have an issue with the working channel. A Stop/Restart of the APP06 process
involved may correct the issue, understanding that this will interrupt MESSAGING
Subscribers reading the working channel, if there is one. |
|
|
|
Message Window shows APP06 related issues. |
See APP06 Application Recoveries. |
All APP06
Application Specific Recovery documentation is referenced in the APP36 (SUBGROUP01,
SUBGROUP02, APP06) section of
this documentation.
APP07 Purpose:
APPGROUP04 Servers receive orders, order cancels and order changes from Order Sending Firms and send responses to these to Order Sending Firms.
APPGROUP04 Servers receive order responses from Market Engines, SITE 2s, TRFs, as well as “regulatory” drop copies.
APPGROUP04 Servers send orders to PROCESSINGs, SITE 2s, and TRFs, as well as drop copies to MESSAGING processes.
APPGROUP04 Servers also send trade reports directly to SUBGROUP02 services when they correct MESSAGES 2.
Use the
following hyperlinks to jump to the desired section of APP07 documentation:
APP07_Monitoring_Considerations
APP07 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP07 nodes only when moving between nodes. (Java code, FireDaemon and Host Specific references in JNLPs required.)
- When stopping/restarting APP07 processes:
1) Notify the IBs and work in cooperation with them, as appropriate to situations.
2) Stop/Restart the APP07 process.
Refresh Threshold Data:
- Use NTM Control Utility – Service Control - APP07 – Refresh Threshold Data.
Refresh Trade Ack Data:
- Use NTM Control Utility – Service Control - APP07 – Refresh Trade Ack Data.
Refresh Brokers Clerks Data:
- Use NTM Control Utility – Service Control - APP07 – Refresh Brokers Clerk Data.
Refresh Sub Accounts Data:
- Use NTM Control Utility – Service Control - APP07 – Refresh Sub Accounts Data.
Turn On MESSAGING Control:
- Use NTM Control Utility – Service Control - APP07 – Turn on MESSAGING Control.
Turn Off MESSAGING Control:
- Use NTM Control Utility – Service Control - APP07 – Turn off MESSAGING Control.
End Of Day:
- Use NTM Control Utility – Service Control - APP07 – End Of Day to start End Of Day processing.
Client Shutdown:
- Use NTM Control Utility – Service Control – APP07 – Client Shutdown to remotely shutdown all clients for the instance.
APP07 Troubleshooting Table:
APP07 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
IBs will not be able to receive or process orders via MESSAGING. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Notify Management. |
APP07 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
|
Processes facilitate communications from APPGROUP04 Servers
to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade
Reporting Systems. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
Not all APP07 processes are displayed as expected. |
APP07 Service has not been started. |
||||
Msgs In and/or Msgs Out are zero when messages are processed. |
No messages have been sent/received since that monitor has
been started. |
||||
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate communications from APPGROUP04 Servers
to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade
Reporting Systems. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP07 or APP03 processes are displayed as
expected. |
APP07 or APP03 Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate communications from APPGROUP04 Servers
to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade
Reporting Systems. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP07 or APP24 processes are displayed as
expected. |
APP07 or APP24 Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate communications from APPGROUP04 Servers
to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade
Reporting Systems. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP07 or PROCESSING processes are displayed as
expected. |
APP07 or PROCESSING Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate communications from APPGROUP04 Servers
to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade
Reporting Systems. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP07 or APP37 processes are displayed as
expected. |
APP07 or APP37 Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate communications from APPGROUP04 Servers
to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade
Reporting Systems. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP07 or SUBGROUP02 processes are displayed as
expected. |
APP07 or SUBGROUP02 Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
All APP08 Application Specific Recovery documentation
is referenced in the APP28 (RISK System) section of this documentation.
APP09 Purpose:
MESSAGING receives MESSAGING records from SUBGROUP02.
MESSAGING sends MESSAGING records to APP23 and sends MESSAGING drop copy messages to APP22 for order sending firms that request them.
Use the
following hyperlinks to jump to the desired section of APP09 documentation:
APP09_Monitoring_Considerations
APP09 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- APP09 process is closely tied to the APP36 system. If moving APP09 to another node, see APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure.
Resend Zero CuSITE 2 Messages:
- Use NTM Control Utility – Service Control – RealtiAPP32 MESSAGING Options – Resend Zero CuSITE 2 Messages.
Send EOD Message:
- Use NTM Control Utility – Service Control - RealtiAPP32 MESSAGING Options – Send EOD Message.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - RealtiAPP32 MESSAGING Options – Set Outbound Sequence Number.
APP09 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
APP23 will no longer be receiving MESSAGING records. Order Sending firms requesting MESSAGING drop copies will no longer be receiving them. Refer to: APPGROUP13 Server (APP36 – to SITE 2 MESSAGES 1 and MESSAGES 2, MESSAGING, MESSAGING and TRF) Server Specific Recoveries. |
1) Refer to: APPGROUP13 Server (APP36 – to SITE 2 MESSAGES 1 and MESSAGES 2, MESSAGING, MESSAGING and TRF) Server Specific Recoveries. 2) Notify Management. |
APP09 Monitoring
Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
|||||
Processes
facilitate MESSAGING trade delivery from SUBGROUP02 processes to APP23 PROCESSOR. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|||||
Not all RTC processes are displayed as expected. |
Not all RTC services have been started or havn't processed
any messages since monitor has been started. |
||||||||
Status
is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
||||||||
IPC Connected Queue size is non-zero value and not
decreasing as expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
||||||||
|
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
||||
|
Processes facilitate MESSAGING
trade delivery from SUBGROUP02 processes to APP23 PROCESSOR. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
||||
|
Status is Disconnected or Open |
Firm is not connected. |
|||||||
|
InMsgs values are not equal to or greater than OutMsgs
values. |
We may not be processing as expected. |
|||||||
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate MESSAGING
trade delivery from SUBGROUP02 processes to APP23 PROCESSOR. |
PROD MENU: |
- Color of data in columns |
Data is RED.
|
Process is either down or multicast data is not being
received by monitor. 1) Check status of process |
Not all RTC processes are displayed as expected. |
Service has not been started. |
|||
Msgs In and/or Msgs Out are zero. |
No messages have been sent/received since that monitor has
been started. |
APP10 Purpose:
APP10 processes generate test orders (emulating order sending firms) and sends these to either APP20 or APP26 processes to route to PROCESSINGs.
APP10 processes receive the order and quote responses for each order generated and report “test success/fail” results as they are received.
Based on testing failures (or successes after failures), APP10 processes may automatically set all MESSAGES 1 for stocks traded by a given PROCESSING to manual or auto.
By default configuration
settings:
APP10DC2 process runs in DC2 and tests all stocks assigned to DC2 MEs, via APP20_APP10DC2 process, running in DC1.
Failures will control quoting conditions in these stocks.
APP10DC1 process runs in DC1 and tests all stocks assigned to DC1 MEs, via APP20_APP10DC1 process, running in DC2.
Failures will control quoting conditions in these stocks.
APP10APP37 process runs in DC1 and tests all stocks assigned to DC2 and DC1 MEs, via APP26_APP10APP37 process, running in DC2.
Failures will NOT control quoting conditions in these stocks.
Use the
following hyperlinks to jump to the desired section of APP10 documentation:
APP10_Monitoring_Considerations
APP10 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP10 nodes only when moving between nodes. (Configuration files utilize specific IP addresses to connect to APP20 and APP26 instances)
- When stopping/restarting APP10 processes:
1) Determine if associated APP20 or APP26 instance used by the affected APP10 instance is also moving hosts.
a. If APP10 associated APP20 or APP26 instances are moving hosts, a different APP10Services.xml configuration file will be required.
The APP10Services.xml files make use of APP10 service level parameters in the APP29.
The following APP10 service level parameters are defined for APP10DC1 and APP10DC2 services:
APP20_PRI_PRI_HOST
APP20_PRI_ALT_HOST
APP20_DR_PRI_HOST
APP20_DR_ALT_HOST
The following APP10 service level parameters are defined for the APP10APP37 service:
APP26_PRI_PRI_HOST
APP26_PRI_ALT_HOST
APP26_DR_PRI_HOST
APP26_DR_ALT_HOST
b. The APP10Services.xml file can easily be modified to redirect APP10 service being moved to use the correct parameter.
c. There are already soAPP32 pre-modified versions of APP10Services.xml to aid in these recoveries.
d. If these configuration files are not moved in before the APP10 process restarts, no connections will occur.
2) If the associated APP10 APP20 or APP26 instance is NOT moving to an alternate node, skip to step 3.
a) Move in, or modify the required version of APP10Services.xml file
b) Stop the associated APP20 or APP26 instance on its current node.
c) Copy the day’s APP20 or APP26 PROCESSOR files from d$\chx\data\ subfolder from current node to saAPP32 folder on alternate node.
d) Start the associated APP20 or APP26 instance on its alternate node.
e) Open channel for associated APP20 or APP26 instance and confirm channel opens without issue.
3) Stop the APP10 instance.
4) If APP10 process is NOT moving to alternate node, skip to step 5.
a) Copy the day’s APP10 PROCESSOR files from d$\chx\data\ subfolder from current node to saAPP32 folder on alternate node.
5) Start APP10 instance.
a) Confirm APP10 testing works without errors.
APP10 testing will automatically start between 5am-3:30pm upon APP10 startup.
Auto Quote All MEs:
- Use NTM Control Utility – Service Control - APP10 – Auto Quote All MEs to send all MEs message to set all stocks quote modes to auto.
Auto Quote One ME:
- Use NTM Control Utility – Service Control - APP10 – Auto Quote One APP32 to send one APP32 a message to set all stocks quote modes to auto.
Manual Quote All MEs:
- Use NTM Control Utility – Service Control - APP10 – Manual Quote All MEs to send all MEs message to set all stocks quote modes to manual.
Manual Quote One ME:
- Use NTM Control Utility – Service Control - APP10 – Manual Quote One APP32 to send one APP32 a message to set all stocks quote modes to manual.
Start All APP32 Test:
- Use NTM Control Utility – Service Control - APP10 – Start All APP32 Test to start APP10 testing for all MEs, as configured by stock.
Start One APP32 Test:
- Use NTM Control Utility – Service Control - APP10 – Start One APP32 Test to start APP10 testing for a single ME, as configured by stock.
Stop All APP32 Test:
- Use NTM Control Utility – Service Control - APP10 – Stop All APP32 Test to stop APP10 testing for all MEs, as configured by stock.
Stop One APP32 Test:
- Use NTM Control Utility – Service Control - APP10 – Stop One APP32 Test to stop APP10 testing for a single ME, as configured by stock.
APP10 Troubleshooting Table:
APP10 Symptom |
ImpaAPP13 |
Response |
APP10 RESTING ORDER QUOTE
FAILURES Caused by SITE 2
connectivity issues Evidenced by: - APP10 EMT messages indicating RESTING_ORDER_QUOTE_FAILURE. - APP10 Processing Stats monitor indicating RESTING_ORDER_QUOTE_FAILURE. - APP36 Stats monitor showing connectivity issues to SITE 2 (SITE 2 and/or NASD) |
APP10 is not receiving MESSAGES 1 from SUBGROUP01 process as expected. During Trading Hours: cannot fulfill quoting obligations to National Market System. MESSAGING will no longer be updating if SUBGROUP01 process is not connected to SITE 2. If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2. SITE 2 connectivity issues will prevent industry from seeing these “manual” MESSAGES 1. |
1) Refer to: CHX_Cannot_Send_MESSAGES 1_To_SITE 2s Generalized Recovery Scenario. 2) Refer to: APP36 (SUBGROUP01, SUBGROUP02, APP06) Application Specific Recoveries. |
APP10 RESTING ORDER QUOTE
FAILURES NOT caused by SITE 2 connectivity
issues Evidenced by: - APP10 EMT messages indicating RESTING_ORDER_QUOTE_FAILURE. - APP10 Processing Stats monitor indicating RESTING_ORDER_QUOTE_FAILURE. - APP36 Stats monitor showing NO CONNECTIVITY ISSUES to SITE 2 (SITE 2 and/or NASD) - APP10_App_Connect_Stats monitor showing possible connectivity problem between APP10 and APP11RI/UQDRI process. - NTM Control APP31 APP11/SUBGROUP02 Display Montage shows MESSAGES 1 marked as “manual”. |
APP10 is not receiving MESSAGES 1 from SUBGROUP01 process as expected. During Trading Hours: If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2. |
1) Confirm scope of impaAPP13. In Stats Monitor: - Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) Work with Tech Services to determine corrective actions. Try to avoid APP32 stop/restart. - Stop/Restart of APP10 may resolve issue. - Stop/Restart of SUBGROUP01 may resolve issue. - NTM APP32 option to “Resend APP32 MESSAGES 1” may resolve issue. 4) If connectivity cannot be immediately resolved, use NTM Control SUBGROUP01 options by APP32 to zero MESSAGES 1 or mark all MESSAGES 1 as manual. 5) Consider suspending trading. 6) If SUBGROUP01 restart is tried, once SITE 2 connectivity is re-established, SUBGROUP01 will automatically request download of updated MESSAGES 1 from all MEs and send these to SITE 2. 7) Once problems are resolved, use NTM APP32 “Resend APP32 MESSAGES 1” option to generate updated MESSAGES 1 from all MEs and send these to SITE 2. |
APP10 RESTING ORDER EXEC
RPT FAILURE WITHOUT APP10
ORDER_EXEC_RPT_FAILURE. Evidenced by: - APP10 EMT messages indicating RESTING_ORDER_EXEC_RPT_FAILURE. - APP10 Processing Stats monitor indicating RESTING_ORDER_EXEC_RPT_FAILURE. - APP10_App_Connect_Stats monitor showing possible connectivity problem between APP10 and FIX.4.1:APP10T process. - NTM Control APP31 APP11/SUBGROUP02 Display Montage shows MESSAGES 1 marked as “manual”. If APP10 is ONLY reporting RESTING_ORDER_EXEC_RPT_FAILURE, without APP10 ORDER_EXEC_RPT_FAILURE, then implication is that the APP32 is up and processing, and connections to APP10 are likely in tact, but APP32 APP10 stock may be closed, and need to be re-opened. |
APP10 is not receiving APP32 trade executions from ME, APP20 or APP26 process as expected, BUT is receiving IOC order cancels as expected. During Trading Hours: If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2. |
1) Confirm scope of impaAPP13. In Stats Monitor: - Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) Work with APP12 APP10 order queries and other departments as necessary to determine corrective actions. Try to avoid APP32 stop/restart. - NTM APP32 option to “ResAPP41 APP10 Issues” may resolve issue. - Stop/Restart of APP20/APP26 may resolve issue; Requires opening of channels. - Stop/Restart of APP10 may resolve issue. 4) If connectivity cannot be immediately resolved, use NTM Control SUBGROUP01 options by APP32 to zero MESSAGES 1 or mark all MESSAGES 1 as manual. 5) Consider suspending trading. 6) Once problems are resolved, use NTM APP32 “Resend APP32 MESSAGES 1” option to generate updated MESSAGES 1 from all MEs and send these to SITE 2. |
APP10 RESTING ORDER EXEC
RPT FAILURE WITH APP10 ORDER EXEC RPT
FAILURE. Evidenced by: - APP10 EMT messages indicating ORDER_EXEC_RPT_FAILURE and RESTING_ORDER_EXEC_RPT FAILURE. - APP10 Processing Stats monitor indicating ORDER_EXEC_RPT_FAILURE and RESTING_ORDER_EXEC_RPT FAILURE. - APP10_App_Connect_Stats monitor showing possible connectivity problem between APP10 and FIX.4.1:APP10T process. - NTM Control APP31 APP11/SUBGROUP02 Display Montage shows MESSAGES 1 marked as “manual”. If APP10 is reporting both RESTING_ORDER_EXEC_RPT_FAILURE and ORDER_EXEC_RPT_FAILURE simultaneously, then likely issue is caused by APP32 being down, or APP32 connection to APP10 APP26/APP20 is broken, or APP10 APP26/APP20 connection to APP10 is broken. |
APP10 is not receiving APP32 trade executions from ME, APP20 or APP26 process as expected, NOR receiving IOC order cancels as expected. During Trading Hours: If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2. |
1) Confirm scope of impaAPP13. In Stats Monitor: - Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) Work with APP12 APP10 order queries and other departments as necessary to determine corrective actions. Try to avoid APP32 stop/restart. - Stop/Restart of APP20/APP26 may resolve issue; Requires opening of channels. - Stop/Restart of APP10 may resolve issue. - NTM APP32 option to “ResAPP41 APP10 Issues” may resolve issue. 4) If connectivity cannot be immediately resolved, use NTM Control SUBGROUP01 options by APP32 to zero MESSAGES 1 or mark all MESSAGES 1 as manual. 5) Consider suspending trading. 6) Once problems are resolved, use NTM APP32 “Resend APP32 MESSAGES 1” option to generate updated MESSAGES 1 from all MEs and send these to SITE 2. |
APP10 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes
generate test orders and cancels to PROCESSINGs via APP20 or APP26 processes
and report communication failures from PROCESSINGs and SUBGROUP01 processes. |
PROD MENU: |
- Color of data in columns - Status, |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Status is Created, Disconnected or Open |
APP20, APP26 or SUBGROUP01 is not connected. |
|||
Write Queue is non-zero values and not reducing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats
Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes
generate test orders and cancels to PROCESSINGs via APP20 or APP26 processes
and report communication failures from PROCESSINGs and SUBGROUP01 processes. |
PROD MENU: |
- Color of data in columns |
Data is RED. Process is either down or multicast data is not being
received by monitor. |
1) Check status of process |
Status is not Testing APP10 testing has not been enabled. |
1) Use NTM Control Utility APP10 Service Controls to
control testing. |
|||
Quote Mode is Manual APP10 test cycles have failed 3 consecutive times and not
succeeded in 3 subsequent consecutive times. |
1) If failure reasons are quote related, check SUBGROUP01
processes and processing. |
|||
Test Results are not PASS and/or TestDesp are not successful. APP10 test cycles have failed 3 consecutive times and not
succeeded in 3 subsequent consecutive times. |
1) If failure reasons are quote related, check SUBGROUP01
processes and processing. |
All APP11
Application Specific Recovery documentation is referenced in the APP31
(APP11/SUBGROUP02/APP13/SUBGROUP04)
section of this documentation.
APP12 Purpose:
APP12 allows users to query order, trade and MESSAGING records. Users can also cancel orders, modify or resend MESSAGES 2, or enter trade or MESSAGING records.
APP12 Servers send corrected trade reports to PROCESSINGs, or directly to SUBGROUP02 services when they correct MESSAGES 2.
Use the
following hyperlinks to jump to the desired section of APP12 documentation:
APP12_Monitoring_Considerations
APP12 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP12 nodes only when moving between nodes. (Java code, FireDaemon and Host Specific references in JNLPs required.)
- When stopping/restarting APP12 processes:
3) Notify Operations and work in cooperation with them, as appropriate to situations.
4) Stop/Restart the APP12 process.
Refresh Sub Accounts Data:
- Use NTM Control Utility – Service Control - APP12 – Refresh Sub Accounts Data.
APP12 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
Operations will not be able to query or administratively manage orders, MESSAGES 2 or MESSAGING reports via APP12. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Notify Management. |
APP12 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes
facilitate order and trade research and modification capabilities with PROCESSINGs. |
PROD MENU: |
- Color of data in columns |
Data is RED. Process is either down or multicast data is not being
received by monitor. |
1) Check status of process |
Not all APP12 processes are displayed as expected. |
APP12 Service has not been started. |
|||
Msgs In and/or Msgs Out are zero when messages are
processed. |
No messages have been sent/received since that monitor has
been started. |
All APP13
Application Specific Recovery documentation is referenced in the APP31
(APP11/SUBGROUP02/APP13/SUBGROUP04)
section of this documentation.
All APP14
Application Specific Recovery documentation is referenced in the APPGROUP03
Server section of this
documentation.
All APP15
Application Specific Recovery documentation is referenced in the APPGROUP03
Server section of this
documentation.
All APP16
Application Specific Recovery documentation is referenced in the APPGROUP03
Server section of this
documentation.
All APP17
Application Specific Recovery documentation is referenced in the APPGROUP03
Server section of this
documentation.
All APP18
Application Specific Recovery documentation is referenced in the APPGROUP03
Server section of this
documentation.
DBL Purpose:
APP19 read files created by associated applications,
and load all messages into databases for historical reference.
Use the
following hyperlinks to jump to the desired section of DBL documentation:
DBL Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use appropriate nodes for associated application only when moving between nodes. (See associated application sections for reference.)
- When stopping/restarting Database Loader processes:
1) Stop the associated application that is responsible for writing to the affected database loader file.
2) Confirm the database loader has completed loading all data.
3) Stop the database loader.
4) If NOT moving the Database Loader files to a new node, skip to step 5.
a) Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.
b) If moving Database Loader files:
Copy :\chx\data\DL*.log, DL*.pos, DL*rejeAPP13.log files (created on day) for each Database Loader process moving to alternate node.
5) Start the associated application that writes the database loader file.
6) Start the Database Loader.
7) If APP19 were moved to alternate nodes WITHOUT moving Database Loader files, concatenate Database Loader files:
a) These steps are only critical for End-Of-Day Post Trade Technology procedures that depend on the affected Database Loader files existing on a given node, and including the entire day’s records. These steps can be left for the End-Of-Day, as long as they are completed prior to the End-Of-Day Post Trade Technology procedures that require them.
b) In Windows Explorer:
i) Go to server and folder where first set of Database Loader files exist.
ii) RenaAPP32 all *.log files involved (created on the day) from *.log to *A.log
iii) Copy second server versions of the all *.log files involved (created on the day) to original server and folder.
iv) RenaAPP32 all newly copied files involved (created on the day) from *.log to *B.log
c) On original server, using DOS command prompt:
i) Go to folder where Database Loader files exist.
ii) COPY *A.log + *B.log *.log
See Database Loader Reject Processing
Documentation for NTM Control Commands utilized for this purpose.
Database Reject Reformat:
- Use NTM Control Utility – Utilities – Database Reject Reformat.
- User will see list of reject files to be processed, if there are any.
- For any reject files the user wishes to process, they highlight the file and then right-click and select Create New Reload File.
DBL Troubleshooting Table:
DBL Symptom |
ImpaAPP13 |
Response |
There are Database Loading
RejeAPP13
Evidenced by: - Stats monitor shows non-zero values in reject column. - Oracle reject errors are seen in EMT. |
Historical data will not be retained as expected, nor be available to applications that may need to act further against the data. |
1) Use Database Loader Reject Replay Procedures to examine reasons for rejeAPP13 and replay data as appropriate. |
DBL Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes read data files and load the records into the
databases. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Reject File Size column shows positive value (non-zero). |
RejeAPP13 have occurred. |
|||
Percent Complete and Records Remaining columns showing non-zero
values and not reducing as expected. Insert Rate is zero or lower than
expected rate. |
Process is either down or hung, or database is not responding. |
Binary DBL Purpose:
APP19 read files created by associated applications,
and load all messages into databases for historical reference.
Use the
following hyperlinks to jump to the desired section of Binary DBL
documentation:
Binary_DBL_Recovery_Considerations
Binary_DBL_NTM_Control_Commands
Binary_DBL_Troubleshooting_Table
Binary_DBL_Monitoring_Considerations
Binary DBL Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use appropriate nodes for associated application only when moving between nodes. (See associated application sections for reference.)
- When stopping/restarting Database Loader processes:
1) Stop the associated application that is responsible for writing to the affected database loader file.
2) Confirm the database loader has completed loading all data.
3) Stop the database loader.
4) If NOT moving the Database Loader to a new node, skip to step 5.
a) Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.
b) If moving Database Loader files:
Copy :\chx\data\*bin*.log and *bin*.pos files (created on day) for each Database Loader process moving to alternate node.
5) Start the associated application that writes the database loader file.
6) Start the Database Loader.
7) If APP19 moved to alternate nodes WITHOUT moving Database Loader files, concatenate Database Loader files:
a) These steps are only critical for End-Of-Day Post Trade Technology procedures that depend on the affected Database Loader files existing on a given node, and including the entire day’s records. These steps can be left for the End-Of-Day, as long as they are completed prior to the End-Of-Day Post Trade Technology procedures that require them.
b) In Windows Explorer:
i) Go to server and folder where first set of Database Loader files exist.
ii) RenaAPP32 all *.log files involved (created on the day) from *.log to *A.log
iii) Copy second server versions of the all *.log files involved (created on the day) to original server and folder.
iv) RenaAPP32 all newly copied files involved (created on the day) from *.log to *B.log
b) On original server, using DOS command prompt:
i) Go to folder where Database Loader files exist.
ii) COPY *A.log + *B.log *.log
Binary DBL NTM Control Commands:
See Database Loader Reject Processing Documentation for NTM Control
Commands utilized for this purpose.
Database Loader Options:
- Use NTM Control Utility – Utilities – Database Loader Options.
- User will see list of reject files to be processed, if there are any.
- For any reject files the user wishes to process, they highlight the file and then right-click and select Create New Reject File.
- Once a reject file has been processed, a reload file will be shown.
- For any reload file the user wishes to process, they highlight the file and then right-click and select Replay Reload File.
Binary DBL Troubleshooting Table:
Binary DBL Symptom |
ImpaAPP13 |
Response |
There are Database Loading
RejeAPP13
Evidenced by: - Stats monitor shows non-zero values in reject column. - Oracle reject errors are seen in EMT. |
Historical data will not be retained as expected, nor be available to applications that may need to act further against the data. |
1) Use Database Loader Reject Replay Procedures to examine reasons for rejeAPP13 and replay data as appropriate. |
Binary DBL Monitoring
Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes read data files and load the records into the
databases. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Reject File Size column shows positive value (non-zero). |
RejeAPP13 have occurred. |
|||
Percent Complete and Records Remaining columns showing non-zero
values and not reducing as expected. Insert Rate is zero or lower than
expected rate. |
Process is either down or hung, or database is not responding. |
APP20 (APP20)
Purpose:
APP20 processes receive orders and order related information from order sending firms and send to PROCESSINGs.
They then receive related responses from the PROCESSINGs
and send these back to order senders.
Use the
following hyperlinks to jump to the desired section of APP20 documentation:
APP20_Monitoring_Considerations
APP20 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
-
When moving between nodes:
o
ALTERNATE NODES are only to be used when a given
firm is having issues and cannot connect to their PRIMARY NODE.
o
DR NODES are to be used when is having issues
and must move the affected firms to another node.
DR NODES must be allocated un-natted nodes.
No node can support more than one natted address at the saAPP32 time.
- When stopping/restarting APP20 processes:
1) Notify the associated firm and work in cooperation with them, as appropriate to situations.
2) Notify Tech Services if moving APP20 processes to DR nodes and NAT addresses need to change to accommodate move.
3) Stop the APP20 process.
4) If NOT moving APP20 to new node, skip to step 6.
a) If moving the APP20 to a new node, copy the day’s APP20 PROCESSOR files to alternate node:
a) Copy :\chx\data\{APP20}\*.conf, *.in, *.ndx.in, *.out, *.ndx.out files for each APP20 process moving to the alternate node.
b) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
c) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
d) If moving a APP10 associated APP20 (or APP26) process, also see APP10 recovery considerations documentation.
5) Start the APP20 process.
6) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open OSF Channel:
- Use NTM Control Utility – Service Control - APP20 – Open OSF Channel to make OSF connection to APP20 possible.
Close OSF Channel:
- Use NTM Control Utility – Service Control - APP20 – Close OSF Channel to make OSF connection to APP20 impossible.
Set Inbound Sequence Number:
- Use NTM Control Utility – Service Control - APP20 – Set Inbound Sequence Number to set Inbound Sequence Number.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - APP20 – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP20 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP20 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP20 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP20Services.xml file within the disconnect message. |
Firm is no longer able to send or receive order or order related messages with PROCESSING. |
4) Contact firm. 5) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 6) Stop/Restarts of affected application service may help resolve the issue. |
Firm wants to force all orders from service to be canceled. Evidenced by: - Firm calling and requesting all orders be canceled. |
The affected firm wants their risk mitigated by not leaving any open orders in the PROCESSING. |
1) Stopping the firm’s APP20 service(s) will also force a “Send AllOAPP39xlReq” message to all PROCESSINGs, canceling all of the affected APP20’s open orders. 2) Use the NTM Control Utility APP32 option to “Forcibly Cancel Orders by Firm” if the firm wants all orders canceled, regardless of which APP20 service it may have been sent through. |
APP20 Monitoring
Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes
facilitate communications from order sending firms directly to PROCESSINGs. |
PROD MENU: |
- Color of data in columns |
Data is RED. Process is either down or multicast data is not being
received by monitor. |
1) Check status of process |
Status is Disconnected or Open Firm is not connected. |
1) Use NTM Control Utility APP20 Service Controls to Open
Channels. |
|||
Status is Inactive 29 West communications has been disabled between the PROCESSOR
and the PROCESSING. |
1) Check status of process |
|||
Write Queue is non-zero values and not reducing as
expected. |
Firm may not be processing as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from order sending
firms directly to PROCESSINGs. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
No data is displayed. |
No data has been generated by order sending firm. |
|||
InCount is not less than, or equal to OutCount. |
Firm may not be receiving all PROCESSING responses
expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate
communications from order sending firms directly to PROCESSINGs. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Statistics are not shown for APP20 process as expected. |
Process may not be up, or 29 West Stats have not yet been
enabled for process. |
|||
Rate and/or MsgCount values are not incrementing as
expected. |
Firms may not be receiving messages as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate
communications from order sending firms directly to PROCESSINGs. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Statistics are not shown for APP20 process as expected. |
Process may not be up, or 29 West Stats have not yet been
enabled for process. |
|||
Msgs_rcved values are not incrementing as expected. |
Firms may not be receiving messages as expected. |
|||
Lost-unrecovered values are non-zero. |
Firms may not be receiving messages as expected. |
APP21 Purpose:
APP21 processes receive order and execution drop copies from APP22 processes and send them to External Vendors or MESSAGING Services.
They then receive related responses from these
services for appropriate error handling.
Use the
following hyperlinks to jump to the desired section of APP21 documentation:
APP21_Monitoring_Considerations
APP21 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
-
When moving between nodes:
o
ALTERNATE NODES are not defined for DSCF
services.
o
DR NODES must be allocated un-natted nodes.
No node can support more than one natted address at the saAPP32 time.
- When stopping/restarting APP03 processes:
1) Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.
2) Notify Tech Services if moving APP21 processes to DR nodes and NAT addresses need to change to accommodate move.
3) Stop the DSCF process.
4) If NOT moving APP21 to new node, skip to step 5.
a) If moving the APP21 to a new node, copy the day’s APP03 PROCESSOR files to alternate node:
a) Copy: \chx\data\{APP21}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.
b) Copy: \chx\data\{APP21}\Global.* file created for the day to the alternate node.
c) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
d) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
5) Start the DCSSF process.
6) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open OSF Channel:
- Use NTM Control Utility – Service Control - APP21 – Open OSF Channel to make OSF connection possible.
Close OSF Channel:
- Use NTM Control Utility – Service Control - APP21 – Close OSF Channel to make OSF connection impossible.
Set Inbound Sequence Number:
- Use NTM Control Utility – Service Control - APP21 – Set Inbound Sequence Number to set Inbound Sequence Number.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - APP21 – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP21 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP21 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP21 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP01Services.xml file within the disconnect message. |
Drop copy firm/vendor is no longer able to receive drop copy related messages. |
1) Contact firm. 2) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 3) Stop/Restarts of affected application service may help resolve the issue. |
APP21 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from (PROCESSINGs
via APP40 and APP22 processes) to Order Sending Firm's Drop Copy
Destinations. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
FixConnect is Disconnected or Open |
Firm is not connected. |
|
|
|
MsgIn values are not equal to or greater than MsgOut
values. |
MESSAGINGs may not be as expected. |
APP22 Purpose:
MESSAGING Routers receive drop copy related orders and MESSAGES 2 from the PROCESSING (APP40) or from MESSAGING.
MESSAGING Routers send drop copy related orders and MESSAGES 2 to appropriate order sending firms/vendors.
MESSAGING Routes are configured such that:
- the DC1 instance sends drop copies for DC1 PROCESSINGs,
- the DC2 instance sends drop copies for DC2 PROCESSINGs.
Use the
following hyperlinks to jump to the desired section of APP22 documentation:
APP22_Monitoring_Considerations
APP22 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP22 nodes only when moving between nodes. (No real dependencies other than expected processing sites.)
- When stopping/restarting APP22 processes:
1) Notify the Drop Copy firms and vendors to notify them that drop copies will be interrupted.
2) Stop/Restart the APP22 process.
- There are no APP22 specific NTM Control Commands.
APP22 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
Drop Copy firms/vendors will not be able to receive drop copies. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Notify Management. |
APP22 Monitoring Considerations:
See
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
||||
Processes facilitate communications from PROCESSINGs
(via APP40 and APP22 processes) to Order Sending Firm's Drop Copy
Destinations. Monitor shows 29 West statistics (by topic) for APP21
processes, receiving from APP22 processes. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
||||
|
|
Rate and/or MsgCount values are not incrementing as
expected. |
Firms may not be receiving messages as expected. |
|||||
|
|
|
By matching topic_name, Rcv MsgCount in this monitor does
not match total Src MsgCount values in APP40_LBM Stats moniitor. |
Firms may not be receiving messages as expected. |
||||
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
||||
Processes facilitate communications from PROCESSINGs
(via APP40 and APP22 processes) to Order Sending Firm's Drop Copy
Destinations. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
||||
|
|
|
Msgs_rcved values are not incrementing as expected. |
Firms may not be receiving messages as expected. |
||||
|
|
|
Lost-unrecovered values are non-zero. |
Firms may not be receiving messages as expected. |
||||
APP23 Purpose:
APP23 PROCESSOR receives MESSAGING messages from APP09 and sends them to APP23 for actual MESSAGING.
They then receive related responses from these
services for error handling.
Use the
following hyperlinks to jump to the desired section of APP23 documentation:
APP23_Monitoring_Considerations
APP23 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP36 nodes only when moving between nodes. (APP36 NAT addresses must be configured/used by APP23.)
- When stopping/restarting APP23 processes:
1) Notify APP23 and work in cooperation with them, as appropriate to situations.
2) Stop the APP23 process.
3) If NOT moving APP23 to new node, skip to step 5.
a) If moving the APP23 to a new node, copy the day’s APP23 PROCESSOR files to alternate node:
a) Copy: \chx\data\APP23\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.
b) Copy: \chx\data\APP23\Global.* file created for the day to the alternate node.
c) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
d) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
4) Start the APP23 process.
5) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open OSF Channel:
- Use NTM Control Utility – Service Control - APP23 – Open OSF Channel to make OSF connection possible.
Close OSF Channel:
- Use NTM Control Utility – Service Control - APP23 – Close OSF Channel to make OSF connection impossible.
Set Inbound Sequence Number:
- Use NTM Control Utility – Service Control - APP23 – Set Inbound Sequence Number to set Inbound Sequence Number.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - APP23 – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP23 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP23 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP23 Troubleshooting
Table:
APP23 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP23Services.xml file within the disconnect message. |
APP23 is no longer able to receive MESSAGING messages. |
1) Contact firm. 2) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 3) Stop/Restarts of affected application service may help resolve the issue. |
APP23
Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate MESSAGING trade delivery from
processes to APP23. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Status is Disconnected or Open |
Firm is not connected. |
|||
OutTAPP39apRpts
value does not match OutCnt value in RTC01 to APP23 App Queues monitor. |
We may not be processing as expected. |
APP24 Purpose:
APP24 PROCESSOR receives order messages from MESSAGING and sends them to APP24 for actual trading.
They then receive related responses from these
services for processing and/or error handling.
Use the
following hyperlinks to jump to the desired section of APP24 documentation:
APP24_Monitoring_Considerations
APP24 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP36 nodes only when moving between nodes. (APP36 NAT addresses must be configured/used by APP24.)
- When stopping/restarting APP24 processes:
1) Notify APP24 and work in cooperation with them, as appropriate to situations.
2) Stop the APP24 process.
3) If NOT moving APP24 to new node, skip to step 5.
a) If moving the APP24 to a new node, copy the day’s APP24 PROCESSOR files to alternate node:
a) Copy: \chx\data\{APP24}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.
b) Copy: \chx\data\{APP24}\Global.* file created for the day to the alternate node.
c) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
d) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
4) Start the APP24 process.
5) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open OSF Channel:
- Use NTM Control Utility – Service Control - TRF – Open OSF Channel to make OSF connection possible.
Close OSF Channel:
- Use NTM Control Utility – Service Control - TRF – Close OSF Channel to make OSF connection impossible.
Set Inbound Sequence Number:
- Use NTM Control Utility – Service Control - TRF – Set Inbound Sequence Number to set Inbound Sequence Number.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - TRF – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - TRF – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - TRF – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP24 Troubleshooting Table:
APP24 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP24Services.xml file within the disconnect message. |
APP24 is no longer able to receive order messages or return responses to MESSAGING. |
1) Contact firm. 2) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 3) Stop/Restarts of affected application service may help resolve the issue. |
APP24 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from to
FINRA/NASDAQ Trade Reporting Facility Destinations. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Status is Disconnected or Open |
Firm is not connected. |
|
|
|
InCount does not match OutCount. |
We may not be processing as expected. |
APP25 Purpose:
The APP25 process reads order messages received from APP01 and APP26 processes and loads the data into the databases via loaders.
APP25 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP25 nodes only when moving between nodes. (No real dependencies outside of expected processor.)
APP25 NTM Control
Commands:
See Binary_DBL_Monitoring_Considerations
There are no APP25 specific NTM Control Commands outside of Binary Database Loader commands.
APP25 Troubleshooting Table:
APP25 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to APP25 service. - In Solarwinds (and outlook), node and processes will be reported down. |
-Post trade processing will not include APP01 and APP26 order data. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Restart affected APP25 processes on alternate nodes. 3) Notify Management and PTT. |
APP25 Monitoring
Considerations:
See APP01_Monitoring_Considerations, APP26_Monitoring_Considerations, Binary_DBL_Monitoring_Considerations
There are no APP25 specific monitors outside of EMT and the related APP01, APP26 and Binary Database Loader monitors.
APP26 Purpose:
APP26 PROCESSOR receives order messages from order sending firms and sends them to MESSAGING via APP37 for actual trading.
APP26 PROCESSORs can also send to APP01, TRF or PROCESSINGs if firms route them there using fix tags, but typically they are used for MESSAGING.
They then receive related responses from these
services for processing and/or error handling.
Use the
following hyperlinks to jump to the desired section of APP26 documentation:
APP26_Monitoring_Considerations
APP26 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- When moving between nodes:
1) ALTERNATE NODES are not defined for APP26 services.
2) DR NODES must be allocated un-natted nodes.
No node can support more than one natted address at the saAPP32 time.
- When stopping/restarting APP26 processes:
1) Notify order sending firms involved and work in cooperation with them, as appropriate to situations.
2) Stop the APP26 process.
3) If NOT moving APP26 to new node, skip to step 5.
a) If moving the APP26 to a new node, copy the day’s APP26 PROCESSOR files to alternate node:
a) Copy: \chx\data\{APP26}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.
b) Copy: \chx\data\{APP26}\Global.* file created for the day to the alternate node.
c) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
d) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
4) Start the APP26 process.
5) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open OSF Channel:
- Use NTM Control Utility – Service Control - APP26 – Open OSF Channel to make OSF connection possible.
Close OSF Channel:
- Use NTM Control Utility – Service Control - APP26 – Close OSF Channel to make OSF connection impossible.
Set Inbound Sequence Number:
- Use NTM Control Utility – Service Control - APP26 – Set Inbound Sequence Number to set Inbound Sequence Number.
Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control - APP26 – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP26 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP26 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP26 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP26Services.xml file within the disconnect message. |
APP26 is no longer able to receive order messages or return responses to order sending firms. |
1) Contact firm. 2) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 3) Stop/Restarts of affected application service may help resolve the issue. |
APP26 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from Order
Sending Firms to PROCESSOR for special routing (via APP37 to MESSAGING, SITE
2s or PROCESSINGs). |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Status is Disconnected or Open |
Firm is not connected. |
|
|
|
Write Queue is non-zero values and not reducing as
expected. |
Firm may not be processing as expected. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate
communications from Order Sending Firms to PROCESSOR for special routing (via
APP37 to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or
multicast data is not being received by monitor. |
No data is displayed. |
No data has been generated between
Order Sending Firm and PROCESSOR. |
|||
Aggregate InCount (by service
name) is not less than, or equal to aggregate (by service name) OutCount. |
Firm may not be processing as
expected. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate
communications from Order Sending Firms to PROCESSOR for special routing (via
APP37 to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or
multicast data is not being received by monitor. |
Not all APP01 processes are
displayed as expected. |
APP01 Service has not been
started. |
|||
Msgs In and/or Msgs Out are zero. |
No messages have been
sent/received since that monitor has been started. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate
communications from Order Sending Firms to PROCESSOR for special routing (via
APP37 to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast
data is not being received by monitor. |
Not all APP26 or APP37 processes
are displayed as expected. |
APP26 or APP37 Service has not
been started or hasn't processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from
source to destination unless IPC channel is connected. |
|||
Queue size is non-zero value and
not decreasing as expected. |
Messages cannot be sent from
source to destination unless IPC channel is connected. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate
communications from Order Sending Firms to PROCESSOR for special routing (via
APP37 to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast
data is not being received by monitor. |
Not all APP26 or APP25 processes
are displayed as expected. |
APP26 or APP25 Service has not
been started or hasn't processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from
source to destination unless IPC channel is connected. |
|||
Queue size is non-zero value and
not decreasing as expected. |
Messages cannot be sent from
source to destination unless IPC channel is connected. |
APP27 Purpose:
The APP27 process reads ACTIVITY messages received from APP31 processes and loads the data into the databases via loaders.
APP27 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP27 nodes only when moving between nodes. (No real dependencies outside of expected processor.)
APP27 NTM Control
Commands:
See Binary_DBL_Monitoring_Considerations
There are no APP27 specific NTM Control Commands outside of Binary Database Loader commands.
APP27 Troubleshooting Table:
APP27 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to APP27 service. - In Solarwinds (and outlook), node and processes will be reported down. |
-PROCESSING will not have all activities necessary to define proper sequence numbering if the PROCESSING needs to be restarted. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Restart affected APP27 processes on alternate nodes. 3) Notify Management and PTT. |
APP27 Monitoring
Considerations:
See Binary_DBL_Monitoring_Considerations
There are no APP27 specific monitors outside of EMT and the related Binary Database Loader monitors.
APP28 Purpose:
The RISK process group is comprised of two application service types: APP28 and APP08.
Together, these service types process risk management instructions as defined by order sending firms and/or MESSAGING firms.
APP08 (COMMUNICATION) manages communication between the RISK GUI application (on a Linux server) and the RISK process.
APP28 (RISK) manages risk management communication between the APP08, the PROCESSINGs, MESSAGING and the database.
There are no APP28 users as of the last update to this documentation.
There are no APP28 or APP08 monitoring tools outside of EMT messaging.
There are no APP28 or APP08 NTM Commands.
APP28 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP28 nodes only when moving between nodes. (RISK GUI communications require host specific WebService.War files.)
- Use the following hyperlink to access recovery procedures for RISK applcations:
APP28 Troubleshooting Table:
APP28 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
Since there are no actual RISK users, there should be no operational impaAPP13 outside of the loss of a server. And in default configurations, the RISK applications are the only applications running on this server. |
4) Notify Management. |
APP29 Purpose:
APP29 allows users to query, enter, modify and inactivate trading system support records. All writes are directly to the database.
MNT Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use MNT nodes only when moving between nodes. (Java code, FireDaemon and Host Specific references in JNLPs required.)
- When stopping/restarting MNT processes:
1) Notify Operations and work in cooperation with them, as appropriate to situations.
2) Stop/Restart the MNT process.
There are no APP29 specific NTM Control commands.
MNT Troubleshooting Table:
MNT Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
Operations will not be able to query or administratively manage trading system support records via APP29. Refer to: APPGROUP08 or APPGROUP09 Server Server Specific Recoveries. |
1) Refer to: APPGROUP08 or APPGROUP09 Server Server Specific Recoveries. 2) Notify Management. |
MNT Monitoring
Considerations:
There are no APP29 specific monitors outside of EMT.
APP30 Purpose:
The APP30 process group is comprised of three separate application service types: APP02, SUBGROUP02 and SUBGROUP03.
Together, these service types read inbound MESSAGING from SITE 2 and NASDAQ via APP31 processes and load the data into the databases via loaders.
APP02 Processes read SITE 2 and NASDAQ BBO Duration data
from APP31 (APP11/SUBGROUP02) processes and load it into databases via APP19.
SUBGROUP02 Processes read SITE 2 and NASDAQ Lastsale data
from APP31 (APP13/SUBGROUP04) processes and load it into databases via APP19.
SUBGROUP03 Processes read SITE 2 and NASDAQ Quote Montage data from APP31 (APP11/SUBGROUP02) processes and load it into databases via APP19.
Use the
following hyperlinks to jump to the desired section of APP30 documentation:
APP30_Monitoring_Considerations
APP30 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP30 nodes only when moving between nodes. (APP30 processes utilize a lot more disAPP28ace than other applications)
- APP30 processes do not have DR nodes defined; These processes do not move between data centers.
- Restart these processes as quickly as possible on alternate nodes. The longer the processes are down, the worst impaAPP13 we will have on Post Trade Processing.
See Binary_DBL_Monitoring_Considerations
There are no APP02, SUBGROUP02 or SUBGROUP03 specific NTM Control Commands outside of Binary Database Loader commands.
APP30 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to APP30s - In Solarwinds (and outlook), node and processes will be reported down. |
For APP02, SUBGROUP02 and/or SUBGROUP03: -Post trade processing will not include all quote related MESSAGING. Refer to: APP30 (APP02/SUBGROUP02/SUBGROUP03) Server Specific Recoveries. |
1) Refer to: APP30 (APP02/SUBGROUP02/SUBGROUP03) Server Specific Recoveries. 2) Restart affected APP30 processes on alternate nodes. 3) Notify Management and PTT. |
APP30 Monitoring Considerations:
See Binary_DBL_Monitoring_Considerations
There are no APP02, SUBGROUP02 or SUBGROUP03 specific monitors outside of EMT and the Binary Database Loader monitors for each.
APP31 Purpose:
The APP31 process group is comprised of four separate application service types: APP11, APP13, SUBGROUP02, SUBGROUP04.
Together, these service types read inbound MESSAGING distributed by SITE 2 and NASDAQ and forward it to PROCESSINGs and APP19.
APP11 Processes read SITE 2 quote data over multicast
and send it to PROCESSINGs and quote montage readers.
SUBGROUP02 Processes read NASDAQ quote data over
multicast and send it to PROCESSINGs and quote montage readers.
APP11 and SUBGROUP02 processes also calculate Best Bid Offer Duration values to be sent to MESSAGINGs.
APP13 Processes read SITE 2 lastsale data over multicast and send it to PROCESSINGs and lastsale montage readers.
SUBGROUP04 Processes read NASDAQ lastsale data over multicast and send it to PROCESSINGs and lastsale montage readers.
Use the
following hyperlinks to jump to the desired section of APP31 documentation:
APP31_Monitoring_Considerations
APP31 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP31 nodes only when moving between nodes. (APP31 interfaces must be configured/enabled.)
- APP31 processes do not move between data centers. (Only “alternate” nodes are defined in service tables.)
- When stopping/restarting APP31 instances:
1) Stop/Start “A” series and “B” series separately so as to avoid causing suspending trading in PROCESSINGs.
APP31 NTM Control Commands:
Enabling/Disabling Channel Readers:
- Use NTM Control Utility – Service Control - APP31 - Control Multicast Readers.
- User may have to use REFRESH button multiple times to see all APP31 processes/channels.
- Select (and highlight) desired channels and right click to see and select desired options, including:
1) Start Primary Reader
2) Start Alternate Reader
3) Start Both Readers
4) Stop Primary Reader
5) Stop Alternate Reader
6) Stop Both Readers
Flush Message Queue:
- Use NTM Control Utility – Service Control - APP31 – Flush Message Queue.
- User may have to use REFRESH button multiple times to see all APP31 processes/channels.
- Select (and highlight) desired channels and right click to see and select Flush Internal Msg Queue.
Reset Multicast Sequence Number:
- Use NTM Control Utility – Service Control - APP31 – Reset Multicast Sequence Number.
- User may have to use REFRESH button multiple times to see all APP31 processes/channels.
-
Select (and highlight) desired channels and right
click to see and select Reset Multicast Sequence Number.
APP31 Troubleshooting Table:
APP31 Symptom |
ImpaAPP13 |
Response |
Nasdaq moves to DR site
(CRITICAL)
Evidenced by: - Stats monitor shows zero NASDAQ primary data received in both data centers and zero NASDAQ alternate data received in DC2 – but NASDAQ alternate data processed in DC1. |
For SUBGROUP02: -PROCESSING trading without quote related MESSAGING for NASDAQ issues in DC2 only. -Post trade processing will not include all quote related MESSAGING. For SUBGROUP02: -PROCESSING trading without lastsale related MESSAGING for NASDAQ issues in DC2 only. -Post trade processing will not include all lastsale related MESSAGING. See Generalized Recovery Scenarios for more: Trading must be halted in affected issues if problem goes on too long. |
1) Refer to: CHX_Cannot_Process_MESSAGES 1_From_SITE 2s And CHX_Cannot_Process_MESSAGES 2_From_SITE 2s Generalized Recovery Scenarios. For SUBGROUP02: 1) Production Support must replace CHXAPPCFG APP31 SUBGROUP02_Services file with NASDAQ DR version. 2) SUBGROUP02 Services in DC2 must be stopped/restarted. -
See APP31_Recovery_Considerations For SUBGROUP04: 1) Production Support must replace CHXAPPCFG APP31 SUBGROUP04_Services file with NASDAQ DR version. 2) SUBGROUP04 Services in DC2 must be stopped/restarted. -
See
APP31_Recovery_Considerations
For all: 1) Notify management. 2) If trading was halted, resAPP41 trading when stable and notify industry. |
No Multicast Data Received (CRITICAL)
: - BOTH Primary and Alternate Channels. - BOTH “A” and “B” series of Processes. Evidenced by: - In APP31 Stats, zero values seen in rates columns. |
For APP11/SUBGROUP02: -PROCESSING WILL be trading without quote related MESSAGING. -Post trade processing WILL not include all quote related MESSAGING. For APP13/SUBGROUP04: -PROCESSING WILL be trading without lastsale related MESSAGING. -Post trade processing WILL not include all lastsale related MESSAGING. See Generalized Recovery Scenarios for more: Trading must be halted in affected issues if problem goes on too long. |
1) Refer to: CHX_Cannot_Process_MESSAGES 1_From_SITE 2s And CHX_Cannot_Process_MESSAGES 2_From_SITE 2s Generalized Recovery Scenarios. 1) Confirm scope of impaAPP13. In Stats Monitor: - Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) Determine corrective actions. 4) If trading was halted, resAPP41 trading when stable and notify industry. |
No Multicast Data Received
(NON-CRITICAL): - EITHER Primary or Alternate Channel. - ONLY “A” or “B” series of Processes. Evidenced by: - In APP31 Stats, zero values seen in rates columns. |
For APP11/SUBGROUP02: -PROCESSING MAY be trading without quote related MESSAGING. -Post trade processing MAY not include all quote related MESSAGING. For APP13/SUBGROUP04: -PROCESSING MAY be trading without lastsale related MESSAGING. -Post trade processing MAY not include all lastsale related MESSAGING. |
1) Confirm scope of impaAPP13. In Stats Monitor: - Are problems specific to: - SITE 2 and/or NASDAQ - DC1 and/or DC2 - Servers? Processes? Channels? 2) Notify management. 3) Determine corrective actions. |
Sequence gaps reported
(NON-CRITICAL) Evidenced by: - In EMT, sequence gaps reported |
SaAPP32 as No Multicast Data Received (NON-CRITICAL) symptom. |
1) SaAPP32 as No Multicast Data Received (NON-CRTICAL) symptom. 2) Notify management only in critical situations. |
Dupes reported
(NON-CRITICAL) Evidenced by: - In EMT, dupes reported |
SaAPP32 as No Multicast Data Received (NON-CRITICAL) symptom. |
1) SaAPP32 as No Multicast Data Received (NON-CRTICAL) symptom. 2) Notify management only in critical situations. |
Market Wide Circuit Breaker
(NON-CRITICAL) Evidenced by: - In EMT, Process reports MWCB messages |
Listing Exchanges will Halt Trading in their issues for 15 minutes, and ResAPP41 when appropriate. NOTE: currently has no exclusively listed issues. |
1) Halt trading in all exclusive issues. 2) Confirm MEs halt trading accordingly and resAPP41 trading accordingly in all stocks. - EMT/ER may be useful. - NTM Control Utility APP32 Service Controls “Get Issues Open” and “Get Issues Not Open” may also help. 3) Notify management. |
APP31 Monitoring
Considerations:
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from NASDAQ and SITE 2
SITE 2s to PROCESSINGs and MESSAGING APP19 via MESSAGING Processors. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Primary Channel and/or Alt Channel are not ENABLED. |
Processes are not reading multicast SITE 2 data. |
|||
Primary Rate and/or Alt Rate show sustained rate of zero
during trading hours. |
Multicast feed is not being processed as expected. |
|||
Primary Total Msgs and Alt Total Msgs are not showing
(relatively) the saAPP32 numbers of messages processed per process. |
Multicast feed is not being processed as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from NASDAQ and SITE 2
SITE 2s to PROCESSINGs and MESSAGING APP19 via MESSAGING Processors. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
nMsgQueSize is non-zero value and not reducing as
expected, or InMsgRate is not increasing as expected. |
Receiving
process may not be up and/or there are THIRD PARTY delivery issues. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate communications from NASDAQ and SITE 2
SITE 2s to PROCESSINGs and MESSAGING APP19 via MESSAGING Processors. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP31 or ACTR processes are displayed as expected. |
APP31 or ACTR Service has not been started or hasn't
processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate inbound
quote processing and database loading. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Rate and/or MsgCount values are not incrementing as
expected. |
MESSAGING may not be being processed or loaded into
database as expected. |
|
|
|
By matching topic_name, Rcv MsgCount in this monitor does
not match total Src MsgCount values. |
MESSAGING may not be being processed or loaded into
database as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate inbound
quote processing and database loading. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Msgs_rcved values are not incrementing as expected. |
MESSAGING may not be being processed or loaded into
database as expected. |
|
|
|
Lost-unrecovered values are non-zero. |
MESSAGING may not be being processed or loaded into
database as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate inbound
lastsale processing and database loading. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Rate and/or MsgCount values are not incrementing as
expected. |
MESSAGING may not be being processed or loaded into
database as expected. |
|
|
|
By matching topic_name, Rcv MsgCount in this monitor does
not match total Src MsgCount values. |
MESSAGING may not be being processed or loaded into
database as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate inbound
lastsale processing and database loading. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Msgs_rcved values are not incrementing as expected. |
MESSAGING may not be being processed or loaded into
database as expected. |
|
|
|
Lost-unrecovered values are non-zero. |
MESSAGING may not be being processed or loaded into
database as expected. |
APP32 Purpose:
APP32 Processes read order messages from APP20s and APP38ridges, order and trade modification messages from APP12 and MESSAGING (via APP38ridges), MESSAGING from APP31 processes and MESSAGING Inquiry messages from MESSAGING.
APP32 Processes send order responses to APP20s and APP38ridges, drop copies to APP22 processes, and MESSAGES 1, MESSAGES 2 and MESSAGING messages to SUBGROUP01 processes. They also send database loader messages to APP40 for database loading.
Use the
following hyperlinks to jump to the desired section of APP32 documentation:
APP32 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility Service Control Process
Controller to stop/restart processes.
- Use APP32 nodes only when moving between nodes. MESSAGING interfaces must be configured.
NOTE: Existing bug exists when moving APP40 processes from one node to another in an orderly fashion whereas APP32 messages may get lost as a result. If this occurs, records will need to be extracted from ME_LBM logs and provided to PTT developers to try and recreate data in the database. This is a very cumbersoAPP32 operation and should be avoided if at all possible.
- When stopping/restarting APP32 instances:
1) Stop APP10 testing for the APP32 instance involved before stopping the ME
2) Stop the APP32 instance
3) Determine if associated APP40 and DLAPP32 instances used by the affected APP32 instance also require recovery.
a) APP40 and APP19 are configured to run on different nodes from the APP32 (unless they are test instances of ME).
b) If the APP32 is moving to a host within the saAPP32 data center, it is possible that the APP40 (and DLMEs) do not need to be moved.
c) If the APP32 is moving to a host in a different data center, then it would be prudent to move the APP40 (and DLMEs) as well.
4) Confirm that the associated Database loader is up to date IN THE DATA CENTER that the APP32 will be restarted in.
a) If the APP32 is restarted without the database up to date with all most recent transactions, soAPP32 order processing may not be handled as expected.
b) If there are database loader rejeAPP13 outstanding, they must be replayed before APP32 restarts. Use database loader reject procedures to replay this data.
c) It may be necessary to move the Database Loader to another node to complete loading this data. See step 5.
5) If the associated APP40 and DLAPP32 instances are NOT moving to an alternate node, skip to step 6.
a) If moving APP40 and DLAPP32 process:
i) Stop APP40 process on current node.
ii) Stop DLAPP32 process on current node.
iii) If NOT moving the Database Loader files to a new node, skip to step 5-iv.
(a) Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.
(b) If moving Database Loader files:
Copy :\chx\data\DL*.log, DL*.pos, DL*rejeAPP13.log files (created on day) for each Database Loader process moving
to the alternate node.
iv) Start APP40 process on new node.
v) Start DLAPP32 process on new node.
6) Start the APP32 instance and confirm it starts without errors.
(continue procedure on next page)
7) Start APP10 testing for the APP32 instance involved and confirm testing is conducted without errors.
8) Confirm APP10 orders and MESSAGES 2 before and after restart can be queried in APP12 to insure that APP40 communications are as expected.
9) ResAPP41 trading in all stocks for affected ME.
a) Trading will be halted by default if APP32 startup occurs after any stock’s primary session begins.
b) Do not OPEN stocks. RESUME.
10) Determine if associated DLMP processes need to recovered.
a) DLMP (Performance Loaders) are coded such that they must run on the saAPP32 nodes as the ME.
b) If the APP32 has moved to another node, the DLMP must move as well.
11) If DLMP processes will move:
a) Stop associated DLMP processes
b) Start associated DLMP processes.
Halting Issues by ME/Stock:
- Use NTM Control Utility – Service Control – APP32 - options to Halt Issues.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
1) The APP29 instrument queries can inform a user which MEs are assigned to which stocks, if needed.
2) The APP29 instrument queries can also be used to easily store this information in an excel spreadsheet if needed.
3) Halting issues results in Halts, regardless of what the listing SITEs are doing.
- From the APP32 options, the following can be used to halt issues:
1) Halt All Issues
2) Halt APP10 Issues
3) Halt Issue (if wanting to halt an individual stock versus a group of stocks)
4) Halt Issue by SITE (need to know SITE code)
5) Halt Exclusive Issue
Resuming/Opening Issues by ME/Stock:
- Use NTM Control Utility – Service Control – APP32 options to ResAPP41 or Open Issues.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
1) The APP29 instrument queries can inform a user which MEs are assigned to which stocks, if needed.
2) The APP29 instrument queries can also be used to easily store this information in an excel spreadsheet if needed.
3) Resuming issues results in setting the issue’s trading status to match the last received status by APP31 processes.
4) Opening issues results in Openings, regardless of what the listing SITEs are doing or what is last known by APP31.
- From the APP32 options, the following can be used to resAPP41 issues:
1) ResAPP41 All Issues
2) ResAPP41 APP10 Issues
3) ResAPP41 Issue (if wanting to resAPP41 an individual stock versus a group of stocks)
4) ResAPP41 Exclusive Issue
- From the APP32 options, the following can be used to open issues:
1) Open All Issues
2) Open Issue (if wanting to open an individual stock versus a group of stocks)
3) Open Issue by SITE (need to know SITE code)
LULD Pausing/Resuming Issues by ME:
- Use NTM Control Utility – Service Control – APP32 - options to LULD Trading Pause/Resume.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
- Under LULD options, user will be required to enter Issue Symbol to Pause/Resume.
Getting Issues Open/Issues Not Open by ME:
- Use NTM Control Utility – Service Control – APP32 - options to Get Issues Open/Not Open.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
Resend MESSAGES 1 by ME:
- Use NTM Control Utility – Service Control – APP32 – Resend APP32 MESSAGES 1.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
Set Quote Conditions or Zero MESSAGES 1 by
ME:
- Use NTM Control Utility – Service Control – SUBGROUP01 (Options by ME) – to:
1) Zero Quote by ME, Set Quote Condition Auto, Set Quote Condition Manual.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
Enable/Disable MKT IOC by ME:
- Use NTM Control Utility – Service Control – APP32 – Enable/Disable Market IOC.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
Forcibly cancel orders (by either ME, order
sending firm, or stock) by ME:
- Use NTM Control Utility – Service Control – APP32 - options to Forcibly Cancel All Orders, Order for Firm or Orders for Issue.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
Enable/Disable THIRD PARTY LBM Stats by ME:
- Use NTM Control Utility – Service Control – APP32 - options to Enable/Disable THIRD PARTY Stats.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
APP32 Troubleshooting Table:
APP32 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to MEs - In Solarwinds (and outlook), node and processes will be reported down. |
is not trading issues as expected. Order sending firms will not be receiving responses/updates and industry will not be getting MESSAGES 1 or . Refer to: Server Specific Recoveries. |
4) Refer to: Server Specific Recoveries. 5) Confirm SUBGROUP01 zeroed MESSAGES 1 for all affected MEs 6) Notify Management. 7) Work with Tech Services to confirm server status. 8) If node is not to be used and processes need to be moved, determine if database loading has been completed, and complete per recovery procedures. 9) Restart affected MEs. See ME_Recovery_Considerations 10) Trading will be halted by default upon restart. 11) Notify industry after trading is resumed. |
APP32 hangs (suspends
processing) Evidenced by: - In APP32 Thread stats, non-zero values are seen in queues and not reducing. |
is not trading issues as expected. Order sending firms will not be receiving responses/updates and industry will not be getting MESSAGES 1 or . Refer to: Generalized Recovery Scenarios for more. |
1) Stop/Restart affected MEs. See ME_Recovery_Considerations 2) Notify management. 3) Trading will be halted by default upon restart. 4) Notify industry after trading is resumed. |
APP32 Monitoring Considerations:
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes
facilitate order and trade processing. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
In_rate and/or msg_in value is zero. |
PROCESSING may not be processing as expected. |
|||
Routing_enable and/or Ors_dreicted_connected flags are N. |
PROCESSING is not enabled for Outbound Routing. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes
facilitate order and trade processing. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Data_queue and/or Ctrl_que column is non-zero value and
not decreasing as expected. |
PROCESSING may not be processing as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate order and trade processing. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all APP32 processes are displayed as expected. |
APP32 Service has not been started. |
|||
Msgs In and/or Msgs Out are zero when messages are
processed. |
No messages have been sent/received since that monitor has
been started. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate order and trade processing. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being received
by monitor. |
Not all APP38ridge or APP37 processes are displayed as
expected. |
APP37 Bridge or APP37 Service has not been started or
hasn't processed any messages since monitor has been started. |
|||
Status is not CONNECTED. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
|||
Queue size is non-zero value and not decreasing as
expected. |
Messages cannot be sent from source to destination unless
IPC channel is connected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate order and
trade processing. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Statistics are not shown for APP32 process as expected. |
Process may not be up, or 29 West Stats have not yet been
enabled for process. |
|
|
|
Rate and/or MsgCount values are not incrementing as
expected. |
Order and/or trade related processing may not be working
as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate order and
trade processing. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Statistics are not shown for APP32 process as expected. |
Process may not be up, or 29 West Stats have not yet been
enabled for process. |
|
|
|
Msgs_rcved values are not incrementing as expected. |
Order and/or trade related processing may not be working
as expected. |
|
|
|
Lost-unrecovered values are non-zero. |
Order and/or trade related processing may not be working
as expected. |
APP33 Purpose:
APP33 Client allows users to query, enter, and modify support records. All writes are directly to the database.
APP33 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP33 nodes only when moving between nodes. (Java code, FireDaemon and Host Specific references in JNLPs required.)
- When stopping/restarting APP33 processes:
1) Notify Operations and work in cooperation with them, as appropriate to situations.
2) Stop/Restart the APP33 process.
There are no APP33 specific NTM Control commands.
APP33 Troubleshooting Table:
APP33 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
s Staff and IB staff will not be able to query or administratively manage support records via APP33 Client. Refer to: APPGROUP08 or APPGROUP09 Server Server Specific Recoveries. |
1) Refer to: APPGROUP08 or APPGROUP09 Server Server Specific Recoveries. 2) Notify Management. |
APP33
Monitoring Considerations:
There are no APP33 specific monitors outside of EMT.
APP34 Purpose:
APP34 allows applications to advertise their services and connection points to a local repository so other applications can find and connect to them.
In production configurations, there are two redundant APP34 services running in tandem; 1 in DC1 data center and 1 in DC2 data center.
The APP34 process is required by application startup only. Only one APP34 instance is required at any tiAPP32 to support application startups.
APP34 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP34 nodes only when moving between nodes. (FireDaemon and DNS dependencies exist for APP34.)
- If moving APP34 to alternate node, Tech Services must redefine APP34 DNS IP addresses.
APP34 NTM Control
Commands:
Reload Static Services:
- Use NTM Control Utility – Service Control – Nespr – Reload Static Services.
- Select (and highlight) desired processes and right click to see and select desired options.
APP34 Troubleshooting Table:
APP34 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to APP34 service. - In Solarwinds (and outlook), node and processes will be reported down. |
-Unless both APP34 services are down, or inaccessible for any reason, there will be no impact. If both are down, applications will not be able to start cleanly. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Restart affected APP25 processes on alternate nodes. 3) Notify Management and PTT. |
APP34 Monitoring
Considerations:
There are no APP34 specific monitors outside of EMT.
APP35 Purpose:
APP35 reads Application system messages via logserver processes running on every application node.
APP35 then broadcasts these messages to the network via multicast to be picked up and displayed by the EMT (Event Messaging Terminal).
Operations management staff cannot monitor system applications without APP35 working as expected.
Use the
following hyperlinks to jump to the desired section of APP35 documentation:
APP35_Monitoring_Considerations
APP35 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP35 nodes only when moving between nodes. (There are no dependencies outside of expected server allocations.)
APP35 NTM Control Commands:
There are no APP35 specific NTM Control Commands.
APP35 Troubleshooting Table:
APP35 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to APP34 service. - In Solarwinds (and outlook), node and processes will be reported down. |
-Unless both APP34 services are down, or inaccessible for any reason, there will be no impact. If both are down, applications will not be able to start cleanly. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Restart affected APP25 processes on alternate nodes. 3) Notify Management and PTT. |
APP35 Monitoring
Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate Application messages sent to
Logserver to be reported to EMT monitors. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Not all APP35 or Logserver processes are displayed as
expected. |
APP35 or Logserver Service has not been started. |
|
|
|
InCnt values are zero when messages are processed. |
No messages have been sent/received since that monitor has
been started. |
Purpose:
The APP36 system is comprised of three separate application service types: SUBGROUP01, SUBGROUP02 and APP06.
Together, these three service types receive and process quote, trade and MESSAGING messages from the PROCESSINGs.
The PROCESSING produces one message for all three of these service types to process, each taking their part from this message.
The data path for these messages is as follows:
1) APP32 to SUBGROUP01
2) SUBGROUP01 to SUBGROUP02
3) SUBGROUP02 to APP06
Because these
separate service types work together to process all outbound MESSAGING, their
operations and recoveries must be considered together.
SUBGROUP01 Services include the following services:
APP11RI Processes read combined data from PROCESSINGs
and send quote data to SITE 2 and APP10, as well as trade/MESSAGING data to SUBGROUP02.
UQDRI Processes read combined data from PROCESSINGs
and send quote data to NASDAQ and APP10, as well as trade/MESSAGING data to SUBGROUP02.
SUBGROUP02 Services include the following services:
APP13RI Processes read combined data from SUBGROUP01 and
trade data from APP12 and MESSAGING, and send trade data to SITE 2 and RTC, and
MESSAGING data to APP06.
UTDRI Processes read combined data from SUBGROUP01 and
trade data from APP12 and MESSAGING, and send trade data to NASDAQ and RTC, and
MESSAGING to APP06.
APP06 Services:
APP06 Services read MESSAGING data from SUBGROUP02 and send MESSAGING data to MESSAGING Subscribers via Multicast.
Use the
following hyperlinks to jump to the desired section of APP36 documentation:
APP36_SUBGROUP01_Troubleshooting_Table
APP36_SUBGROUP02_Troubleshooting_Table
APP36_APP06_Troubleshooting_Table
APP36_SUBGROUP01_SUBGROUP02_Monitoring_Considerations
APP36_APP06_Monitoring_Considerations
APP36 Recovery Considerations:
Stopping/Restart Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP36 nodes only when moving between nodes. (APP36 NAT addresses must be configured/used by SITE 2s.)
Because
SUBGROUP01, SUBGROUP02 and APP06 work together to process all outbound MESSAGING,
their operations and recoveries must be considered together.
Go
to APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure for procedure if moving these systems to
other nodes.
-
When stopping/restarting SUBGROUP01
instances:
1) There should be no special dependencies or considerations. A simple restart should suffice.
2) Upon reconnection to SITE 2 or NASDAQ, SUBGROUP01 will request any queued combined APP36 data messages from the APP41 data store and send delayed MESSAGES 1 to the APP19 only (not the SITE 2), and then request the most recent stock MESSAGES 1 from all connected APP32 instances and send these to the SITE 2s.
-
When stopping/restarting SUBGROUP02
instances:
1) For UTDRI (NASDAQ) processes, it is imperative that the database is up to date before restart.
a) If the UTDRI involved is restarted without the database up to date with all most recent transactions,
stock specific sequence numbers being sent to NASDAQ may be off and trade messages may not be processed as expected.
2) For APP13RI (SITE 2), there should be no special dependencies or considerations. A simple restart should suffice.
3) Upon reconnection to SITE 2 or NADAQ, SUBGROUP02 will request any queued combined APP36 data messages from the APP41 data store and send queued MESSAGES 2 to the SITE 2 marked “sold” as well as to MESSAGING, and then forward the MESSAGING portion of the messages to APP06 services.
-
When stopping/restarting APP06 instances:
1) APP06 files must be moved before the APP06 restarts on the new node.
a) If these files are not moved, two general impaAPP13 will be seen:
· MESSAGING retrans requests made after recovery may not be able to find messages requested.
· MESSAGING Sequence Number Resets will likely be seen by the MESSAGING subscribers and may compromise their resulting functionality.
2) Upon startup, APP06 services will request any queued MESSAGING messages from the APP41 data store and resend these (emulating a MESSAGING) before sending any subsequent messages from the APP36 data path.
APP36 Combined SUBGROUP01, SUBGROUP02, APP06
Move Procedure
- By design, there are four APP36 servers between both data centers:
· One DC1 server supporting DC1 SITE 2 traded stocks, sending the SUBGROUP01 and SUBGROUP02 to SITE 2 and broadcasting the saAPP32 APP06 data.
· One DC1 server supporting DC1 NASDAQ traded stocks, sending the SUBGROUP01 and SUBGROUP02 to NASDAQ and broadcasting the saAPP32 APP06 data.
· One DC2 server supporting DC2 SITE 2 traded stocks, sending the SUBGROUP01 and SUBGROUP02 to SITE 2 and broadcasting the saAPP32 APP06 data.
· One DC2 server supporting DC2 NASDAQ traded stocks, sending the SUBGROUP01 and SUBGROUP02 to NASDAQ and broadcasting the saAPP32 APP06 data.
- If these services should move between servers at any time, all services should move together using the following procedure:
1) Stop SUBGROUP01 on the current node
2) Stop SUBGROUP02 on the current node
3) Stop APP06 on the current node
4) If APP09 is also moving, then stop APP09 at this point as well (which would be the case if we moved between data centers).
5) Confirm that the associated UTDRI Database loader is up to date IN THE DATA CENTER that the UTDRI will be restarted in.
a) Use database loader reject procedures to replay this data.
· If there are database loader rejeAPP13 outstanding, they must be replayed before UTDRI restarts.
b) It may be necessary to move the Database Loader to another node to complete loading this data.
c) If NOT moving the Database Loader files to a new node, skip to step 6.
· Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.
· If moving Database Loader files:
o Stop the affected UTDRI DLTR process to unlock the database loader files.
o Copy :\chx\data\DL*.log, DL*.pos, DL*rejeAPP13.log files (created on day) for each Database Loader process moving
to the alternate node.
o Complete replaying the data, using Replay Procedures as necessary.
6) If APP09 was moved, Restart APP09 on the new node.
7) Restart SUBGROUP02 on the new node.
8) Restart SUBGROUP01 on the new node.
a) If SUBGROUP01 conneAPP13 to the SITE 2s on restart, they will resend the most recent MESSAGES 1 to the SITE 2.
b) If SUBGROUP02 conneAPP13 to the SITE 2s on restart, and SUBGROUP01 is also connected, they will send MESSAGES 2 in the APP41 data store queue sold.
c) If there are any questions regarding MESSAGES 1 not being up to date, should resend MESSAGES 1 using NTM APP32 commands.
d) If there are any questions regarding MESSAGES 2 not being sent, should resend MESSAGES 2 using APP12 trade queries.
e) If there are any questions regarding MESSAGING records not being sent, should resend MESSAGING using APP12.
(continue procedure on next page)
Continue
recovery of APP06 portion of APP36 system (lower priority than SUBGROUP01 and SUBGROUP02):
9) Copy APP06 log and inx files to the alternate node
a) Copy D:\chx\data\APP06*.log and APP06*.inx files (created that day) for each APP06 service moving to its alternate node.
10) Restart APP06 on new node.
11) Confirm MESSAGING Reader Clients reflect reconnect to moved APP06 services.
APP04 (MESSAGING Reader) Move Procedure
If APP06 processes have moved nodes, we must inform MESSAGING processes of moved MESSAGING Services using the following procedure:
1) Stop all APP04 processes.
2) Modify \\chxappcfg\APP041\APP04config.xml to reflect new nodes for APP06 log files.
3) Restart all APP04 processes.
APP36
Database Loader Move Procedure
If SUBGROUP01, SUBGROUP02, APP06 processes have moved nodes, we must also move Database Loader processes afterward using the following procedure:
1) Stop/Restart DLQR (after confirming all APP19 are up to date)
2) Stop/Restart DLTR (after confirming all APP19 are up to date)
3) Stop/Restart DLBF (after confirming all APP19 are up to date)
Use the following hyperlinks to get to SUBGROUP01, SUBGROUP02 or APP06 NTM Commands:
APP36_SUBGROUP01_NTM_Control_Commands
APP36_SUBGROUP02_NTM_Control_Commands
APP36_APP06_NTM_Control_Commands
SUBGROUP01 NTM Control Commands:
Control connections to SITE 2s:
- Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) - Connect/Disconnect To/From SITE 2/SITE 2 or Switch Connection.
- Select (and highlight) desired processes and right click to see and select desired options.
- If user desires to connect/disconnect with the SITE 2s production primary site, select either Connect To/Disconnect From SITE 2/SITE 2 options.
- If user desires to connect to any other SITE 2 site other than the production primary site, select Switch Connection and choose the desire site:
1) PRI_SITE_PRI_ADDR (primary site, primary server)
2) PRI_SITE_ALT_ADDR (primary site, alternate server)
3) DR_SITE_PRI_ADDR (DR site/remote data center, primary server)
4) DR_SITE_ALT_ADDR (DR site/remote data center, alternate server)
Send Sequence Inquiry to SITE 2s:
- Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) - Send Sequence Inquiry.
- Select (and highlight) desired processes and right click to see and select desired options.
SUBGROUP01 Bypass:
- Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) – SUBGROUP01 Bypass.
- Select (and highlight) desired processes and right click to see and select desired options.
Abort Waiting Download Reply:
- Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) – Abort Waiting Download Reply.
- Select (and highlight) desired processes and right click to see and select desired options.
Enable/Disable Processing Quote Stat:
- Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) – options to Enable/Disable Processing Quote Stat.
- Select (and highlight) desired processes and right click to see and select desired options.
Set Quote Conditions or Zero MESSAGES 1 by
ME:
- Use NTM Control Utility – Service Control – SUBGROUP01 (Options by ME) – to:
1) Zero Quote by ME,
2) Set Quote Condition Auto, or
3) Set Quote Condition Manual.
- Select (and highlight) desired PROCESSINGs and right click to see and select desired options.
SUBGROUP02 NTM Control Commands:
Control connections to SITE 2s:
- Use NTM Control Utility – Service Control – SUBGROUP02 - Connect/Disconnect To/From SITE 2/SITE 2 or Switch Connection.
- Select (and highlight) desired processes and right click to see and select desired options.
- If user desires to connect/disconnect to/from the SITE 2s production primary site, select either Connect/Disconnect To/From SITE 2/SITE 2 options.
- If user desires to connect to any other SITE 2 site other than the production primary site, select Switch Connection and choose the desire site:
1) PRI_SITE_PRI_ADDR (primary site, primary server)
2) PRI_SITE_ALT_ADDR (primary site, alternate server)
3) DR_SITE_PRI_ADDR (DR site/remote data center, primary server)
4) DR_SITE_ALT_ADDR (DR site/remote data center, alternate server)
Send Sequence Inquiry to SITE 2s:
- Use NTM Control Utility – Service Control – SUBGROUP02 - Send Sequence Inquiry.
- Select (and highlight) desired processes and right click to see and select desired options.
Send TradeId Inquiry to SITE 2s:
- Use NTM Control Utility – Service Control – SUBGROUP02 - Send TradeId Inquiry.
- Select (and highlight) desired processes and right click to see and select desired options.
Set Outbound Sequence Number to SITE 2s:
- Use NTM Control Utility – Service Control – SUBGROUP02 – Set Outbound Sequence Number.
- Select (and highlight) desired processes and right click to see and select desired options.
- User will have to enter desired sequence number.
Set Outbound TradeId Per Instrument to SITE
2s:
- Use NTM Control Utility – Service Control – SUBGROUP02 – Set Outbound TradeId Per Instrument.
- Select (and highlight) desired processes and right click to see and select desired options.
- User will have to enter Instrument and TradeId desired.
BFD Start Of Day:
- Use NTM Control Utility – Service Control – Book Feed Options – BFD Start Of Day.
- Select (and highlight) desired processes and right click to see and select desired options.
BFD End Of Day:
- Use NTM Control Utility – Service Control – Book Feed Options – BFD End Of Day.
- Select (and highlight) desired processes and right click to see and select desired options.
BFD Set Outbound Sequence Number:
- Use NTM Control Utility – Service Control – Book Feed Options – BFD Set Outbound Sequence Number.
- Select (and highlight) desired processes and right click to see and select desired options.
- User will have to enter desired sequence number.
BFD Send System Problem Message:
- Use NTM Control Utility – Service Control – Book Feed Options – BFD Send System Problem Message.
- Select (and highlight) desired processes and right click to see and select desired options.
BFD Send System Problem Clear Message:
- Use NTM Control Utility – Service Control – Book Feed Options – BFD Send System Problem Clear Message.
- Select (and highlight) desired processes and right click to see and select desired options.
SUBGROUP01 Troubleshooting
Table:
APP36 Symptom |
ImpaAPP13 |
Response |
SUBGROUP01 SITE 2
connectivity issues Evidenced by: - In SUBGROUP01_SUBGROUP02_to_SITE 2 stats, APP11RI processes are not connected. - If SITE 2 moves to DR site, APP11 processes will report messages with text “disaster” in them. NOTE: APP11RI processes will try to auto-reconnect continuously until connections can be made. |
Since SUBGROUP01 is first process in APP36 system path: -will not be reporting quote related MESSAGING to industry. -will not be reporting trade related MESSAGING to industry; This includes MESSAGING. -will not be reporting MESSAGING related MESSAGING to subscribers. -Trading must be halted in affected issues if problem goes on too long. |
1) Refer to: CHX_Cannot_Send_MESSAGES 1_To_SITE 2s Generalized Recovery Scenario. 2) Work with SITE 2 and Tech Services to identify and resolve issues. - SITE 2 may ask to move APP11RI connections to Primary Alternate Servers or their DR site. Use NTM Control SUBGROUP01 options to switch SUBGROUP01 connections. 3) APP11RI Services may need to be stopped/restarted. See APP36_Recovery_Considerations |
SUBGROUP01 NASDAQ
connectivity issues Evidenced by: - In SUBGROUP01_SUBGROUP02_to_NASDAQ stats, UQDRI processes are not connected. - If SITE 2 moves to DR site, SUBGROUP02 processes will report messages with text “disaster” in them. NOTE: UQDRI processes will try to auto-reconnect continuously until connections can be made. |
Since SUBGROUP01 is first process in APP36 system path: -will not be reporting quote related MESSAGING to industry. -will not be reporting trade related MESSAGING to industry; This includes MESSAGING. -will not be reporting MESSAGING related MESSAGING to subscribers. -Trading must be halted in affected issues if problem goes on too long. |
1) Refer to: CHX_Cannot_Send_MESSAGES 1_To_SITE 2s Generalized Recovery Scenario. 2) Work with NASDAQ and Tech Services to identify and resolve issues. 3) NASDAQ may ask to move UQDRI connections to Primary Alternate Servers or their DR site. Use NTM Control SUBGROUP01 options to switch SUBGROUP01 connections. 4) NOTE: If NASDAQ moves to DR site, APP31 recovery procedures will also need to be used. See APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04) Application Specific Recoveries. 5) UQDRI Services may need to be stopped/restarted. See APP36_Recovery_Considerations |
SUBGROUP02 Troubleshooting
Table:
|
|
6)
|
SUBGROUP02 SITE 2
connectivity issues Evidenced by: - In SUBGROUP01_SUBGROUP02_to_SITE 2 stats, APP13RI processes are not connected. - If SITE 2 moves to DR site, APP13 processes will report messages with text “disaster” in them. NOTE: APP13RI processes will try to auto-reconnect continuously until connections can be made. |
Since SUBGROUP02 is second process in APP36 system path: -will not be reporting trade related MESSAGING to industry; This includes MESSAGING. -will not be reporting MESSAGING related MESSAGING to subscribers. -Trading must be halted in affected issues if problem goes on too long. |
1) Refer to: CHX_Cannot_Send_MESSAGES 2_To_SITE 2s Generalized Recovery Scenario. 2) Work with SITE 2 and Tech Services to identify and resolve issues. 3) SITE 2 may ask to move APP13RI connections to Primary Alternate Servers or their DR site. Use NTM Control SUBGROUP02 options to switch SUBGROUP02 connections. 4) APP13RI Services may need to be stopped/restarted. See APP36_Recovery_Considerations |
SUBGROUP02 NASDAQ connectivity
issues Evidenced by: - In SUBGROUP01_SUBGROUP02_to_NASDAQ stats, UTDRI processes are not connected. - If NASDAQ moves to DR site, SUBGROUP04 processes will report messages with text “disaster” in them. NOTE: UTDRI processes will try to auto-reconnect continuously until connections can be made. |
Since SUBGROUP02 is second process in APP36 system path: -will not be reporting trade related MESSAGING to industry; This includes MESSAGING. -will not be reporting MESSAGING related MESSAGING to subscribers. -Trading must be halted in affected issues if problem goes on too long. |
1) Refer to: CHX_Cannot_Send_MESSAGES 2_To_SITE 2s Generalized Recovery Scenario. 2) Work with NASDAQ and Tech Services to identify and resolve issues. 3) NASDAQ may ask to move UTDRI connections to Primary Alternate Servers or their DR site. Use NTM Control SUBGROUP01 options to switch SUBGROUP01 connections. - NOTE: If NASDAQ moves to DR site, APP31 recovery procedures will also need to be used. See APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04) Application Specific Recoveries. 4) UTDRI Services may need to be stopped/restarted. See APP36_Recovery_Considerations |
APP06 Symptom |
ImpaAPP13 |
Response |
Users report missing data
in MESSAGING data. May or may not be evidenced by: - Sequence gaps reported by APP04 processes in both EMT and MESSAGING Reader Client. |
-MESSAGING Subscribers are missing data that they may or may not use in trading decisions. |
4) Use MESSAGING Reader to confirm whether or not the saAPP32 data lost by user was reported by APP04 Client. See APP05_Monitoring_Considerations 5) Report sequence gap information to Tech Services and work with Tech Services to determine cause/resolution. 6) Users may utilize MESSAGING Retrans processes to try and gap fill messages lost. See |
SUBGROUP01_SUBGROUP02 Monitoring Considerations:
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate quote and last sale delivery from to SITE
2 SITE 2s. |
PROD MENU: |
-
Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Connstat is not Connected. |
is not connected to SITE 2. |
|||
QuoteQue is non-zero value and not decreasing as expected,
or OutMsgRate or OutMsgs values are not reflecting changes as expected. |
We may not be sending MESSAGES 1 and/or as expected. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate quote and last sale delivery from to SITE
2 and NASDAQ SITE 2s. |
PROD MENU: |
-
Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Soup status is not Connected, Status is not ready, or Is
ready to send value is not Y. |
is not connected to SITE 2. |
|||
Out Rate or Total Sent values are not reflecting changes
as expected. |
We may not be sending MESSAGES 1 and/or as expected. |
Stats Monitors: |
To
Start: |
Key
Indicators to Monitor: |
Symptom: |
Response: |
Processes facilitate quote and last sale delivery from to SITE
2 and NASDAQ SITE 2s. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast
data is not being received by monitor. |
Not all SUBGROUP01, SUBGROUP02 or IPC
connected processes are displayed as expected. |
Not all IPC connected services have
been started or haven’t processed any messages since monitor has been
started. |
|||
IPC Connected status is not CONNECTED. |
Messages cannot be sent from source
to destination if IPC channel disconnected. |
|||
THIRD PARTY connected status is
Inactive |
29 West communications has been
disabled between the services involved. |
|||
IPC Connected Queue size is non-zero
value and not decreasing as expected. |
Messages cannot be sent from source
to destination unless IPC channel is connected. |
|||
THIRD PARTY connected Write Queue is
non-zero values and not reducing as expected. |
29 West communications has been
disabled between the services involved. |
Stats Monitors: |
To Start: |
Key Indicators to
Monitor: |
Symptom: |
Response: |
Processes facilitate quote and last sale delivery from to SITE
2 and NASDAQ SITE 2s. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
Not all SUBGROUP01 / SUBGROUP02 processes are displayed as
expected. |
Service has not been started. |
|||
Msgs In and/or Msgs Out are zero. |
No messages have been sent/received since that monitor has
been started. |
APP06 Monitoring Considerations:
Stats
Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes receive MESSAGING data via ME->SUBGROUP01->SUBGROUP02->APP06
path, and send multicast to MESSAGING Subscribers. |
PROD MENU: |
- Color of data in columns - SUBGROUP02_APP06_over_limit |
Data is RED. |
Process is either down or multicast data is not being
received by monitor. |
|
|
|
Rule 603a violation count is > 0 |
Rule 603a violation has been reported 1) Notify Production Support and management. |
|
|
|
Any
one of, or any combination of the process “over limit” columns are greater
than 0. |
We may not be processing as expected. |
Also see MESSAGING Reader. It will report any issues specific to MESSAGING multicast delivery, at least to our APPGROUP01 Servers.
Use the following hyperlink to see this documentation: APP05_Monitoring_Considerations.
APP37 Purpose:
APP37 reads order related messages from APP01, APP07, and APP26 and forwards messages to destinations specified in routing agreements or on messages.
APP37 sends order related messages to APP07, APP01 and/or PROCESSINGs depending on applied routing instructions.
Use the following hyperlinks to jump to the
desired section of APP37 documentation:
APP37_Monitoring_Considerations
APP37 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP37 nodes only when moving between nodes. (Java code and FireDaemon references are required.)
- When stopping/restarting APP37 processes:
7) Notify Operations and work in cooperation with them, as appropriate to situations.
8) Stop/Restart the APP37 process.
Reload
Rules:
- Use NTM Control Utility – Service Control - APP37 – Reload Rules.
End Of
Day:
- Use NTM Control Utility – Service Control - APP37 – End of Day.
APP37 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In Solarwinds (and outlook), node and processes will be reported down. |
Operations will not be able to query or administratively manage orders, MESSAGES 2 or MESSAGING reports via APP12. Refer to: Server Specific Recoveries. |
3) Refer to: Server Specific Recoveries. 4) Notify Management. |
APP37 Monitoring Considerations:
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes
facilitate communications from Order Sending Firms that use APP26 services
for special routing (to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in
columns |
Data is RED. |
Process is either
down or multicast data is not being received by monitor. |
|
|
|
Not all APP37
processes are displayed as expected. |
APP37 Service has
not been started. |
|
|
|
Msgs In and/or Msgs
Out are zero when messages are processed. |
No messages have
been sent/received since that monitor has been started. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes
facilitate communications from Order Sending Firms that use APP26 services
for special routing (to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in
columns |
Data is RED. |
Process is either
down or multicast data is not being received by monitor. |
|
|
|
Not all APP37 or APP01
processes are displayed as expected. |
APP37 or APP01
Service has not been started or hasn't processed any messages since monitor
has been started. |
|
|
|
Status is not
CONNECTED. Messages cannot be
sent from source to destination unless IPC channel is connected. |
1) Stop/Restart
destination process if other processes connecting to the saAPP32 are showing
similar issues; Otherwise, stop/restart source process. |
|
|
|
Queue size is non-zero
value and not decreasing as expected. Messages cannot be
sent unless IPC channel is connected. |
1) Stop/Restart
destination process so as not to accidentally delete queued messages; Do not
stop/restart source process. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes
facilitate communications from Order Sending Firms that use APP26 services
for special routing (to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in
columns |
Data is RED. |
Process is either
down or multicast data is not being received by monitor. |
|
|
|
Not all APP37 or APP07
processes are displayed as expected. |
APP37 or APP07
Service has not been started or hasn't processed any messages since monitor
has been started. |
|
|
|
Status is not
CONNECTED. Messages cannot be
sent from source to destination unless IPC channel is connected. |
1) Stop/Restart
destination process if other processes connecting to the saAPP32 are showing
similar issues; Otherwise, stop/restart source process. |
|
|
|
Queue size is
non-zero value and not decreasing as expected. Messages cannot be
sent unless IPC channel is connected. |
1) Stop/Restart
destination process so as not to accidentally delete queued messages; Do not
stop/restart source process. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes
facilitate communications from Order Sending Firms that use APP26 services
for special routing (to MESSAGING, SITE 2s or PROCESSINGs). |
PROD MENU: |
- Color of data in
columns |
Data is RED. |
Process is either down
or multicast data is not being received by monitor. |
|
|
|
Not all APP37 or APP26
processes are displayed as expected. |
APP37 or APP26
Service has not been started or hasn't processed any messages since monitor
has been started. |
|
|
|
Status is not
CONNECTED. Messages cannot be
sent from source to destination unless IPC channel is connected. |
1) Stop/Restart
destination process if other processes connecting to the saAPP32 are showing
similar issues; Otherwise, stop/restart source process. |
|
|
|
Queue size is
non-zero value and not decreasing as expected. Messages cannot be
sent unless IPC channel is connected. |
1) Stop/Restart
destination process so as not to accidentally delete queued messages; Do not
stop/restart source process. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes
facilitate communications from Order Sending Firms that use APP26 services
for special routing (to MESSAGING, Away Destinations, PROCESSINGs and Trade
Reporting Systems. Monitor
shows IPC channel connectivity status to APP37 Bridge / PROCESSING
processes. |
PROD MENU: |
- Color of data in
columns |
Data is RED. |
Process is either
down or multicast data is not being received by monitor. |
|
|
Not all APP37, APP38ridge
or PROCESSING processes are displayed as expected. |
APP37, APP38ridge
or PROCESSING Service has not been started or hasn't processed any messages
since monitor has been started. |
|
|
|
|
Status is not
CONNECTED. Messages cannot be
sent from source to destination unless IPC channel is connected. |
1) Stop/Restart
destination process if other processes connecting to the saAPP32 are showing
similar issues; Otherwise, stop/restart source process. |
|
|
|
Queue size is
non-zero value and not decreasing as expected. |
1) Stop/Restart
destination process so as not to accidentally delete queued messages; Do not
stop/restart source process. |
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
Processes
facilitate communications from Order Sending Firms that use APP26 services
for special routing (to MESSAGING, SITE 2s or PROCESSINGs). Monitor
shows IPC channel connectivity status to SUBGROUP02 processes. |
PROD MENU: |
- Color of data in
columns |
Data is RED. |
Process is either
down or multicast data is not being received by monitor. |
|
|
Not all APP37 or SUBGROUP02
processes are displayed as expected. |
APP37 or SUBGROUP02
Service has not been started or hasn't processed any messages since monitor
has been started. |
|
|
|
|
Status is not
CONNECTED. Messages cannot be
sent from source to destination unless IPC channel is connected. |
1) Stop/Restart
destination process if other processes connecting to the saAPP32 are showing
similar issues; Otherwise, stop/restart source process. |
|
|
|
Queue size is
non-zero value and not decreasing as expected. |
Messages cannot be
sent from source to destination unless IPC channel is connected. |
APP38 Purpose:
The APP38 process reads order and trade related messages from APP37 and writes them to PROCESSINGs.
The APP38 process reads order and trade related messages from PROCESSINGs and writes them back to originator.
APP38 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP38 nodes only when moving between nodes. (No real dependencies outside of expected processor.)
APP38 NTM Control Commands:
There are no APP38 specific NTM Control Commands.
APP38 Troubleshooting Table:
APP38 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to APP27 service. - In Solarwinds (and outlook), node and processes will be reported down. |
-IBs will not be able to send orders to PROCESSINGs, and PROCESSING responses will be stopped. -APP12 trade corrections will not be able to be sent to PROCESSINGs. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Restart affected APP27 processes on alternate nodes. 3) Notify Management and PTT. |
APP38 Monitoring Considerations:
See ME_Monitoring_Considerations and APP37_Monitoring_Considerations
There are no APP38 specific monitors outside of EMT and related APP32 and APP37 monitors.
APP39 Purpose:
APP39 processes receive order and execution drop copies from SITE 2 Firms and Vendors and send them to APP07.
Use the following hyperlinks to jump to the
desired section of APP39 documentation:
APP39_Monitoring_Considerations
APP39 Recovery
Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
-
When moving between nodes:
o ALTERNATE
NODES are not defined for APP39 services.
o DR
NODES must be allocated un-natted nodes.
No node can support more than one natted address at the saAPP32 time.
- When stopping/restarting APP39 processes:
1) Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.
2) Notify Tech Services if moving APP39 processes to DR nodes and NAT addresses need to change to accommodate move.
3) Stop the APP39 process.
4) If NOT moving APP39 to new node, skip to step 5.
a) If moving the APP39 to a new node, copy the day’s APP39 PROCESSOR files to alternate node:
a) Copy: \chx\data\{APP39}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.
b) Copy: \chx\data\{APP39}\Global.* file created for the day to the alternate node.
c) If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.
d) If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.
5) Start the APP39 process.
6) Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.
Open
OSF Channel:
- Use NTM Control Utility – Service Control - APP39 – Open OSF Channel to make OSF connection possible.
Close
OSF Channel:
- Use NTM Control Utility – Service Control - APP39 – Close OSF Channel to make OSF connection impossible.
Set Inbound
Sequence Number:
- Use NTM Control Utility – Service Control - APP39 – Set Inbound Sequence Number to set Inbound Sequence Number.
Set
Outbound Sequence Number:
- Use NTM Control Utility – Service Control - APP39 – Set Outbound Sequence Number to set Outbound Sequence Number.
Enable THIRD
PARTY Stats:
- Use NTM Control Utility – Service Control - APP39 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.
Disable
THIRD PARTY Stats:
- Use NTM Control Utility – Service Control - APP39 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.
APP39 Symptom |
ImpaAPP13 |
Response |
Firm disconneAPP13 or Logs out of session Evidenced by: - EMT message saying {firm} is disconnected and/or {firm} is logged out. - Stats monitor shows disconnected in status column. {Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP39Services.xml file within the disconnect message. |
IB is no longer able to receive drop copy related messages from SITE 2 Vendor or MESSAGING Service. |
1) Contact firm. 2) Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them. 3) Stop/Restarts of affected application service may help resolve the issue. |
APP39 Monitoring Considerations:
Stats Monitors: |
To Start: |
Key Indicators to Monitor: |
Symptom: |
Response: |
|
Processes facilitate drop copy
communications from SITE 2s and MESSAGING via APP39 PROCESSOR. |
PROD MENU: |
- Color of data in columns |
Data is RED. |
Process is either down or multicast
data is not being received by monitor. |
|
Status is Disconnected or Open |
Firm is not connected. |
||||
InMsgs value is
not relatively close to, or far greater than OutMsgs value. |
We may not be processing as expected. |
||||
APP40 Purpose:
The APP40 process reads order and trade related messages from PROCESSINGs and creates a database loader file to be used in loading the data into the databases via loaders.
APP40 Recovery Considerations:
Stopping/Restart
Processes:
-
Use NTM Control Utility - Service Control -
Process Controller to stop/restart processes.
- Use APP40 nodes only when moving between nodes. (No real dependencies outside of expected processor.)
- If moving APP40 processes to alternative servers, see ME_Recovery_Considerations.
APP40 NTM Control Commands:
There are no APP40 specific NTM Control Commands.
APP40 Troubleshooting
Table:
APP40 Symptom |
ImpaAPP13 |
Response |
Node Crashes Evidenced by: - In EMT, applications report lost communications to APP40 service. - In Solarwinds (and outlook), node and processes will be reported down. |
-PROCESSING related activities will not be loaded into the database. Refer to: Server Specific Recoveries. |
1) Refer to: Server Specific Recoveries. 2) Restart affected APP40 processes on alternate nodes. 3) Notify Management and PTT. |
APP40 Monitoring Considerations:
See APP22_Monitoring_Considerations and DBL_Monitoring_Considerations.
There are no APP40 specific monitors outside of EMT and the related APP22 29 West and XML Database Loader monitors.
Purpose:
APP41 = Ultra Messaging for the Enterprise. A product of Informatica (29 West) that tries to provide the benefit of fast multicast message delivery along with a guarantee of message delivery persistence. It consists of a APP41 Daemon “Listener” that runs as a service on a server (or a number of servers) that subscribes to the saAPP32 29 West topics as all APP41 message senders and receivers. The “listeners” are not an additional hop in message delivery, but are instead an eaves-dropping “store” for all messages delivered. The “store” then aAPP13 as the place where any receiver who thinks they lost a message would go and try to retrieve/reprocess it. In CHX’s implementation, there are primary APP41 stores and backup stores which processes would auto-connect to if the primary store were to go down; They are NOT redundant stores.
has implemented the two separate APP41 Data Stores, for two separate paths of data:
1) FC (MESSAGING)
2) APP36 (Outbound MESSAGING)
- By design, there are eight APPGROUP15 Servers between both data centers:
· One DC1 server acting as the primary APP41 FC Data Store, generally storing APP20, DCS, APP40 and APP28 related messages.
· One DC1 server acting as the backup APP41 FC Data Store, generally storing APP20, DCS, APP40, and APP28 related messages.
· One DC2 server acting as the primary APP41 FC Data Store, generally storing APP20, DCS, APP40 and APP28 related messages.
· One DC2 server acting as the backup APP41 FC Data Store, generally storing APP20, DCS, APP40, and APP28 related messages.
· One DC1 server acting as the primary NYSE APP41 APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.
· One DC1 server acting as the primary NASD APP41 APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.
· One DC2 server acting as the primary NYSE APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.
· One DC2 server acting as the primary NASD APP41 APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.
Use the following hyperlinks to jump to the
desired section of APP41 documentation:
APP41 Recovery
Considerations:
NOTE: APP41 Services are not configured to
"move" between nodes.
Instead, we move the "APP41 registrations"
of sending/receiving applications by shutting down APP41 stores.
What happens when sending application goes down?:
When the sending application goes down, there is no impact to the “store”. The “stores” simply don’t have new messages to listen for until the senders coAPP32 back up. When the sender comes back up, it resumes its connection to the “store” and the “store” continues listening.
What happens when the receiving application goes down (or thinks it lost messages)?:
When the receiving application goes down, it loses its connection to the sending applications and the “store”. When the receiving application comes back up, it resumes its connections to the store and initiates a request for any missed messages. The “store” delivers the messages and the receiving application reprocesses the lost messages. The saAPP32 “message request/retransmission” would occur if the receiving application suffers a 29 west unrecoverable loss of messages as well.
What happens when the primary APP41 store goes down?:
When the primary APP41 store goes down, both the sending and receiving applications recognize the connections lost and automatically re-connect to the backup APP41 store. The messages “stored” in the primary are no longer available to the receiving application, and only new messages sent from the sending applications to the backup “store” will be able to be resent (if the receiving process thinks they lost them).
In the implementation, the stores are not redundant; Only one store is “listening” at a time. And since APP41 stores are not redundant, messages sent/stored by a given APP41 Service BEFORE "re-registration" periods are essentially unable to be resent AFTER the "re-registration" period. Or in other words, ONLY messages sent/stored AFTER a APP41 registration is made are available to be resent to a receiver.
APP41 Troubleshooting
Table:
APP41 Symptom |
ImpaAPP13 |
Response |
If ONLY ONE node running APP41
FC Data Store Service is having problems: Evidenced by: - In EMT, application specific APP41 registration errors are seen that indicate problems with one APP41 instance only involving APP20, ME, APP40, DCS and APP28 related messages.
|
ImpaAPP13 will be specific to applications involved. The following applications are SENDERS and RECEIVERS of THIRD PARTY MESSAGING related data: - APP20 to ME - APP38RIDGE to APP32 and APP37 - SUBGROUP02 to APP22 - APP40 to APP22 |
1)
Notify Operations
Management. 2)
Stop the APP41 FC service running
on the node (via NTM Control Utility) 3)
Confirm through EMT and ER
that all APP41 FC SENDING services previously registered re-register for
backup APP41 FC service. 4)
Confirm through EMT and ER
that all APP41 FC RECEIVING services previously registered re-register for
backup APP41 FC service 5)
Wait for all
"re-registrations" to complete and give all applications soAPP32 tiAPP32
to function with all remaining stores. 6)
Wait for a fair amount of
tiAPP32 to make sure all applications, including APP41 FC Stores are
healthy. Arbitrarily, 2-5 minutes. 7)
Restart the original APP41 FC
service on the original node (APP41 FC services are not set up to move
between nodes) 8)
Confirm that no
applications re-register for it since "re-registrations" only occur
when APP41 Stores are lost and not when they are brought up. |
APP41 Symptom |
ImpaAPP13 |
Response |
If ONLY ONE node running APP41
APP36 Data Store Service is having problems: Evidenced by: - In EMT, application specific APP41 registration errors are seen that indicate problems with one APP41 instance only involving ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.
|
ImpaAPP13 will be specific to applications involved. The following applications are SENDERS and RECEIVERS of THIRD PARTY APP36 related data: - ME to SITE 2 SUBGROUP01 - ME to NASDAQ SUBGROUP01 - SITE 2 SUBGROUP01 to SITE 2 SUBGROUP02 - NASDAQ SUBGROUP01 to NASDAQ SUBGROUP02 - SITE 2 SUBGROUP02 to SITE 2 APP06 - NASDAQ to NASDAQ APP06 |
1)
Notify Operations
Management. 2)
Stop the APP41 APP36 service
running on the node (via NTM Control Utility) 3)
Confirm through EMT and ER
that all APP41 APP36 SENDING services previously registered re-register for
backup APP41 APP36 service. 4)
Confirm through EMT and ER
that all APP41 APP36 RECEIVING services previously registered re-register for
backup APP41 APP36 service 5)
Wait for all
"re-registrations" to complete and give all applications soAPP32 tiAPP32
to function with all remaining stores. 6)
Wait for a fair amount of
tiAPP32 to make sure all applications, including APP41 APP36 Stores are
healthy. Arbitrarily, 2-5 minutes. 7)
Restart the original APP41
APP36 service on the original node (APP41 APP36 services are not set up to
move between nodes) 8)
Confirm that no
applications re-register for it since "re-registrations" only occur
when APP41 Stores are lost and not when they are brought up. |
APP41 Symptom |
ImpaAPP13 |
Response |
If BOTH APP41 FC Data stores
need to be restarted with stores cleared (as they do at “start of day”) - Evidenced by unresolvable THIRD PARTY problems involving APP20, ME, APP40, DCS and APP28 related messages. |
ImpaAPP13 will be specific to applications involved. The following applications are SENDERS and RECEIVERS of THIRD PARTY MESSAGING related data: - APP20 to ME - APP38RIDGE to ME - ME to SUBGROUP02, APP40, APP20, APP38 - RTC to APP22, APP28 - APP40 to APP22 |
1)
Notify Operations
Management. 2)
Halt Trading in all
stocks. Refer to ME_NTM_Control_Commands 3)
Stop APP10 testing via NTM
Control Utility to avoid APP10 testing MEs during recoveries. 4)
Stop all applications that
utilize APP41 stores (via NTM Control Utility) -
Use Production Opsmenu
Shutdown Menu and APP41 developer input to confirm the list of processes
involved and the following order of shutdown: 5)
Stop APP20 6)
Stop APP37 Bridge, MEs, APP40s 7)
Stop SUBGROUP02 8)
Stop APP09 (RTC) 9)
Stop APP22 10)
Stop APP28 11)
Stop both APP41 services 12)
RenaAPP32 APP41 FC Store
files, from Production Opsmenu Startup Menu. 13)
Purge APP41 FC cache and
state files, from Production Opsmenu Startup Menu. 14)
Restart both APP41 FC services
and check APP41 FC Store log files to confirm startup is “clean”. 15)
Restart all applications
stopped in the order that they appear in Production Opsmenu Startup Menu. 16)
Confirm through EMT and ER
that all startups occur with error. 17)
Perform System Integrity Checklist with focus on paths
utilized by THIRD PARTY. |
APP41 Symptom |
ImpaAPP13 |
Response |
If BOTH APP41 APP36 Data stores
need to be restarted with stores cleared (as they do at “start of day”) Evidenced by unresolvable THIRD PARTY problems involving ME, SUBGROUP01, SUBGROUP02 and APP06 related messages. |
ImpaAPP13 will be specific to applications involved. The following applications are SENDERS and RECEIVERS of THIRD PARTY APP36 related data: - ME to SITE 2 SUBGROUP01 - ME to NASDAQ SUBGROUP01 - SITE 2 SUBGROUP01 to SITE 2 SUBGROUP02 - NASDAQ SUBGROUP01 to NASDAQ SUBGROUP02 - SITE 2 SUBGROUP02 to SITE 2 APP06 - NASDAQ to NASDAQ APP06 |
1)
Notify Operations
Management. 2)
Halt Trading in all
stocks. Refer to ME_NTM_Control_Commands 3)
Stop APP10 testing via NTM
Control Utility to avoid APP10 testing MEs during recoveries. 4)
Stop all applications that
utilize APP41 stores (via NTM Control Utility) -
Use Production Opsmenu
Shutdown Menu and APP41 developer input to confirm the list of processes
involved and the following order of shutdown: 5)
Stop ME 6)
Stop SUBGROUP01 7)
Stop SUBGROUP02 8)
Stop APP06 9)
Stop both APP41 services 10)
RenaAPP32 APP41 APP36 Store
files, from Production Opsmenu Startup Menu. 11)
Purge APP41 APP36 cache and
state files, from Production Opsmenu Startup Menu. 12)
Restart both APP41 APP36 services
and check APP41 APP36 Store log files to confirm startup is “clean”. 13)
Restart all applications
stopped in the order that they appear in Production Opsmenu Startup Menu. 14)
Confirm through EMT and ER
that all startups occur with error. 15)
Perform System Integrity Checklist with focus on paths
utilized by THIRD PARTY. |