COMPANY

Crisis Event Management and Operations Playbook

 

 

About this documentation:

This document outlines all system recovery steps to be taken in the event of any crisis event.

This document was last reviewed and updated on: May 20, 2019, by: Drae J. Namaste-Rose

 

 

Table of Contents

 

Crisis Event Definition and Operational Response Procedure. 4

System Integrity Checklist. 5

Generalized Recovery Scenarios. 7

Cannot Send MESSAGES 1 To SITE 2(s). 7

Cannot Send MESSAGES 2 To SITE 2(s). 8

Cannot Process MESSAGES 1 From SITE 2(s). 9

Cannot Process MESSAGES 2 From SITE 2(s). 10

Experiencing Issues in PROCESSING.. 11

Network Issues Compromise Dual Data Center Connectivity. 12

Data Center Move Procedures. 13

Loses A Single Data Center’s Database Access, But Network Between Data Centers and Production FileShare are OK. 15

Loses A Single Data Center’s Database Access AND Production FileShare, But Network Between Data Centers is OK. 16

Needs to Claim Self Help; Another SITE’s Problems Compromising Their Own Published MESSAGING.. 17

Server Specific Recovery Scenarios. 18

APPGROUP01 Server. 18

APPGROUP02 Server. 18

APPGROUP03 Server. 19

APPGROUP04 Server. 19

APPGROUP05 Server. 20

APPGROUP06 Server. 21

APPGROUP07 Server. 22

APPGROUP08 or APPGROUP09 Server. 23

APPGROUP10 Server. 24

APPGROUP11 Server (APP31 – SITE 2 to Multicast). 25

APPGROUP12 Server (APP30). 25

APPGROUP13 Server (APP36 – to SITE 2 MESSAGES 1 and MESSAGES 2, MESSAGING, MESSAGING and TRF). 26

APPGROUP14 Server. 27

APPGROUP15 Server. 28

Application Specific Recoveries. 29

APP01 (SITE 2 Processes). 30

APP02 (MESSAGING Processes). 36

APP03 (MESSAGING Processes). 37

APP04 (MESSAGING Application). 40

APP05 (MESSAGING Reader Application). 43

APP06 (MESSAGING Services). 45

APP07 (MESSAGING). 46

APP08 (COMMUNICATION). 55

APP09 (MESSAGING). 55

APP10 (TESTING Application). 60

APP11 (SITE 2 MESSAGES 1 MESSAGING Processor). 69

APP12 (CLIENT Interface). 69

APP13 (SITE 2  MESSAGING Processor). 71

APP14 (APPGROUP02 – ). 71

APP15 (APPGROUP02 – ). 71

APP16 (APPGROUP02 – ). 71

APP17 (APPGROUP02 – ). 71

APP18 (APPGROUP02 – ). 71

APP19 (Non-Binary / XML Loaders). 72

APP19 (Binary – SUBGROUP01, SUBGROUP02, SUBGROUP03, SUBGROUP04, SUBGROUP05). 76

APP20 (APP20 processes). 80

APP21 (MESSAGING Engine). 86

APP22 (MESSAGING Router). 89

APP23 (APP23 PROCESSOR). 92

APP24 (APP24 PROCESSOR). 95

APP25 (MESSAGE Reader). 98

APP26 (MESSAGING APP20 PROCESSOR). 99

APP27 (ACTIVITY Reader). 106

APP28 (RISK System). 107

APP29. 108

APP30 (APP02/SUBGROUP02/SUBGROUP03). 109

APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04). 111

APP32 (PROCESSING). 123

APP33 ( Client). 134

APP34 (RESOLVER). 135

APP35 (OPS System). 136

APP36 (SUBGROUP01, SUBGROUP02, APP06). 138

APP37 (MESSAGING Service). 153

APP38 (MESSAGING Service Bridge Process). 160

APP39 (MESSAGING Processes). 161

APP40 (Transaction Reader). 164

APP41 Data Stores (MESSAGING and Outbound MESSAGING). 165

 

 


 

Crisis Event Definition and Operational Response Procedure

 

A Crisis Event is defined as any event outside of normal operational procedures that would require management assistance to resolve.

 

For every Crisis Event:

 

1)      Event Management (Specific Staff List) must be informed via text and email, and/or phone as necessary.

2)      Event Management must determine if Executive (Specific Staff List) should be advised, and if executive contacts are necessary.

3)      Control room/Help Desk staff will communicate the event (via THIRD PARTY) to Event Management Teams and initialize EXECUTIVE conference call.

4)      Event Managers will start addressing event, with deference given to EXECUTIVE (Specific Staff List) on proposed solution approvals.

5)      Control room/Help Desk staff will provide 15 minute status updates (via THIRD PARTY) based on EXECUTIVE Conference Call communications.

 

For all Crisis Management situations, the following roles must be accounted for:

Role:

Responsibility:

Control Room Help Desk

Initial Response to Event; Primarily responsible for THIRD PARTY text and email communications, and initializing EXECUTIVE Conference call.  During event, continues trader support and reporting issues seen to Incident/Communication Managers.

Control Room Operations

Initial Response to Event; Secondarily responsible for THIRD PARTY text and email communications, and initializing EXECUTIVE Conference call.  During event, continues monitoring and reporting issues seen to Incident/Communication Managers.

EXECUTIVE Conference Call Manager

(Specific Staff List)

 

Manages communications between conference call and control room, as well as SMS/email updates.

 

Incident Manager

(Specific Staff List)

 

Assigns and manages all event research, corrective action, and communication staff, as well as their activity.

Communication Manager

(Specific Staff List)

 

AAPP13 as liaison between staff outside of the control room and in control room as necessary.

Compliance Regulatory Manager

(Specific Staff List)

 

Confirms Compliance and Regulatory issues are accounted for, in coordination with EXECUTIVE conference call.


 

System Integrity Checklist

The following checks must be done after any major interruption in Trading, and before any resumption in Trading can be considered:

Confirm all monitoring is working as expected:

DC1 PROD-VDI is receiving stats and EMT updates from both DC1 and DC2 processes.

DC2 PROD-VDI is receiving stats and EMT updates from both DC1 and DC2 processes.

Solarwinds, Oracle Grid and Perfmons are updating as expected (available from DC1 PROD-VDI only).

Confirm APP10 testing is working without issue, confirming all OSF-to-ME-to-SUBGROUP01 functionality.

APP10DC1 testing all MEB instances via APP20_APP10DC1 process.

APP10DC2 testing all MEL and MEO instances via APP20_APP10DC2 process.

APP10APP37 testing all APP32 instances via APP26_APP10APP37 process.

Confirm all Outbound MESSAGING is connected and processing as expected.

Use all four MESSAGING test issues to confirm all outbound MESSAGING.

Confirm SUBGROUP01 stats are updated as orders and MESSAGES 2 are entered.

Confirm SUBGROUP02 stats are updated as MESSAGES 2 are entered.

Confirm APP06 stats are updated as orders and MESSAGES 2 are entered.

Use MESSAGING test issue to confirm MESSAGING records are rejected by APP23 as expected.

Confirm all database loading is up to date for application startups.

                All DC1 and DC2 databases are up to date from all processes running in both DC1 and DC2 data centers.

Confirm all MESSAGES 1 and MESSAGES 2 from SITE 2 are being received and processed by MEs.

                DC1 SITE 2-to-stats monitors are updating from both SITE 2 and NASD, Primary and Alt channels.

                DC2 SITE 2-to-stats monitors are updating from both SITE 2 and NASD, Primary and Alt channels.

Confirm DC1 MESSAGING MESSAGING queries return data for at least one test connectivity issue from each ME.

Confirm DC2 MESSAGING MESSAGING queries return data for at least one test connectivity issue from each ME.

Confirm all MESSAGING to APP32 connectivity is working as expected, confirming APP07 to APP32 connectivity.

From DC1 MESSAGING, send at least 1 MESSAGING cross to every APP32 using test connectivity issues.

From DC2 MESSAGING, send at least 1 MESSAGING cross to every APP32 using test connectivity issues.

Confirm all cross-datacenter test firm connections are established and have simulators connected.

From DC2 PROD-VDI OSF Simulator, send order to APP20_APP20T (running in DC1).

From DC2 PROD-VDI OSF Simulator, send order to APP26_FIXTB (running in DC2).

From DC1 PROD-VDI OSF Simulator, send order to APP20_APP20T2 (running in DC2).

From DC1 PROD-VDI OSF Simulator, send order to APP26_FIXT (running in DC1).

Confirm APP12, APP29 and APP33 queries working as expected.

APP12 queries of orders and MESSAGES 2 occurring after each application’s tiAPP32 of recovery being the most imperative.

 

If Binary Database Loader for OPCON is up to date, check for the following in the ER:

 

abort

alert

data is

disconnect

dupe

error

fail

(space) gap

invalid

loss

lost

MESSAGING download complete

ora-

processed in

rej

trib

unable

warn


 

Generalized Recovery Scenarios

 

Cannot Send MESSAGES 1 To SITE 2(s)

Situation

ImpaAPP13

Response

cannot send MESSAGES 1 to the SITE 2

 

During Trading Hours:

 

cannot fulfill quoting obligations to National Market System.

 

If problems involve SUBGROUP01 not able to connect to the SITE 2, will also:

 

- not be fulfilling trade reporting obligations,

- not be fulfilling MESSAGING obligations, and

- not be updating MESSAGING.

 

1)      Confirm scope of impaAPP13.

Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

2)      Notify management.

3)      If problems are not immediately resolved, zero affected MESSAGES 1 if not already marked “manual” by APP10.

-          use NTM Control SUBGROUP01 by APP32 commands, or

-          Ask SITE 2 to zero MESSAGES 1 as they are able.

4)      Consider suspending trading (if issues not resolved in 5 minutes).

-          Use NTM Control APP32 commands.

5)      Consider SUBGROUP01 APP06 Bypass.

-          Use NTM Control SUBGROUP01 commands.

6)      At tiAPP32 of resolution, confirm the most recent affected MESSAGES 1 are sent to the SITE 2.

-          Either SUBGROUP01 will auto-regenerate MESSAGES 1, or

-          Use NTM Control APP32 commands, if necessary.

7)      Confirm any residual SUBGROUP02, APP06 and MESSAGING concerns are addressed.

 

Scope of Impact Response Considerations:

1)      If considering zeroing of MESSAGES 1, consider doing so only for the affected APP32 instances.

2)      If considering suspending trading, consider doing so only for the affected APP32 instances.

3)      Only consider SUBGROUP01 APP06 Bypass if SITE 2 connectivity is lost, and trade reporting and/or MESSAGING becomes a concern during outage.

4)      If SUBGROUP01-to-SITE 2 connectivity was lost as part of the issue, upon reconnection to the SITE 2, the SUBGROUP01 process will request any queued MESSAGES 1 from the APP41 data store and send these to the APP19 only (not the SITE 2), and then request the most recent stock MESSAGES 1 from all connected APP32 instances and send these to the SITE 2.  All MESSAGES 1, MESSAGES 2, MESSAGING and MESSAGING should be up to date at that time.

Cannot Send MESSAGES 2 To SITE 2(s)

Situation

ImpaAPP13

Response

cannot send MESSAGES 2 to the SITE 2

 

During Trading Hours:

 

cannot fulfill last sale reporting obligations to National Market System.

 

If problems involve SUBGROUP02 not able to connect to the SITE 2, will also:

 

- not be fulfilling MESSAGING obligations, and

- not be updating MESSAGING.

 

1)      Confirm scope of impaAPP13.

Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

2)      Notify management.

3)      Consider suspending trading (if issues not resolved in 15 minutes).

-          Use NTM Control APP32 commands.

4)      Consider SUBGROUP02 APP06 Bypass.

-          Use NTM Control SUBGROUP02 commands.

5)      At tiAPP32 of resolution, confirm all MESSAGES 2 not reported during outage are resent to SITE 2 as “sold” MESSAGES 2.

-          Either SUBGROUP02 will auto-generate “sold” MESSAGES 2.

-          Use APP12 to manually resend “sold” MESSAGES 2.

6)      Confirm any residual APP06 and MESSAGING concerns are addressed.

 

Scope of Impact Response Considerations:

1)      If considering suspending trading, consider doing so only for the affected APP32 instances.

2)      Only consider SUBGROUP02 APP06 Bypass if SITE 2 connectivity is lost, and MESSAGING becomes a concern during outage.

3)      If SUBGROUP02-to-SITE 2 connectivity was lost as part of the issue, upon reconnection to the SITE 2, the SUBGROUP02 process will request any queued MESSAGES 2 from the APP41 data store and send these to the SITE 2 as sold.   APP12 trade reporting queries can help uncover any MESSAGES 2 not reported.  APP12 may also allow for the manual re-sending of any MESSAGES 2 “sold”. 

 

Cannot Process MESSAGES 1 From SITE 2(s)

Situation

ImpaAPP13

Response

cannot process MESSAGES 1 from the SITE 2

 

During Trading Hours:

 

At a PROCESSING level, cannot adequately validate against locked markets, trade-throughs, or suspending trading.  May be trading against stale BBO information.

 

At a MESSAGING database loading level, post trading integrity checking may be compromised.

1)      Confirm scope of impaAPP13.

Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

-          PROCESSING processing?

-          MESSAGING Database Loading?

2)      Notify management.

3)      Consider suspending trading (if issues not resolved in 5 minutes).

-          Use NTM Control APP32 commands.

4)      At tiAPP32 of resolution, there will be no retransmission of lost MESSAGING to PROCESSINGs; Confirm MESSAGING is known by MEs.

-          Use MESSAGING MESSAGING queries.

5)      After tiAPP32 of resolution, consider whether or not a reload of any missing data can be, and should be copied or replayed into the database for post trading integrity processing.

 

Scope of Impact Response Considerations:

1)      Inbound MESSAGING processing is configured such that there is dual redundancy supported, reading from the SITE 2s. If redundancy is not lost and MESSAGING is being delivered via at least one path, then there should be no operational impaAPP13, other than a potential loss of MESSAGING in the MESSAGING database loading processing.  (Lost data in MESSAGING database loading may not be fully realized until integrity reports are run.)

2)      If complete MESSAGING delivery is lost, then if considering suspending trading, consider doing so only for the affected APP32 instances.

3)      If MESSAGING was received by MESSAGING Processors, but not processed correctly by MESSAGING APP19, then it is possible for this data to be replayed into the database; However, this is a complicated and tiAPP32 restrictive procedure. 

 


 

Cannot Process MESSAGES 2 From SITE 2(s)

Situation

ImpaAPP13

Response

cannot process MESSAGES 2 from the SITE 2

 

During Trading Hours:

 

At a PROCESSING level, may not be able to process orders for IPO issues or Market IOC orders (if first sale not yet received). 

 

At a MESSAGING database loading level, post trading integrity checking may be compromised.

 

1)      Confirm scope of impaAPP13.

Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

-          PROCESSING processing?

-          MESSAGING Database Loading?

2)      Notify management.

3)      Manually open IPO issues as dictated by other Financial MESSAGING Vendor information.

-          Use NTM Control APP32 commands.

4)      Consider enabling Market IOC processing in affected issues

-          Use NTM Control APP32 commands.

5)      At tiAPP32 of resolution, there will be no retransmission of lost MESSAGING to PROCESSINGs; Confirm MESSAGING is known by MEs.

6)      After tiAPP32 of resolution, consider whether or not a reload of any missing data can be, and should be copied or replayed into the database for post trading integrity processing.

 

Scope of Impact Response Considerations:

1)      MESSAGING Inbound processing is configured such that there is dual redundancy supported, reading from the SITE 2s. If redundancy is not lost and MESSAGING is being delivered via at least one path, then there should be no operational impaAPP13, other than a potential loss of MESSAGING in the MESSAGING database loading processing.  (Lost data in MESSAGING database loading may not be fully realized until integrity reports are run.)

2)      If complete MESSAGING delivery is lost, then if considering enabling Market IOC processing, consider doing so only for the affected APP32 instances.

3)      If MESSAGING was received by MESSAGING Processors, but not processed correctly by MESSAGING APP19, then it is possible for this data to be replayed into the database; However, this is a complicated and tiAPP32 restrictive procedure. 

 

 


 

Experiencing Issues in PROCESSING

Situation

ImpaAPP13

Response

Either is not:

1)      Accepting soAPP32 or all inbound orders,

2)      Executing soAPP32 or all inbound orders,

3)      Canceling soAPP32 or all inbound orders,

4)      Or sending Execution Reports to firms

 

During Trading Hours:

 

cannot fulfill order processing and/or reporting obligations to National Market System.

 

 

1)      Confirm scope of impaAPP13.

Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

2)      Notify management.

3)      If cause of problem is not APP32 related, then work with affected firms to

4)      Consider marking MESSAGES 1 as manual.

-          Use NTM Control SUBGROUP01 by APP32 commands

5)      Consider suspending trading (if issues not resolved in 5 minutes).

-          Use NTM Control APP32 commands.

6)      At tiAPP32 of resolution, confirm that all messages sent to the Exchange or sent back to the affected firms were sent and processed as expected.

 

Scope of Impact Response Considerations:

1)      If problems are isolated to firm connectivity issues only and not APP32 processing issues, then only consider suspending trading if an entire data center’s processing is affected.

2)      If problems do involve APP32 processing issues, and if considering marking MESSAGES 1 as manual or suspending trading, consider doing so only for the affected APP32 instances.

Network Issues Compromise Dual Data Center Connectivity

Situation

ImpaAPP13

Response

trading applications lose their ability to communicate across data centers.

 

During Trading Hours:

 

Depending on the scope of the problem:

 

Monitoring of applications may be lost or compromised.

 

may not be able to fulfill order processing and/or reporting obligations to National Market System.

 

Database loading in affected data center may no longer be up to date, and clerical support of order research and/or trade corrections in the affected data center would be impossible.

 

Post Trade Processing using the affected data center may be invalid.

 

All clerical and administrative operations would need to rely on the working data center’s database, including all Post Trade Processing.

 

1)      Confirm scope of impaAPP13.

Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          System Monitoring?

-          PROCESSINGs?

-          Inbound MESSAGING Processing?

-          Outbound MESSAGING Processing?

-          Firm Connectivity?

2)      Notify management.

3)      For each impact identified, see the appropriate Generalized Recovery Scenario in this documentation for appropriate responses, with a focus on whether or not should be zeroing MESSAGES 1 and/or suspending trading.

System monitoring is a priority.

4)      Consider moving processing from the affected data center to the healthy data center.

See Data Center Move Procedures for more details.

5)      At tiAPP32 of resolution, see the appropriate Generalized Recovery Scenario in this documentation for appropriate integrity checks, and post trading impaAPP13.

Scope of Impact Response Considerations:

1)      The network design is such that auto-failovers to redundant services should help the auto-recovery of any situation within 5 minutes of the event. 

If the problems are such that we don’t believe auto-failovers are working as expected, or that the problems may reoccur to the extent that our trading integrity is compromised, we should consider moving processing from the most adversely affected data center to the most healthy data center. 

See each application’s specific recovery procedures in this document for appropriate responses and considerations.


Data Center Move Procedures

Move DC1 to DC2 Data Center

Move DC2 to DC1 Data Center

1)      Suspend Trading, Notify Industry

2)      Move Opcon (APP35, SUBGROUP04) for monitoring

See Application Specific Recoveries

3)      Move Instrument/System Activity loading (APP27, DLAC) for ME

See Application Specific Recoveries

4)      Move PROCESSINGs (ME, APP40, DLME) for trading

See ME_Recovery_Considerations

5)      Move MESSAGING (SUBGROUP01, SUBGROUP02, APP06) for SITE 2 and BFD reporting

See APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure

6)      Move MESSAGING (APP09, APP23) for MESSAGING

See Application Specific Recoveries

7)      Move Firm Comm (APP37, APP38, APP22, APP24)

See Application Specific Recoveries

8)      Move MESSAGING and Administrative (APP07, APP12, MNT, APP33)

See Application Specific Recoveries

9)      Move APP10 (APP10, APP20, APP26)

See APP10_Recovery_Considerations

10)   Move MESSAGING Retrans and RISK (APP04, APP28, APP08)

See Application Specific Recoveries

11)   Move APP19 (DLBF, DLBP, DLCM, DLCS, DLKS, DLMP, DLOM, DLQR, DLRT, DLTR)

See DBL_Recovery_Considerations

12)   Perform System Integrity Checks

See System Integrity Checklist

13)   ResAPP41 Trading, Notify Industry

14)   Notify PTT for post trading impaAPP13 and resolutions.

 

NOTE: If DC1 moves to DC2:

- MESSAGING will lose access to APP01, APP03 and APP39 processes and related functionality.  APP26 access will be limited to those firms connecting in DC2 data center.

- Order sending firms without redundant connectivity in DC2 will not have access to PROCESSINGs.  They will also lose APP21 drop copies.

1)      Suspend Trading, Notify Industry

2)      Move PROCESSINGs (ME, APP40, DLME) for trading

See ME_Recovery_Considerations

3)      Move MESSAGING (SUBGROUP01, SUBGROUP02, APP06) for SITE 2 reporting

See APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure

4)      Move Firm Comm (APP37, APP22)

See Application Specific Recoveries

5)      Move APP10 (APP10, APP20, APP26)

See APP10_Recovery_Considerations

6)      Move APP19 (SUBGROUP02, DLBF, DLCM, DLMP, DLOM, DLQR, DLTR, APP25)

See DBL_Recovery_Considerations and Binary_DBL_Recovery_Considerations

7)      Perform System Integrity Checks

See System Integrity Checklist

8)      ResAPP41 Trading, Notify Industry

 

 

 

 

 

 

 

 

 

 

 

 

 

NOTE: If DC2 moves to DC1:

- Order sending firms without redundant connectivity in DC2 will not have access to PROCESSINGs.  They will also lose APP21 drop copies.

 

The following applications do not move between Data Centers;

Either the data is replicated in the other data center or external sources are responsible for providing redundancy:

 

-          MESSAGING PROCESSORS / INBOUND MESSAGING READERS (APP11, APP13, SUBGROUP02, SUBGROUP04)

o   Data is redundantly read in both data centers.

 

-          MESSAGING LOADERS / INBOUND MESSAGING LOADERS (APP02, SUBGROUP02, QMTL, SUBGROUP01, SUBGROUP03, SUBGROUP05)

o   Data is redundantly loaded in both data centers.

o   PTT will need to account for using data out of one data center or the other.

 

-          MESSAGINGS (APP01, APP03, APP21, APP20, APP26, APP39)

o   Firms are responsible for providing connectivity in both data centers.

o   APP01, APP03, APP21 and APP39 are only running in DC1.

 

-          APPGROUP02 (APP14, APP15, APP16, APP17, APP18))

o   APPGROUP02 Functionality is provided redundantly in both data centers.

o   IBs will need to use client links facilitating use out of one data center or the other.

 

-          MESSAGING READER (APP05)

o   MESSAGING data is multicast redundantly in both data centers.

o   Operations are the only users of MESSAGING Readers.

 

-          APP41 DATA STORES (FIRM COMM AND APP36)

o   APP41 data stores are separated between data centers.

o   APP41 data senders and receivers are limited to data center specific store storage and retrieval.

 

-          NESPR NAAPP32 SERVICE (APP34)

o   NESPR facilities exist in both data centers redundantly.

o   applications only require one instance of NESPR .

Loses A Single Data Center’s Database Access, But Network Between Data Centers and Production FileShare are OK.

Situation

ImpaAPP13

Response

Database Access is lost in either DC1 or DC2, but network connectivity between the two data centers is in tact.

 

Production FileShare access is also working as expected between data centers.

 

During Trading Hours:

Trading could continue but database loading in affected data center would no longer be up to date, and clerical support of order research and/or trade corrections in the affected data center would be impossible.

 

If processes need start in the affected data center, they would not be able to.

 

Post Trade Processing using the affected data center would likely be invalid.

 

All clerical and administrative operations would need to rely on the working data center’s database, including all Post Trade Processing.

 

Non-java application startups would need rely on TNS_NAMES.ORA file pointing to the working data center’s database.  They could start in their primary data center’s during this recovery if desired.

 

Java application startups would need to rely on TNS_NAMES.ORA file pointing to the working data center’s database as well as database specific configurations located on local servers in each data center.  They may need to start in alternate data center depending on which database is affected.

1)      Notify management.

2)      If decision is made to rely on the working data center’s database exclusively for the rest of the day, copy the working data center’s TNS_NAMES.ORA file over the affected data center’s TNS_NAMES.ORA file in chxappcfg folders. 

3)      Stop any affected APP19 trying to write to affected database.  These should remain down.

4)      Confirm all database loading to working database is up to date.

5)      Start/Restart any required java application in the saAPP32 data center as the working data center.  These applications include: APP07, APP12, MNT, APP33 and APP37.

6)      Start/Restart any required non-Java application as necessary in its primary data center. 

7)      Consider suspending trading only if MEs need to restart.

-          Use NTM Control APP32 commands before stopping/restarting MEs.

-          If MEs crash, rely on restart to halt trading automatically.

8)      Note that OSF Simulators have hard-coded database references in their xml configurations.  If these will be used for testing, they may need changes.

9)      Production Support should also be cognizant of database’s being used in ER queries.

10)   Change post-trading processing to work from single healthy data center.

 

Scope of Impact Response Considerations:

1)      This scenario documents a very specific problem with a well defined scope of impact and response.

Loses A Single Data Center’s Database Access AND Production FileShare, But Network Between Data Centers is OK.

Situation

ImpaAPP13

Response

Database Access is lost in either DC1 or DC2, but network connectivity between the two data centers is in tact.

 

Production FileShare access in affected data center is also lost.

 

During Trading Hours:

 

The saAPP32 exact impaAPP13 as when a single data center’s database access is lost with a working production fileshare, with the exception that the following local environment variables will need to be modified in order to support application restarts:

 

CHX_APP_CONFIG

IPC_CONFIG

IPC_CONFIG_LOC

TNS_ADMIN

 

All of these will need to change from the generic \\chx.com\prod\chxappcfg value to either a DC1 or DC2 specific production fileshare value.

 

 

1)      Redirect the affected data center’s servers to the working data center’s production fileshare.  These are the steps required to do this:

-          If Altiris is available, Tech Services has jobs to redefine local environment variables as necessary. 

-          If Altiris is not available, therer are registry key files in \\keymaster that must be manually imported on every server.  To import registry key files on a server:

A)      Login to Server

B)      Via Windows Explorer, find registry key file in \\keymaster\it_operations\controlroom\Test Reg Keys

C)      Double click the registry key file and follow prompts to import the keys.

2)      Once local environment variables have been changed, Tech Services will need to reboot all affected servers. 

-          If Altiris is available, Tech Services can use it.

-          If Altiris is not available, each server must be logged onto and restarted manually, individually.

3)      Follow the saAPP32 exact procedure used when a single data center’s database access is lost with a working production fileshare.

 

Scope of Impact Response Considerations:

1)      This scenario documents a very specific problem with a well defined scope of impact and response.

 

Needs to Claim Self Help; Another SITE’s Problems Compromising Their Own Published MESSAGING

Situation

ImpaAPP13

Response

Either SITE is:

1)      Marking their MESSAGES 1 as “manual”,

2)      Cannot mark their MESSAGES 1 as “manual”,

3)      Or cannot update existing MESSAGES 1.

 

During Trading Hours:

 

Unless excludes affected SITE’s MESSAGES 1 from BBO calculations (implementing Self Help Procedures), may not be adequately validating against locked markets, trade-throughs, or suspending trading.  And may be trading against stale BBO information.

 

1)      Notify management.

2)      Consider implementing Self Help Procedures.

-          Confirm SITE Issues

-          Use NTM Control Utilities APP31 APP11/SUBGROUP02 Control APP11 Markets commands.

-          Send Self Help Email Notices

 

Scope of Impact Response Considerations:

1)      Any problems or their resolution are not in CHX’s control.  Respond appropriately to whichever SITE is having issues.

 

 


 

Server Specific Recovery Scenarios

 

APPGROUP01 Server

Services Involved

ImpaAPP13

Response

Services include:

MESSAGING Reader Client

 

Service Types include:

APP05

 

- Service requires FireDaemon setup.

 

During Off-Trading and Trading Hours:

- No trading suspension considerations necessary.   The functionality provided is not required for Trading.

 

- IT Operations are only users. 

- IT Operations will not be able to confirm MESSAGING multicast going out to subscribers.  

 

-MESSAGING reader subscribers may still be getting MESSAGING multicast.

 

- Notify Tech Services for support.

- Do not notify EXECUTIVE.

 

- No alternate nodes.

- Live without functionality until server is reactivated and confirmed healthy.

 

 

APPGROUP02 Server

Services Involved

ImpaAPP13

Response

Services MAY include any of the following:

MESSAGING Retrans, Instrument/System Activity Reader and related database loader.

 

Service Types MAY include any of the following:

APP04, APP25, APP27, SUBGROUP02 or DLAC

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary.  The functionality provided is not required for Trading.

- MESSAGING Retrans users will not be able to request MESSAGINGs.

 

- If APP27 and related database loader were on node, trading applications will not be up to date with Instrument/System Activities.  These are imperative at the tiAPP32 of PROCESSING restarts.

- Notify Help Desk to notify MESSAGING Retrans users affected. 

- Do not notify EXECUTIVE.

- Move all applications to alternate nodes.
- If alternate nodes not available, move to DR nodes.
- All hosts are pre-defined. 

- No configuration changes necessary.

- Restart order (as applicable by server):
APP27, DLAC, APP04, APP25, SUBGROUP02

 

-Database loader files can be concatenated after end of day shutdown.

 

APPGROUP03 Server

Services Involved

ImpaAPP13

Response

Services include:

Apache Tomcat Services

 

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary.  The functionality provided is not required for Trading.

- IB users will not be able to use utility for administrative functions specific to MESSAGING MESSAGES 2.

 

- Operations staff will not be able to generate BBO Reports.

- Notify Help Desk to notify all APPGROUP02 users affected. 

- Do not notify EXECUTIVE.

- Ask users to access hot backup node, either in DC1 or DC2 data center until affected server is reactivated and confirmed healthy.

- Notify Tech Services to identify cause and resolution.



 

APPGROUP04 Server

Services Involved

ImpaAPP13

Response

Services include:

MESSAGING and related APP19

 

Service Types include:

APP07, DLBP

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary.    IBs should have capability of trading via TRF terminals.

 

-IBs will not be able to perform trading or administrative functions via MESSAGING terminals.

 

-Inbound orders to IBs should reject back to firm with “communication problems”.

 

-PROCESSING responses to IB orders will be seen upon MESSAGING restart.

 

-Regulatory drop copies from vendors to IBs should queue on their end.

 

- Notify IBs affected. 

- Notify EXECUTIVE.

 

- Move all applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts and host specific configurations are pre-defined, outside of JNLP requirements for APP07 client connections.  These files must be changed to allow APP07 client connectivity.

 

Restart order (as applicable by server):

APP07, DLBP

 

- Database loader files can be concatenated after end of day shutdown.

 

- Confirm all data accounted for around move.

 

 

APPGROUP05 Server

Services Involved

ImpaAPP13

Response

Services MAY include any of the following:

APP10, APP20, APP26, Nespr and related APP19.

 

Service Types MAY include any of the following:

APP10, APP20, APP34 and DLCM

 

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary.    

 

- APP10 testing will not be occurring per rules.

 

- Notify Management that MESSAGES 1 may be manual. 

- Notify EXECUTIVE.

 

- Move all applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts and host specific configurations are pre-defined, outside of APP10 service configuration files that must change to accommodate when APP10 related APP20 or APP26 processes change hosts.

- Nespr is never failed over to another node.  One instance of Nespr runs in each data center and the naAPP32 services work across the network, so we live without redundancy in these situations.

 

Restart order (as applicable by server):

APP20, APP26, APP10, DLCM

 

- Database loader files can be concatenated after end of day shutdown.

 

- Confirm all data accounted for around move.

 

 

APPGROUP06 Server

Services Involved

ImpaAPP13

Response

Services MAY include any of the following:

SITE 2, MESSAGING, APP20 (order sending firm), Drop Copy (for order sending firm), APP26 (order sending firm), MESSAGING (for IBs).

 

Service Types MAY include any of the following:

APP01, APP03, APP20, APP21, APP26, APP39

 

During Off-Trading and Trading Hours:

- No trading suspension considerations necessary.   OSFs and vendors are responsible for supporting alternative connectivity if primary connectivity options are not available.

 

-Any messaging supported by the applications involved will not be processed as expected.  There is always a potential for lost messages being in transit at the point of failure.

 

-IB messages routed to APP01s should reject back to MESSAGING.

 

-IB messages routed to APP03s should queue.

 

-OSF orders sent from APP20 firms to PROCESSINGs should queue on their side.  Open orders already sent will either remain open or be canceled per APP20 configurations.

 

-Drop copies from PROCESSINGs to APP21 firms should be queued and/or resent via THIRD PARTY data stores.

 

-OSF orders sent from APP26 firms to PROCESSINGs or IBs should queue on their side.  Open orders already sent will remain open.

 

-Regulatory drop copies from vendors to IBs should queue on their side.

 

- Notify Tech Services for server and/or NAT change support.

- Notify affected firms. 

- Notify EXECUTIVE.

 

- Move all applications to DR nodes with cooperation of Technical Services and necessary NAT changes.

- All hosts are pre-defined. 

- No configuration changes necessary.

 

- All PROCESSOR files need to be moved ahead of any service restarts.

 

Restart order (as applicable by server):

APP20, APP26, APP21, APP39, APP01, APP03

 

- Confirm all data accounted for around move.

 

 

APPGROUP07 Server

Services Involved

ImpaAPP13

Response

Services include the following:

RISK, COMMUNICATION and related APP19.

 

Service Types include the following:

APP28, APP08 and DLKS

 

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary.    

 

- There are no RISK users, but if there were, they would lose risk management functionality.

 

- Notify Management. 

- No need to notify EXECUTIVE.

 

- Move all applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- APP08 requires Linux server WAR file deployment.

 

Restart order (as applicable by server):

APP28, GTWY, DLKS

 

- Database loader files can be concatenated after end of day shutdown.

 

- Confirm all data accounted for around move.

 

APPGROUP08 or APPGROUP09 Server

Services Involved

ImpaAPP13

Response

Services MAY include any of the following:

APP29 and/or APP33, APP41 Data Store

 

ServiceTypes MAY include any of the following:

MNT, APP33

 

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary.   Neither APP29 or APP33 functionality are required for Trading.

APP41 Data Stores functionality should failover to alternate APP41 Data Store.

 

- Notify s/Compliance that APP33 is affected.

- Notify IBs that APP29 is affected. 

- Notify EXECUTIVE that APP41 Data Store involved is no longer redundantly supported until server is put back on line.

 

- Move APP29 or APP33 applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts and host specific configurations are pre-defined, outside of JNLP requirements for MNT and APP33 client connections.  These files must be changed to allow MNT and APP33 client connectivity.

 

Restart order (as applicable by server):

MNT, APP33

 

-APP41 Data Stores are not to be moved between servers.  These services may only be reactivated when the server is placed back in service.

 

 

APPGROUP10 Server

Services Involved

ImpaAPP13

Response

Services include:

PROCESSING, Transaction Readers, and related APP19

 

Service Types include:

ECHX, APP40 and

DLME, DLMP

 

During Trading Hours:

- will be exposed until PROCESSING restarts if trading was not already halted at the tiAPP32 of the server shutdown.

- Trading suspension will occur automatically upon PROCESSING restarts; However, production stocks do not open for trading until 6am. 

 

-All orders open in the PROCESSING will be canceled upon restart regardless of origin. 

 

-All orders sent to PROCESSING while PROCESSING is down will be rejected, regardless of source of origin.

 

-All affected APP32 stocks should have MESSAGES 1 zeroed by SUBGROUP01 processes upon APP32 disconnect.

 

During Off-Trading Hours:

- No trading suspension considerations unless PROCESSING will not be up before 6am on a Trading Day. 

 

- Restart PROCESSINGs as soon as possible. 

- Notify EXECUTIVE and participants immediately if trading suspension occurred.

- If any question regarding whether or not affected MESSAGES 1 have been zeroed, zero MESSAGES 1 via NTM Control Utility, SUBGROUP01 options by ME.

 

- Move all applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts are pre-defined. 

- No configuration changes necessary.

 

- All DLAPP32 database loading must be up to date in the data center that the APP32 is going to be restarted in before restarts occur.  The only tiAPP32 a database loader file should move between nodes is if the database loading has not been completed and can only be completed by doing so.

 

NOTE: APP40 and DLAPP32 processes for any given APP32 are typically configured to run on a separate node from the APP32 to try and avoid data loss in the case of an APP32 crash.

 

Restart order:

APP40, ECHX, DLME, DLMP

 

- Confirm all data accounted for around move.

 

- Database loader files (not moved by necessity) can be concatenated after end of day shutdown.

 

- Confirm systems integrity and defer to EXECUTIVE instructions before resuming trading, if trading was suspended.

 

APPGROUP11 Server (APP31 – SITE 2 to Multicast)

Services Involved

ImpaAPP13

Response

Services include:

SITE 2 Quote Processor, SITE 2 Trade Processor, NASD Quote Processor, NASD Trade Processor

 

Service Types include:

APP11, APP13, SUBGROUP02, SUBGROUP04

 

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary unless all MESSAGING is lost between two different servers.

 

-Inbound MESSAGING will not be processed as expected.  Redundancy of inbound MESSAGING is supported between “A” series processors and “B” series processors, so unless both servers go down together, there should be no loss of data, except possibly when redundancy is returned (as a result of sequence gap related processing).

 

- Do not notify EXECUTIVE unless trading suspension will be considered.

 

- Move all applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts and host specific configurations are pre-defined. 

- No configuration changes are necessary.

 

Restart order:

APP11, SUBGROUP02, APP13, SUBGROUP04

 

 

APPGROUP12 Server (APP30)

Services Involved

ImpaAPP13

Response

Services include:

Readers and Loaders for BBO Duration, Quote Montage, Lastsale Montage

 

Service Types include:

BBOD, SUBGROUP02, SUBGROUP03, and

SUBGROUP01, SUBGROUP03, SUBGROUP05

 

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary.

 

-Quote Montage, Lastsale Montage and BBO Duration will all be compromised.  Post Trading Processing will be compromised as a result.

 

- IT Operations will need to work with Post Trading Technology and Database Technologies to determine whether or not missing data needs to be “replayed” and if so, what method of “replay” should be used.

 

- Notify EXECUTIVE and Post Trading Technology that MESSAGING Loading is compromised.

 

- Move all applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts and host specific configurations are pre-defined. 

- No configuration changes are necessary.

 

Restart order:

SUBGROUP03, BBOD, SUBGROUP02, SUBGROUP05, SUBGROUP01, SUBGROUP03

 

- Confirm all data received was loaded as much as possible. 

 

- Database loader files can be concatenated after end of day shutdown.  

 

APPGROUP13 Server (APP36 – to SITE 2 MESSAGES 1 and MESSAGES 2, MESSAGING, MESSAGING and TRF)

Services Involved

ImpaAPP13

Response

Services MAY include any  of the following:

MESSAGING, SITE 2 Regional Inputs, MESSAGING/APP23 and APP24 and all related APP19.

 

ServiceTypes MAY include any of the following:

APP06, SUBGROUP01, SUBGROUP02, APP09, APP23, APP24 and

DLBF, DLAR, DLTR, DLRT

 

During Trading Hours:

- Trading suspension should only be considered if recovery takes longer than 5 minutes. 

 

-Outbound MESSAGES 1 will be queued in THIRD PARTY Data Store to be sent to database upon SUBGROUP01 restart.  All PROCESSING top of book orders will be re-quoted when SUBGROUP01 process reconneAPP13 after restart.

 

-Outbound MESSAGES 2 will be queued in THIRD PARTY Data Store to be sent to SITE 2 “sold” upon SUBGROUP02 restart. 

 

-Outbound MESSAGING will be queued and resent upon RTC and APP23 restarts/reconneAPP13.

 

-Outbound messages from MESSAGING to TRF will queue to be sent upon APP24 restart/reconnect.

 

-MESSAGING quote and trade messaging will be queued in THIRD PARTY Data Store to be sent in chronological order upon APP06 restart (as if all MESSAGING Subscribers requested a retransmission of this data).

 

During Off-Trading Hours:

- Trading suspension should only be considered if recovery goes past 6am. 

 

- Notify EXECUTIVE

- Notify SITE 2s, APP23 and/or APP24s affected.

 

- Move applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts and NAT addresses are pre-defined. 

- No configuration changes necessary.

 

-All UTDRI (NASDAQ SUBGROUP02) database loading must be up to date in the data center that the APP32 is going to be restarted in before restarts occur.  The only tiAPP32 a database loader file should move between nodes is if the database loading has not been completed and can only be completed by doing so.

 

- APP23 and APP24 PROCESSOR files need to be moved ahead of service restart.

 

- APP06 log and index files need to be moved ahead of service restart to accommodate later MESSAGING Requests.

 

- Restart order (as applicable by server):

SUBGROUP01, SUBGROUP02, APP23, APP09, APP24, APP06 then APP19.

 

- Confirm all data accounted for around move.

 

- Database loader files (not moved by necessity) can be concatenated after end of day shutdown.

 

 

APPGROUP14 Server

Services Involved

ImpaAPP13

Response

Services MAY include any  of the following:

APP12, Opcon, APP37, APP38ridge, Drop Copy Reader

 

Service Types MAY include any  of the following:

APP12, APP35, APP37, APP38, DSCR, and

DLCS, SUBGROUP04, DLOM

 

During Off-Trading and Trading Hours:

- No trading suspension considerations are necessary. 

 

-System monitoring will be compromised.  EMT will not receive messages from Opcon until problems resolved.

 

-Administrative functions and queries from APP12 will not be possible.

 

-MESSAGING, APP26 and APP12 messaging to PROCESSINGs and/or Order Sending Firms will be queued until APP37 and APP38ridge processes are restarted.

 

- PROCESSING drop copies to Order Sending Firms will be queued in THIRD PARTY Data Store until APP22 process is restarted.

 

- Notify EXECUTIVE and IBs that APP38ridge connectivity between APP07 and APP32 is compromised.

 

- Move applications to alternate nodes.

- If alternate nodes not available, move to DR nodes.

- All hosts and host specific configurations are pre-defined, outside of JNLP requirements for APP12 client connections.  These files must be changed to allow APP12 client connectivity.

 

Restart order (as applicable by server):

APP35 (followed by confirmation of monitoring integrity), APP38, APP37, APP22, APP12, then APP19.

 

- Confirm all data accounted for around move.

 

- Database loader files can be concatenated after end of day shutdown.

 

 

APPGROUP15 Server

Services Involved

ImpaAPP13

Response

Services include:

APP41 Data Store (FireDaemon)

 

NOTE: There are two pairings of APP41 Data Stores in each data center.

 

1) APP36 Data Store, storing data from ME, to SUBGROUP01, to SUBGROUP02, to APP06, and

 

2) FS Data Store, storing data from APP38 to ME, to APP40, to APP22, to APP09

 

During Off-Trading and Trading Hours:

- No trading suspension considerations necessary.

 

- Notify EXECUTIVE.

 

- No alternate nodes.

- Live without redundancy until server is reactivated and confirmed healthy.

- Service requires FireDaemon setup.

 

- Confirm APP41 Data Store failover occurred as expected and systems integrity of affected data. 

 

 

 

 


 

Application Specific Recoveries

 

All application specific documentation contains the following sections:

-          Purpose, describing what the application does and what data is processed in general terms. 

-          Troubleshooting Table, describing known events related to the application, their impaAPP13 and expected responses, in general terms.

-          Recovery Considerations, outlining detailed steps required to move, and/or recover the application.

-          NTM Control Commands, outlining available NTM Control Commands specific to the application to facilitate various operations.

 


 

APP01 (SITE 2 Processes)

 

                APP01 Purpose:

APP01 processes receive orders from IBs and send them to External Vendors or MESSAGING Services.

They then receive related responses from these services and send these back to IBs.

 

Use the following hyperlinks to jump to the desired section of APP01 documentation:

APP01_Recovery_Considerations

APP01_NTM_Control_Commands

APP01_Troubleshooting_Table

APP01_Monitoring_Considerations

 

APP01 Recovery Considerations:

                                Stopping/Restarting/Moving Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

 

-          When moving between nodes:

o   ALTERNATE NODES are not defined for APP01 services.

o   DR NODES must be allocated un-natted nodes. 

No node can support more than one natted address at the saAPP32 time.

 

-          When stopping/restarting APP01 processes:

1)      Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.

2)      Notify Tech Services if moving APP01 processes to DR nodes and NAT addresses need to change to accommodate move.

3)      Stop the APP01 process.

4)      If NOT moving APP01 to new node, skip to step 5.

a)       If moving the APP01 to a new node, copy the day’s APP01 PROCESSOR files to alternate node:

a)       Copy: \chx\data\APP01File\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.

b)      Copy: \chx\data\APP01File\Global.* file created for the day to the alternate node.

c)       If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

d)      If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

5)      Start the APP01 process.

6)      Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

 

APP01 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - APP01 – Open OSF Channel to make OSF connection possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - APP01 – Close OSF Channel to make OSF connection impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP01 – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP01 – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP01 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP01 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

APP01 Troubleshooting Table:

APP01 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP01Services.xml file within the disconnect message.

 

IB is no longer able to send or receive order or order related messages with Vendor or MESSAGING Service.

1)      Contact firm.

2)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

3)      Stop/Restarts of affected application service may help resolve the issue.

 

APP01 Monitoring Considerations:

Stats Monitors:
APP01 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:



Processes facilitate communications from to SITE 2s.

Monitor shows connection status between PROCESSOR and firm, or of PROCESSOR and application.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- Write Queue

Data is RED.

Process is either down or multicast data is not being received by monitor.


1) Check status of process

2) If process is up, call Technical Services.

Status is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP01 Service Controls to Open Channels.
2) Call Production Control if needed.

Write Queue is non-zero values and not reducing as expected.

Firm may not be processing as expected.
1) Call firm and work with Technical Services if necessary.

 


 

Stats Monitors:
APP01 App Processing Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from to SITE 2s.

Monitor shows processing statistics.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- InCount,
- OutCount,
- LastInTime,
- LastOutTime

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

No data is displayed.

No data has been generated to any SITE 2. 
1) Check APP01 log files. If all sizes are zero, no traffic has been generated.
2) Test sending order to SITE 2.

InCount does not match OutCount.

Firm may not be processing as expected.
1) Check APP01 FIX message files to confirm inbound messages match outbound messages.
2) Call firm and work with Technical Services if necessary.


 

Stats Monitors:
APP01 IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from to SITE 2s.

Monitor shows IPC channel processing statistics.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP01 processes are displayed as expected.

APP01 Service has not been started.
1) Check status of service.

Msgs In and/or Msgs Out are zero.

No messages have been sent/received since that monitor has been started.
1) Check APP01 log files.
2) Call firm and work with Technical Services if necessary.

 


 

Stats Monitors:
APP01 to APP37 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from to SITE 2s.

Monitor shows IPC channel connectivity status to APP37 processes.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP01 or APP37 processes are displayed as expected.

APP01 or APP37 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP01 or APP37 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

Stats Monitors:
APP01 to APP25 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from to SITE 2s.

Monitor shows IPC channel connectivity status to APP25 processes.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP01 or APP25 processes are displayed as expected.

APP01 or APP25 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP01 or APP25 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 

            APP02 (MESSAGING Processes)

 

All APP02 Application Specific Recovery documentation is referenced in the APP30 (APP02/SUBGROUP02/SUBGROUP03) section of this documentation.


 

APP03 (MESSAGING Processes)

 

                APP03 Purpose:

APP03 processes receive order and execution drop copies from IBs and send them to External Vendors or MESSAGING Services.

They then receive related responses from these services and send these back to IBs.

 

Use the following hyperlinks to jump to the desired section of APP03 documentation:

APP03_Recovery_Considerations

APP03_NTM_Control_Commands

APP03_Troubleshooting_Table

APP03_Monitoring_Considerations

 

APP03 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

 

-          When moving between nodes:

o   ALTERNATE NODES are not defined for APP03 services.

o   DR NODES must be allocated un-natted nodes. 

No node can support more than one natted address at the saAPP32 time.

 

-          When stopping/restarting APP03 processes:

7)      Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.

8)      Notify Tech Services if moving APP03 processes to DR nodes and NAT addresses need to change to accommodate move.

9)      Stop the APP03 process.

10)   If NOT moving APP03 to new node, skip to step 5.

a)       If moving the APP03 to a new node, copy the day’s APP03 PROCESSOR files to alternate node:

a)       Copy: \chx\data\{APP03}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.

b)      Copy: \chx\data\{APP03}\Global.* file created for the day to the alternate node.

c)       If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

d)      If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

11)   Start the APP03 process.

12)   Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

 

APP03 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - APP03 – Open OSF Channel to make OSF connection possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - APP03 – Close OSF Channel to make OSF connection impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP03 – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP03 – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP03 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP03 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

 


 

APP03 Troubleshooting Table:

APP03 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP01Services.xml file within the disconnect message.

IB is no longer able to send or receive order or drop copy related messages with Vendor or MESSAGING Service.

1)      Contact firm.

2)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

3)      Stop/Restarts of affected application service may help resolve the issue.

 

                           

APP03 Monitoring Considerations:

Stats Monitors:
APP03 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from to MESSAGING Destinations.

Monitor shows connection status between PROCESSOR and firm.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- Write Queue

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Status is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP03 Service Controls to Open Channels.
2) Call Production Control if needed.

Write Queue is non-zero values and not reducing as expected.

Firm may not be processing as expected.
1) Call firm and work with Technical Services if necessary.

 

APP04 (MESSAGING Application)

               

APP04 Purpose:

APP04 processes allow MESSAGING Subscribers to request MESSAGING multicast retransmissions of data they believe they might have missed, on demand. APP04 processes receive MESSAGING Requests from MESSAGING Subscribers and send requested data back in return.

 

Use the following hyperlinks to jump to the desired section of APP04 documentation:

 

APP04_Recovery_Considerations

APP04_NTM_Control_Commands

APP04_Troubleshooting_Table

APP04_Monitoring_Considerations

 

APP04 Recovery Considerations:

 

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP04 nodes only when moving between nodes.  There are NAT dependencies on this functionality.

 

-          When stopping/restarting APP04 processes:

1)      Notify the MESSAGING Subscribers and work in cooperation with them, as appropriate to situations.

2)      Notify Tech Services if moving APP04 processes to alternate nodes and NAT addresses need to change to accommodate move.

3)      Stop/Restart the APP04 process.

 

-          APP04 recovery must also be considered if MESSAGING (APP06) Processes are moved between nodes.

-          See APP04_Move_Procedure in the APP36_Recovery_Considerations section for more details.

 

 

APP04 NTM Control Commands:

 

There are no APP04 related NTM Control Commands.

 

 

APP04 Troubleshooting Table:

APP04 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

MESSAGING Subscribers will not be able to request MESSAGINGs.

 

Refer to:

APPGROUP02 Server

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP02 Server

Server Specific Recoveries.

2)      Notify Management.

 

                           


 

APP04 Monitoring Considerations:

Stats Monitors:
BFR Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate MESSAGING requests from MESSAGING subscribers utilizing MESSAGING log files.

Monitor shows connection status between and MESSAGING Subscribers as well as processing statistics.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Connection Name,
- Connection Status,
- Retrans Requests,
- Retrans Messages Sent,
- Connection IP,
- Connection Port

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Connection status is not LoggedIn.

Subscriber is not logged in and is unable to request or receive MESSAGINGs.
1) Work with firm to reconnect, noting that a stop/restart of the MESSAGING Retrans process on our side will affect every subscriber loggedin at the tiAPP32 - and would not be recommended unless absolutely necessary.

 

 

 

Retrans Requests Accepted and Sent values are unexpectedly high.

May indicate that there are MESSAGING delivery issues; Especially if seen for several different subscribers.
1) Call Production Support if necessary.

 

 


 

APP05 (MESSAGING Reader Application)

               

APP05 Purpose:

APP05 processes emulate MESSAGING Subscriber firms by running on servers that are outside of access switches.

APP05 processes receive MESSAGING multicast from MESSAGING (APP06) processes.

APP05 Client applications allow users to view MESSAGING data statistics, status and error messaging as well as MESSAGING data.

 

Operations are the only users of MESSAGING Readers. 

Any issue with MESSAGING Readers may, or may not, mean MESSAGING Subscribers are having similar issues.  

See APP06 Application Specific Scenarios for information related to MESSAGING multicast.

 

Use the following hyperlinks to jump to the desired section of APP05 documentation:

APP05_Recovery_Considerations

APP05_NTM_Control_Commands

APP05_Troubleshooting_Table

APP05_Monitoring_Considerations

 

 

APP05 Recovery Considerations:

 

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          APP05 processes do not move between data centers.

 

APP05 NTM Control Commands:  

 

There are no APP05 related NTM Control Commands.

 


 

APP05 Troubleshooting Table:

APP05 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

MESSAGING Readers will not be able to receive MESSAGING multicast.

 

Refer to:

APPGROUP01 Server

Server Specific Recoveries.

1)      Refer to:

APPGROUP01 Server

Server Specific Recoveries.

2)      Notify Management.

 

                           

APP05 Monitoring Considerations:

MESSAGING Reader Client

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes emulate MESSAGING Subscribers.

MESSAGING Reader Client allows user monitoring of MESSAGING Statistics, Messaging and MESSAGING data.

PROD MENU:
Client Apps Menu

To Exit:
Close Window

- Status Bar (bottom of client)

- Backup State

- Primary State

- Messages Window

 

Status Bar shows client in disconnected state.

 

APP06 Process(es) are either down or Client started before APP06 Processes.

 

1) If APP06 processes are down, restart them.
2) If APP06 processes are up, select “Connection” option from client to attempt reconnect.

3) If reconnection attempts fail, call Technical Services; There may be networking issue.

 

 

 

Backup State and/or Primary State is not “active”.

 

APP06 multicast for alternate channel is not subscribed to.

 

1) Work with Tech Services to investigate potential network issue. 

2) If only one of the channels is not active, MESSAGING Subscribers may not have an issue with the working channel.  A Stop/Restart of the APP06 process involved may correct the issue, understanding that this will interrupt MESSAGING Subscribers reading the working channel, if there is one. 

 

 

 

Message Window shows APP06 related issues. 

See APP06 Application Recoveries.
1) Call Production Support if needed.

 

APP06 (MESSAGING Services)

 

 

All APP06 Application Specific Recovery documentation is referenced in the APP36 (SUBGROUP01, SUBGROUP02, APP06) section of this documentation.

 

 


 

APP07 (MESSAGING)

               

APP07 Purpose:

APPGROUP04 Servers receive orders, order cancels and order changes from Order Sending Firms and send responses to these to Order Sending Firms.

APPGROUP04 Servers receive order responses from Market Engines, SITE 2s, TRFs, as well as “regulatory” drop copies.

APPGROUP04 Servers send orders to PROCESSINGs, SITE 2s, and TRFs, as well as drop copies to MESSAGING processes.

APPGROUP04 Servers also send trade reports directly to SUBGROUP02 services when they correct MESSAGES 2.

 

Use the following hyperlinks to jump to the desired section of APP07 documentation:

APP07_Recovery_Considerations

APP07_NTM_Control_Commands

APP07_Troubleshooting_Table

APP07_Monitoring_Considerations

 

 

 

APP07 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP07 nodes only when moving between nodes.  (Java code, FireDaemon and Host Specific references in JNLPs required.)

 

-          When stopping/restarting APP07 processes:

1)      Notify the IBs and work in cooperation with them, as appropriate to situations.

2)      Stop/Restart the APP07 process.

 


 

APP07 NTM Control Commands :

Refresh Threshold Data:

-          Use NTM Control Utility – Service Control - APP07 – Refresh Threshold Data.

 

Refresh Trade Ack Data:

-          Use NTM Control Utility – Service Control - APP07 – Refresh Trade Ack Data.

 

Refresh Brokers Clerks Data:

-          Use NTM Control Utility – Service Control - APP07 – Refresh Brokers Clerk Data.

 

Refresh Sub Accounts Data:

-          Use NTM Control Utility – Service Control - APP07 – Refresh Sub Accounts Data.

 

Turn On MESSAGING Control:

-          Use NTM Control Utility – Service Control - APP07 – Turn on MESSAGING Control.

 

Turn Off MESSAGING Control:

-          Use NTM Control Utility – Service Control - APP07 – Turn off MESSAGING Control.

 

End Of Day:

-          Use NTM Control Utility – Service Control - APP07 – End Of Day to start End Of Day processing.

 

Client Shutdown:

-          Use NTM Control Utility – Service Control – APP07 – Client Shutdown to remotely shutdown all clients for the instance.

 


 

APP07 Troubleshooting Table:

APP07 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

IBs will not be able to receive or process orders via MESSAGING.

 

Refer to:

APPGROUP04 Server

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP04 Server

Server Specific Recoveries.

2)      Notify Management.

 

                           


 

APP07 Monitoring Considerations:

Stats Monitors:
APP07 IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from APPGROUP04 Servers to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade Reporting Systems.

Monitor shows IPC channel processing statistics.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP07 processes are displayed as expected.

APP07 Service has not been started.
1) Check status of service.

Msgs In and/or Msgs Out are zero when messages are processed.

No messages have been sent/received since that monitor has been started.
1) Check MESSAGING log files.
2) Call Production Support if necessary.

 

 


 

Stats Monitors:
MESSAGING to APP03 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from APPGROUP04 Servers to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade Reporting Systems.

Monitor shows IPC channel connectivity status to APP03 processes.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP07 or APP03 processes are displayed as expected.

APP07 or APP03 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP07 or APP03 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

Stats Monitors:
MESSAGING to APP24 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from APPGROUP04 Servers to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade Reporting Systems.

Monitor shows IPC channel connectivity status to APP24 processes.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP07 or APP24 processes are displayed as expected.

APP07 or APP24 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP07 or APP24 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

Stats Monitors:
MESSAGING to APP32 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from APPGROUP04 Servers to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade Reporting Systems.

Monitor shows IPC channel connectivity status to APP37 Bridge / PROCESSING processes.  The only messages going through these channels are MESSAGING Query messages.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP07 or PROCESSING processes are displayed as expected.

APP07 or PROCESSING Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP07 or PROCESSING log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 

Stats Monitors:
MESSAGING to APP37 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from APPGROUP04 Servers to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade Reporting Systems.

Monitor shows IPC channel connectivity status to APP37 processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP07 or APP37 processes are displayed as expected.

APP07 or APP37 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP07 or APP37 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

Stats Monitors:
MESSAGING to SUBGROUP02 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from APPGROUP04 Servers to Order Sending Firms, Drop Copy Vendors, SITE 2s, PROCESSINGs and Trade Reporting Systems.

Monitor shows IPC channel connectivity status to SUBGROUP02 processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP07 or SUBGROUP02 processes are displayed as expected.

APP07 or SUBGROUP02 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP07 or SUBGROUP02 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

APP08 (COMMUNICATION)

               

All APP08  Application Specific Recovery documentation is referenced in the APP28 (RISK System) section of this documentation.

 

 

APP09 (MESSAGING)

               

APP09 Purpose:

MESSAGING receives MESSAGING records from SUBGROUP02.

MESSAGING sends MESSAGING records to APP23 and sends MESSAGING drop copy messages to APP22 for order sending firms that request them.

 

Use the following hyperlinks to jump to the desired section of APP09 documentation:

APP09_Recovery_Considerations

APP09_NTM_Control_Commands

APP09_Troubleshooting_Table

APP09_Monitoring_Considerations

 

APP09 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          APP09 process is closely tied to the APP36 system.  If moving APP09 to another node, see APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure.

 

 

APP09 NTM Control Commands :

Resend Zero CuSITE 2 Messages:

-          Use NTM Control Utility – Service Control – RealtiAPP32 MESSAGING Options – Resend Zero CuSITE 2 Messages.

 

Send EOD Message:

-          Use NTM Control Utility – Service Control - RealtiAPP32 MESSAGING Options – Send EOD Message.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - RealtiAPP32 MESSAGING Options – Set Outbound Sequence Number.

 

APP09 Troubleshooting Table:

APP09 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

APP23 will no longer be receiving MESSAGING records.

 

Order Sending firms requesting MESSAGING drop copies will no longer be receiving them.

 

Refer to:

APPGROUP13 Server (APP36 – to SITE 2 MESSAGES 1 and MESSAGES 2, MESSAGING, MESSAGING and TRF)

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP13 Server (APP36 – to SITE 2 MESSAGES 1 and MESSAGES 2, MESSAGING, MESSAGING and TRF)

Server Specific Recoveries.

2)      Notify Management.

 

                           


 

APP09 Monitoring Considerations:

Stats Monitors:
RTC to APP23 App Queues Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate MESSAGING trade delivery from SUBGROUP02 processes to APP23 PROCESSOR.

Monitor shows connection status between RTC process and APP23 PROCESSOR, and processing statistics.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- Write Queue

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all RTC processes are displayed as expected.

Not all RTC services have been started or havn't processed any messages since monitor has been started.
1) Check status of services.
2) Check relevant service log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

IPC Connected Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 

Stats Monitors:
RTC App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

 

Processes facilitate MESSAGING trade delivery from SUBGROUP02 processes to APP23 PROCESSOR.

Monitor shows processing statistics between SUBGROUP02, RTC and APP23 processes.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- InMsgs,
- OutMsgs

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

Status is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP39 Service Controls to Open Channels.
2) Call Production Control if needed.

 

InMsgs values are not equal to or greater than OutMsgs values.

By design, SUBGROUP02 sends all MESSAGES 2 to RTC process but RTC only forwards non-test messages to APP23 for actual MESSAGING.

We may not be processing as expected.
1) Check RTC log files to confirm inbound and outbound messages.
2) Work with Production Support if necessary.

 


 

Stats Monitors:
RTC IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate MESSAGING trade delivery from SUBGROUP02 processes to APP23 PROCESSOR.

Monitor shows IPC channel processing statistics.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.



Process is either down or multicast data is not being received by monitor.

1) Check status of process
2) If process is up, call Technical Services.

Not all RTC processes are displayed as expected.

Service has not been started.
1) Check status of service.

Msgs In and/or Msgs Out are zero.

No messages have been sent/received since that monitor has been started.
1) Check RTC log files.
2) Call Production Support if necessary.

 

 

 


 

APP10 (TESTING Application)

               

APP10 Purpose:

APP10 processes generate test orders (emulating order sending firms) and sends these to either APP20 or APP26 processes to route to PROCESSINGs.

APP10 processes receive the order and quote responses for each order generated and report “test success/fail” results as they are received.

Based on testing failures (or successes after failures), APP10 processes may automatically set all MESSAGES 1 for stocks traded by a given PROCESSING to manual or auto.  

 

By default configuration settings:

 

APP10DC2 process runs in DC2 and tests all stocks assigned to DC2 MEs, via APP20_APP10DC2 process, running in DC1. 

Failures will control quoting conditions in these stocks.

 

APP10DC1 process runs in DC1 and tests all stocks assigned to DC1 MEs, via APP20_APP10DC1 process, running in DC2. 

Failures will control quoting conditions in these stocks.

 

APP10APP37 process runs in DC1 and tests all stocks assigned to DC2 and DC1 MEs, via APP26_APP10APP37 process, running in DC2. 

Failures will NOT control quoting conditions in these stocks.

 

Use the following hyperlinks to jump to the desired section of APP10 documentation:

APP10_Recovery_Considerations

APP10_NTM_Control_Commands

APP10_Troubleshooting_Table

APP10_Monitoring_Considerations

 

 

 


 

APP10 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP10 nodes only when moving between nodes.  (Configuration files utilize specific IP addresses to connect to APP20 and APP26 instances)

 

-          When stopping/restarting APP10 processes:

1)      Determine if associated APP20 or APP26 instance used by the affected APP10 instance is also moving hosts.

a.       If APP10 associated APP20 or APP26 instances are moving hosts, a different APP10Services.xml configuration file will be required.

The APP10Services.xml files make use of APP10 service level parameters in the APP29.

The following APP10 service level parameters are defined for APP10DC1 and APP10DC2 services:

APP20_PRI_PRI_HOST  

APP20_PRI_ALT_HOST

APP20_DR_PRI_HOST

APP20_DR_ALT_HOST

The following APP10 service level parameters are defined for the APP10APP37 service:

APP26_PRI_PRI_HOST  

APP26_PRI_ALT_HOST

APP26_DR_PRI_HOST

APP26_DR_ALT_HOST

b.       The APP10Services.xml file can easily be modified to redirect APP10 service being moved to use the correct parameter.

c.       There are already soAPP32 pre-modified versions of APP10Services.xml to aid in these recoveries.

d.       If these configuration files are not moved in before the APP10 process restarts, no connections will occur.

2)      If the associated APP10 APP20 or APP26 instance is NOT moving to an alternate node, skip to step 3.

a)       Move in, or modify the required version of APP10Services.xml file

b)      Stop the associated APP20 or APP26 instance on its current node.

c)       Copy the day’s APP20 or APP26 PROCESSOR files from d$\chx\data\ subfolder from current node to saAPP32 folder on alternate node.

d)      Start the associated APP20 or APP26 instance on its alternate node.

e)      Open channel for associated APP20 or APP26 instance and confirm channel opens without issue.

3)      Stop the APP10 instance.

4)      If APP10 process is NOT moving to alternate node, skip to step 5.

a)       Copy the day’s APP10 PROCESSOR files from d$\chx\data\ subfolder from current node to saAPP32 folder on alternate node.

5)      Start APP10 instance.

a)       Confirm APP10 testing works without errors. 

APP10 testing will automatically start between 5am-3:30pm upon APP10 startup.


 

 APP10 NTM Control Commands :

Auto Quote All MEs:

-          Use NTM Control Utility – Service Control - APP10 – Auto Quote All MEs to send all MEs message to set all stocks quote modes to auto.

 

Auto Quote One ME:

-          Use NTM Control Utility – Service Control - APP10 – Auto Quote One APP32 to send one APP32 a message to set all stocks quote modes to auto.

 

Manual Quote All MEs:

-          Use NTM Control Utility – Service Control - APP10 – Manual Quote All MEs to send all MEs message to set all stocks quote modes to manual.

 

Manual Quote One ME:

-          Use NTM Control Utility – Service Control - APP10 – Manual Quote One APP32 to send one APP32 a message to set all stocks quote modes to manual.

 

Start All APP32 Test:

-          Use NTM Control Utility – Service Control - APP10 – Start All APP32 Test to start APP10 testing for all MEs, as configured by stock.

 

Start One APP32 Test:

-          Use NTM Control Utility – Service Control - APP10 – Start One APP32 Test to start APP10 testing for a single ME, as configured by stock.

 

Stop All APP32 Test:

-          Use NTM Control Utility – Service Control - APP10 – Stop All APP32 Test to stop APP10 testing for all MEs, as configured by stock.

 

Stop One APP32 Test:

-          Use NTM Control Utility – Service Control - APP10 – Stop One APP32 Test to stop APP10 testing for a single ME, as configured by stock.

 

 

 


 

APP10 Troubleshooting Table:

APP10 Symptom

ImpaAPP13

Response

APP10 RESTING ORDER QUOTE FAILURES

Caused by SITE 2 connectivity issues

 

Evidenced by:

-          APP10 EMT messages indicating RESTING_ORDER_QUOTE_FAILURE.

 

-          APP10 Processing Stats monitor indicating RESTING_ORDER_QUOTE_FAILURE.

 

-          APP36 Stats monitor showing connectivity issues to SITE 2 (SITE 2 and/or NASD)

 

APP10 is not receiving MESSAGES 1 from SUBGROUP01 process as expected.  

 

During Trading Hours:

 

cannot fulfill quoting obligations to National Market System.

 

MESSAGING will no longer be updating if SUBGROUP01 process is not connected to SITE 2.

 

If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2.  

 

SITE 2 connectivity issues will prevent industry from seeing these “manual” MESSAGES 1.

 

1)      Refer to: CHX_Cannot_Send_MESSAGES 1_To_SITE 2s

Generalized Recovery Scenario.

 

2)      Refer to: APP36 (SUBGROUP01, SUBGROUP02, APP06)

Application Specific Recoveries.

 

APP10 RESTING ORDER QUOTE FAILURES

NOT caused by SITE 2 connectivity issues

 

Evidenced by:

-          APP10 EMT messages indicating RESTING_ORDER_QUOTE_FAILURE.

 

-          APP10 Processing Stats monitor indicating RESTING_ORDER_QUOTE_FAILURE.

 

-          APP36 Stats monitor showing NO CONNECTIVITY ISSUES to SITE 2 (SITE 2 and/or NASD)

 

-          APP10_App_Connect_Stats monitor showing possible connectivity problem between APP10 and APP11RI/UQDRI process.

 

-          NTM Control APP31 APP11/SUBGROUP02 Display Montage shows MESSAGES 1 marked as “manual”.

 

APP10 is not receiving MESSAGES 1 from SUBGROUP01 process as expected.

 

During Trading Hours:

 

If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2. 

 

1)      Confirm scope of impaAPP13.

In Stats Monitor:

-          Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

 

2)      Notify management.

3)      Work with Tech Services to determine corrective actions.  Try to avoid APP32 stop/restart.

-          Stop/Restart of APP10 may resolve issue.

-          Stop/Restart of SUBGROUP01 may resolve issue.

-          NTM APP32 option to “Resend APP32 MESSAGES 1” may resolve issue.

4)      If connectivity cannot be immediately resolved, use NTM Control SUBGROUP01 options by APP32 to zero MESSAGES 1 or mark all MESSAGES 1 as manual.

5)      Consider suspending trading.

6)      If SUBGROUP01 restart is tried, once SITE 2 connectivity is re-established, SUBGROUP01 will automatically request download of updated MESSAGES 1 from all MEs and send these to SITE 2.

7)      Once problems are resolved, use NTM APP32 “Resend APP32 MESSAGES 1” option to generate updated MESSAGES 1 from all MEs and send these to SITE 2.

 

APP10 RESTING ORDER EXEC RPT FAILURE

WITHOUT APP10 ORDER_EXEC_RPT_FAILURE.

 

Evidenced by:

-          APP10 EMT messages indicating RESTING_ORDER_EXEC_RPT_FAILURE.

 

-          APP10 Processing Stats monitor indicating RESTING_ORDER_EXEC_RPT_FAILURE.

 

-          APP10_App_Connect_Stats monitor showing possible connectivity problem between APP10 and FIX.4.1:APP10T process.

 

-          NTM Control APP31 APP11/SUBGROUP02 Display Montage shows MESSAGES 1 marked as “manual”.

 

If APP10 is ONLY reporting RESTING_ORDER_EXEC_RPT_FAILURE, without APP10 ORDER_EXEC_RPT_FAILURE, then implication is that the APP32 is up and processing, and connections to APP10 are likely in tact, but APP32 APP10 stock may be closed, and need to be re-opened.

 

APP10 is not receiving APP32 trade executions from ME, APP20 or APP26 process as expected, BUT is receiving IOC order cancels as expected.

 

During Trading Hours:

 

If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2. 

 

1)      Confirm scope of impaAPP13.

In Stats Monitor:

-          Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

 

2)      Notify management.

3)      Work with APP12 APP10 order queries and other departments as necessary to determine corrective actions.  Try to avoid APP32 stop/restart.

-          NTM APP32 option to “ResAPP41 APP10 Issues” may resolve issue.

-          Stop/Restart of APP20/APP26 may resolve issue; Requires opening of channels.

-          Stop/Restart of APP10 may resolve issue.

4)      If connectivity cannot be immediately resolved, use NTM Control SUBGROUP01 options by APP32 to zero MESSAGES 1 or mark all MESSAGES 1 as manual.

5)      Consider suspending trading.

6)      Once problems are resolved, use NTM APP32 “Resend APP32 MESSAGES 1” option to generate updated MESSAGES 1 from all MEs and send these to SITE 2.

 

APP10 RESTING ORDER EXEC RPT FAILURE

WITH APP10 ORDER EXEC RPT FAILURE.

 

Evidenced by:

-          APP10 EMT messages indicating ORDER_EXEC_RPT_FAILURE and RESTING_ORDER_EXEC_RPT FAILURE.

 

-          APP10 Processing Stats monitor indicating ORDER_EXEC_RPT_FAILURE and RESTING_ORDER_EXEC_RPT FAILURE.

 

-          APP10_App_Connect_Stats monitor showing possible connectivity problem between APP10 and FIX.4.1:APP10T process.

 

-          NTM Control APP31 APP11/SUBGROUP02 Display Montage shows MESSAGES 1 marked as “manual”.

 

If APP10 is reporting both RESTING_ORDER_EXEC_RPT_FAILURE and ORDER_EXEC_RPT_FAILURE simultaneously, then likely issue is caused by APP32 being down, or APP32 connection to APP10 APP26/APP20 is broken, or APP10 APP26/APP20 connection to APP10 is broken.

 

APP10 is not receiving APP32 trade executions from ME, APP20 or APP26 process as expected, NOR receiving IOC order cancels as expected.

 

During Trading Hours:

 

If 3+ consecutive APP10 failures of any kind, APP10 will send SUBGROUP01 process “set quote to manual” command, causing all affected stocks to be marked as “manual” to SITE 2. 

 

1)      Confirm scope of impaAPP13.

In Stats Monitor:

-          Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

 

2)      Notify management.

3)      Work with APP12 APP10 order queries and other departments as necessary to determine corrective actions.  Try to avoid APP32 stop/restart.

-          Stop/Restart of APP20/APP26 may resolve issue; Requires opening of channels.

-          Stop/Restart of APP10 may resolve issue.

-          NTM APP32 option to “ResAPP41 APP10 Issues” may resolve issue.

4)      If connectivity cannot be immediately resolved, use NTM Control SUBGROUP01 options by APP32 to zero MESSAGES 1 or mark all MESSAGES 1 as manual.

5)      Consider suspending trading.

6)      Once problems are resolved, use NTM APP32 “Resend APP32 MESSAGES 1” option to generate updated MESSAGES 1 from all MEs and send these to SITE 2.

 

                           

 

 

               

 

APP10 Monitoring Considerations:

Stats Monitors:
APP10 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes generate test orders and cancels to PROCESSINGs via APP20 or APP26 processes and report communication failures from PROCESSINGs and SUBGROUP01 processes.

Monitor shows connection status between APP10 PROCESSOR and APP20, APP26 and SUBGROUP01 processes.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Name,

- Status,
- Write Queue

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Status is Created, Disconnected or Open

APP20, APP26 or SUBGROUP01 is not connected.
1) Use NTM Control Utility APP20 or APP26 Service Controls to Open Channels.
2) Stop/Restart APP10 or SUBGROUP01 process if commonality between disconnected statuses can be detected.
3) Call Production Control if needed.

Write Queue is non-zero values and not reducing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 

 

 

 


 

Stats Monitors:
APP10 App Processing Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes generate test orders and cancels to PROCESSINGs via APP20 or APP26 processes and report communication failures from PROCESSINGs and SUBGROUP01 processes.

Monitor shows testing status and processing statistics of APP10 processes.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- APP32 Names,
- Status,
- Quote Mode,
- Test Result,
- TestDesp

Data is RED.

 

Process is either down or multicast data is not being received by monitor.

1) Check status of process
2) If process is up, call Technical Services.

Status is not Testing

 

APP10 testing has not been enabled.

1) Use NTM Control Utility APP10 Service Controls to control testing.
2) Stop/Restart APP10 process if process isn't responding as expected.
3) Call Production Control if needed.

Quote Mode is Manual

 

APP10 test cycles have failed 3 consecutive times and not succeeded in 3 subsequent consecutive times.

1) If failure reasons are quote related, check SUBGROUP01 processes and processing.
2) If failures are order or execution report related, check APP32 processing via statistics monitors and APP12 APP10 order queries if necessary.
3) Call Production Control if needed.

Test Results are not PASS and/or TestDesp are not successful.

 

APP10 test cycles have failed 3 consecutive times and not succeeded in 3 subsequent consecutive times.

1) If failure reasons are quote related, check SUBGROUP01 processes and processing.
2) If failures are order or execution report related, check APP32 processing via statistics monitors and APP12 APP10 order queries if necessary.
3) Call Production Control if needed.


 

APP11 (SITE 2 MESSAGES 1 MESSAGING Processor)

 

All APP11 Application Specific Recovery documentation is referenced in the APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04) section of this documentation.

 

APP12 (CLIENT Interface)

               

APP12 Purpose:

APP12 allows users to query order, trade and MESSAGING records.  Users can also cancel orders, modify or resend MESSAGES 2, or enter trade or MESSAGING records.

APP12 Servers send corrected trade reports to PROCESSINGs, or directly to SUBGROUP02 services when they correct MESSAGES 2.

 

Use the following hyperlinks to jump to the desired section of APP12 documentation:

 

APP12_Recovery_Considerations

APP12_NTM_Control_Commands

APP12_Troubleshooting_Table

APP12_Monitoring_Considerations

 

 

APP12 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP12 nodes only when moving between nodes.  (Java code, FireDaemon and Host Specific references in JNLPs required.)

 

-          When stopping/restarting APP12 processes:

3)      Notify Operations and work in cooperation with them, as appropriate to situations.

4)      Stop/Restart the APP12 process.

 

APP12 NTM Control Commands:

Refresh Sub Accounts Data:

-          Use NTM Control Utility – Service Control - APP12 – Refresh Sub Accounts Data.

 

APP12 Troubleshooting Table:

APP12 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

Operations will not be able to query or administratively manage orders, MESSAGES 2 or MESSAGING reports via APP12.

 

Refer to:

APPGROUP14 Server

Server Specific Recoveries.

1)      Refer to:

APPGROUP14 Server

Server Specific Recoveries.

2)      Notify Management.

 

                           

APP12 Monitoring Considerations:

Stats Monitors:
APP12 IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:


 Processes facilitate order and trade research and modification capabilities with PROCESSINGs.

Monitor shows IPC channel processing statistics.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.

 

Process is either down or multicast data is not being received by monitor.

1) Check status of process
2) If process is up, call Technical Services.

Not all APP12 processes are displayed as expected.

APP12 Service has not been started.
1) Check status of service.

Msgs In and/or Msgs Out are zero when messages are processed.

No messages have been sent/received since that monitor has been started.
1) Check APP12 log files.
2) Call Production Support if necessary.

 


 

APP13 (SITE 2  MESSAGING Processor)

 

All APP13 Application Specific Recovery documentation is referenced in the APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04) section of this documentation.

 

APP14 (APPGROUP02 – )

 

All APP14 Application Specific Recovery documentation is referenced in the APPGROUP03 Server section of this documentation.

 

 

APP15 (APPGROUP02 – )

 

All APP15 Application Specific Recovery documentation is referenced in the APPGROUP03 Server section of this documentation.

 

 

APP16 (APPGROUP02 – )

 

All APP16 Application Specific Recovery documentation is referenced in the APPGROUP03 Server section of this documentation.

 

 

APP17 (APPGROUP02 – )

 

All APP17 Application Specific Recovery documentation is referenced in the APPGROUP03 Server section of this documentation.

 

 

APP18 (APPGROUP02 – )

 

All APP18 Application Specific Recovery documentation is referenced in the APPGROUP03 Server section of this documentation.

 

 


 

APP19 (Non-Binary / XML Loaders)

               

DBL Purpose:

APP19 read files created by associated applications, and load all messages into databases for historical reference.

 

Use the following hyperlinks to jump to the desired section of DBL documentation:

DBL_Recovery_Considerations

DBL_NTM_Control_Commands

DBL_Troubleshooting_Table

DBL_Monitoring_Considerations

 


 

DBL Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use appropriate nodes for associated application only when moving between nodes.  (See associated application sections for reference.)

 

-          When stopping/restarting Database Loader processes:

1)      Stop the associated application that is responsible for writing to the affected database loader file.

2)      Confirm the database loader has completed loading all data.

3)      Stop the database loader.

4)      If NOT moving the Database Loader files to a new node, skip to step 5.

a)       Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.

b)      If moving Database Loader files:

Copy :\chx\data\DL*.log, DL*.pos, DL*rejeAPP13.log files (created on day) for each Database Loader process moving to alternate node.

5)      Start the associated application that writes the database loader file.

6)      Start the Database Loader.

7)      If APP19 were moved to alternate nodes WITHOUT moving Database Loader files, concatenate Database Loader files:

a)       These steps are only critical for End-Of-Day Post Trade Technology procedures that depend on the affected Database Loader files existing on a given node, and including the entire day’s records. These steps can be left for the End-Of-Day, as long as they are completed prior to the End-Of-Day Post Trade Technology procedures that require them.

b)      In Windows Explorer:

i)        Go to server and folder where first set of Database Loader files exist.

ii)       RenaAPP32 all *.log files involved (created on the day) from *.log to *A.log

iii)     Copy second server versions of the all *.log files involved (created on the day) to original server and folder.

iv)     RenaAPP32 all newly copied files involved (created on the day) from *.log to *B.log

c)       On original server, using DOS command prompt:

i)        Go to folder where Database Loader files exist.

ii)       COPY *A.log + *B.log *.log

 

 


 

DBL NTM Control Commands:

See Database Loader Reject Processing Documentation for NTM Control Commands utilized for this purpose.

 

Database Reject Reformat:

-          Use NTM Control Utility – Utilities – Database Reject Reformat.

-          User will see list of reject files to be processed, if there are any.

-          For any reject files the user wishes to process, they highlight the file and then right-click and select Create New Reload File.

 

 

 

DBL Troubleshooting Table:

DBL Symptom

ImpaAPP13

Response

There are Database Loading RejeAPP13

                       

Evidenced by:

-          Stats monitor shows non-zero values in reject column.

 

-          Oracle reject errors are seen in EMT.

Historical data will not be retained as expected, nor be available to applications that may need to act further against the data.

1)      Use Database Loader Reject Replay Procedures to examine reasons for rejeAPP13 and replay data as appropriate.

 


 

DBL Monitoring Considerations:

Stats Monitors:
APP19

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes read data files and load the records into the databases.

Monitor shows processing stats of each loader.

DC= tradeon
DE = etradeon
HC = toss
HE = etoss

PROD MENU:
 DB Loader Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- RejeAPP13,
- Percent Complete,
- Records remaining,
- Insert Rate,
- Minutes to Finish

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Reject File Size column shows positive value (non-zero).

RejeAPP13 have occurred.
Process rejeAPP13 per procedures.

Percent Complete and Records Remaining columns showing non-zero values and not reducing as expected. Insert Rate is zero or lower than expected rate.

Process is either down or hung, or database is not responding.
1) Check status of process
2) If process is up, check CPU and memory utilization to see if process is hung.
3) If process is up and not hung, call Database Technologies.

 

               


 

 

APP19 (Binary – SUBGROUP01, SUBGROUP02, SUBGROUP03, SUBGROUP04, SUBGROUP05)

 

                Binary DBL Purpose:

APP19 read files created by associated applications, and load all messages into databases for historical reference.

 

Use the following hyperlinks to jump to the desired section of Binary DBL documentation:

Binary_DBL_Recovery_Considerations

Binary_DBL_NTM_Control_Commands

Binary_DBL_Troubleshooting_Table

Binary_DBL_Monitoring_Considerations

 


 

Binary DBL Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use appropriate nodes for associated application only when moving between nodes.  (See associated application sections for reference.)

 

-          When stopping/restarting Database Loader processes:

1)      Stop the associated application that is responsible for writing to the affected database loader file.

2)      Confirm the database loader has completed loading all data.

3)      Stop the database loader.

4)      If NOT moving the Database Loader to a new node, skip to step 5.

a)       Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.

b)      If moving Database Loader files:

Copy :\chx\data\*bin*.log and *bin*.pos files (created on day) for each Database Loader process moving to alternate node.

5)      Start the associated application that writes the database loader file.

6)      Start the Database Loader.

7)      If APP19 moved to alternate nodes WITHOUT moving Database Loader files, concatenate Database Loader files:

a)       These steps are only critical for End-Of-Day Post Trade Technology procedures that depend on the affected Database Loader files existing on a given node, and including the entire day’s records. These steps can be left for the End-Of-Day, as long as they are completed prior to the End-Of-Day Post Trade Technology procedures that require them.

b)      In Windows Explorer:

i)        Go to server and folder where first set of Database Loader files exist.

ii)       RenaAPP32 all *.log files involved (created on the day) from *.log to *A.log

iii)     Copy second server versions of the all *.log files involved (created on the day) to original server and folder.

iv)     RenaAPP32 all newly copied files involved (created on the day) from *.log to *B.log

b)      On original server, using DOS command prompt:

i)        Go to folder where Database Loader files exist.

ii)       COPY *A.log + *B.log *.log

 


 

Binary DBL NTM Control Commands:

 

See Database Loader Reject Processing Documentation for NTM Control Commands utilized for this purpose.

 

Database Loader Options:

-          Use NTM Control Utility – Utilities – Database Loader Options.

-          User will see list of reject files to be processed, if there are any.

-          For any reject files the user wishes to process, they highlight the file and then right-click and select Create New Reject File.

-          Once a reject file has been processed, a reload file will be shown.

-          For any reload file the user wishes to process, they highlight the file and then right-click and select Replay Reload File.

 

 

 

Binary DBL Troubleshooting Table:

Binary DBL Symptom

ImpaAPP13

Response

There are Database Loading RejeAPP13

                       

Evidenced by:

-          Stats monitor shows non-zero values in reject column.

 

-          Oracle reject errors are seen in EMT.

 

Historical data will not be retained as expected, nor be available to applications that may need to act further against the data.

1)      Use Database Loader Reject Replay Procedures to examine reasons for rejeAPP13 and replay data as appropriate.

 


 

Binary DBL Monitoring Considerations:

Stats Monitors:
APP19

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes read data files and load the records into the databases.

Monitor shows processing stats of each loader.

DC= tradeon
DE = etradeon
HC = toss
HE = etoss

PROD MENU:
 DB Loader Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- RejeAPP13,
- Percent Complete,
- Records remaining,
- Insert Rate,
- Minutes to Finish

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Reject File Size column shows positive value (non-zero).

RejeAPP13 have occurred.
Process rejeAPP13 per procedures.

Percent Complete and Records Remaining columns showing non-zero values and not reducing as expected. Insert Rate is zero or lower than expected rate.

Process is either down or hung, or database is not responding.
1) Check status of process
2) If process is up, check CPU and memory utilization to see if process is hung.
3) If process is up and not hung, call Database Technologies.

                           

 

               


 

APP20 (APP20 processes)

 

                APP20 (APP20) Purpose:

APP20 processes receive orders and order related information from order sending firms and send to PROCESSINGs.

They then receive related responses from the PROCESSINGs and send these back to order senders.

 

Use the following hyperlinks to jump to the desired section of APP20 documentation:

APP20_Recovery_Considerations

APP20_NTM_Control_Commands

APP20_Troubleshooting_Table

APP20_Monitoring_Considerations

 

APP20 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

 

-          When moving between nodes:

o   ALTERNATE NODES are only to be used when a given firm is having issues and cannot connect to their PRIMARY NODE.

o   DR NODES are to be used when is having issues and must move the affected firms to another node.

DR NODES must be allocated un-natted nodes. 

No node can support more than one natted address at the saAPP32 time.

 

-          When stopping/restarting APP20 processes:

1)      Notify the associated firm and work in cooperation with them, as appropriate to situations.

2)      Notify Tech Services if moving APP20 processes to DR nodes and NAT addresses need to change to accommodate move.

3)      Stop the APP20 process.

4)      If NOT moving APP20 to new node, skip to step 6.

a)       If moving the APP20 to a new node, copy the day’s APP20 PROCESSOR files to alternate node:

a)       Copy :\chx\data\{APP20}\*.conf, *.in, *.ndx.in, *.out, *.ndx.out files for each APP20 process moving to the alternate node.

b)      If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

c)       If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

d)      If moving a APP10 associated APP20 (or APP26) process, also see APP10 recovery considerations documentation.

5)      Start the APP20 process.

6)      Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

APP20 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - APP20 – Open OSF Channel to make OSF connection to APP20 possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - APP20 – Close OSF Channel to make OSF connection to APP20 impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP20 – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP20 – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP20 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP20 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

APP20 Troubleshooting Table:

APP20 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP20Services.xml file within the disconnect message.

 

Firm is no longer able to send or receive order or order related messages with PROCESSING.

4)      Contact firm.

5)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

6)      Stop/Restarts of affected application service may help resolve the issue.

 

Firm wants to force all orders from service to be canceled.

 

Evidenced by:

-          Firm calling and requesting all orders be canceled.

 

The affected firm wants their risk mitigated by not leaving any open orders in the PROCESSING.

1)      Stopping the firm’s APP20 service(s) will also force a “Send AllOAPP39xlReq” message to all PROCESSINGs, canceling all of the affected APP20’s open orders.

2)      Use the NTM Control Utility APP32 option to “Forcibly Cancel Orders by Firm” if the firm wants all orders canceled, regardless of which APP20 service it may have been sent through.

 

APP20 Monitoring Considerations:

Stats Monitors:
APP20 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from order sending firms directly to PROCESSINGs.
(Direct access Protocol Buffer)

Monitor shows connection status between PROCESSOR and firm, or 29 west connection status and PROCESSING, by topic.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- Write Queue

Data is RED.

 

Process is either down or multicast data is not being received by monitor.

1) Check status of process
2) If process is up, call Technical Services.

Status is Disconnected or Open

 

Firm is not connected.

1) Use NTM Control Utility APP20 Service Controls to Open Channels.
2) Call Production Control if needed.

Status is Inactive

 

29 West communications has been disabled between the PROCESSOR and the PROCESSING.

1) Check status of process
2) If process is up, call Production Support.

Write Queue is non-zero values and not reducing as expected.

Firm may not be processing as expected.
1) Call firm and work with Technical Services if necessary.

 

Stats Monitors:
APP20 App Processing Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from order sending firms directly to PROCESSINGs.
(Direct access Protocol Buffer)

Monitor shows processing statistics of application.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- InCount,
- OutCount,
- LastInTime,
- LastOutTime

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

No data is displayed.

No data has been generated by order sending firm. 
1) Check APP20 fix message files. If all sizes are zero, no traffic has been generated.
2) Call firm if necessary.

InCount is not less than, or equal to OutCount.

Firm may not be receiving all PROCESSING responses expected.
1) Check APP20 FIX message files to confirm inbound messages and outbound messages.
2) Call firm if necessary.

 


 

Stats Monitors:
THIRD PARTY APP20 LBM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from order sending firms directly to PROCESSINGs.

Monitor shows 29 West statistics (by topic) for APP20 processes, receiving from PROCESSING processes, and sending to PROCESSING Processes.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Type,
- Rate,
- Persistence,
- MsgCount

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Statistics are not shown for APP20 process as expected.

Process may not be up, or 29 West Stats have not yet been enabled for process.
1) Check status of process
2) Via NTM Control APP20 Service Controls, enable 29 West stats for process.

Rate and/or MsgCount values are not incrementing as expected.

Firms may not be receiving messages as expected.
1) Check LBM APP20 files to confirm 29 West related processing.
2) Call Production Support if necessary.

 

 

 


 

Stats Monitors:
THIRD PARTY APP20 LBM RCVRM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from order sending firms directly to PROCESSINGs.

Monitor shows 29 West "receiving" statistics (by sender, and topic) for APP20 processes, receiving from PROCESSING processes.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Lost-Recovered,
- Lost-Unrecovered-Txm,
- Lost-unrecovered-tmo,
-Msgs_rcved

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Statistics are not shown for APP20 process as expected.

Process may not be up, or 29 West Stats have not yet been enabled for process.
1) Check status of process
2) Via NTM Control APP20 Service Controls, enable 29 West stats for process.

Msgs_rcved values are not incrementing as expected.

Firms may not be receiving messages as expected.
1) Check LBM APP20 files to confirm 29 West related processing.
2) Call Production Support if necessary.

Lost-unrecovered values are non-zero.

NOTE: Lost-recovered values may also indicate problems, but that THIRD PARTY auto-recovered lost messages.

Firms may not be receiving messages as expected.
1) Check LBM APP20 files to confirm 29 West related processing.
2) Call Production Support if necessary.

 

               


 

APP21 (MESSAGING Engine)

 

                APP21 Purpose:

APP21 processes receive order and execution drop copies from APP22 processes and send them to External Vendors or MESSAGING Services.

They then receive related responses from these services for appropriate error handling.

 

Use the following hyperlinks to jump to the desired section of APP21 documentation:

APP21_Recovery_Considerations

APP21_NTM_Control_Commands

APP21_Troubleshooting_Table

APP21_Monitoring_Considerations

 

 

APP21 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

 

-          When moving between nodes:

o   ALTERNATE NODES are not defined for DSCF services.

o   DR NODES must be allocated un-natted nodes. 

No node can support more than one natted address at the saAPP32 time.

 

-          When stopping/restarting APP03 processes:

1)      Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.

2)      Notify Tech Services if moving APP21 processes to DR nodes and NAT addresses need to change to accommodate move.

3)      Stop the DSCF process.

4)      If NOT moving APP21 to new node, skip to step 5.

a)       If moving the APP21 to a new node, copy the day’s APP03 PROCESSOR files to alternate node:

a)       Copy: \chx\data\{APP21}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.

b)      Copy: \chx\data\{APP21}\Global.* file created for the day to the alternate node.

c)       If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

d)      If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

5)      Start the DCSSF process.

6)      Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

APP21 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - APP21 – Open OSF Channel to make OSF connection possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - APP21 – Close OSF Channel to make OSF connection impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP21 – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP21 – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP21 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP21 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

DSCF Troubleshooting Table:

APP21 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP01Services.xml file within the disconnect message.

 

Drop copy firm/vendor  is no longer able to receive drop copy related messages.

 

1)      Contact firm.

2)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

3)      Stop/Restarts of affected application service may help resolve the issue.

 

APP21 Monitoring Considerations:

Stats Monitors:
DCS App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from (PROCESSINGs via APP40 and APP22 processes) to Order Sending Firm's Drop Copy Destinations.

Monitor shows connection status between APP21 and firm as well as processing stats between APP22 (msgin) and APP21 as well as APP21 and firm (msgout).

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- fixConnect,
- MsgIn,
- MsgOut

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

FixConnect is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP21 Service Controls to Open Channels.
2) Call Production Control if needed.

 

 

 

MsgIn values are not equal to or greater than MsgOut values.

Depending on APP21 configuration, outbound messaging may be filtered such that not all inbound messages received by APP22 will be forwarded to firm.

MESSAGINGs may not be as expected.
1) Check DCS FIX message files to confirm inbound messages and outbound messages.
2) Call firm if necessary.

 

 


 

APP22 (MESSAGING Router)

               

APP22 Purpose:

MESSAGING Routers receive drop copy related orders and MESSAGES 2 from the PROCESSING (APP40) or from MESSAGING.

MESSAGING Routers send drop copy related orders and MESSAGES 2 to appropriate order sending firms/vendors. 

 

MESSAGING Routes are configured such that:

-          the DC1 instance sends drop copies for DC1 PROCESSINGs,

-          the DC2 instance sends drop copies for DC2 PROCESSINGs.

 

Use the following hyperlinks to jump to the desired section of APP22 documentation:

APP22_Recovery_Considerations

APP22_NTM_Control_Commands

APP22_Troubleshooting_Table

APP22_Monitoring_Considerations

 

 

APP22 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP22 nodes only when moving between nodes.  (No real dependencies other than expected processing sites.)

 

-          When stopping/restarting APP22 processes:

1)      Notify the Drop Copy firms and vendors to notify them that drop copies will be interrupted.

2)      Stop/Restart the APP22 process.

 

APP22 NTM Control Commands :

-          There are no APP22 specific NTM Control Commands.

 

 

 

APP22 Troubleshooting Table:

APP22 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

Drop Copy firms/vendors will not be able to receive drop copies.

 

Refer to:

APPGROUP14 Server

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP14 Server

Server Specific Recoveries.

2)      Notify Management.

 

                           

APP22 Monitoring Considerations:

See

Stats Monitors:
THIRD PARTY DCS LBM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from PROCESSINGs (via APP40 and APP22 processes) to Order Sending Firm's Drop Copy Destinations.

Monitor shows 29 West statistics (by topic) for APP21 processes, receiving from APP22 processes.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Type,
- Rate,
- Persistence,
- MsgCount

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

Rate and/or MsgCount values are not incrementing as expected.

Firms may not be receiving messages as expected.
1) Check DCS FIX message files to confirm inbound messages and outbound messages.
2) Call Production Support if necessary.

 

 

 

By matching topic_name, Rcv MsgCount in this monitor does not match total Src MsgCount values in APP40_LBM Stats moniitor.

Firms may not be receiving messages as expected.
1) Check DCS FIX message files to confirm inbound messages and outbound messages.
2) Call Production Support if necessary.

Stats Monitors:
THIRD PARTY DCS LBM RCVRM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from PROCESSINGs (via APP40 and APP22 processes) to Order Sending Firm's Drop Copy Destinations.

Monitor shows 29 West "receiving" statistics (by sender, and topic) for APP21 processes, receiving from APP22 processes.

SENDER=CLEAR=Messages sent in MESSAGING format.
SENDER=APP2201/02=Message sent in OSF format.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Lost-Recovered,
- Lost-Unrecovered-Txm,
- Lost-unrecovered-tmo,
-Msgs_rcved

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Msgs_rcved values are not incrementing as expected.

Firms may not be receiving messages as expected.
1) Check DCS FIX message files to confirm inbound messages and outbound messages.
2) Call Production Support if necessary.

 

 

 

Lost-unrecovered values are non-zero.

NOTE: Lost-recovered values may also indicate problems, but that THIRD PARTY auto-recovered lost messages.

Firms may not be receiving messages as expected.
1) Check DCS FIX message files to confirm inbound messages and outbound messages.
2) Call Production Support if necessary.


 

APP23 (APP23 PROCESSOR)

 

                APP23 Purpose:

APP23 PROCESSOR receives MESSAGING messages from APP09 and sends them to APP23 for actual MESSAGING.

They then receive related responses from these services for error handling.

 

Use the following hyperlinks to jump to the desired section of APP23 documentation:

APP23_Recovery_Considerations

APP23_NTM_Control_Commands

APP23_Troubleshooting_Table

APP23_Monitoring_Considerations

 

 

APP23 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP36 nodes only when moving between nodes.  (APP36 NAT addresses must be configured/used by APP23.)

 

-          When stopping/restarting APP23 processes:

1)      Notify APP23 and work in cooperation with them, as appropriate to situations.

2)      Stop the APP23 process.

3)      If NOT moving APP23 to new node, skip to step 5.

a)       If moving the APP23 to a new node, copy the day’s APP23 PROCESSOR files to alternate node:

a)       Copy: \chx\data\APP23\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.

b)      Copy: \chx\data\APP23\Global.* file created for the day to the alternate node.

c)       If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

d)      If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

4)      Start the APP23 process.

5)      Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

 


 

APP23 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - APP23 – Open OSF Channel to make OSF connection possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - APP23 – Close OSF Channel to make OSF connection impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP23 – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP23 – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP23 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP23 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

APP23 Troubleshooting Table:

APP23 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP23Services.xml file within the disconnect message.

APP23 is no longer able to receive MESSAGING messages.

1)      Contact firm.

2)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

3)      Stop/Restarts of affected application service may help resolve the issue.

 

APP23 Monitoring Considerations:

Stats Monitors:
APP23 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate MESSAGING trade delivery from processes to APP23.

Monitor shows connection status between PROCESSORs and APP23 Destination as well as processing statistics.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- OutMsgs,
- InMsgs,
- OutTAPP39apRpts

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Status is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP23 Service Controls to Open Channels.
2) Call Production Control if needed.

OutTAPP39apRpts value does not match OutCnt value in RTC01 to APP23 App Queues monitor.

We may not be processing as expected.
1) Check RTC log files and APP23 FIX message files to confirm inbound messages match outbound messages.
2) Work with Production Support if necessary.

 

 


 

APP24 (APP24 PROCESSOR)

 

                APP24 Purpose:

APP24 PROCESSOR receives order messages from MESSAGING and sends them to APP24 for actual trading.

They then receive related responses from these services for processing and/or error handling.

 

Use the following hyperlinks to jump to the desired section of APP24 documentation:

APP24_Recovery_Considerations

APP24_NTM_Control_Commands

APP24_Troubleshooting_Table

APP24_Monitoring_Considerations

 

 

APP24 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP36 nodes only when moving between nodes.  (APP36 NAT addresses must be configured/used by APP24.)

 

-          When stopping/restarting APP24 processes:

1)      Notify APP24 and work in cooperation with them, as appropriate to situations.

2)      Stop the APP24 process.

3)      If NOT moving APP24 to new node, skip to step 5.

a)       If moving the APP24 to a new node, copy the day’s APP24 PROCESSOR files to alternate node:

a)       Copy: \chx\data\{APP24}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.

b)      Copy: \chx\data\{APP24}\Global.* file created for the day to the alternate node.

c)       If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

d)      If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

4)      Start the APP24 process.

5)      Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

 


 

APP24 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - TRF – Open OSF Channel to make OSF connection possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - TRF – Close OSF Channel to make OSF connection impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - TRF – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - TRF – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - TRF – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - TRF – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

APP24 Troubleshooting Table:

APP24 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP24Services.xml file within the disconnect message.

APP24 is no longer able to receive order messages or return responses to MESSAGING.

1)      Contact firm.

2)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

3)      Stop/Restarts of affected application service may help resolve the issue.

 

APP24 Monitoring Considerations:

Stats Monitors:
APP24 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from to FINRA/NASDAQ Trade Reporting Facility Destinations.

Monitor shows connection status between APP24 and FINRA/NASDAQ TRF as well as processing statistics.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- nInNum,
- nOutNum

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Status is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP24 Service Controls to Open Channels.
2) Call Production Control if needed.

 

 

 

InCount does not match OutCount.

We may not be processing as expected.
1) Check APP24 FIX message files to confirm inbound messages match outbound messages.
2) Work with Brokers and FINRA/NASDAQ if necessary.


 

APP25 (MESSAGE Reader)

 

                APP25 Purpose:

The APP25 process reads order messages received from APP01 and APP26 processes and loads the data into the databases via loaders.

 

 

APP25 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP25 nodes only when moving between nodes.  (No real dependencies outside of expected processor.)

 

APP25 NTM Control Commands:

See Binary_DBL_Monitoring_Considerations

There are no APP25 specific NTM Control Commands outside of Binary Database Loader commands.

 

APP25 Troubleshooting Table:

APP25 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to APP25 service.

-          In Solarwinds (and outlook), node and processes will be reported down.

 

-Post trade processing will not include APP01 and APP26 order data.

 

Refer to:

APPGROUP02 Server

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP02 Server

Server Specific Recoveries.

2)      Restart affected APP25 processes on alternate nodes.

3)      Notify Management and PTT.

 

 

APP25 Monitoring Considerations:

See APP01_Monitoring_Considerations, APP26_Monitoring_Considerations,  Binary_DBL_Monitoring_Considerations

There are no APP25 specific monitors outside of EMT and the related APP01, APP26 and Binary Database Loader monitors.

APP26 (MESSAGING APP20 PROCESSOR)

 

                APP26 Purpose:

APP26 PROCESSOR receives order messages from order sending firms and sends them to MESSAGING via APP37 for actual trading.

APP26 PROCESSORs can also send to APP01, TRF or PROCESSINGs if firms route them there using fix tags, but typically they are used for MESSAGING.

They then receive related responses from these services for processing and/or error handling.

 

Use the following hyperlinks to jump to the desired section of APP26 documentation:

APP26_Recovery_Considerations

APP26_NTM_Control_Commands

APP26_Troubleshooting_Table

APP26_Monitoring_Considerations

 

 

APP26 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

 

-          When moving between nodes:

1)      ALTERNATE NODES are not defined for APP26 services.

2)      DR NODES must be allocated un-natted nodes. 

No node can support more than one natted address at the saAPP32 time.

 

-          When stopping/restarting APP26 processes:

1)      Notify order sending firms involved and work in cooperation with them, as appropriate to situations.

2)      Stop the APP26 process.

3)      If NOT moving APP26 to new node, skip to step 5.

a)       If moving the APP26 to a new node, copy the day’s APP26 PROCESSOR files to alternate node:

a)       Copy: \chx\data\{APP26}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.

b)      Copy: \chx\data\{APP26}\Global.* file created for the day to the alternate node.

c)       If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

d)      If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

4)      Start the APP26 process.

5)      Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

 

APP26 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - APP26 – Open OSF Channel to make OSF connection possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - APP26 – Close OSF Channel to make OSF connection impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP26 – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP26 – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP26 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP26 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

APP26 Troubleshooting Table:

APP26 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP26Services.xml file within the disconnect message.

APP26 is no longer able to receive order messages or return responses to order sending firms.

1)      Contact firm.

2)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

3)      Stop/Restarts of affected application service may help resolve the issue.

 

APP26 Monitoring Considerations:

Stats Monitors:
APP26 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms to PROCESSOR for special routing (via APP37 to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows connection status between PROCESSOR and firm, or of PROCESSOR and application.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- Write Queue

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Status is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP26 Service Controls to Open Channels.
2) Call Production Control if needed.

 

 

 

Write Queue is non-zero values and not reducing as expected.

Firm may not be processing as expected.
1) Call firm and work with Technical Services if necessary.

 


 

Stats Monitors:
APP26 App Processing Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms to PROCESSOR for special routing (via APP37 to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows processing statistics between PROCESSOR and Order Sending Firm.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- InCount,
- OutCount,
- LastInTime,
- LastOutTime

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

No data is displayed.

No data has been generated between Order Sending Firm and PROCESSOR.
 
1) Check APP26 Fix message files. If all sizes are zero, no traffic has been generated.
2) Test sending order to test PROCESSOR.

Aggregate InCount (by service name) is not less than, or equal to aggregate (by service name) OutCount.

NOTE: LastOutTiAPP32 (by service name) should also be later than LastInTiAPP32 (by service name)

Firm may not be processing as expected.
1) Check APP26 FIX message files to confirm inbound messages match outbound messages.
2) Call firm and work with Technical Services if necessary.

 


 

Stats Monitors:
APP26 IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms to PROCESSOR for special routing (via APP37 to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel processing statistics.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP01 processes are displayed as expected.

APP01 Service has not been started.
1) Check status of service.

Msgs In and/or Msgs Out are zero.

No messages have been sent/received since that monitor has been started.
1) Check APP26 log files.
2) Call firm and work with Technical Services if necessary.

 


 

Stats Monitors:
APP26 to APP37 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms to PROCESSOR for special routing (via APP37 to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel connectivity status to APP37 processes.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP26 or APP37 processes are displayed as expected.

APP26 or APP37 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP26 or APP37 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

Stats Monitors:
APP26 to APP25 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms to PROCESSOR for special routing (via APP37 to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel connectivity status to APP25 processes.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP26 or APP25 processes are displayed as expected.

APP26 or APP25 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP26 or APP25 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.


 

APP27 (ACTIVITY Reader)

 

                APP27 Purpose:

The APP27 process reads ACTIVITY messages received from APP31 processes and loads the data into the databases via loaders.

 

 

APP27 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP27 nodes only when moving between nodes.  (No real dependencies outside of expected processor.)

 

APP27 NTM Control Commands:

See Binary_DBL_Monitoring_Considerations

There are no APP27 specific NTM Control Commands outside of Binary Database Loader commands.

 

APP27 Troubleshooting Table:

APP27 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to APP27 service.

-          In Solarwinds (and outlook), node and processes will be reported down.

-PROCESSING will not have all activities necessary to define proper sequence numbering if the PROCESSING needs to be restarted.

 

Refer to:

APPGROUP02 Server

Server Specific Recoveries.

1)      Refer to:

APPGROUP02 Server

Server Specific Recoveries.

2)      Restart affected APP27 processes on alternate nodes.

3)      Notify Management and PTT.

 

 

APP27 Monitoring Considerations:

See Binary_DBL_Monitoring_Considerations

There are no APP27 specific monitors outside of EMT and the related Binary Database Loader monitors.

APP28 (RISK System)

               

APP28 Purpose:

 

The RISK process group is comprised of two application service types:  APP28 and APP08.

Together, these service types process risk management instructions as defined by order sending firms and/or MESSAGING firms.

 

APP08 (COMMUNICATION) manages communication between the RISK GUI application (on a Linux server) and the RISK process.

APP28 (RISK) manages risk management communication between the APP08, the PROCESSINGs, MESSAGING and the database.

 

There are no APP28 users as of the last update to this documentation.

There are no APP28 or APP08 monitoring tools outside of EMT messaging.

There are no APP28 or APP08 NTM Commands.

 

 

APP28 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP28 nodes only when moving between nodes.  (RISK GUI communications require host specific WebService.War files.)

 

-          Use the following hyperlink to access recovery procedures for RISK applcations:

https://chx.sharepoint.com/:w:/r/sites/BusinessUnits/Governance/_layouts/15/Doc.aspx?sourcedoc={8A26ADE3-7824-460D-AFF4-15BEF9030958}&file=RISK COMMUNICATION Server Move Procedure.docx&action=default&mobileredirect=true&DefaultItemOpen=1

 

 

APP28 Troubleshooting Table:

APP28 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

Since there are no actual RISK users, there should be no operational impaAPP13 outside of the loss of a server.  And in default configurations, the RISK applications are the only applications  running on this server. 

 

4)      Notify Management.

 

 

APP29

               

APP29 Purpose:

APP29 allows users to query, enter, modify and inactivate trading system support records.  All writes are directly to the database. 

 

MNT Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use MNT nodes only when moving between nodes.  (Java code, FireDaemon and Host Specific references in JNLPs required.)

 

-          When stopping/restarting MNT processes:

1)      Notify Operations and work in cooperation with them, as appropriate to situations.

2)      Stop/Restart the MNT process.

 

MNT NTM Control Commands:

There are no APP29 specific NTM Control commands.

 

MNT Troubleshooting Table:

MNT Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

Operations will not be able to query or administratively manage trading system support records via APP29.

 

Refer to:

APPGROUP08 or APPGROUP09 Server

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP08 or APPGROUP09 Server

Server Specific Recoveries.

2)      Notify Management.

 

 

MNT Monitoring Considerations:

 

There are no APP29 specific monitors outside of EMT.

APP30 (APP02/SUBGROUP02/SUBGROUP03)

 

                APP30 Purpose:

The APP30 process group is comprised of three separate application service types: APP02, SUBGROUP02 and SUBGROUP03.

Together, these service types read inbound MESSAGING from SITE 2 and NASDAQ via APP31 processes and load the data into the databases via loaders.

 

APP02 Processes read SITE 2 and NASDAQ BBO Duration data from APP31 (APP11/SUBGROUP02) processes and load it into databases via APP19.

SUBGROUP02 Processes read SITE 2 and NASDAQ Lastsale data from APP31 (APP13/SUBGROUP04) processes and load it into databases via APP19.

SUBGROUP03 Processes read SITE 2 and NASDAQ Quote Montage data from APP31 (APP11/SUBGROUP02) processes and load it into databases via APP19.

 

 

Use the following hyperlinks to jump to the desired section of APP30 documentation:

APP30_Recovery_Considerations

APP30_NTM_Control_Commands

APP30_Troubleshooting_Table

APP30_Monitoring_Considerations

 

 

APP30 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP30 nodes only when moving between nodes.  (APP30 processes utilize a lot more disAPP28ace than other applications)

-          APP30 processes do not have DR nodes defined; These processes do not move between data centers. 

-          Restart these processes as quickly as possible on alternate nodes.  The longer the processes are down, the worst impaAPP13 we will have on Post Trade Processing.

 

 

APP30 NTM Control Commands:

See Binary_DBL_Monitoring_Considerations

There are no APP02, SUBGROUP02 or SUBGROUP03 specific NTM Control Commands outside of Binary Database Loader commands.

 

APP30 Troubleshooting Table:

APP30 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to APP30s

-          In Solarwinds (and outlook), node and processes will be reported down.

 

For APP02, SUBGROUP02 and/or SUBGROUP03:

-Post trade processing will not include all quote related MESSAGING.

 

Refer to:

APP30 (APP02/SUBGROUP02/SUBGROUP03)

Server Specific Recoveries.

 

1)      Refer to:

APP30 (APP02/SUBGROUP02/SUBGROUP03)

Server Specific Recoveries.

2)      Restart affected APP30 processes on alternate nodes.

3)      Notify Management and PTT.

 

                           

APP30 Monitoring Considerations:

See Binary_DBL_Monitoring_Considerations

There are no APP02, SUBGROUP02 or SUBGROUP03 specific monitors outside of EMT and the Binary Database Loader monitors for each.

 


 

APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04)

 

                APP31 Purpose:

The APP31 process group is comprised of four separate application service types: APP11, APP13, SUBGROUP02, SUBGROUP04.

Together, these service types read inbound MESSAGING distributed by SITE 2 and NASDAQ and forward it to PROCESSINGs and APP19.

 

APP11 Processes read SITE 2 quote data over multicast and send it to PROCESSINGs and quote montage readers.

SUBGROUP02 Processes read NASDAQ quote data over multicast and send it to PROCESSINGs and quote montage readers.

APP11 and SUBGROUP02 processes also calculate Best Bid Offer Duration values to be sent to MESSAGINGs.

 

APP13 Processes read SITE 2 lastsale data over multicast and send it to PROCESSINGs and lastsale montage readers.

SUBGROUP04 Processes read NASDAQ lastsale data over multicast and send it to PROCESSINGs and lastsale montage readers.

 

Use the following hyperlinks to jump to the desired section of APP31 documentation:

APP31_Recovery_Considerations

APP31_NTM_Control_Commands

APP31_Troubleshooting_Table

APP31_Monitoring_Considerations

 

 

APP31 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP31 nodes only when moving between nodes.  (APP31 interfaces must be configured/enabled.)

-          APP31 processes do not move between data centers.  (Only “alternate” nodes are defined in service tables.)

 

-          When stopping/restarting APP31 instances:

1)      Stop/Start “A” series and “B” series separately so as to avoid causing suspending trading in PROCESSINGs.

 

 

 


 

APP31 NTM Control Commands:

Enabling/Disabling Channel Readers:

-          Use NTM Control Utility – Service Control - APP31 - Control Multicast Readers.

-          User may have to use REFRESH button multiple times to see all APP31 processes/channels.

-          Select (and highlight) desired channels and right click to see and select desired options, including:

1)      Start Primary Reader

2)      Start Alternate Reader

3)      Start Both Readers

4)      Stop Primary Reader

5)      Stop Alternate Reader

6)      Stop Both Readers

 

Flush Message Queue:

-          Use NTM Control Utility – Service Control - APP31 – Flush Message Queue.

-          User may have to use REFRESH button multiple times to see all APP31 processes/channels.

-          Select (and highlight) desired channels and right click to see and select Flush Internal Msg Queue.

 

Reset Multicast Sequence Number:

-          Use NTM Control Utility – Service Control - APP31 – Reset Multicast Sequence Number.

-          User may have to use REFRESH button multiple times to see all APP31 processes/channels.

-         Select (and highlight) desired channels and right click to see and select Reset Multicast Sequence Number.

 


 

APP31 Troubleshooting Table:

APP31 Symptom

ImpaAPP13

Response

Nasdaq moves to DR site (CRITICAL)

                       

Evidenced by:

-          Stats monitor shows zero NASDAQ primary data received in both data centers and zero NASDAQ alternate data received in DC2 – but NASDAQ alternate data processed in DC1.

 

For SUBGROUP02:

-PROCESSING trading without quote related MESSAGING for NASDAQ issues in DC2 only.

-Post trade processing will not include all quote related MESSAGING.

 

For SUBGROUP02:

-PROCESSING trading without lastsale related MESSAGING for NASDAQ issues in DC2 only.

-Post trade processing will not include all lastsale related MESSAGING.

 

See Generalized Recovery Scenarios for more:

Trading must be halted in affected issues if problem goes on too long.

 

1)      Refer to:

CHX_Cannot_Process_MESSAGES 1_From_SITE 2s

And

CHX_Cannot_Process_MESSAGES 2_From_SITE 2s

Generalized Recovery Scenarios.

 

For SUBGROUP02:

1)      Production Support must replace CHXAPPCFG APP31 SUBGROUP02_Services file with NASDAQ DR version.

2)      SUBGROUP02 Services in DC2 must be stopped/restarted.

-          See  APP31_Recovery_Considerations

 

For SUBGROUP04:

1)      Production Support must replace CHXAPPCFG APP31 SUBGROUP04_Services file with NASDAQ DR version.

2)      SUBGROUP04 Services in DC2 must be stopped/restarted.

-          See  APP31_Recovery_Considerations

 

For all:

1)      Notify management.

2)      If trading was halted, resAPP41 trading when stable and notify industry.

 

No Multicast Data Received (CRITICAL) :

 

-          BOTH Primary and Alternate Channels.

-          BOTH “A” and “B” series of Processes.

 

Evidenced by:

-          In APP31 Stats, zero values seen in rates columns.

For APP11/SUBGROUP02:

-PROCESSING WILL be trading without quote related MESSAGING.

-Post trade processing WILL not include all quote related MESSAGING.

 

For APP13/SUBGROUP04:

-PROCESSING WILL be trading without lastsale related MESSAGING.

-Post trade processing WILL not include all lastsale related MESSAGING.

 

See Generalized Recovery Scenarios for more:

Trading must be halted in affected issues if problem goes on too long.

 

1)      Refer to:

CHX_Cannot_Process_MESSAGES 1_From_SITE 2s

And

CHX_Cannot_Process_MESSAGES 2_From_SITE 2s

Generalized Recovery Scenarios.

 

1)      Confirm scope of impaAPP13.

In Stats Monitor:

-          Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

2)      Notify management.

3)      Determine corrective actions.

4)      If trading was halted, resAPP41 trading when stable and notify industry.

 

No Multicast Data Received (NON-CRITICAL):

 

-           EITHER Primary or Alternate Channel.

-          ONLY “A” or “B” series of Processes.

 

Evidenced by:

-          In APP31 Stats, zero values seen in rates columns.

 

For APP11/SUBGROUP02:

-PROCESSING MAY be trading without quote related MESSAGING.

-Post trade processing MAY not include all quote related MESSAGING.

 

For APP13/SUBGROUP04:

-PROCESSING MAY be trading without lastsale related MESSAGING.

-Post trade processing MAY not include all lastsale related MESSAGING.

 

1)      Confirm scope of impaAPP13.

In Stats Monitor:

-          Are problems specific to:

-          SITE 2 and/or NASDAQ

-          DC1 and/or DC2

-          Servers? Processes?  Channels?

2)      Notify management.

3)      Determine corrective actions.

 

Sequence gaps reported (NON-CRITICAL)

 

Evidenced by:

-          In EMT, sequence gaps reported

 

SaAPP32 as No Multicast Data Received (NON-CRITICAL) symptom.

1)      SaAPP32 as No Multicast Data Received (NON-CRTICAL) symptom.

2)      Notify management only in critical situations.

Dupes reported (NON-CRITICAL)

 

Evidenced by:

-          In EMT, dupes reported

 

SaAPP32 as No Multicast Data Received (NON-CRITICAL) symptom.

 

1)      SaAPP32 as No Multicast Data Received (NON-CRTICAL) symptom.

2)      Notify management only in critical situations.

Market Wide Circuit Breaker (NON-CRITICAL)

 

Evidenced by:

-          In EMT, Process reports MWCB messages

 

Listing Exchanges will Halt Trading in their issues for 15 minutes, and ResAPP41 when appropriate.

 

NOTE: currently has no exclusively listed issues. 

1)      Halt trading in all exclusive issues.

2)      Confirm MEs halt trading accordingly and resAPP41 trading accordingly in all stocks.

-          EMT/ER may be useful.

-          NTM Control Utility APP32 Service Controls “Get Issues Open” and “Get Issues Not Open” may also help.

3)      Notify management.

 

                           


 

APP31 Monitoring Considerations:

Stats Monitors:
APP31 from SITE 2 Queues (DC1 and DC2)

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from NASDAQ and SITE 2 SITE 2s to PROCESSINGs and MESSAGING APP19 via MESSAGING Processors.

Monitor shows connection status between APP31 process and SITE 2 primary/alternate channels as well as processing statistics of each process.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Primary Connection,
- Alt Connection,
- Primary Rate,
- Alt Rate,
- Prim Total Msgs,
- Alt total msgs

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Primary Channel and/or Alt Channel are not ENABLED.

Processes are not reading multicast SITE 2 data.
1) Use NTM Control Utility APP31 Service Controls to control multicast readers.
2) Call Production Control if needed.

Primary Rate and/or Alt Rate show sustained rate of zero during trading hours.

Multicast feed is not being processed as expected.
1) Work with SITE 2 and Technical Services if necessary.

Primary Total Msgs and Alt Total Msgs are not showing (relatively) the saAPP32 numbers of messages processed per process. 

Multicast feed is not being processed as expected.
1) Work with SITE 2 and Technical Services if necessary.

 


 

Stats Monitors:
APP31 to App Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from NASDAQ and SITE 2 SITE 2s to PROCESSINGs and MESSAGING APP19 via MESSAGING Processors.

Monitor shows 29 west to PROCESSING processing statistics for each process, including queues.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- nMsgQueSize,
- InMsgRate

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

nMsgQueSize is non-zero value and not reducing as expected, or InMsgRate is not increasing as expected.

Receiving process may not be up and/or there are THIRD PARTY delivery issues.
1) Check status of receiving processes
2) Involve Production Support if necessary.


 


 

Stats Monitors:
APP31 to ACTR IPC Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from NASDAQ and SITE 2 SITE 2s to PROCESSINGs and MESSAGING APP19 via MESSAGING Processors.

Monitor shows connection status between (APP31) applications and ACTIVITY Reader.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- Write Queue

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP31 or ACTR processes are displayed as expected.

APP31 or ACTR Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check ACTR xml data files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

Stats Monitors:
THIRD PARTY APP31 MESSAGES 1 LBM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate inbound quote processing and database loading.

Monitor shows 29 West statistics (by topic) for APP31s (recieving download messages from PROCESSINGs and sending to MESSAGING Loaders, and BBO Duration and Quote Montage database loading processes, receiving from APP31 processes. 

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Type,
- Rate,
- Persistence,
- MsgCount

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Rate and/or MsgCount values are not incrementing as expected.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP11/SUBGROUP02 log files and BBOD/SUBGROUP03 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.

 

 

 

By matching topic_name, Rcv MsgCount in this monitor does not match total Src MsgCount values.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP11/SUBGROUP02 log files and BBOD/SUBGROUP03 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.

 


 

Stats Monitors:
THIRD PARTY APP31 MESSAGES 1 LBM RCVRM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate inbound quote processing and database loading.

Monitor shows 29 West "receiving" statistics (by sender, and topic) for APP31s (receiving download messages from PROCESSINGs) and BBO Duration and Quote Montage Database Loading  processes, receiving from APP31 processes.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Lost-Recovered,
- Lost-Unrecovered-Txm,
- Lost-unrecovered-tmo,
-Msgs_rcved

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Msgs_rcved values are not incrementing as expected.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP11/SUBGROUP02 log files and BBOD/SUBGROUP03 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.

 

 

 

Lost-unrecovered values are non-zero.

NOTE: Lost-recovered values may also indicate problems, but that THIRD PARTY auto-recovered lost messages.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP11/SUBGROUP02 log files and BBOD/SUBGROUP03 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.

 


 

Stats Monitors:
THIRD PARTY APP31  LBM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate inbound lastsale processing and database loading.

Monitor shows 29 West statistics (by topic) for MESSAGING Processors (recieving download messages from PROCESSINGs and sending to MESSAGING Loaders, and Last Sale Montage database loading processes, receiving from MESSAGING Processor processes. 

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Type,
- Rate,
- Persistence,
- MsgCount

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Rate and/or MsgCount values are not incrementing as expected.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP13/SUBGROUP04 log files and SUBGROUP02 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.

 

 

 

By matching topic_name, Rcv MsgCount in this monitor does not match total Src MsgCount values.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP13/SUBGROUP04 log files and SUBGROUP02 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.


 

Stats Monitors:
THIRD PARTY APP31  LBM RCVRM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate inbound lastsale processing and database loading.

Monitor shows 29 West "receiving" statistics (by sender, and topic) for MESSAGING Processors (receiving download messages from PROCESSINGs) and Last Sale Montage Database Loading  processes, receiving from MESSAGING Processor processes.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Lost-Recovered,
- Lost-Unrecovered-Txm,
- Lost-unrecovered-tmo,
-Msgs_rcved

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Msgs_rcved values are not incrementing as expected.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP13/SUBGROUP04 log files and SUBGROUP02 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.

 

 

 

Lost-unrecovered values are non-zero.

NOTE: Lost-recovered values may also indicate problems, but that THIRD PARTY auto-recovered lost messages.

MESSAGING may not be being processed or loaded into database as expected.
1) Check APP13/SUBGROUP04 log files and SUBGROUP02 Binary data files to confirm inbound messages received and loaded into database.
2) Call Production Support if necessary.

 

               


 

APP32 (PROCESSING)

               

APP32 Purpose:

APP32 Processes read order messages from APP20s and APP38ridges, order and trade modification messages from APP12 and MESSAGING (via APP38ridges), MESSAGING from APP31 processes and MESSAGING Inquiry messages from MESSAGING.

 

APP32 Processes send order responses to APP20s and APP38ridges, drop copies to APP22 processes, and MESSAGES 1, MESSAGES 2 and MESSAGING messages to SUBGROUP01 processes.  They also send database loader messages to APP40 for database loading.

 

Use the following hyperlinks to jump to the desired section of APP32 documentation:

ME_Recovery_Considerations

ME_NTM_Control_Commands

ME_Troubleshooting_Table

ME_Monitoring_Considerations

 


 

APP32 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility Service Control Process Controller to stop/restart processes.

-          Use APP32 nodes only when moving between nodes.  MESSAGING interfaces must be configured.

 

NOTE: Existing bug exists when moving APP40 processes from one node to another in an orderly fashion whereas APP32 messages may get lost as a result.  If this occurs, records will need to be extracted from ME_LBM logs and provided to PTT developers to try and recreate data in the database.  This is a very cumbersoAPP32 operation and should be avoided if at all possible.

 

-          When stopping/restarting APP32 instances:

1)      Stop APP10 testing for the APP32 instance involved before stopping the ME

2)      Stop the APP32 instance

3)      Determine if associated APP40 and DLAPP32 instances used by the affected APP32 instance also require recovery.

a)       APP40 and APP19 are configured to run on different nodes from the APP32 (unless they are test instances of ME).

b)      If the APP32 is moving to a host within the saAPP32 data center, it is possible that the APP40 (and DLMEs) do not need to be moved.

c)       If the APP32 is moving to a host in a different data center, then it would be prudent to move the APP40 (and DLMEs) as well.

4)      Confirm that the associated Database loader is up to date IN THE DATA CENTER that the APP32 will be restarted in.

a)       If the APP32 is restarted without the database up to date with all most recent transactions, soAPP32 order processing may not be handled as expected.

b)      If there are database loader rejeAPP13 outstanding, they must be replayed before APP32 restarts.  Use database loader reject procedures to replay this data.

c)       It may be necessary to move the Database Loader to another node to complete loading this data.  See step 5. 

5)      If the associated APP40 and DLAPP32 instances are NOT moving to an alternate node, skip to step 6.

a)       If moving APP40 and DLAPP32 process:

i)        Stop APP40 process on current node.

ii)       Stop DLAPP32 process on current node.

iii)     If NOT moving the Database Loader files to a new node, skip to step 5-iv.

(a)    Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.

(b)    If moving Database Loader files:

Copy :\chx\data\DL*.log, DL*.pos, DL*rejeAPP13.log files (created on day) for each Database Loader process moving

to the alternate node.

iv)     Start APP40 process on new node.

v)       Start DLAPP32 process on new node.

6)      Start the APP32 instance and confirm it starts without errors.

 

(continue procedure on next page)

7)      Start APP10 testing for the APP32 instance involved and confirm testing is conducted without errors.

8)      Confirm APP10 orders and MESSAGES 2 before and after restart can be queried in APP12 to insure that APP40 communications are as expected.

9)      ResAPP41 trading in all stocks for affected ME.

a)       Trading will be halted by default if APP32 startup occurs after any stock’s primary session begins.

b)      Do not OPEN stocks.  RESUME.

10)   Determine if associated DLMP processes need to recovered.

a)       DLMP (Performance Loaders) are coded such that they must run on the saAPP32 nodes as the ME.

b)      If the APP32 has moved to another node, the DLMP must move as well.

11)   If DLMP processes will move:

a)       Stop associated DLMP processes

b)      Start associated DLMP processes.

 

 

 

APP32 NTM Control Commands:

Halting Issues by ME/Stock:

-          Use NTM Control Utility – Service Control – APP32 - options to Halt Issues.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

1)      The APP29 instrument queries can inform a user which MEs are assigned to which stocks, if needed.

2)      The APP29 instrument queries can also be used to easily store this information in an excel spreadsheet if needed.

3)      Halting issues results in Halts, regardless of what the listing SITEs are doing.

 

-          From the APP32 options, the following can be used to halt issues:

1)      Halt All Issues

2)      Halt APP10 Issues

3)      Halt Issue (if wanting to halt an individual stock versus a group of stocks)

4)      Halt Issue by SITE (need to know SITE code)

5)      Halt Exclusive Issue

 


 

Resuming/Opening Issues by ME/Stock:

-          Use NTM Control Utility – Service Control – APP32 options to ResAPP41 or Open Issues.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

1)      The APP29 instrument queries can inform a user which MEs are assigned to which stocks, if needed.

2)      The APP29 instrument queries can also be used to easily store this information in an excel spreadsheet if needed.

3)      Resuming issues results in setting the issue’s trading status to match the last received status by APP31 processes.

4)      Opening issues results in Openings, regardless of what the listing SITEs are doing or what is last known by APP31.

 

-          From the APP32 options, the following can be used to resAPP41 issues:

1)      ResAPP41 All Issues

2)      ResAPP41 APP10 Issues

3)      ResAPP41 Issue (if wanting to resAPP41 an individual stock versus a group of stocks)

4)      ResAPP41 Exclusive Issue

-          From the APP32 options, the following can be used to open issues:

1)      Open All Issues

2)      Open Issue (if wanting to open an individual stock versus a group of stocks)

3)      Open Issue by SITE (need to know SITE code)

 

LULD Pausing/Resuming Issues by ME:

-          Use NTM Control Utility – Service Control – APP32 - options to LULD Trading Pause/Resume.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

-          Under LULD options, user will be required to enter Issue Symbol to Pause/Resume.

 

Getting Issues Open/Issues Not Open by ME:

-          Use NTM Control Utility – Service Control – APP32 - options to Get Issues Open/Not Open.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

 

Resend MESSAGES 1 by ME:

-          Use NTM Control Utility – Service Control – APP32 – Resend APP32 MESSAGES 1.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

 

Set Quote Conditions or Zero MESSAGES 1 by ME:

-          Use NTM Control Utility – Service Control – SUBGROUP01 (Options by ME) – to:

1)      Zero Quote by ME, Set Quote Condition Auto, Set Quote Condition Manual.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

 


 

Enable/Disable MKT IOC by ME:

-          Use NTM Control Utility – Service Control – APP32 – Enable/Disable Market IOC.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

 

Forcibly cancel orders (by either ME, order sending firm, or stock) by ME:

-          Use NTM Control Utility – Service Control – APP32 - options to Forcibly Cancel All Orders, Order for Firm or Orders for Issue.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

 

Enable/Disable THIRD PARTY LBM Stats by ME:

-          Use NTM Control Utility – Service Control – APP32 - options to Enable/Disable THIRD PARTY Stats.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

 

 

 

 

 


 

APP32 Troubleshooting Table:

APP32 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to MEs

-          In Solarwinds (and outlook), node and processes will be reported down.

is not trading issues as expected. 

 

Order sending firms will not be receiving responses/updates and industry will not be getting MESSAGES 1 or .

 

Refer to:

APPGROUP10 Server

Server Specific Recoveries.

4)      Refer to:

APPGROUP10 Server

Server Specific Recoveries.

 

5)      Confirm SUBGROUP01 zeroed MESSAGES 1 for all affected MEs

6)      Notify Management.

7)      Work with Tech Services to confirm server status.

8)      If node is not to be used and processes need to be moved, determine if database loading has been completed, and complete per recovery procedures.

9)      Restart affected MEs.

See ME_Recovery_Considerations

10)   Trading will be halted by default upon restart.

11)   Notify industry after trading is resumed.

 

APP32 hangs (suspends processing)

 

Evidenced by:

-          In APP32 Thread stats, non-zero values are seen in queues and not reducing.

is not trading issues as expected.  Order sending firms will not be receiving responses/updates and industry will not be getting MESSAGES 1 or .

 

Refer to:

CHX_Issues_In_Order_Handling

Generalized Recovery Scenarios for more.

 

1)      Stop/Restart affected MEs.

See ME_Recovery_Considerations

2)      Notify management.

3)      Trading will be halted by default upon restart.

4)      Notify industry after trading is resumed.

 

 


 

APP32 Monitoring Considerations:

Stats Monitors:
MT APP32 Statistics

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate order and trade processing.

Monitor shows PROCESSING status of order routing and processing statistics.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- in rate,
- routing enable,
- ors directed connected

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

In_rate and/or msg_in value is zero.

PROCESSING may not be processing as expected.
1) Check APP32 status.
2) Call Production Support if needed.

Routing_enable and/or Ors_dreicted_connected flags are N.

PROCESSING is not enabled for Outbound Routing.
1) Check status of ORS process.  Routing will be disabled if process is not connected.
2) Use NTM Control Utility PROCESSING Service Controls to Enable Reg NMS Routing.
3)  Call Production Support if needed.


 

Stats Monitors:
MT APP32 Thread Statistics

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate order and trade processing.

Monitor shows PROCESSING THREAD status and processing statistics.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- data queue,
- ctrl que

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Data_queue and/or Ctrl_que column is non-zero value and not decreasing as expected.

PROCESSING may not be processing as expected.
1) Check APP32 status.
2) APP32 may need to be stopped/restarted.  Call Production Support if needed.

 

Stats Monitors:
APP32 IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate order and trade processing.

Monitor shows IPC channel processing statistics.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP32 processes are displayed as expected.

APP32 Service has not been started.
1) Check status of service.

Msgs In and/or Msgs Out are zero when messages are processed.

No messages have been sent/received since that monitor has been started.
1) Check APP12 log files.
2) Call Production Support if necessary.

 

Stats Monitors:
APP32 to APP37 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate order and trade processing.

Monitor shows IPC channel connectivity status between APP37 Bridge and APP37 processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all APP38ridge or APP37 processes are displayed as expected.

APP37 Bridge or APP37 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP37 Bridge or APP37 log files.

Status is not CONNECTED.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

Stats Monitors:
THIRD PARTY APP32 LBM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate order and trade processing.

Monitor shows 29 West statistics (by topic) for PROCESSING processes, receiving from APP20, APP38ridge, ORS and MESSAGING Processor processes. 

Sending to APP20, APP31 (download messages only), APP37 Bridge, ORS, SUBGROUP01, SUBGROUP02, and APP40 (for database loader and APP21) processes.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Type,
- Rate,
- Persistence,
- MsgCount

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Statistics are not shown for APP32 process as expected.

Process may not be up, or 29 West Stats have not yet been enabled for process.
1) Check status of process
2) Via NTM Control APP32 Service Controls, enable 29 West stats for process.

 

 

 

Rate and/or MsgCount values are not incrementing as expected.

Order and/or trade related processing may not be working as expected.
1) Check LBM APP32 log files to confirm 29 West related processing.
2) Call Production Support if necessary.

 

 

 


 

Stats Monitors:
THIRD PARTY APP32 LBM RCVRM Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate order and trade processing.

Monitor shows 29 West "receiving" statistics (by sender, and topic) for PROCESSING processes, receiving from APP20, APP38ridge, ORS and MESSAGING Processor processes.

PROD MENU:
29 West LBM Monitor Menu

To Exit:
Close Window

- Color of data in columns
- ContextName,
- Service,
- Topic_name,
- Lost-Recovered,
- Lost-Unrecovered-Txm,
- Lost-unrecovered-tmo,
-Msgs_rcved

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Statistics are not shown for APP32 process as expected.

Process may not be up, or 29 West Stats have not yet been enabled for process.
1) Check status of process
2) Via NTM Control APP32 Service Controls, enable 29 West stats for process.

 

 

 

Msgs_rcved values are not incrementing as expected.

Order and/or trade related processing may not be working as expected.
1) Check LBM APP32 log files to confirm 29 West related processing.
2) Call Production Support if necessary.

 

 

 

Lost-unrecovered values are non-zero.

NOTE: Lost-recovered values may also indicate problems, but that THIRD PARTY auto-recovered lost messages.

Order and/or trade related processing may not be working as expected.
1) Check LBM APP32 log files to confirm 29 West related processing.
2) Call Production Support if necessary.

 


 

APP33 ( Client)

               

APP33 Purpose:

APP33 Client allows users to query, enter, and modify  support records.  All writes are directly to the database. 

 

APP33 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP33 nodes only when moving between nodes.  (Java code, FireDaemon and Host Specific references in JNLPs required.)

 

-          When stopping/restarting APP33 processes:

1)      Notify Operations and work in cooperation with them, as appropriate to situations.

2)      Stop/Restart the APP33 process.

 

APP33 NTM Control Commands:

There are no APP33 specific NTM Control commands.

 

APP33 Troubleshooting Table:

APP33 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

s Staff and IB staff will not be able to query or administratively manage  support records via APP33 Client.

 

Refer to:

APPGROUP08 or APPGROUP09 Server

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP08 or APPGROUP09 Server

Server Specific Recoveries.

2)      Notify Management.

 

 

APP33 Monitoring Considerations:

 

There are no APP33 specific monitors outside of EMT.

APP34 (RESOLVER)

 

                APP34 Purpose:

APP34 allows applications to advertise their services and connection points to a local repository so other applications can find and connect to them.

In production configurations, there are two redundant APP34 services running in tandem; 1 in DC1 data center and 1 in DC2 data center.

The APP34 process is required by application startup only.  Only one APP34 instance is required at any tiAPP32 to support application startups.

 

APP34 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP34 nodes only when moving between nodes.  (FireDaemon and DNS dependencies exist for APP34.)

-          If moving APP34 to alternate node, Tech Services must redefine APP34 DNS IP addresses.

 

APP34 NTM Control Commands:

Reload Static Services:

-          Use NTM Control Utility – Service Control – Nespr – Reload Static Services.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

APP34 Troubleshooting Table:

APP34 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to APP34 service.

-          In Solarwinds (and outlook), node and processes will be reported down.

 

-Unless both APP34 services are down, or inaccessible for any reason, there will be no impact.  If both are down, applications will not be able to start cleanly.

 

Refer to:

APPGROUP05 Server

Server Specific Recoveries.

1)      Refer to:

APPGROUP05 Server

Server Specific Recoveries.

2)      Restart affected APP25 processes on alternate nodes.

3)      Notify Management and PTT.

 

 

APP34 Monitoring Considerations:

There are no APP34 specific monitors outside of EMT.

APP35 (OPS System)

 

                APP35 Purpose:

APP35 reads Application system messages via logserver processes running on every application node.

APP35 then broadcasts these messages to the network via multicast to be picked up and displayed by the EMT (Event Messaging Terminal).

Operations management staff cannot monitor system applications without APP35 working as expected.

 

Use the following hyperlinks to jump to the desired section of APP35 documentation:

APP35_Recovery_Considerations

APP35_NTM_Control_Commands

APP35_Troubleshooting_Table

APP35_Monitoring_Considerations

 

 

APP35 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP35 nodes only when moving between nodes.  (There are no dependencies outside of expected server allocations.)

 

 

APP35 NTM Control Commands:

There are no APP35 specific NTM Control Commands.

 

 


 

APP35 Troubleshooting Table:

APP35 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to APP34 service.

-          In Solarwinds (and outlook), node and processes will be reported down.

 

-Unless both APP34 services are down, or inaccessible for any reason, there will be no impact.  If both are down, applications will not be able to start cleanly.

 

Refer to:

APPGROUP14 Server

Server Specific Recoveries.

1)      Refer to:

APPGROUP14 Server

Server Specific Recoveries.

2)      Restart affected APP25 processes on alternate nodes.

3)      Notify Management and PTT.

 

 

APP35 Monitoring Considerations:

Stats Monitors:
APP35 Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate Application messages sent to Logserver to be reported to EMT monitors.

Monitor shows IPC channel processing statistics between APP35 service and all LogServers.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service Name,
- Conn Name,
- Status,
- Incnt

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Not all APP35 or Logserver processes are displayed as expected.

APP35 or Logserver Service has not been started.
1) Check status of APP35 and Logserver services.

 

 

 

InCnt values are zero when messages are processed.

No messages have been sent/received since that monitor has been started.
1) Check Logserver and Opcon data files.
2) Call Production Support if necessary.

 

APP36 (SUBGROUP01, SUBGROUP02, APP06)

               

Purpose:

 

The APP36 system is comprised of three separate application service types: SUBGROUP01, SUBGROUP02 and APP06.

Together, these three service types receive and process quote, trade and MESSAGING messages from the PROCESSINGs. 

The PROCESSING produces one message for all three of these service types to process, each taking their part from this message.

The data path for these messages is as follows:

1) APP32 to SUBGROUP01

2) SUBGROUP01 to SUBGROUP02

3) SUBGROUP02 to APP06

 

                Because these separate service types work together to process all outbound MESSAGING, their operations and recoveries must be considered together.

 

SUBGROUP01 Services include the following services:

APP11RI Processes read combined data from PROCESSINGs and send quote data to SITE 2 and APP10, as well as trade/MESSAGING data to SUBGROUP02.

UQDRI Processes read combined data from PROCESSINGs and send quote data to NASDAQ and APP10, as well as trade/MESSAGING data to SUBGROUP02.

 

SUBGROUP02 Services include the following services:

APP13RI Processes read combined data from SUBGROUP01 and trade data from APP12 and MESSAGING, and send trade data to SITE 2 and RTC, and MESSAGING data to APP06.

UTDRI Processes read combined data from SUBGROUP01 and trade data from APP12 and MESSAGING, and send trade data to NASDAQ and RTC, and MESSAGING to APP06.

 

APP06 Services:

APP06 Services read MESSAGING data from SUBGROUP02 and send MESSAGING data to MESSAGING Subscribers via Multicast.

 

Use the following hyperlinks to jump to the desired section of APP36 documentation:

APP36_Recovery_Considerations

APP36_NTM_Control_Commands

 

APP36_SUBGROUP01_Troubleshooting_Table

APP36_SUBGROUP02_Troubleshooting_Table

APP36_APP06_Troubleshooting_Table

 

APP36_SUBGROUP01_SUBGROUP02_Monitoring_Considerations

APP36_APP06_Monitoring_Considerations

 

APP36 Recovery Considerations:

Stopping/Restart Processes:

 

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP36 nodes only when moving between nodes.  (APP36 NAT addresses must be configured/used by SITE 2s.)

 

                                Because SUBGROUP01, SUBGROUP02 and APP06 work together to process all outbound MESSAGING, their operations and recoveries must be considered together.

                                Go to APP36_Combined_SUBGROUP01_SUBGROUP02_APP06_Move_Procedure for procedure if moving these systems to other nodes.

 

 

-          When stopping/restarting SUBGROUP01 instances:

1)      There should be no special dependencies or considerations.  A simple restart should suffice.

2)      Upon reconnection to SITE 2 or NASDAQ, SUBGROUP01 will request any queued combined APP36 data messages from the APP41 data store and send delayed MESSAGES 1 to the APP19 only (not the SITE 2), and then request the most recent stock MESSAGES 1 from all connected APP32 instances and send these to the SITE 2s.

 

-          When stopping/restarting SUBGROUP02 instances:

1)      For UTDRI (NASDAQ) processes, it is imperative that the database is up to date before restart.

a)       If the UTDRI involved is restarted without the database up to date with all most recent transactions,

stock specific sequence numbers being sent to NASDAQ may be off and trade messages may not be processed as expected.

2)      For APP13RI (SITE 2), there should be no special dependencies or considerations.  A simple restart should suffice.

3)      Upon reconnection to SITE 2 or NADAQ, SUBGROUP02 will request any queued combined APP36 data messages from the APP41 data store and send queued MESSAGES 2 to the SITE 2 marked “sold” as well as to MESSAGING, and then forward the MESSAGING portion of the messages to APP06 services.

 

-          When stopping/restarting APP06 instances:

1)      APP06 files must be moved before the APP06 restarts on the new node.

a)       If these files are not moved, two general impaAPP13 will be seen:

·         MESSAGING retrans requests made after recovery may not be able to find messages requested.

·         MESSAGING Sequence Number Resets will likely be seen by the MESSAGING subscribers and may compromise their resulting functionality.

2)      Upon startup, APP06 services will request any queued MESSAGING messages from the APP41 data store and resend these (emulating a MESSAGING) before sending any subsequent messages from the APP36 data path.

 

 


 

APP36 Combined SUBGROUP01, SUBGROUP02, APP06 Move Procedure

 

-          By design, there are four APP36 servers between both data centers:

·         One DC1 server supporting DC1 SITE 2 traded stocks, sending the SUBGROUP01 and SUBGROUP02 to SITE 2 and broadcasting the saAPP32 APP06 data.

·         One DC1 server supporting DC1 NASDAQ traded stocks, sending the SUBGROUP01 and SUBGROUP02 to NASDAQ and broadcasting the saAPP32 APP06 data.

·         One DC2 server supporting DC2 SITE 2 traded stocks, sending the SUBGROUP01 and SUBGROUP02 to SITE 2 and broadcasting the saAPP32 APP06 data.

·         One DC2 server supporting DC2 NASDAQ traded stocks, sending the SUBGROUP01 and SUBGROUP02 to NASDAQ and broadcasting the saAPP32 APP06 data.

 

 

-          If these services should move between servers at any time, all services should move together using the following procedure:

 

1)      Stop SUBGROUP01 on the current node

2)      Stop SUBGROUP02 on the current node

3)      Stop APP06 on the current node

4)      If APP09 is also moving, then stop APP09 at this point as well (which would be the case if we moved between data centers).

5)      Confirm that the associated UTDRI Database loader is up to date IN THE DATA CENTER that the UTDRI will be restarted in. 

a)       Use database loader reject procedures to replay this data.

·         If there are database loader rejeAPP13 outstanding, they must be replayed before UTDRI restarts. 

b)      It may be necessary to move the Database Loader to another node to complete loading this data. 

c)       If NOT moving the Database Loader files to a new node, skip to step 6.

·         Database Loader files should only be moved if “pre-move” database loading cannot be completed without doing so.

·         If moving Database Loader files:

o   Stop the affected UTDRI DLTR process to unlock the database loader files.

o   Copy :\chx\data\DL*.log, DL*.pos, DL*rejeAPP13.log files (created on day) for each Database Loader process moving

to the alternate node.

o   Complete replaying the data, using Replay Procedures as necessary.

6)      If APP09 was moved, Restart APP09 on the new node.

7)      Restart SUBGROUP02 on the new node.

8)      Restart SUBGROUP01 on the new node. 

a)       If SUBGROUP01 conneAPP13 to the SITE 2s on restart, they will resend the most recent MESSAGES 1 to the SITE 2. 

b)      If SUBGROUP02 conneAPP13 to the SITE 2s on restart, and SUBGROUP01 is also connected, they will send MESSAGES 2 in the APP41 data store queue sold.

c)       If there are any questions regarding MESSAGES 1 not being up to date, should resend MESSAGES 1 using NTM APP32 commands.

d)      If there are any questions regarding MESSAGES 2 not being sent, should resend MESSAGES 2 using APP12 trade queries.

e)      If there are any questions regarding MESSAGING records not being sent, should resend MESSAGING using APP12.

(continue procedure on next page)

Continue recovery of APP06 portion of APP36 system (lower priority than SUBGROUP01 and SUBGROUP02):

9)      Copy APP06 log and inx files to the alternate node

a)       Copy D:\chx\data\APP06*.log and APP06*.inx files (created that day) for each APP06 service moving to its alternate node.

10)   Restart APP06 on new node.

11)   Confirm MESSAGING Reader Clients reflect reconnect to moved APP06 services.

 

 

APP04 (MESSAGING Reader) Move Procedure

 

If APP06 processes have moved nodes, we must inform MESSAGING processes of moved MESSAGING Services using the following procedure:

1)      Stop all APP04 processes.

2)      Modify \\chxappcfg\APP041\APP04config.xml  to reflect new nodes for APP06 log files.

3)      Restart all APP04 processes.

 

 

APP36 Database Loader Move Procedure

 

If SUBGROUP01, SUBGROUP02, APP06 processes have moved nodes, we must also move Database Loader processes afterward using the following procedure:

1)      Stop/Restart DLQR (after confirming all APP19 are up to date)

2)      Stop/Restart DLTR (after confirming all APP19 are up to date)

3)      Stop/Restart DLBF (after confirming all APP19 are up to date)

 

 

 

APP36 NTM Control Commands:

Use the following hyperlinks to get to SUBGROUP01, SUBGROUP02 or APP06 NTM Commands:

APP36_SUBGROUP01_NTM_Control_Commands

APP36_SUBGROUP02_NTM_Control_Commands

APP36_APP06_NTM_Control_Commands

 

 

SUBGROUP01 NTM Control Commands:

 

Control connections to SITE 2s:

-          Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) - Connect/Disconnect To/From SITE 2/SITE 2 or Switch Connection.

-          Select (and highlight) desired processes and right click to see and select desired options.

-          If user desires to connect/disconnect with the SITE 2s production primary site, select either Connect To/Disconnect From SITE 2/SITE 2 options.

-          If user desires to connect to any other SITE 2 site other than the production primary site, select Switch Connection and choose the desire site:

1)      PRI_SITE_PRI_ADDR  (primary site, primary server)

2)      PRI_SITE_ALT_ADDR (primary site, alternate server)

3)      DR_SITE_PRI_ADDR  (DR site/remote data center, primary server)

4)      DR_SITE_ALT_ADDR (DR site/remote data center, alternate server)

 

Send Sequence Inquiry to SITE 2s:

-          Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) - Send Sequence Inquiry.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

SUBGROUP01 Bypass:

-          Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) – SUBGROUP01 Bypass.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

Abort Waiting Download Reply:

-          Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) – Abort Waiting Download Reply.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

Enable/Disable Processing Quote Stat:

-          Use NTM Control Utility – Service Control – SUBGROUP01 (Options by SUBGROUP01) – options to Enable/Disable Processing Quote Stat.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

Set Quote Conditions or Zero MESSAGES 1 by ME:

-          Use NTM Control Utility – Service Control – SUBGROUP01 (Options by ME) – to:

1)      Zero Quote by ME,

2)      Set Quote Condition Auto, or

3)      Set Quote Condition Manual.

-          Select (and highlight) desired PROCESSINGs and right click to see and select desired options.

 

 


 

SUBGROUP02 NTM Control Commands:

 

Control connections to SITE 2s:

-          Use NTM Control Utility – Service Control – SUBGROUP02 - Connect/Disconnect To/From SITE 2/SITE 2 or Switch Connection.

-          Select (and highlight) desired processes and right click to see and select desired options.

-          If user desires to connect/disconnect to/from the SITE 2s production primary site, select either Connect/Disconnect To/From SITE 2/SITE 2 options.

-          If user desires to connect to any other SITE 2 site other than the production primary site, select Switch Connection and choose the desire site:

1)      PRI_SITE_PRI_ADDR  (primary site, primary server)

2)      PRI_SITE_ALT_ADDR (primary site, alternate server)

3)      DR_SITE_PRI_ADDR  (DR site/remote data center, primary server)

4)      DR_SITE_ALT_ADDR (DR site/remote data center, alternate server)

 

 

Send Sequence Inquiry to SITE 2s:

-          Use NTM Control Utility – Service Control – SUBGROUP02 - Send Sequence Inquiry.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

Send TradeId Inquiry to SITE 2s:

-          Use NTM Control Utility – Service Control – SUBGROUP02 - Send TradeId Inquiry.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

Set Outbound Sequence Number to SITE 2s:

-          Use NTM Control Utility – Service Control – SUBGROUP02 – Set Outbound Sequence Number.

-          Select (and highlight) desired processes and right click to see and select desired options.

-          User will have to enter desired sequence number.

 

Set Outbound TradeId Per Instrument to SITE 2s:

-          Use NTM Control Utility – Service Control – SUBGROUP02 – Set Outbound TradeId Per Instrument.

-          Select (and highlight) desired processes and right click to see and select desired options.

-          User will have to enter Instrument and TradeId desired.

 


 

APP06 NTM Control Commands:

 

BFD Start Of Day:

-          Use NTM Control Utility – Service Control – Book Feed Options – BFD Start Of Day.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

BFD End Of Day:

-          Use NTM Control Utility – Service Control – Book Feed Options – BFD End Of Day.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

BFD Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control – Book Feed Options – BFD Set Outbound Sequence Number.

-          Select (and highlight) desired processes and right click to see and select desired options.

-          User will have to enter desired sequence number.

 

BFD Send System Problem Message:

-          Use NTM Control Utility – Service Control – Book Feed Options – BFD Send System Problem Message.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

BFD Send System Problem Clear Message:

-          Use NTM Control Utility – Service Control – Book Feed Options – BFD Send System Problem Clear Message.

-          Select (and highlight) desired processes and right click to see and select desired options.

 

 

 

 


 

SUBGROUP01 Troubleshooting Table:

APP36 Symptom

ImpaAPP13

Response

SUBGROUP01 SITE 2 connectivity issues

 

Evidenced by:

-          In SUBGROUP01_SUBGROUP02_to_SITE 2 stats, APP11RI processes are not connected.

-          If SITE 2 moves to DR site, APP11 processes will report messages with text “disaster” in them.

 

NOTE: APP11RI processes will try to auto-reconnect continuously until connections can be made.

 

Since SUBGROUP01 is first process in APP36 system path:

-will not be reporting quote related MESSAGING to industry.

-will not be reporting trade related MESSAGING to industry; This includes MESSAGING.

-will not be reporting MESSAGING related MESSAGING to subscribers.

 

-Trading must be halted in affected issues if problem goes on too long.

 

1)      Refer to: CHX_Cannot_Send_MESSAGES 1_To_SITE 2s Generalized Recovery Scenario.

2)      Work with SITE 2 and Tech Services to identify and resolve issues.

-          SITE 2 may ask to move APP11RI connections to Primary Alternate Servers or their DR site.  Use NTM Control SUBGROUP01 options to switch SUBGROUP01 connections. 

3)      APP11RI Services may need to be stopped/restarted.  See APP36_Recovery_Considerations

 

SUBGROUP01 NASDAQ connectivity issues

 

Evidenced by:

-          In SUBGROUP01_SUBGROUP02_to_NASDAQ stats, UQDRI processes are not connected.

-          If SITE 2 moves to DR site, SUBGROUP02 processes will report messages with text “disaster” in them.

 

NOTE: UQDRI processes will try to auto-reconnect continuously until connections can be made.

 

Since SUBGROUP01 is first process in APP36 system path:

-will not be reporting quote related MESSAGING to industry.

-will not be reporting trade related MESSAGING to industry; This includes MESSAGING.

-will not be reporting MESSAGING related MESSAGING to subscribers.

 

-Trading must be halted in affected issues if problem goes on too long.

 

1)      Refer to: CHX_Cannot_Send_MESSAGES 1_To_SITE 2s Generalized Recovery Scenario.

2)      Work with NASDAQ and Tech Services to identify and resolve issues.

3)      NASDAQ may ask to move UQDRI connections to Primary Alternate Servers or their DR site.  Use NTM Control SUBGROUP01 options to switch SUBGROUP01 connections. 

4)      NOTE: If NASDAQ moves to DR site, APP31 recovery procedures will also need to be used.   See APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04) Application Specific Recoveries.

5)      UQDRI Services may need to be stopped/restarted.  See APP36_Recovery_Considerations

 


 

SUBGROUP02 Troubleshooting Table:

 

 

6)       

SUBGROUP02 SITE 2 connectivity issues

 

Evidenced by:

-          In SUBGROUP01_SUBGROUP02_to_SITE 2 stats, APP13RI processes are not connected.

-          If SITE 2 moves to DR site, APP13 processes will report messages with text “disaster” in them.

 

NOTE: APP13RI processes will try to auto-reconnect continuously until connections can be made.

 

Since SUBGROUP02 is second process in APP36 system path:

-will not be reporting trade related MESSAGING to industry; This includes MESSAGING.

-will not be reporting MESSAGING related MESSAGING to subscribers.

 

-Trading must be halted in affected issues if problem goes on too long.

 

1)      Refer to:

CHX_Cannot_Send_MESSAGES 2_To_SITE 2s

Generalized Recovery Scenario.

2)      Work with SITE 2 and Tech Services to identify and resolve issues.

3)      SITE 2 may ask to move APP13RI connections to Primary Alternate Servers or their DR site.  Use NTM Control SUBGROUP02 options to switch SUBGROUP02 connections. 

4)      APP13RI Services may need to be stopped/restarted.  See APP36_Recovery_Considerations

 

SUBGROUP02 NASDAQ connectivity issues

 

Evidenced by:

-          In SUBGROUP01_SUBGROUP02_to_NASDAQ stats, UTDRI processes are not connected.

-          If NASDAQ moves to DR site, SUBGROUP04 processes will report messages with text “disaster” in them.

 

NOTE: UTDRI processes will try to auto-reconnect continuously until connections can be made.

 

Since SUBGROUP02 is second process in APP36 system path:

-will not be reporting trade related MESSAGING to industry; This includes MESSAGING.

-will not be reporting MESSAGING related MESSAGING to subscribers.

 

-Trading must be halted in affected issues if problem goes on too long.

 

1)      Refer to:

CHX_Cannot_Send_MESSAGES 2_To_SITE 2s

Generalized Recovery Scenario.

2)      Work with NASDAQ and Tech Services to identify and resolve issues.

3)      NASDAQ may ask to move UTDRI connections to Primary Alternate Servers or their DR site.  Use NTM Control SUBGROUP01 options to switch SUBGROUP01 connections. 

-          NOTE: If NASDAQ moves to DR site, APP31 recovery procedures will also need to be used.   See APP31 (APP11/SUBGROUP02/APP13/SUBGROUP04) Application Specific Recoveries.

4)      UTDRI Services may need to be stopped/restarted.  See APP36_Recovery_Considerations

 

 

               


APP06 Troubleshooting Table:

APP06 Symptom

ImpaAPP13

Response

Users report missing data in MESSAGING data.

 

May or may not be evidenced by:

-          Sequence gaps reported by APP04 processes in both EMT and MESSAGING Reader Client.

 

-MESSAGING Subscribers are missing data that they may or may not use in trading decisions.

 

4)      Use MESSAGING Reader to confirm whether or not the saAPP32 data lost by user was reported by APP04 Client.

See APP05_Monitoring_Considerations

5)      Report sequence gap information to Tech Services and work with Tech Services to determine cause/resolution.

6)      Users may utilize MESSAGING Retrans processes to try and gap fill messages lost. See

 

               


 

SUBGROUP01_SUBGROUP02 Monitoring Considerations:

Stats Monitors:
SUBGROUP01 / SUBGROUP02 to SITE 2 App Queues Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate quote and last sale delivery from to SITE 2 SITE 2s.

Monitor shows connection status between and SITE 2 SITE 2 as well as processing statistics.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- ConnStat,
- QuoteQue,
- OutMsgRate,
- OutMsgs

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Connstat is not Connected.

is not connected to SITE 2.
1) Use NTM Control Utility SUBGROUP01 / SUBGROUP02 Service Controls to control connections.
2) Call Production Control if needed.

QuoteQue is non-zero value and not decreasing as expected, or OutMsgRate or OutMsgs values are not reflecting changes as expected.

We may not be sending MESSAGES 1 and/or  as expected.
1) Check process log and/or data files to confirm inbound messages match outbound messages.
2) Work with SITE 2 and Technical Services if necessary.

 


 

Stats Monitors:
SUBGROUP01 / SUBGROUP02 to NASDAQ App Queues Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate quote and last sale delivery from to SITE 2 and NASDAQ SITE 2s.

Monitor shows connection status between and SITE 2 / NASDAQ SITE 2s as well as processing statistics.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- soup status,
- status,
- is ready to send,
- out rate,
- total sent

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Soup status is not Connected, Status is not ready, or Is ready to send value is not Y.

is not connected to SITE 2.
1) Use NTM Control Utility SUBGROUP01 / SUBGROUP02 Service Controls to control connections.
2) Call Production Control if needed.

Out Rate or Total Sent values are not reflecting changes as expected.

We may not be sending MESSAGES 1 and/or  as expected.
1) Check process log and/or data files to confirm inbound messages match outbound messages.
2) Work with SITE 2 and Technical Services if necessary.

 

 


 

Stats Monitors:
SUBGROUP01 / SUBGROUP02 To App Queues Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate quote and last sale delivery from to SITE 2 and NASDAQ SITE 2s.

Monitor shows IPC connection status between SUBGROUP01 / SUBGROUP02 process and other applications, or 29 west connection status by topic.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- Write Queue

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Tech Services.

Not all SUBGROUP01, SUBGROUP02 or IPC connected processes are displayed as expected.

Not all IPC connected services have been started or haven’t processed any messages since monitor has been started.
1) Check status of services.
2) Check relevant service log files.

IPC Connected status is not CONNECTED.

Messages cannot be sent from source to destination if IPC channel disconnected.
1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

THIRD PARTY connected status is Inactive

29 West communications has been disabled between the services involved.
1) Check status of process
2) If process is up, call Prod Support.

IPC Connected Queue size is non-zero value and not decreasing as expected.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

THIRD PARTY connected Write Queue is non-zero values and not reducing as expected.

29 West communications has been disabled between the services involved.
1) Check status of process
2) If process is up, call Production Support.

 

Stats Monitors:
SUBGROUP01 / SUBGROUP02 IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate quote and last sale delivery from to SITE 2 and NASDAQ SITE 2s.

Monitor shows IPC channel processing statistics.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Not all SUBGROUP01 / SUBGROUP02 processes are displayed as expected.

Service has not been started.
1) Check status of service.

Msgs In and/or Msgs Out are zero.

No messages have been sent/received since that monitor has been started.
1) Check SUBGROUP01 / SUBGROUP02 log files.
2) Call Production Support if necessary.

 


 

APP06 Monitoring Considerations:

Stats Monitors:
APP06 Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes receive MESSAGING data via ME->SUBGROUP01->SUBGROUP02->APP06 path, and send multicast to MESSAGING Subscribers.

Monitor shows statistics of data received, as well as instances of slow message delivery times between processes in the path and rule_603a violations.

PROD MENU:
MESSAGING Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- rule_603a_violation_cnt
- me_SUBGROUP01_over_limit
- SUBGROUP01_SUBGROUP02_over_limit

- SUBGROUP02_APP06_over_limit

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Rule 603a violation count is > 0

Rule 603a violation has been reported

1) Notify Production Support and management.

 

 

 

Any one of, or any combination of the process “over limit” columns are greater than 0.

We may not be processing as expected.
1) Production Control gets hourly reports of the data.  They are aware of reporting thresholds and will investigate if problems indicate greater problems other than those already identified.

 

Also see MESSAGING Reader.  It will report any issues specific to MESSAGING multicast delivery, at least to our APPGROUP01 Servers.

Use the following hyperlink to see this documentation: APP05_Monitoring_Considerations. 


 

APP37 (MESSAGING Service)

               

APP37 Purpose:

APP37 reads order related messages from APP01, APP07, and APP26 and forwards messages to destinations specified in routing agreements or on messages.

APP37 sends order related messages to APP07, APP01 and/or PROCESSINGs depending on applied routing instructions.

 

 

Use the following hyperlinks to jump to the desired section of APP37 documentation:

APP37_Recovery_Considerations

APP37_NTM_Control_Commands

APP37_Troubleshooting_Table

APP37_Monitoring_Considerations

 

 

APP37 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP37 nodes only when moving between nodes.  (Java code and FireDaemon references are required.)

 

-          When stopping/restarting APP37 processes:

7)      Notify Operations and work in cooperation with them, as appropriate to situations.

8)      Stop/Restart the APP37 process.

 

APP37 NTM Control Commands:

Reload Rules:

-          Use NTM Control Utility – Service Control - APP37 – Reload Rules.

 

End Of Day:

-          Use NTM Control Utility – Service Control - APP37 – End of Day.

 

 

 

APP37 Troubleshooting Table:

APP37 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In Solarwinds (and outlook), node and processes will be reported down.

 

Operations will not be able to query or administratively manage orders, MESSAGES 2 or MESSAGING reports via APP12.

 

Refer to:

APPGROUP14 Server

Server Specific Recoveries.

3)      Refer to:

APPGROUP14 Server

Server Specific Recoveries.

4)      Notify Management.

 

 

APP37 Monitoring Considerations:

Stats Monitors:
APP37 IPC Instance Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms that use APP26 services for special routing (to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel processing statistics.

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Service,
- Hostname,
- Msgs In,
- Msgs Out

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Not all APP37 processes are displayed as expected.

APP37 Service has not been started.
1) Check status of service.

 

 

 

Msgs In and/or Msgs Out are zero when messages are processed.

No messages have been sent/received since that monitor has been started.
1) Check APP37 log files.
2) Call Production Support if necessary.

 

 

Stats Monitors:
APP37 to APP01 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms that use APP26 services for special routing (to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel connectivity status to APP01 processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Not all APP37 or APP01 processes are displayed as expected.

APP37 or APP01 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP37 or APP01 log files.

 

 

 

Status is not CONNECTED.

 

Messages cannot be sent from source to destination unless IPC channel is connected.

1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

 

 

 

Queue size is non-zero value and not decreasing as expected.

 

Messages cannot be sent unless IPC channel is connected.

1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

Stats Monitors:
APP37 to APP07 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms that use APP26 services for special routing (to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel connectivity status to APP07 processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Not all APP37 or APP07 processes are displayed as expected.

APP37 or APP07 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP37 or APP07 log files.

 

 

 

Status is not CONNECTED.

 

Messages cannot be sent from source to destination unless IPC channel is connected.

1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

 

 

 

Queue size is non-zero value and not decreasing as expected.

 

Messages cannot be sent unless IPC channel is connected.

1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

Stats Monitors:
APP37 to APP26 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms that use APP26 services for special routing (to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel connectivity status to APP26 processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

 

Not all APP37 or APP26 processes are displayed as expected.

APP37 or APP26 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP37 or APP26 log files.

 

 

 

Status is not CONNECTED.

 

Messages cannot be sent from source to destination unless IPC channel is connected.

1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

 

 

 

Queue size is non-zero value and not decreasing as expected.

 

Messages cannot be sent unless IPC channel is connected.

1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

Stats Monitors:
APP37 to APP32 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms that use APP26 services for special routing (to MESSAGING, Away Destinations, PROCESSINGs and Trade Reporting Systems.

Monitor shows IPC channel connectivity status to APP37 Bridge / PROCESSING processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

Not all APP37, APP38ridge or PROCESSING processes are displayed as expected.

APP37, APP38ridge or PROCESSING Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP37, APP38ridge or PROCESSING log files.

 

 

 

Status is not CONNECTED.

 

Messages cannot be sent from source to destination unless IPC channel is connected.

1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

 

 

 

Queue size is non-zero value and not decreasing as expected.

NOTE: APP38ridge values will always be zero.  Only the "bridge-to-ME" channels will show non-zero values.

1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 

Stats Monitors:
APP37 to SUBGROUP02 IPC Connect Queues

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate communications from Order Sending Firms that use APP26 services for special routing (to MESSAGING, SITE 2s or PROCESSINGs).

Monitor shows IPC channel connectivity status to SUBGROUP02 processes. 

PROD MENU:
Trading Applications Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Source,
- Dest,
- Status,
- Queue Size,
- Msgs Out/Sec

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

 

 

Not all APP37 or SUBGROUP02 processes are displayed as expected.

APP37 or SUBGROUP02 Service has not been started or hasn't processed any messages since monitor has been started.
1) Check status of service.
2) Check APP37 or SUBGROUP02 log files.

 

 

 

Status is not CONNECTED.

 

Messages cannot be sent from source to destination unless IPC channel is connected.

1) Stop/Restart destination process if other processes connecting to the saAPP32 are showing similar issues; Otherwise, stop/restart source process.
2) Notify Production Support if issues.

 

 

 

Queue size is non-zero value and not decreasing as expected.

NOTE: Queue size will be one until more than one message is generated.  SUBGROUP02 messaging from APP37 is mostly defunct as of the last update of this documentation.

Messages cannot be sent from source to destination unless IPC channel is connected.
1) Stop/Restart destination process so as not to accidentally delete queued messages; Do not stop/restart source process.
2) Notify Production Support if issues.

 


 

APP38 (MESSAGING Service Bridge Process)

 

                APP38 Purpose:

The APP38 process reads order and trade related messages from APP37 and writes them to PROCESSINGs.

The APP38 process reads order and trade related messages from PROCESSINGs and writes them back to originator.

 

 

APP38 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP38 nodes only when moving between nodes.  (No real dependencies outside of expected processor.)

 

 

APP38 NTM Control Commands:

There are no APP38 specific NTM Control Commands.

 

APP38 Troubleshooting Table:

APP38 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to APP27 service.

-          In Solarwinds (and outlook), node and processes will be reported down.

 

-IBs will not be able to send orders to PROCESSINGs, and PROCESSING responses will be stopped.

 

-APP12 trade corrections will not be able to be sent to PROCESSINGs.

 

Refer to:

APPGROUP14 Server

Server Specific Recoveries.

1)      Refer to:

APPGROUP14 Server

Server Specific Recoveries.

2)      Restart affected APP27 processes on alternate nodes.

3)      Notify Management and PTT.

 

 

APP38 Monitoring Considerations:

See ME_Monitoring_Considerations and APP37_Monitoring_Considerations

There are no APP38 specific monitors outside of EMT and related APP32 and APP37 monitors.

APP39 (MESSAGING Processes)

 

                APP39 Purpose:

APP39 processes receive order and execution drop copies from SITE 2 Firms and Vendors and send them to APP07.

 

Use the following hyperlinks to jump to the desired section of APP39 documentation:

APP39_Recovery_Considerations

APP39_NTM_Control_Commands

APP39_Troubleshooting_Table

APP39_Monitoring_Considerations

 

 

APP39 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

 

-          When moving between nodes:

o   ALTERNATE NODES are not defined for APP39 services.

o   DR NODES must be allocated un-natted nodes. 

No node can support more than one natted address at the saAPP32 time.

 

-          When stopping/restarting APP39 processes:

1)      Notify the associated vendor or MESSAGING service firm and work in cooperation with them, as appropriate to situations.

2)      Notify Tech Services if moving APP39 processes to DR nodes and NAT addresses need to change to accommodate move.

3)      Stop the APP39 process.

4)      If NOT moving APP39 to new node, skip to step 5.

a)       If moving the APP39 to a new node, copy the day’s APP39 PROCESSOR files to alternate node:

a)       Copy: \chx\data\{APP39}\*.log, *.body, *.header, *.seqnums, *.session file created for the day to the alternate node.

b)      Copy: \chx\data\{APP39}\Global.* file created for the day to the alternate node.

c)       If the target folder does not exist on the new node, you must first either create the folder, or copy the entire folder.

d)      If these files are not moved before the process restarts on the new node, there will be a chance of sequence number miscommunications between the order sending firm involved and CHX.

5)      Start the APP39 process.

6)      Open the channel for the process affected and confirm order sending firm conneAPP13 as expected.

APP39 NTM Control Commands:

Open OSF Channel:

-          Use NTM Control Utility – Service Control - APP39 – Open OSF Channel to make OSF connection possible.

 

Close OSF Channel:

-          Use NTM Control Utility – Service Control - APP39 – Close OSF Channel to make OSF connection impossible.

 

Set Inbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP39 – Set Inbound Sequence Number to set Inbound Sequence Number.

 

Set Outbound Sequence Number:

-          Use NTM Control Utility – Service Control - APP39 – Set Outbound Sequence Number to set Outbound Sequence Number.

 

Enable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP39 – Enable THIRD PARTY Stats to start collection and display of LBM related stats.

 

Disable THIRD PARTY Stats:

-          Use NTM Control Utility – Service Control - APP39 – Disable THIRD PARTY Stats to stop collection and display of LBM related stats.

 

 

APP39 Troubleshooting Table:

APP39 Symptom

ImpaAPP13

Response

Firm disconneAPP13 or Logs out of session

 

Evidenced by:

-          EMT message saying {firm} is disconnected and/or {firm} is logged out.

 

-          Stats monitor shows disconnected in status column.

 

{Firm} will be further identified in EMT message by including “LocalFixId” and “RemoteFixID” as configured in APP39Services.xml file within the disconnect message.

IB is no longer able to receive drop copy related messages from SITE 2 Vendor or MESSAGING Service.

1)      Contact firm.

2)      Work with firm and/or Technical Services as necessary to isolate cause of issues and resolve them.

3)      Stop/Restarts of affected application service may help resolve the issue.

 

APP39 Monitoring Considerations:

Stats Monitors:
APP39 App Connect Stats

To Start:

Key Indicators to Monitor:

Symptom:

Response:

Processes facilitate drop copy communications from SITE 2s and MESSAGING via APP39 PROCESSOR.

Monitor shows connection status between PROCESSORs and SITE 2s as well as processing statistics.

PROD MENU:
Firmswitch Monitoring Menu

To Exit:
Close Window

- Color of data in columns
- Status,
- OutMsgs,
- InMsgs

Data is RED.

Process is either down or multicast data is not being received by monitor.
1) Check status of process
2) If process is up, call Technical Services.

Status is Disconnected or Open

Firm is not connected.
1) Use NTM Control Utility APP39 Service Controls to Open Channels.
2) Call Production Control if needed.

InMsgs value is not relatively close to, or far greater than OutMsgs value.

Generally, all messages (in and out) are heartbeats traded between sides with the exception of drop copies received from the firms.

We may not be processing as expected.
1) Check APP39 FIX message files to confirm inbound messages match outbound messages.
2) Work with Brokers and APP39 Firms if necessary.

 

 

 

               


 

APP40 (Transaction Reader)

 

                APP40 Purpose:

The APP40 process reads order and trade related messages from PROCESSINGs and creates a database loader file to be used in loading the data into the databases via loaders.

 

APP40 Recovery Considerations:

                                Stopping/Restart Processes:

-          Use NTM Control Utility - Service Control - Process Controller to stop/restart processes.

-          Use APP40 nodes only when moving between nodes.  (No real dependencies outside of expected processor.)

-          If moving APP40 processes to alternative servers, see ME_Recovery_Considerations.

 

APP40 NTM Control Commands:

There are no APP40 specific NTM Control Commands.

 

APP40 Troubleshooting Table:

APP40 Symptom

ImpaAPP13

Response

Node Crashes

 

Evidenced by:

-          In EMT, applications report lost communications to APP40 service.

-          In Solarwinds (and outlook), node and processes will be reported down.

 

-PROCESSING related activities will not be loaded into the database.

 

Refer to:

APPGROUP10 Server

Server Specific Recoveries.

 

1)      Refer to:

APPGROUP10 Server

Server Specific Recoveries.

2)      Restart affected APP40 processes on alternate nodes.

3)      Notify Management and PTT.

 

 

APP40 Monitoring Considerations:

See APP22_Monitoring_Considerations and DBL_Monitoring_Considerations.

There are no APP40 specific monitors outside of EMT and the related APP22 29 West and XML Database Loader monitors.

APP41 Data Stores (MESSAGING and Outbound MESSAGING)

               

Purpose:

 

APP41 = Ultra Messaging for the Enterprise.   A product of Informatica (29 West) that tries to provide the benefit of fast multicast message delivery along with a guarantee of message delivery persistence.   It consists of a APP41 Daemon “Listener” that runs as a service on a server (or a number of servers) that subscribes to the saAPP32 29 West topics as all APP41 message senders and receivers.   The “listeners” are not an additional hop in message delivery, but are instead an eaves-dropping “store” for all messages delivered.  The “store” then aAPP13 as the place where any receiver who thinks they lost a message would go and try to retrieve/reprocess it.  In CHX’s implementation, there are primary APP41 stores and backup stores which processes would auto-connect to if the primary store were to go down; They are NOT redundant stores.

 

has implemented the two separate APP41 Data Stores, for two separate paths of data: 

 

1)      FC (MESSAGING)

2)      APP36 (Outbound MESSAGING)

 

-          By design, there are eight APPGROUP15 Servers between both data centers:

 

·         One DC1 server acting as the primary APP41 FC Data Store, generally storing APP20, DCS, APP40 and APP28 related messages.

·         One DC1 server acting as the backup APP41 FC Data Store, generally storing APP20, DCS, APP40, and APP28 related messages.

·         One DC2 server acting as the primary APP41 FC Data Store, generally storing APP20, DCS, APP40 and APP28 related messages.

·         One DC2 server acting as the backup APP41 FC Data Store, generally storing APP20, DCS, APP40, and APP28 related messages.

 

·         One DC1 server acting as the primary NYSE APP41 APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.

·         One DC1 server acting as the primary NASD APP41 APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.

·         One DC2 server acting as the primary NYSE APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.

·         One DC2 server acting as the primary NASD APP41 APP36 Data Store, generally storing ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.

 

Use the following hyperlinks to jump to the desired section of APP41 documentation:

 

UME_Recovery_Considerations

UME_Troubleshooting_Table

 

 


 

APP41 Recovery Considerations:

NOTE: APP41 Services are not configured to "move" between nodes. 

Instead, we move the "APP41 registrations" of sending/receiving applications by shutting down APP41 stores.

 

What happens when sending application goes down?:

When the sending application goes down, there is no impact to the “store”.   The “stores” simply don’t have new messages to listen for until the senders coAPP32 back up.  When the sender comes back up, it resumes its connection to the “store” and the “store” continues listening.

 

 

What happens when the receiving application goes down (or thinks it lost messages)?:

When the receiving application goes down, it loses its connection to the sending applications and the “store”.  When the receiving application comes back up, it resumes its connections to the store and initiates a request for any missed messages.  The “store” delivers the messages and the receiving application reprocesses the lost messages.   The saAPP32 “message request/retransmission” would occur if the receiving application suffers a 29 west unrecoverable loss of messages as well.

 

 

What happens when the primary APP41 store goes down?:

When the primary APP41 store goes down, both the sending and receiving applications recognize the connections lost and automatically re-connect to the backup APP41 store.  The messages “stored” in the primary are no longer available to the receiving application, and only new messages sent from the sending applications to the backup “store” will be able to be resent (if the receiving process thinks they lost them). 

 

In the implementation, the stores are not redundant; Only one store is “listening” at a time.  And since APP41 stores are not redundant, messages sent/stored by a given APP41 Service BEFORE "re-registration" periods are essentially unable to be resent AFTER the "re-registration" period.  Or in other words, ONLY messages sent/stored AFTER a APP41 registration is made are available to be resent to a receiver.

 

 

 

 

 


 

APP41 Troubleshooting Table:

APP41 Symptom

ImpaAPP13

Response

If ONLY ONE node running APP41 FC Data Store Service is having problems:

 

Evidenced by:

-          In EMT, application specific APP41 registration errors are seen that indicate problems with one APP41 instance only involving APP20, ME, APP40, DCS and APP28 related messages.

 

ImpaAPP13 will be specific to applications involved.

 

The following applications are SENDERS and RECEIVERS of THIRD PARTY MESSAGING related data:

-          APP20                   to ME

-          APP38RIDGE      to APP32 and APP37

-          SUBGROUP02                    to APP22

-          APP40                   to APP22

 

1)      Notify Operations Management.

2)      Stop the APP41 FC service running on the node (via NTM Control Utility)

3)      Confirm through EMT and ER that all APP41 FC SENDING services previously registered re-register for backup APP41 FC service.

4)      Confirm through EMT and ER that all APP41 FC RECEIVING services previously registered re-register for backup APP41 FC service

5)      Wait for all "re-registrations" to complete and give all applications soAPP32 tiAPP32 to function with all remaining stores.

6)      Wait for a fair amount of tiAPP32 to make sure all applications, including APP41 FC Stores are healthy.  Arbitrarily, 2-5 minutes.

7)      Restart the original APP41 FC service on the original node (APP41 FC services are not set up to move between nodes)

8)      Confirm that no applications re-register for it since "re-registrations" only occur when APP41 Stores are lost and not when they are brought up.

 


 

APP41 Symptom

ImpaAPP13

Response

If ONLY ONE node running APP41 APP36 Data Store Service is having problems:

 

Evidenced by:

-          In EMT, application specific APP41 registration errors are seen that indicate problems with one APP41 instance only involving ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.

 

 

ImpaAPP13 will be specific to applications involved.

 

The following applications are SENDERS and RECEIVERS of THIRD PARTY APP36 related data:

-          ME                         to SITE 2 SUBGROUP01

-          ME                         to NASDAQ SUBGROUP01

-          SITE 2 SUBGROUP01       to SITE 2 SUBGROUP02

-          NASDAQ SUBGROUP01 to NASDAQ SUBGROUP02

-          SITE 2 SUBGROUP02       to SITE 2 APP06

-          NASDAQ              to NASDAQ APP06

 

1)      Notify Operations Management.

2)      Stop the APP41 APP36 service running on the node (via NTM Control Utility)

3)      Confirm through EMT and ER that all APP41 APP36 SENDING services previously registered re-register for backup APP41 APP36 service.

4)      Confirm through EMT and ER that all APP41 APP36 RECEIVING services previously registered re-register for backup APP41 APP36 service

5)      Wait for all "re-registrations" to complete and give all applications soAPP32 tiAPP32 to function with all remaining stores.

6)      Wait for a fair amount of tiAPP32 to make sure all applications, including APP41 APP36 Stores are healthy.  Arbitrarily, 2-5 minutes.

7)      Restart the original APP41 APP36 service on the original node (APP41 APP36 services are not set up to move between nodes)

8)      Confirm that no applications re-register for it since "re-registrations" only occur when APP41 Stores are lost and not when they are brought up.

 


 

APP41 Symptom

ImpaAPP13

Response

If BOTH APP41 FC Data stores need to be restarted with stores cleared (as they do at “start of day”)

 

-          Evidenced by unresolvable THIRD PARTY problems involving APP20, ME, APP40, DCS and APP28 related messages.

 

ImpaAPP13 will be specific to applications involved.

 

The following applications are SENDERS and RECEIVERS of THIRD PARTY MESSAGING related data:

-          APP20                   to ME

-          APP38RIDGE      to ME

-          ME                         to SUBGROUP02, APP40, APP20, APP38

-          RTC                        to APP22, APP28

-          APP40                   to APP22

 

1)      Notify Operations Management.

2)      Halt Trading in all stocks.  

Refer to ME_NTM_Control_Commands

3)      Stop APP10 testing via NTM Control Utility to avoid APP10 testing MEs during recoveries.

4)      Stop all applications that utilize APP41 stores (via NTM Control Utility)

-          Use Production Opsmenu Shutdown Menu and APP41 developer input to confirm the list of processes involved and the following order of shutdown:

5)      Stop APP20

6)      Stop APP37 Bridge, MEs, APP40s

7)      Stop SUBGROUP02

8)      Stop APP09 (RTC)

9)      Stop APP22

10)   Stop APP28

11)   Stop both APP41 services

12)   RenaAPP32 APP41 FC Store files, from Production Opsmenu Startup Menu.

13)   Purge APP41 FC cache and state files, from Production Opsmenu Startup Menu.

14)   Restart both APP41 FC services and check APP41 FC Store log files to confirm startup is “clean”.

15)   Restart all applications stopped in the order that they appear in Production Opsmenu Startup Menu.

16)   Confirm through EMT and ER that all startups occur with error.

17)   Perform System Integrity Checklist with focus on paths utilized by THIRD PARTY.

 


 

 

APP41 Symptom

ImpaAPP13

Response

If BOTH APP41 APP36 Data stores need to be restarted with stores cleared (as they do at “start of day”)

 

Evidenced by unresolvable THIRD PARTY problems involving ME, SUBGROUP01, SUBGROUP02 and APP06 related messages.

ImpaAPP13 will be specific to applications involved.

 

The following applications are SENDERS and RECEIVERS of THIRD PARTY APP36 related data:

-          ME                         to SITE 2 SUBGROUP01

-          ME                         to NASDAQ SUBGROUP01

-          SITE 2 SUBGROUP01       to SITE 2 SUBGROUP02

-          NASDAQ SUBGROUP01 to NASDAQ SUBGROUP02

-          SITE 2 SUBGROUP02       to SITE 2 APP06

-          NASDAQ              to NASDAQ APP06

 

1)      Notify Operations Management.

2)      Halt Trading in all stocks.  

Refer to ME_NTM_Control_Commands

3)      Stop APP10 testing via NTM Control Utility to avoid APP10 testing MEs during recoveries.

4)      Stop all applications that utilize APP41 stores (via NTM Control Utility)

-          Use Production Opsmenu Shutdown Menu and APP41 developer input to confirm the list of processes involved and the following order of shutdown:

5)      Stop ME

6)      Stop SUBGROUP01

7)      Stop SUBGROUP02

8)      Stop APP06

9)      Stop both APP41 services

10)   RenaAPP32 APP41 APP36 Store files, from Production Opsmenu Startup Menu.

11)   Purge APP41 APP36 cache and state files, from Production Opsmenu Startup Menu.

12)   Restart both APP41 APP36 services and check APP41 APP36 Store log files to confirm startup is “clean”.

13)   Restart all applications stopped in the order that they appear in Production Opsmenu Startup Menu.

14)   Confirm through EMT and ER that all startups occur with error.

15)   Perform System Integrity Checklist with focus on paths utilized by THIRD PARTY.