So you may be asking – why is Kevin posting blogs for every session now? Truth is I took notes in every session, but now that I am actually taking the notes in live writer it is a one button publish and formatting happens close to real-time (when speakers are bullshiting). Very nice Microsoft. Now, back to the show.
CS 14 Health Monitoring Goals (Jared Zhang):
- Accurate Alerts
- Filter out transient conditions to reduce noise
- Distinguish alerts based on the impact to the system
- Track the current state of alerts (active or resolved)
- Actionable alerts
- Cause and recommended actions
- Relevant information to identify and isolate problems
- Guidance for troubleshooting
CS 14 Health Monitoring
- Health monitoring for CS 14
- Service Monitoring
- End-to-end verification of availability of CS services
- Component monitoring
- Monitoring components running on individual CS servers
- Voice Quality Monitoring
- Monitoring end-user-call reliability and media quality experience
- CS 14 MP for SCOM 20017 R2
- Monitoring and alerting on services, components, and voice quality
- Central discovery of monitored objects from CS 14 Central Management Store (CMS)
Service monitoring with Synthetic Transactions
- Synthetic Transactions (ST’s)
- End-to-end scenario view
- Powershell cmdlets starting with the Test verb
- Examples:
- Test-CsIM
- Test-CsPresence
- Test-CsPstnOutboundCall
- Run with configured test accounts or real credentials
- Provide a success/failure response
- SCOM Alerting
- Core set of ST’s are run periodically to verify service availability
- ST failures result in high priority alerts
- Alerts are auto-resolved if ST’s succeed in the next run
For example, making an outbound call through powershell
c:> Test-CsIm –TargetFqdn myocs.domain.com
Component Monitoring
- Health modeling for CS14 components
- Key health indicator (KHI) and non-KHI’s
- Events and performance counters are categorized as service impacting aspects (KHI’s) and non-service impacting aspects (non-KHI’s)
- KHI indicates a service impacting condition
- SCOM Alerting
- KHI’s result in medium priority alerts
- KHI alerts are auto-resolved if the component returns to healthy
- Non-KHI’s result in informational alerts that need manual resolution.
Call Reliability Monitoring
- Call reliability data are stored as Call Detail Records (CDR) data
- Failures are classified as Expected and Unexpected, based on the ms-diagnostic ID.
- Example: 52031 indicates media connectivity failure
- SCOM Alerting
- Categories for call reliability alerting:
- Peer-to-peer audio/video calls
- Audio/video conference calls
- Alerts are raised for higher then expected failure rates
- Each alert contains a CDR report link for troubleshooting
Media Quality Monitoring
- Media Quality data are stored as Quality of Experience (QoE) data
- Calls are classified as good/poor quality alerting:
- A/V Conferencing Servers, Mediation Servers, Gateways
- Network locations (subnets, sites, regions)
- Alerts are raised for higher then expected poor quality call rates
- Each alert contains a QoE report link for troubleshooting
The bottom line for this section is that there are really thorough monitoring and ST command applets built into Powershell (Test-CS*), and you can tie these into SCOM.
Health Monitoring for CS14 is a must for success – Antwan, build good health monitoring into our CS14 deployment from the ground up.
Reporting CS14 with the Monitoring Server Role - Arish Alreja
Improvements for CS14 Monitoring Server Role
- Call Detail Record (CDR) data collection
- Improved diagnostics information for all modalities in CS14
- Registration diagnostics data
- IP Phone Device data
- Quality of Experience (QoE) data collection
- Richer Endpoint Data (OS, Mac Address, CPU)
- Richer Audio Metrics (User facing diagnostics, audio healer metrics)
- Coverage on Media Bypass, Mediation Server – Multiple Gateways,
- Reporting Improvements
- For ROI Analysis and Asset Management
- Usage reports for visibility into deployment activity
- IP Phone HW and SW versions
- For Operational monitoring and diagnostics
- Dashboard delivers a view into any call reliability/media quality issues
- Call Reliability reports for monitoring and troubleshooting
- For Helpdesk admins helping end users
- User Activity Report
- Reports can be configured for periodic email delivery
- Reports are accessible from the CS Control Panel (CSCP)
Arish then moved directly into a demonstration of the reporting server and the CS Control Panel. It was very impressive – this picture does not do it justice:
I look forward to seeing this in Beta back at Vanderbilt!

No comments:
Post a Comment