KMac's Blog: Microsoft Communications Server “14”: Monitoring and Reporting

So you may be asking – why is Kevin posting blogs for every session now? Truth is I took notes in every session, but now that I am actually taking the notes in live writer it is a one button publish and formatting happens close to real-time (when speakers are bullshiting). Very nice Microsoft. Now, back to the show.

CS 14 Health Monitoring Goals (Jared Zhang):

Accurate Alerts

Filter out transient conditions to reduce noise
Distinguish alerts based on the impact to the system
Track the current state of alerts (active or resolved)

Actionable alerts

Cause and recommended actions
Relevant information to identify and isolate problems
Guidance for troubleshooting

CS 14 Health Monitoring

Health monitoring for CS 14

Service Monitoring

End-to-end verification of availability of CS services

Component monitoring

Monitoring components running on individual CS servers

Voice Quality Monitoring

Monitoring end-user-call reliability and media quality experience

CS 14 MP for SCOM 20017 R2

Monitoring and alerting on services, components, and voice quality
Central discovery of monitored objects from CS 14 Central Management Store (CMS)

Service monitoring with Synthetic Transactions

Synthetic Transactions (ST’s)

End-to-end scenario view
Powershell cmdlets starting with the Test verb

Examples:

Test-CsIM
Test-CsPresence
Test-CsPstnOutboundCall

Run with configured test accounts or real credentials
Provide a success/failure response

SCOM Alerting

Core set of ST’s are run periodically to verify service availability
ST failures result in high priority alerts
Alerts are auto-resolved if ST’s succeed in the next run

For example, making an outbound call through powershell

c:> Test-CsIm –TargetFqdn myocs.domain.com

Component Monitoring

Health modeling for CS14 components

Key health indicator (KHI) and non-KHI’s

Events and performance counters are categorized as service impacting aspects (KHI’s) and non-service impacting aspects (non-KHI’s)
KHI indicates a service impacting condition

SCOM Alerting

KHI’s result in medium priority alerts
KHI alerts are auto-resolved if the component returns to healthy
Non-KHI’s result in informational alerts that need manual resolution.

Call Reliability Monitoring

Call reliability data are stored as Call Detail Records (CDR) data
Failures are classified as Expected and Unexpected, based on the ms-diagnostic ID.

Example: 52031 indicates media connectivity failure

SCOM Alerting

Categories for call reliability alerting:

Peer-to-peer audio/video calls
Audio/video conference calls

Alerts are raised for higher then expected failure rates
Each alert contains a CDR report link for troubleshooting

Media Quality Monitoring

Media Quality data are stored as Quality of Experience (QoE) data
Calls are classified as good/poor quality alerting:

A/V Conferencing Servers, Mediation Servers, Gateways
Network locations (subnets, sites, regions)

Alerts are raised for higher then expected poor quality call rates
Each alert contains a QoE report link for troubleshooting

The bottom line for this section is that there are really thorough monitoring and ST command applets built into Powershell (Test-CS*), and you can tie these into SCOM.

Health Monitoring for CS14 is a must for success – Antwan, build good health monitoring into our CS14 deployment from the ground up.

Reporting CS14 with the Monitoring Server Role - Arish Alreja

Improvements for CS14 Monitoring Server Role

Call Detail Record (CDR) data collection

Improved diagnostics information for all modalities in CS14
Registration diagnostics data
IP Phone Device data

Quality of Experience (QoE) data collection

Richer Endpoint Data (OS, Mac Address, CPU)
Richer Audio Metrics (User facing diagnostics, audio healer metrics)
Coverage on Media Bypass, Mediation Server – Multiple Gateways,

Reporting Improvements

For ROI Analysis and Asset Management

Usage reports for visibility into deployment activity
IP Phone HW and SW versions

For Operational monitoring and diagnostics

Dashboard delivers a view into any call reliability/media quality issues
Call Reliability reports for monitoring and troubleshooting

For Helpdesk admins helping end users

User Activity Report

Reports can be configured for periodic email delivery
Reports are accessible from the CS Control Panel (CSCP)

Arish then moved directly into a demonstration of the reporting server and the CS Control Panel. It was very impressive – this picture does not do it justice:

I look forward to seeing this in Beta back at Vanderbilt!

KMac's Blog

Wednesday, June 9, 2010

Microsoft Communications Server “14”: Monitoring and Reporting

CS 14 Health Monitoring Goals (Jared Zhang):

CS 14 Health Monitoring

Service monitoring with Synthetic Transactions

Component Monitoring

Call Reliability Monitoring

Media Quality Monitoring

Reporting CS14 with the Monitoring Server Role - Arish Alreja

Improvements for CS14 Monitoring Server Role

No comments:

Post a Comment

Followers

Blog Archive

About Me