- Act of collecting & analyzing data to determine the performance, health, and availability of applications and their depended resources.
- 💡📝 Understanding the operation of applications and alerts helps fixing problems before they occur.
- Application Monitoring: • Application Insights
- Deep Infrastructure Monitoring: • Log Analytics • Management Solutions • Network Monitoring • Service Map
- Core Monitoring: • Azure Monitor • Advisor • Service Health • Activity Log
- Shared Capabilities: • Alerts • Dashboards • Metrics Explorer
- Use Pricing Calculator for a specific resource.
- Review estimated costs when creating a resource
- See spent costs through Subscription blade
- You can task resources Cost analysis with e.g. costCenter: marketing to filter in cost analysis view.
- 💡 Recommended to tag & export.
- You can see your bill in "Invoices", opt-in for PDF invoices & download (more options available in Azure Account Center)
- External services billed separately
- You can task resources Cost analysis with e.g. costCenter: marketing to filter in cost analysis view.
- Get notifications through Azure Advisor or Azure Cost Management for anomalies and overspending risks.
- PaaS (Platform as a Service)
- Pipeline for metric log data coming from any Azure resource provider.
- Fastest telemetry pipeline: Faster than Log Analytics.
- Collected data is saved in Log Analytics to analyze, monitor, visualize metrics and query and analyze logs.
- Visualize: • Dashboard • Views • Power BI • Workbooks
- Analyze: • Metrics Explorer • Log Analytics
- Respond: • Alerts • Autoscale
- Integrate: • Event Hubs • Logic Apps • Ingest & Export APIs
- You can use alerts to proactively notify you based on rules with Azure Alerts.
- You can access through
- Azure Insight & Analytics (OMS portal that's now legacy)
- On Azure portal there's an existing resource called Azure Monitor.
- Rest API, PowerShell & CLI
- Two levels of logs: resource (metrics and platform (activity log, resource log).
- All data types can be queried in Portal, PowerShell, Rest API or CLI.
- On resource level
- All numerical values including
- All Application Insights data / telemetry
- System health from VMs (uses HyperV metrics without any configuration)
- Can be visualized & queried
- 30 days free to store & query
- E.g. requests and errors, average response time, infrastructure metrics
- On platform level.
- "Outside operations" that are subscription-level events.
- Previously known as "Audit Logs" or "Operational Logs".
- Determine "what, who and when" for any write operation taken on the resources.
- 90 days free to store, fore more send to Log Analytics ("connect" button in Activity Logs blade of Log Analytic), archive to storage account (click on Export to Event Hubs in Azure Monitor) or stream to other services (click on Export to Event Hubs in Azure Monitor).
- During export you create a Log Profile you control how/where/what is exported for how long.
- Administrative
- Every API call to Azure Resource Manager
- E.g. create virtual machine or delete network security group
- Service health
- Any service health incidents occurred in Azure
- Come in 5 varieties: • Action Required • Assisted Recovery • Incident • Maintenance • Information • Security.
- E.g. SQL Azure in East US is experiencing downtime.
- Resource health
- Status of resources: • Available • Unavailable • Degraded • Unknown
- E.g. Virtual Machine health status changed to unavailable
- Alert
- All activations of Azure alerts.
- E.g. CPU % on a VM has been over 80 for the past 5 minutes
- Autoscale
- Events in autoscale engine defined by your settings.
- E.g. Scale up action failed
- Recommendation
- How to better utilize your resources.
- Security
- Events by Azure Security Center
- E.g. Suspicious double extension file executed.
- Policy
- All effect action operations performed by Azure Policy.
- Types include audit and deny
- 💡 Security monitoring should be set-up for Administrative, Security, Policy and ServiceHealth
- On platform level.
- "Inside operations" within the resource.
- Resource-specific with common scheme where resources include their own properties.
- E.g. IIS logs, web server logging, failed request tracking.
- Can be streamed (to Power BI, Azure Functions etc.), added in blob storage, and sent directly to Log Analytics.
- 🤗 Formerly known as diagnostic logs 1
- Diagnostic setting is resource deployed to send logs different destinations.
- Each Azure resource requires its own diagnostic setting
- A diagnostic setting consists of
- Sources (depends on resource type)
- Destinations (destinations to send)
- Logs can be sent to resources across subscriptions 4
- Using Azure Lighthouse, it can be sent across tenants 4
- Resources that can be sent include:
- A storage account (
storageAccountId
1) - Log analytics workspace (
workspaceId
)- Metrics are converted to forms and sent to Azure Monitor Logs
- Helps to query, visualize, set-up alerts and integrate with Azure Sentinel
- Azure marketplace resource (
marketplacePartnerId
)- Also known as partner integrations 3
- E.g. Datadog, Elastic, Logz.io.
- Event Hub (
eventHubName
)
- A storage account (
- All alert creation for metrics, logs and activity log across Azure Monitor, Log Analytics, and Application Insights
- Separation of operational and configuration views:
- Alert Rules: Definition of the condition that triggers an alert
- Fired Alerts: An instance of the alert rule firing
- Flow of alerts
- Set-up alert rule
- Target resource (e.g. storage account)
- Signal:
- Types are Metric, Activity log, Application Insights and Log
- You can have multi-dimensional metrics & monitor multiple metrics with a single rule (currently up to two)
- Signal:
- Criteria
- Logic test: e.g. six-hour period when capacity is over 10 MB
- Target resource (e.g. storage account)
- Action group (= actions to do)
- Grouping of different actions to take when the alert is triggered
- Each action has name & action type e.g. email/sms/push/voice/webhook/automation runbook
- ❗ Applied rate limiting:
- SMS: No more than 1 SMS every 5 minutes.
- Voice: No more than 1 Voice call every 5 minutes.
- Email: No more than 100 emails in an hour.
- Other actions are not rate limited.
- ❗ Applied rate limiting:
- Set alert rule name, description, severity (can be informational, warning, error, critical)
- Set-up alert rule
- Log alerts
- Defined by Log Query (by Log Analytics), Time period, frequency, threshold.
- Number of results alert rules always creates a single alert, while Metric measurement alert rule creates an alert for each object that exceeds the threshold
- Uses telemetry & application configurations to give personalize recommendations and guidance for
- high availability, security, performance, cost effectiveness (monitors unused resources and spent).
- Common resource, free for all users
- You can download, filter, postpone, dismiss recommendations.
- You can customize by excluding subscription/resource groups, configuring utilization rules (e.g. you can as a subscription owner set CPU to lower threshold)
- PaaS (Platform as a Service)
- It's also referred as (newer names)
- Azure Monitor Logs
- Azure Monitor log data is stored in Log Analytics
- The term is changed from "Log Analytics" to "Azure Monitor Logs" (see)
- Log Search
- "Log Search" interface in Azure Monitor (Monitor -> Logs)
- Or Log Analytics Workspace -> Logs
- Azure Monitor Logs
- Two main features
- Log Analytics Workspace where Azure Monitor stores its data
- Log search feature; collect, correlate, search, and act on data with any schema.
- Business value:
- Assessing updates: From logs can guess average patching time
- Change tracking: Abnormal behavior from a specific account by tracking changes throughout the environment
- Free to store logs for 90 days of charge
- You can export to Excel, PowerBI, or use API to send data or get.
- Log analytics uses push-model i.e. data is pushed to it.
- Data can be pushed from integrated sources including
- Connected Azure sources e.g. IIS logs, custom text logs with custom fields, error level etc.
- Office 365, Azure Automation, Back-ups
- ❗️ You cannot change schema on ingestion time for integrated resources, only on query time.
- Data can also be pushed Linux & Windows systems with an agent
- Agents include
- Log analytics agent
- Operations Manager
- Allows using existing investments with SCOM (System Center Operations Manager)
- SCOM agents communicate with SCOM Server over TLS 1.2 which forward events and performance data to Log Analytics.
- You can install agents using a script via Azure Automation DSC (desired state configuration)
- Agents are already installed in cloud Windows VMs
- Agents include
- You can also use Log Analytics REST APIs to send custom data with any schema you want
- Log analytics applies if in same subscription, or in same Azure Active Directory tenant.
- So single Log Analytics workspace can monitor across under same tenant.
- To collect data across subscriptions and tenants:
- In customer subscription
- Activity log (export button) => Event Hub
- In Service provider subscription
- Logic App (When events are available in Event Hub -> Parse JSON (Body) -> Compose (Select Body in inputs) -> Send Data (Azure Log Analytics Data Collector))=> Log Analytics
- In customer subscription
- Azure Monitor Logs support query across multiple Log Analytics workspaces across subscriptions
- E.g.
app("/subscriptions/b459b4f6-912x-46d5-9cb1-b43069212ab4/resourcegroups/Fabrikam/providers/microsoft.insights/components/fabrikamapp").requests | count
- Read more on Microsoft Docs
- E.g.
- Each data source has documentation (description & name) of its properties.
- 📝 Main query tables: Heartbeat, Perf, Usage, Event, Syslog, Alert.
- You can connect to Activity Logs in Activity Logs section and query with
AzureActivity
- A pipe-through language to query log analytics data
- E.g.:
Event | where (EventLevelName == "Error") | where (TimeGenerated > ago(1days)) | summarize ErrorCount = count() by Computer | top 10 by ErrorCount desc
- Can generate charts
| render timechart
that can be pinned to dashboard.