Observability

6 skills with this tag

wshobson
Passed
Service Mesh Observability
A comprehensive reference guide for implementing observability in service mesh deployments. Provides configuration templates for distributed tracing (Jaeger), metrics collection (Prometheus), dashboards (Grafana), and visualization (Kiali), along with PromQL queries and alerting rules for monitoring service-to-service communication.
Service MeshObservabilityIstio+3
10024.0k
wshobson
Passed
Prometheus Configuration
This skill provides comprehensive documentation for setting up Prometheus monitoring infrastructure. It covers installation via Helm and Docker, scrape configuration with static and dynamic service discovery, recording rules for pre-computed metrics, alert rules for availability and resource monitoring, and validation procedures using promtool.
PrometheusMonitoringMetrics+3
10024.0k
wshobson
Passed
Slo Implementation
This skill provides a comprehensive framework for implementing Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets. It includes Prometheus recording rules, multi-window burn rate alerting configurations, and Grafana dashboard examples to help teams measure and maintain service reliability.
SreObservabilityPrometheus+3
10024.0k
wshobson
Passed
Grafana Dashboards
This skill provides comprehensive guidance for creating production-ready Grafana dashboards. It includes JSON configuration templates for API monitoring, infrastructure, and database dashboards, along with best practices for panel types, variables, alerts, and dashboard provisioning using Terraform or Ansible.
GrafanaMonitoringDashboards+3
10024.0k
wshobson
Passed
Distributed Tracing
This skill provides comprehensive guidance for implementing distributed tracing in microservices architectures using Jaeger and Tempo. It covers Kubernetes and Docker deployments, application instrumentation with OpenTelemetry for Python, Node.js, and Go, context propagation patterns, sampling strategies, and integration with logging systems.
Distributed TracingJaegerTempo+3
10024.0k
fcakyon
Passed
Gcloud Usage
This skill provides comprehensive guidance for Google Cloud Platform observability including structured logging best practices, Cloud Logging query syntax, alert policy design, and cost optimization strategies. It helps developers debug production issues and implement effective monitoring on GCP.
GcpCloud LoggingObservability+3
50281