Final Exam: Chaos Engineer

Final Exam: Chaos Engineer

Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level

Overview/Description

Final Exam: Chaos Engineer will test your knowledge and application of the topics presented throughout the Chaos Engineer track of the Skillsoft Aspire Network Admin to Site Reliability Engineer Journey.

Expected Duration (hours)
0.0

Lesson Objectives

Final Exam: Chaos Engineer

describe challenges for maintaining data integrity

describe CRON and how to use it for scheduling jobs

describe CRON Jobs and its components

describe deterministic and non-deterministic algorithms and how they relate to distributed systems

describe frontend load balancing and the importance of using them to increase performance

describe how engineers think differently to "novices" when it comes to troubleshooting

describe how load balancing needs to be performed taking into consideration virtualization, the cloud, and containers

describe how loads can be balanced using External HTTPS Load Balancing

describe how loads can be balanced using SSL Proxy Load Balancing

describe how loads can be balanced using TCP Proxy Load Balancing

describe how server overloads can lead to cascading failure

describe load balancing techniques and algorithms

describe operational loads and how they related to optimal performance

describe steps to ensure efficient queue management

describe the benefits of client-side throttling

describe the characteristics and purpose of blackbox monitoring

describe the characteristics and purpose of whitebox monitoring

describe the CRON syntax and provide syntax examples

describe the importance of incident response training

describe the mean time between failures metric

describe the meantime to respond metric

describe the system models that can be used with distributed systems

describe when to use acceptance testing

differentiate between idempotent and two-phase mutations

differentiate between the various pipeline features

discuss software testing at scale

identify the system models that can be used with distributed systems

list characteristics of machine learning (ML) applications

list CPU considerations as it relates to failures and overutilization

list data integrity requirements

list potential pitfalls to avoid, such as looking for symptoms that are not relevant

list the main roles in incident response (Incident Commander, Communications Lead, Operations Lead)

outline an idealized troubleshooting model (e.g., report, triage, examine, diagnose, test/treat, and cure.)

outline best practices and approaches to troubleshooting and how to keep those skills sharp

outline the benefits of using tickets

outline the process and purpose of logging and name the benefits of text logs

provide a general overview of the six steps involved in developing a plan

provide an overview of a typical development lifecycle

provide an overview of backup and recovery methods

provide an overview of business continuity and describe why business continuity planning matters

provide an overview of data integrity

provide an overview of Google Workflow

provide an overview of key principles SREs need to be familiar with for emergency response and recognize key steps to take when a system breaks

provide an overview of pages

provide an overview of resources exhaustion

provide an overview of the checkpointing technique

provide an overview of the maturity matrix

provide an overview of the meantime to failure metric

provide an overview of the production readiness review process

recognize aspects of the SRE engagement model

recognize best practices for handling unmanaged incidents

recognize how to create and maintain pipeline documentation

recognize how to create a test and build an environment

recognize how to develop a launch checklist

recognize how to identify cascading failures

recognize key factors to ensuring business continuity

recognize the importance of encouraging proactive testing

recognize the importance of incident response planning

recognize the importance of testing SRE-developed tools

recognize the pitfalls of the queries per second metric

Course Number:
it_fesre_03_enus

Expertise Level
Intermediate