shanghaishanghai

##不同于可靠性,弹性建设你应该这样做

**InfoQ作者 | John Allspaw 译者 | 平川 策划 | Tina**

随着软件在现代社会中扮演越来越重要的角色,人们对系统可靠性的关注也与日俱增。然而,仅仅追求可靠性,却忽略了系统在面对不可预测事件时的适应能力,这将导致系统在面对突发情况时变得脆弱不堪。

弹性工程的出现,正是为了弥补可靠性在应对突发事件方面的不足。弹性工程强调的是系统在面对意外情况时的适应能力,而不是仅仅追求系统在正常情况下的稳定运行。

**弹性建设需要有意识地创造条件,让工程师可以分享、讨论和展示他们的专业知识。**

弹性工程的核心在于增强和扩展人们成功处理意外情况的方式,因此,为人们创造机会详细分享他们处理事件的具体经验至关重要。

**弹性工程的主要挑战是理解什么不会出错,并扩展好的做法。**

我们往往只注意到事件发生时的情况,而忽略了事件没有发生时的状态。投入时间和精力去了解其他团队的目标,以及他们通常受到的限制,这有助于团队在需要时相互帮助。

**弹性工程强调,当人们解决问题时,他们的行为比犯错误时更值得关注。**

在依赖软件的企业中,改善结果的方法通常比较狭隘,只是专注于减少他们所经历的事件。这种方法背后有一个隐含的假设:事件是临时畸变,与“正常”工作无关。

**弹性工程致力于把这种方法反过来,通过理解是什么让事件如此少见,并有意增强使这种情况成为可能的因素。**

**弹性并非可靠性的升级**

可靠性假设未来会和过去一样,而现实世界是不断变化的。弹性是指人们在应对突发事件时现有的可供使用的人员技能和能力,以及系统预测和适应突发事件的相关能力。

**弹性隐于视野范围内**

专家在日常工作中应对各种不断变化的情况,积累了丰富的经验,这些经验是“隐藏的”,因为他们处理这些挑战时显得非常熟练和自如。

**弹性建设需要从以下几个方面着手:**

* **建立和保持弹性需要有意识地创造条件,让工程师可以分享、讨论和展示他们的专业知识。**
* **理解什么不会出错,并扩展好的做法。**
* **关注人们解决问题时的行为,而不是犯错误时的行为。**
* **通过理解是什么让事件如此少见,并有意增强使这种情况成为可能的因素。**

弹性工程的理念,将帮助我们构建更加适应变化的系统,使其能够在面对各种挑战时,依然保持稳定和高效的运行。

英语如下:

##Resilience Engineering: A New Approach Distinct from Reliability

**Keywords:** Resilience Engineering,Reliability, Distinction

**InfoQ Author | John Allspaw Translator |Pingchuan Planner | Tina**

As software plays an increasingly critical role in modern society, the focus on system reliability has intensified. However, solely pursuing reliabilitywhile neglecting the system’s adaptability to unpredictable events can lead to vulnerability in the face of unexpected situations.

Resilience engineering emerges to address the shortcomings of reliabilityin handling unexpected events. It emphasizes the system’s ability to adapt to unforeseen circumstances rather than merely striving for stable operation under normal conditions.

**Resilience engineering requires consciously creating conditions that allow engineers to share, discuss, and showcase theirexpertise.**

The core of resilience engineering lies in enhancing and expanding how people successfully handle unexpected events. Therefore, creating opportunities for individuals to share detailed experiences of their event handling is crucial.

**A primary challenge in resilience engineering is understanding whatdoesn’t go wrong and scaling good practices.**

We often focus on the occurrence of events while overlooking the state where they don’t happen. Investing time and effort to understand other teams’ objectives and their usual constraints can help teams assist each other when needed.

**Resilience engineering emphasizes that people’s actionswhen solving problems are more noteworthy than their mistakes.**

In software-dependent enterprises, methods for improving outcomes are often narrow, focusing solely on reducing the events they experience. This approach rests on an implicit assumption: events are temporary aberrations unrelated to “normal” work.

**Resilience engineering aims to reverse this approach byunderstanding what makes events so rare and intentionally strengthening the factors that contribute to this rarity.**

**Resilience is not an upgrade to reliability**

Reliability assumes the future will resemble the past, while the real world is constantly changing. Resilience refers to the existing skills and capabilities of individuals in responding to unexpected events, along withthe system’s ability to predict and adapt to such events.

**Resilience is hidden in plain sight**

Experts accumulate vast experience in their daily work, handling various constantly changing situations. This experience is “hidden” because they appear highly skilled and effortless in addressing these challenges.

**Building resilience requires addressing thefollowing aspects:**

* **Establish and maintain resilience by consciously creating conditions that allow engineers to share, discuss, and showcase their expertise.**
* **Understand what doesn’t go wrong and scale good practices.**
* **Focus on people’s actions when solving problems, not their mistakes.**
* **Understand whatmakes events so rare and intentionally strengthen the factors that contribute to this rarity.**

The concept of resilience engineering will help us build systems that are more adaptable to change, enabling them to remain stable and efficient in the face of various challenges.

【来源】https://mp.weixin.qq.com/s/5F10LuPT9yK5fMWGL5lK-g

Views: 1

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注