##不同于可靠性,弹性建设你应该这样做

**InfoQ作者 | John Allspaw 译者 | 平川 策划 | Tina**

**弹性并非可靠性的升级,而是应对未知变化的适应能力。** 在依赖软件的企业中,改善结果的方法通常比较狭隘,只是专注于减少他们所经历的事件。这种方法背后隐含着一个假设:事件是临时畸变,与“正常”工作无关。然而,弹性工程领域一直致力于把这个方法反过来——通过理解是什么让事件如此少见,以及如此轻微,并有意增强使这种情况成为可能的因素。

**弹性是指人们为适应无法预期或预见的情况所进行的活动、准备和能力投资。** 它强调的是适应潜力,而非适应本身,是指在需要适应之前进行的投资。这就像是在平静的海面下,暗流涌动,我们已经做好了应对风暴的准备。

**可靠性假设未来会和过去一样,而弹性则承认世界是不断变化的。** 可靠性是通过测试或观察已知具有标准的事物集而得出的,它预测未来可能出现的特定故障,并采取相应的对策来减轻故障或减少其影响。然而,世界变化的方式经常会出乎我们的意料。弹性是指人们在应对突发事件时现有的可供使用的人员技能和能力,以及系统预测和适应突发事件的相关能力。

**弹性隐于视野范围内。** 任何可能出错的事情几乎永远不会出错,但我们往往忽略了这一点。从事现代工作的人不断地适应他们正在做的事情,他们在做大多数事情时都能避免问题。当事情确实“偏离轨道”时,他们能够适应这类情况。

**弹性工程强调,当人们解决问题时,他们的行为比他们犯错误时更为重要。** 投入时间和精力去了解其他团队的目标,以及他们通常受到的限制,这有助于团队在需要时相互帮助。

**建立和保持弹性需要有意识地创造条件,让工程师可以分享、讨论和展示他们的专业知识。** 工程弹性是指增强和扩展人们成功处理意外情况的方式,因此,为人们创造机会详细分享他们处理事件的具体经验至关重要。

**弹性工程的主要挑战是理解什么不会出错,并扩展好的做法。** 我们可以通过观察日常工作中应对各种不断变化的情况的经验,来发现那些“隐藏的”专业知识,并将其转化为弹性建设的宝贵资源。

**在构建弹性系统时,我们应该关注以下几点:**

* **培养适应能力:** 鼓励团队成员不断学习和适应新情况。
* **分享经验:** 建立一个平台,让团队成员可以分享他们的经验和教训。
* **关注预防:** 投入时间和精力去了解潜在的风险,并采取措施来预防问题。
* **重视解决问题:** 鼓励团队成员积极解决问题,并从错误中吸取教训。

**弹性建设是一个持续的过程,需要我们不断学习和改进。** 通过理解弹性的本质,并采取相应的措施,我们可以构建更强大、更具适应性的系统,以应对未来的挑战。

英语如下:

##Resilience Engineering: Stop Believing in Reliability!

**Keywords:** ResilienceEngineering, Reliability, Difference

## Resilience Engineering: How to Do ItDifferently from Reliability

**By John Allspaw | Translated by Pingchuan | Planned by Tina**

**Resilience is not an upgrade to reliability,but an adaptive capacity to deal with unknown changes.** In software-dependent businesses, the methods for improving outcomes are often narrow, focusing only on reducing the events theyexperience. This approach implies an assumption: events are temporary distortions unrelated to “normal” work. However, the field of resilience engineering has been working to reverse this approach—by understanding what makes events so rare and so minor, and intentionally enhancingthe factors that make this possible.

**Resilience refers to the activities, preparations, and investments people make to adapt to situations that cannot be anticipated or foreseen.** It emphasizes the potential for adaptation, not adaptation itself, and refers to investmentsmade before adaptation is needed. It’s like having an undercurrent in the calm sea, we are already prepared for the storm.

**Reliability assumes the future will be like the past, while resilience acknowledges that the world is constantly changing.** Reliability is derived from testing or observing a known set of things with standards,predicting specific failures that may occur in the future and taking appropriate measures to mitigate or reduce their impact. However, the world changes in ways that often surprise us. Resilience refers to the skills and capabilities available to people when responding to unexpected events, as well as the system’s ability to predict and adapt to unexpected events.

**Resilience is hidden from view.** Almost anything that could go wrong almost never goes wrong, but we often overlook this. People in modern work are constantly adapting to what they are doing, they are able to avoid problems when doing most things. When things do “go off the rails,” they are able to adaptto those situations.

**Resilience engineering emphasizes that people’s behavior when they solve problems is more important than when they make mistakes.** Investing time and energy to understand the goals of other teams and the limitations they typically face helps teams help each other when needed.

**Building and maintaining resilience requires consciously creating conditions thatallow engineers to share, discuss, and showcase their expertise.** Engineering resilience refers to enhancing and extending the ways people successfully handle unexpected situations, so it is critical to create opportunities for people to share in detail their specific experiences handling events.

**The main challenge of resilience engineering is understanding what doesn’t go wrong and scalinggood practices.** We can discover this “hidden” expertise by observing the experiences of dealing with various changing situations in daily work and turning it into a valuable resource for resilience building.

**When building resilient systems, we should focus on the following:**

* **Cultivate adaptability:** Encourage team members to continuously learn and adaptto new situations.
* **Share experiences:** Establish a platform where team members can share their experiences and lessons learned.
* **Focus on prevention:** Invest time and energy to understand potential risks and take steps to prevent problems.
* **Value problem solving:** Encourage team members to actively solve problems and learn from mistakes.

**Resilience building is an ongoing process that requires continuous learning and improvement.** By understanding the nature of resilience and taking appropriate measures, we can build stronger, more adaptable systems to meet future challenges.

【来源】https://mp.weixin.qq.com/s/5F10LuPT9yK5fMWGL5lK-g

Views: 2

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注