Network Working Group                                          A. Farrel
Request for Comments: 3612                            Old Dog Consulting
Category: Informational                                   September 2003
        
Network Working Group                                          A. Farrel
Request for Comments: 3612                            Old Dog Consulting
Category: Informational                                   September 2003
        

Applicability Statement for Restart Mechanisms for the Label Distribution Protocol (LDP)

标签分发协议(LDP)重启机制的适用性声明

Status of this Memo

本备忘录的状况

This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.

本备忘录为互联网社区提供信息。它没有规定任何类型的互联网标准。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2003). All Rights Reserved.

版权所有(C)互联网协会(2003年)。版权所有。

Abstract

摘要

This document provides guidance on when it is advisable to implement some form of Label Distribution Protocol (LDP) restart mechanism and which approach might be more suitable. The issues and extensions described in this document are equally applicable to RFC 3212, "Constraint-Based LSP Setup Using LDP".

本文档提供了关于何时建议实施某种形式的标签分发协议(LDP)重启机制以及哪种方法更合适的指南。本文件中描述的问题和扩展同样适用于RFC 3212,“使用LDP的基于约束的LSP设置”。

1. Introduction
1. 介绍

Multiprotocol Label Switching (MPLS) systems are used in core networks where system downtime must be kept to a minimum. Similarly, where MPLS is at the network edges (e.g., in Provider Edge (PE) routers) [RFC2547], system downtime must also be kept to a minimum. Many MPLS Label Switching Routers (LSRs) may, therefore, exploit Fault Tolerant (FT) hardware or software to provide high availability of the core networks.

多协议标签交换(MPLS)系统用于核心网络,其中系统停机时间必须保持在最低限度。类似地,如果MPLS位于网络边缘(例如,在提供商边缘(PE)路由器中)[RFC2547],系统停机时间也必须保持在最低限度。因此,许多MPLS标签交换路由器(LSR)可能利用容错(FT)硬件或软件来提供核心网络的高可用性。

The details of how FT is achieved for the various components of an FT LSR, including the switching hardware and the TCP stack, are implementation specific. How the software module itself chooses to implement FT for the state created by the LDP is also implementation specific. However, there are several issues in the LDP specification [RFC3036] that make it difficult to implement an FT LSR using the LDP protocols without some extensions to those protocols.

FT LSR的各个组件(包括交换硬件和TCP堆栈)如何实现FT的细节是特定于实现的。软件模块本身如何选择为LDP创建的状态实现FT也是具体实现的。然而,LDP规范[RFC3036]中存在一些问题,使得在没有对LDP协议进行扩展的情况下,难以使用LDP协议实现FT LSR。

Proposals have been made in [RFC3478] and [RFC3479] to address these issues.

[RFC3478]和[RFC3479]中提出了解决这些问题的建议。

2. Requirements of an LDP FT System
2. LDP-FT系统的要求

Many MPLS LSRs may exploit FT hardware or software to provide high availability (HA) of core networks. In order to provide HA, an MPLS system needs to be able to survive a variety of faults with minimal disruption to the Data Plane, including the following fault types:

许多MPLS LSR可以利用FT硬件或软件来提供核心网络的高可用性(HA)。为了提供HA,MPLS系统需要能够在对数据平面中断最小的情况下经受住各种故障,包括以下故障类型:

- failure/hot-swap of the switching fabric in an LSR,

- LSR中交换结构的故障/热交换,

- failure/hot-swap of a physical connection between LSRs,

- LSR之间物理连接的故障/热插拔,

- failure of the TCP or LDP stack in an LSR,

- LSR中的TCP或LDP堆栈出现故障,

- software upgrade to the TCP or LDP stacks in an LSR.

- LSR中TCP或LDP堆栈的软件升级。

The first two examples of faults listed above may be confined to the Data Plane. Such faults can be handled by providing redundancy in the Data Plane which is transparent to LDP operating in the Control Plane. However, the failure of the switching fabric or a physical link may have repercussions in the Control Plane since signaling may be disrupted.

上面列出的前两个故障示例可能仅限于数据平面。可以通过在数据平面中提供冗余来处理此类故障,该冗余对在控制平面中操作的LDP是透明的。然而,交换结构或物理链路的故障可能在控制平面中产生影响,因为信令可能被中断。

The third example may be caused by a variety of events including processor or other hardware failure, and software failure.

第三个示例可能由各种事件引起,包括处理器或其他硬件故障以及软件故障。

Any of the last three examples may impact the Control Plane and will require action in the Control Plane to recover. Such action should be designed to avoid disrupting traffic in the Data Plane. Since many recent router architectures can separate the Control and Data Planes, it is possible that forwarding can continue unaffected by recovery action in the Control Plane.

最后三个示例中的任何一个都可能影响控制平面,并且需要在控制平面中执行操作才能恢复。此类操作的设计应避免中断数据平面中的通信。由于许多最新的路由器体系结构可以将控制平面和数据平面分开,因此转发可能不会受到控制平面中恢复操作的影响。

In other scenarios, the Data and Control Planes may be impacted by a fault, but the needs of HA require the coordinated recovery of the Data and Control Planes to a state that existed before the fault.

在其他情况下,数据和控制平面可能会受到故障的影响,但HA的需要需要将数据和控制平面协调恢复到故障前的状态。

The provision of protection paths for MPLS LSP and the protection of links, IP routes or tunnels through the use of protection LSPs is outside the scope of this document. See [RFC3469] for further information.

为MPLS LSP提供保护路径以及通过使用保护LSP保护链路、IP路由或隧道不在本文件范围内。有关更多信息,请参阅[RFC3469]。

3. General Considerations
3. 一般考虑

In order for the Data and Control Plane states to be successfully recovered after a fault, procedures are required to ensure that the state held on a pair of LDP peers (at least one of which was affected

为了在故障后成功恢复数据和控制平面状态,需要执行程序以确保在一对LDP对等点(其中至少一个受到影响)上保持状态

directly by the fault) are synchronized. Such procedures must be implemented in the Control Plane software modules on the peers using Control Plane protocols.

直接由故障)进行同步。这些程序必须使用控制平面协议在对等机上的控制平面软件模块中实施。

The required actions may operate fully after the failure (reactive recovery) or may contain elements that operate before the fault in order to minimize the actions taken after the fault (proactive recovery). It is rare to implement actions that operate solely in advance of the failure and do not require any further processing after the failure (preventive recovery) - this is because of the dynamic nature of signaling protocols and the unpredictability of fault timing.

所需的措施可能在故障后完全运行(反应式恢复),也可能包含在故障前运行的元件,以尽量减少故障后采取的措施(主动式恢复)。很少执行仅在故障发生之前操作且在故障发生后不需要任何进一步处理(预防性恢复)的操作-这是因为信令协议的动态特性和故障时间的不可预测性。

Reactive recovery actions may include full re-signaling of state and re-synchronization of state between peers and synchronization based on checkpointing.

反应式恢复动作可包括状态的完全重新信令、对等方之间状态的重新同步以及基于检查点的同步。

Proactive recovery actions may include hand-shaking state transitions and checkpointing.

主动恢复操作可能包括握手状态转换和检查点设置。

4. Specific Issues with the LDP Protocol
4. LDP协议的具体问题

LDP uses TCP to provide reliable connections between LSRs to exchange protocol messages to distribute labels and to set up LSPs. A pair of LSRs that have such a connection are referred to as LDP peers.

LDP使用TCP在LSR之间提供可靠的连接,以交换协议消息来分发标签和设置LSP。具有这种连接的一对lsr被称为LDP对等点。

TCP enables LDP to assume reliable transfer of protocol messages. This means that some of the messages do not need to be acknowledged (e.g., Label Release).

TCP使LDP能够承担协议消息的可靠传输。这意味着一些消息不需要确认(例如,标签发布)。

LDP is defined such that if the TCP connection fails, the LSR should immediately tear down the LSPs associated with the session between the LDP peers, and release any labels and resources assigned to those LSPs.

LDP的定义是,如果TCP连接失败,LSR应立即拆除与LDP对等方之间的会话相关联的LSP,并释放分配给这些LSP的任何标签和资源。

It is notoriously difficult to provide a Fault Tolerant implementation of TCP. To do so might involve making copies of all data sent and received. This is an issue familiar to implementers of other TCP applications, such as BGP.

众所周知,很难提供TCP的容错实现。要做到这一点,可能需要复制发送和接收的所有数据。这是其他TCP应用程序(如BGP)的实现者所熟悉的问题。

During failover affecting the TCP or LDP stacks, therefore, the TCP connection may be lost. Recovery from this position is made worse by the fact that LDP control messages may have been lost during the connection failure. Since these messages are unconfirmed, it is possible that LSP or label state information will be lost.

因此,在影响TCP或LDP堆栈的故障切换期间,TCP连接可能会丢失。由于LDP控制消息可能在连接失败期间丢失,因此从该位置的恢复变得更糟。由于这些消息未经确认,因此可能会丢失LSP或标签状态信息。

At the very least, the solution to this problem must include a change to the basic requirements of LDP so that the failure of an LDP session does not require that associated LDP or forwarding state be torn down.

至少,这个问题的解决方案必须包括修改LDP的基本要求,以便LDP会话的失败不需要拆除相关的LDP或转发状态。

Any changes made to LDP in support of recovery processing must meet the following requirements:

为支持恢复处理而对LDP进行的任何更改必须满足以下要求:

- offer backward-compatibility with LSRs that do not implement the extensions to LDP,

- 提供与未实现LDP扩展的LSR的向后兼容性,

- preserve existing protocol rules described in [RFC3036] for handling unexpected duplicate messages and for processing unexpected messages referring to unknown LSPs/labels.

- 保留[RFC3036]中描述的现有协议规则,用于处理意外重复消息和处理涉及未知LSP/标签的意外消息。

Ideally, any solution applicable to LDP should be equally applicable to CR-LDP.

理想情况下,适用于LDP的任何解决方案都应同样适用于CR-LDP。

5. Summary of the Features of LDP FT
5. LDP-FT特性综述

LDP Fault Tolerance extensions are described in [RFC3479]. This approach involves:

[RFC3479]中描述了LDP容错扩展。这种方法包括:

- negotiation between LDP peers of the intent to support extensions to LDP that facilitate recovery from failover without loss of LSPs,

- LDP对等方之间的协商旨在支持LDP的扩展,以促进从故障切换中恢复,而不会丢失LSP,

- selection of FT survival on a per LSP/label basis or for all labels on a session,

- 在每个LSP/标签的基础上或在一个会话中为所有标签选择FT生存期,

- sequence numbering of LDP messages to facilitate acknowledgement and checkpointing,

- LDP消息的顺序编号,以便于确认和检查点,

- acknowledgement of LDP messages to ensure that a full handshake is performed on those messages either frequently (such as per message) or less frequently as in checkpointing,

- 确认LDP消息,以确保在这些消息上频繁(如每条消息)或在检查点中不频繁地执行完全握手,

- solicitation of up-to-date acknowledgement (checkpointing) of previous LDP messages to ensure the current state is secured, with an additional option that allows an LDP partner to request that state is flushed in both directions if graceful shutdown is required,

- 请求先前LDP消息的最新确认(检查点),以确保当前状态是安全的,另外一个选项允许LDP伙伴在需要正常关机时请求双向刷新状态,

- a timer to control how long LDP and forwarding state should be retained after the LDP session failure, but before being discarded if LDP communications are not re-established,

- 一个定时器,用于控制LDP会话失败后,LDP和转发状态应保留多长时间,但如果LDP通信未重新建立,则在丢弃之前,

- exchange of checkpointing information on LDP session recovery to establish what state has been retained by recovering LDP peers,

- 交换LDP会话恢复的检查点信息,以确定恢复的LDP对等方保留了什么状态,

- re-issuing lost messages after failover to ensure that LSP/label state is correctly recovered after reconnection of the LDP session.

- 在故障转移后重新发出丢失的消息,以确保在重新连接LDP会话后正确恢复LSP/标签状态。

The FT procedures in [RFC3479] concentrate on the preservation of label state for labels exchanged between a pair of adjacent LSRs when the TCP connection between those LSRs is lost. There is no intention within these procedures to support end-to-end protection for LSPs.

[RFC3479]中的FT程序集中于当一对相邻LSR之间的TCP连接丢失时,保留这些LSR之间交换的标签的标签状态。这些程序无意支持LSP的端到端保护。

6. Summary of the Features of LDP Graceful Restart
6. LDP的特点综述

LDP graceful restart extensions are defined in [RFC3478]. This approach involves:

[RFC3478]中定义了LDP优雅重启扩展。这种方法包括:

- negotiation between LDP peers of the intent to support extensions to LDP that facilitate recovery from failover without loss of LSPs,

- LDP对等方之间的协商旨在支持LDP的扩展,以促进从故障切换中恢复,而不会丢失LSP,

- a mechanism whereby an LSR that restarts can relearn LDP state by resynchronization with its peers,

- 一种机制,在此机制下,重新启动的LSR可以通过与对等机重新同步来重新学习LDP状态,

- use of the same mechanism to allow LSRs recovering from an LDP session failure to resynchronize LDP state with their peers provided that at least one of the LSRs has retained state across the failure or has itself resynchronized state with its peers,

- 使用相同的机制允许从LDP会话故障中恢复的LSR与其对等方重新同步LDP状态,前提是至少一个LSR在故障期间保持状态,或者其自身与对等方重新同步状态,

- a timer to control how long LDP and forwarding state should be retained after the LDP session failure, but before being discarded if LDP communications are not re-established,

- 一个定时器,用于控制LDP会话失败后,LDP和转发状态应保留多长时间,但如果LDP通信未重新建立,则在丢弃之前,

- a timer to control the length of the resynchronization period between adjacent peers should be completed.

- 应该完成一个定时器来控制相邻对等点之间的重新同步周期的长度。

The procedures in [RFC3478] are applicable to all LSRs, both those with the ability to preserve forwarding state during LDP restart and those without. LSRs that can not preserve their MPLS forwarding state across the LDP restart would impact MPLS traffic during restart. However, by implementing a subset of the mechanisms in [RFC3478] they can minimize the impact if their neighbor(s) are capable of preserving their forwarding state across the restart of their LDP sessions or control planes by implementing the mechanism in [RFC3478].

[RFC3478]中的程序适用于所有LSR,包括能够在LDP重启期间保持转发状态的LSR和不具有转发状态的LSR。无法在LDP重启期间保持其MPLS转发状态的LSR将在重启期间影响MPLS流量。然而,通过实施[RFC3478]中的机制子集,如果其邻居能够通过实施[RFC3478]中的机制在其LDP会话或控制平面重启期间保持其转发状态,则可以将影响降至最低。

7. Applicability Considerations
7. 适用性考虑

This section considers the applicability of fault tolerance schemes within LDP networks and considers issues that might lead to the choice of one method or another. Many of the points raised below should be viewed as implementation issues rather than specific drawbacks of either solution.

本节考虑LDP网络中容错方案的适用性,并考虑可能导致选择一种或另一种方法的问题。下面提出的许多观点应被视为实施问题,而不是任何一种解决方案的具体缺点。

7.1. General Applicability
7.1. 普遍适用性

The procedures described in [RFC3478] and [RFC3479] are intended to cover two distinct scenarios. In Session Failure, the LDP peers at the ends of a session remain active, but the session fails and is restarted. Note that session failure does not imply failure of the data channel even when using an in-band control channel. In Node Failure, the session fails because one of the peers has been restarted (or at least, the LDP component of the node has been restarted). These two scenarios have different implications for the ease of retention of LDP state within an individual LSR, and are described in sections below.

[RFC3478]和[RFC3479]中描述的程序旨在涵盖两种不同的场景。在会话失败时,会话结束时的LDP对等方保持活动状态,但会话失败并重新启动。请注意,即使在使用带内控制通道时,会话故障也不意味着数据通道故障。在节点故障中,会话失败是因为其中一个对等方已重新启动(或者至少节点的LDP组件已重新启动)。这两种方案对于LDP状态在单个LSR内的保持的容易程度具有不同的含义,下面将对其进行描述。

These techniques are only applicable in LDP networks where at least one LSR has the capability to retain LDP signaling state and the associated forwarding state across LDP session failure and recovery. In [RFC3478], the LSRs retaining state do not need to be adjacent to the failed LSR or session.

这些技术仅适用于LDP网络,其中至少一个LSR能够在LDP会话失败和恢复期间保持LDP信令状态和相关转发状态。在[RFC3478]中,LSR保留状态不需要与失败的LSR或会话相邻。

If traffic is not to be impacted, both LSRs at the ends of an LDP session must at least preserve forwarding state. Preserving LDP state is not a requirement to preserve traffic.

如果流量不受影响,LDP会话结束时的两个LSR必须至少保持转发状态。保持LDP状态不是保持通信量的要求。

[RFC3479] requires that the LSRs at both ends of the session implement the procedures that it describes. Thus, either traffic is preserved and recovery resynchronizes state, or no traffic is preserved and the LSP fails.

[RFC3479]要求会话两端的LSR执行其描述的过程。因此,要么保留通信量,恢复重新同步状态,要么不保留通信量,LSP失败。

Further, to use the procedures of [RFC3479] to recover state on a session, both LSRs must have a mechanism for maintaining some session state and a way of auditing the forwarding state and the resynhcronized control state.

此外,为了使用[RFC3479]中的过程来恢复会话上的状态,两个LSR必须具有用于维护某些会话状态的机制以及审核转发状态和重新同步控制状态的方法。

[RFC3478] is scoped to support preservation of traffic if both LSRs implement the procedures that it describes. Additionally, it functions if only one LSR on the failed session supports retention of forwarding state, and implements the mechanisms in the document. In this case, traffic will be impacted by the session failure, but the forwarding state will be recovered on session recovery. Further, in the event of simultaneous failures, [RFC3478] is capable of

[RFC3478]的作用范围是,如果两个LSR都实施了它所描述的程序,则支持保存通信量。此外,如果失败会话上只有一个LSR支持保留转发状态,则它将运行,并实现文档中的机制。在这种情况下,通信量将受到会话失败的影响,但转发状态将在会话恢复时恢复。此外,在同时发生故障的情况下,[RFC3478]能够

relearning and redistributing state across multiple LSRs by combining its mechanisms with the usual LDP message exchanges of [RFC3036].

通过将其机制与[RFC3036]的常用LDP消息交换相结合,在多个LSR之间重新学习和分配状态。

7.2. Session Failure
7.2. 会话失败

In Session Failure, an LDP session between two peers fails and is restarted. There is no restart of the LSRs at either end of the session and LDP continues to function on those nodes.

会话失败时,两个对等方之间的LDP会话失败并重新启动。在会话的任何一端都不会重新启动LSR,LDP将继续在这些节点上运行。

In these cases, it is simple for LDP implementations to retain the LDP state associated with the failed session and to associate the state with the new session when it is established. Housekeeping may be applied to determine that the failed session is not returning and to release the old LDP state. Both [RFC3478] and [RFC3479] handle this case.

在这些情况下,LDP实现很容易保留与失败会话关联的LDP状态,并在建立新会话时将该状态与新会话关联。内务管理可用于确定失败的会话没有返回并释放旧的LDP状态。[RFC3478]和[RFC3479]都处理此情况。

Applicability of [RFC3478] and [RFC3479] to the Session Failure scenario should be considered with respect to the availability of the data plane.

应根据数据平面的可用性考虑[RFC3478]和[RFC3479]对会话失败场景的适用性。

In some cases the failure of the LDP session may be independent of any failure of the physical (or virtual) link(s) between adjacent peers; for example, it might represent a failure of the TCP/IP stack. In these cases, the data plane is not impacted and both [RFC3478] and [RFC3479] are applicable to preserve or restore LDP state.

在某些情况下,LDP会话的故障可能与相邻对等方之间的物理(或虚拟)链路的任何故障无关;例如,它可能表示TCP/IP堆栈失败。在这些情况下,数据平面不受影响,[RFC3478]和[RFC3479]都适用于保存或恢复LDP状态。

LDP signaling may also operate out of band; that is, it may use different links from the data plane. In this case, a failure of the LDP session may be a result of a failure of the control channel, but there is no implied failure of the data plane. For this scenario [RFC3478] and [RFC3479] are both applicable to preserve or restore LDP state.

LDP信令也可以在带外工作;也就是说,它可能使用与数据平面不同的链接。在这种情况下,LDP会话的故障可能是控制信道故障的结果,但是不存在隐含的数据平面故障。对于这种情况,[RFC3478]和[RFC3479]都适用于保存或恢复LDP状态。

In the case where the failure of the LDP session also implies the failure of the data plane, it may be an implementation decision whether LDP peers retain forwarding state, and for how long. In such situations, if forwarding state is retained, and if the LDP session is re-established, both [RFC3478] and [RFC3479] are applicable to preserve or restore LDP state.

在LDP会话的失败也意味着数据平面的失败的情况下,LDP对等点是否保持转发状态以及保持多长时间可能是实现决定。在这种情况下,如果转发状态被保留,并且如果LDP会话被重新建立,则[RFC3478]和[RFC3479]都适用于保留或恢复LDP状态。

When the data plane has been disrupted an objective of a recovery implementation might be to restore data traffic as quickly as possible.

当数据平面中断时,恢复实施的目标可能是尽快恢复数据流量。

7.3. Controlled Session Failure
7.3. 受控会话失败

In some circumstances, the LSRs may know in advance that an LDP session is going fail (e.g., perhaps a link is going to be taken out of service).

在某些情况下,lsr可能预先知道LDP会话将失败(例如,可能链路将停止服务)。

[RFC3036] includes provision for controlled shutdown of a session. [RFC3478] and [RFC3479] allow resynchronization of LDP state upon re-establishment of the session.

[RFC3036]包括会话受控关闭的规定。[RFC3478]和[RFC3479]允许在重新建立会话时重新同步LDP状态。

[RFC3479] offers the facility to both checkpoint all LDP states before the shut-down, and to quiesce the session so that no new state changes are attempted between the checkpoint and the shut-down. This means that on recovery, resynchronization is simple and fast.

[RFC3479]提供了在关闭前检查所有LDP状态的功能,并使会话停止,以便在检查点和关闭之间不会尝试新的状态更改。这意味着在恢复时,重新同步既简单又快速。

[RFC3478] resynchronizes all state on recovery regardless of the nature of the shut-down.

[RFC3478]在恢复时重新同步所有状态,而不管关闭的性质如何。

7.4. Node Failure
7.4. 节点故障

Node Failure describes events where a whole node is restarted or where the component responsible for LDP signaling is restarted. Such an event will be perceived by the LSR's peers as session failure, but the restarting node sees the restart as full re-initialization.

节点故障描述整个节点重新启动或负责LDP信令的组件重新启动的事件。LSR的对等方会将此类事件视为会话失败,但重新启动节点会将重新启动视为完全重新初始化。

The basic requirement is that the forwarding state is retained, otherwise the data plane will necessarily be interrupted. If forwarding state is not retained, it may be relearned from the saved control state in [RFC3479]. [RFC3478] does not utilize or expect a saved control state. If a node restarts without preserved forwarding state it informs its neighbors, which immediately delete all label-FEC bindings previously received from the restarted node.

基本要求是保持转发状态,否则数据平面必然中断。如果未保留转发状态,可从[RFC3479]中保存的控制状态重新学习。[RFC3478]不使用或期望保存的控制状态。如果节点在没有保留转发状态的情况下重新启动,它会通知其邻居,邻居会立即删除以前从重新启动的节点接收到的所有标签FEC绑定。

The ways to retain a forwarding and control state are numerous and implementation specific. It is not the purpose of this document to espouse one mechanism or another, nor even to suggest how this might be done. If state has been preserved across the restart, synchronization with peers can be carried out as though recovering from Session Failure as in the previous section. Both [RFC3478] and [RFC3479] support this case.

保持转发和控制状态的方法很多,并且是特定于实现的。本文件的目的不是支持这样或那样的机制,甚至不是建议如何做到这一点。如果在重启过程中保持了状态,则可以执行与对等方的同步,就像在上一节中从会话失败中恢复一样。[RFC3478]和[RFC3479]都支持这种情况。

How much control state is retained is largely an implementation choice, but [RFC3479] requires that at least small amount of per-session control state be retained. [RFC3478] does not require or expect control state to be retained.

保留多少控制状态在很大程度上取决于实现选择,但[RFC3479]要求至少保留少量的每会话控制状态。[RFC3478]不要求或期望保留控制状态。

It is also possible that the restarting LSR has not preserved any state. In this case, [RFC3479] is of no help. [RFC3478] however,

重新启动的LSR也可能没有保留任何状态。在这种情况下,[RFC3479]没有帮助。[RFC3478]但是,

allows the restarting LSR to relearn state from each adjacent peer through the processes for resynchronizing after Session Failure. Further, in the event of simultaneous failure of multiple adjacent nodes, the nodes at the edge of the failure zone can recover state from their active neighbors and distribute it to the other recovering LSRs without any failed LSR having to have saved state.

允许重新启动的LSR通过进程从每个相邻对等机重新学习状态,以便在会话失败后重新同步。此外,在多个相邻节点同时发生故障的情况下,故障区域边缘的节点可以从其活动邻居恢复状态,并将其分配给其他恢复LSR,而无需任何故障LSR保存状态。

7.5. Controlled Node Failure
7.5. 受控节点故障

In some cases (hardware repair, software upgrade, etc.), node failure may be predictable. In these cases all sessions with peers may be shutdown and existing state retention may be enhanced by special actions.

在某些情况下(硬件修复、软件升级等),节点故障可能是可预测的。在这些情况下,与对等方的所有会话都可能被关闭,并且可以通过特殊操作增强现有状态保留。

[RFC3479] checkpointing and quiesce may be applied to all sessions so that state is up-to-date.

[RFC3479]检查点和静止可应用于所有会话,以便状态是最新的。

As above, [RFC3478] does not require that state is retained by the restarting node, but can utilize it if it is.

如上所述,[RFC3478]不要求重新启动节点保留该状态,但可以利用该状态。

7.6. Speed of Recovery
7.6. 恢复速度

Speed of recovery is impacted by the amount of signaling required.

恢复速度受所需信号量的影响。

If forwarding state is preserved on both LSRs on the failed session, then the recovery time is constrained by the time to resynchronize the state between the two LSRs.

如果在失败会话的两个LSR上都保留了转发状态,则恢复时间受重新同步两个LSR之间状态的时间的限制。

[RFC3479] may resynchronize very quickly. In a stable network, this resolves to a handshake of a checkpoint. At the most, resynchronization involves this handshake plus an exchange of messages to handle state changes since the checkpoint was taken. Implementations that support only the periodic checkpointing subset of [RFC3479] are more likely to have additional state to resynchronize.

[RFC3479]可能会很快重新同步。在一个稳定的网络中,这解决了一个检查点的握手问题。至多,重新同步包括握手和消息交换,以处理自检查点设置以来的状态更改。仅支持[RFC3479]的定期检查点子集的实现更有可能具有额外的状态进行重新同步。

[RFC3478] must resynchronize state for all label mappings that have been retained. At the same time, resources that have been retained by a restarting upstream LSR but are not actually required, because they have been released by the downstream LSR (perhaps because it was in the process of releasing the state), they must be held for the full resynchronization time to ensure that they are not needed.

[RFC3478]必须重新同步已保留的所有标签映射的状态。同时,已由重新启动的上游LSR保留但实际上不需要的资源,因为它们已由下游LSR释放(可能是因为它处于释放状态的过程中),必须保留完整的重新同步时间,以确保它们不需要。

The impact of recovery time will vary according to the use of the network. Both [RFC3478] and [RFC3479] allow advertisement of new labels while resynchronization is in progress. Issues to consider are re-availability of falsely retained resources and conflict between retained label mappings and newly advertised ones. This may

恢复时间的影响因网络的使用而异。[RFC3478]和[RFC3479]都允许在重新同步过程中发布新标签。要考虑的问题是错误保留资源的重新可用性和保留标签映射与新广告标签之间的冲突。今年五月

cause incorrect forwarding of data (since labels are advertised from downstream), an LSR upstream of a failure may continue to forward data for one FEC on an old label while the recovering downstream LSR might re-assign that label to another FEC and advertise it. For this reason, restarting LSRs may choose to not advertise new labels until resynchronization with their peers has completed, or may decide to use special techniques to cover the short period of overlap between resynchronization and new LSP setup.

导致数据转发不正确(因为标签是从下游播发的),故障上游的LSR可能会继续转发旧标签上一个FEC的数据,而正在恢复的下游LSR可能会将该标签重新分配给另一个FEC并播发。因此,重新启动LSR可能会选择在与对等机的重新同步完成之前不发布新标签,或者决定使用特殊技术覆盖重新同步和新LSP设置之间的短时间重叠。

7.7. Scalability
7.7. 可伸缩性

Scalability is largely the same issue as speed of recovery and is governed by the number of LSPs managed through the failed session(s).

可伸缩性在很大程度上与恢复速度是同一个问题,并且由通过失败会话管理的LSP数量决定。

Note that there are limits to how small the resynchronization time in [RFC3478] may be made given the capabilities of the LSRs, the throughput on the link between them, and the number of labels that must be resynchronized.

注意,考虑到LSR的能力、它们之间链路的吞吐量以及必须重新同步的标签数量,[RFC3478]中的重新同步时间有限制。

Impact on normal operation should also be considered.

还应考虑对正常运行的影响。

[RFC3479] requires acknowledgement of all messages. These acknowledgements may be deferred as for checkpointing described in section 4, or may be frequent. Although acknowledgements can be piggy-backed on other state messages, an option for frequent acknowledgement is to send a message solely for the purpose of acknowledging a state change message. Such an implementation would clearly be unwise in a busy network.

[RFC3479]需要确认所有消息。对于第4节中描述的检查点,这些确认可以延迟,也可以是频繁的。尽管确认可以依赖于其他状态消息,但频繁确认的一个选项是发送消息仅用于确认状态更改消息。在繁忙的网络中,这样的实现显然是不明智的。

[RFC3478] has no impact on normal operations.

[RFC3478]对正常运行没有影响。

7.8. Rate of Change of LDP State
7.8. LDP状态变化率

Some networks do not show a high degree of change over time, such as those using targeted LDP sessions; others change the LDP forwarding state frequently, perhaps reacting to changes in routing information on LDP discovery sessions.

一些网络不显示随时间的高度变化,例如使用目标LDP会话的网络;其他人经常更改LDP转发状态,可能是对LDP发现会话上路由信息的更改作出反应。

Rate of change of LDP state exchanged over an LDP session depends on the application for which the LDP session is being used. LDP sessions used for exchanging <FEC, label> bindings for establishing hop by hop LSPs will typically exchange state reacting to IGP changes. Such exchanges could be frequent. On the other hand, LDP sessions established for exchanging MPLS Layer 2 VPN FECs will typically exhibit a smaller rate of state exchange.

在LDP会话上交换的LDP状态的变化率取决于LDP会话所用于的应用程序。用于交换<FEC,label>绑定以建立逐跳LSP的LDP会话通常会交换响应IGP更改的状态。这种交流可能会经常进行。另一方面,为交换MPLS第2层VPN FEC而建立的LDP会话通常表现出较小的状态交换速率。

In [RFC3479], two options exist. The first uses a frequent (up to per-message) acknowledgement system which is most likely to be applicable in a more dynamic system where it is desirable to preserve the maximum amount of state over a failure to reduce the level of resynchronization required and to speed the recovery time.

在[RFC3479]中,存在两个选项。第一种方法使用频繁(最多每条消息)确认系统,该系统最有可能适用于更具动态性的系统,其中希望在故障期间保持最大的状态量,以降低所需的重新同步级别并加快恢复时间。

The second option in [RFC3479] uses a less-frequent acknowledgement scheme known as checkpointing. This is particularly suitable to networks where changes are infrequent or bursty.

[RFC3479]中的第二个选项使用频率较低的确认方案,称为检查点。这尤其适用于变化不频繁或突发的网络。

[RFC3478] resynchronizes all state on recovery regardless of the rate of change of the network before the failure. This consideration is thus not relevant to the choice of [RFC3478].

[RFC3478]在恢复时重新同步所有状态,而不考虑故障前网络的变化率。因此,这种考虑与[RFC3478]的选择无关。

7.9. Label Distribution Modes
7.9. 标签分发模式

Both [RFC3478] and [RFC3479] are suitable for use with Downstream Unsolicited label distribution.

[RFC3478]和[RFC3479]都适用于下游未经请求的标签分发。

[RFC3478] describes Downstream-On-Demand as an area for future study and is therefore not applicable for a network in which this label distribution mode is used. It is possible that future examination of this issue will reveal that once a label has been distributed in either distribution mode, it can be redistributed by [RFC3478] upon session recovery.

[RFC3478]将下游随需应变描述为未来研究的领域,因此不适用于使用此标签分发模式的网络。将来对该问题的研究可能会发现,一旦标签以任何一种分发模式分发,则[RFC3478]可以在会话恢复时重新分发该标签。

[RFC3479] is suitable for use in a network that uses Downstream-On-Demand label distribution.

[RFC3479]适用于使用下游按需标签分发的网络。

In theory, and according to [RFC3036], even in networks configured to utilize Downstream Unsolicited label distribution, there may be occasions when the use of Downstream-On-Deman distribution is desirable. The use of the Label Request message is not prohibited in a Downstream Unsolicited label distribution LDP network.

理论上,根据[RFC3036],即使在配置为利用下游未经请求的标签分发的网络中,也可能存在需要使用下游随需应变分发的情况。在下游未经请求的标签分发LDP网络中,不禁止使用标签请求消息。

Opinion varies as to whether there is a practical requirement for the use of the Label Request message in a Downstream Unsolicited label distribution LDP network. Current deployment experience suggests that there is no requirement.

对于在下游未经请求的标签分发LDP网络中使用标签请求消息是否存在实际需求,意见各异。目前的部署经验表明,没有要求。

7.10. Implementation Complexity
7.10. 实现复杂性

Implementation complexity has consequences for the implementer and also for the deployer since complex software is more error prone and harder to manage.

实现的复杂性对实现者和部署者都有影响,因为复杂的软件更容易出错,也更难管理。

[RFC3479] is a more complex solution than [RFC3478]. In particular, [RFC3478] does not require any modification to the normal signaling and processing of LDP state changing messages.

[RFC3479]是比[RFC3478]更复杂的解决方案。特别地,[RFC3478]不需要对LDP状态改变消息的正常信令和处理进行任何修改。

[RFC3479] implementations may be simplified by implementing only the checkpointing subset of the functionality.

[RFC3479]可以通过仅实现功能的检查点子集来简化实现。

7.11. Implementation Robustness
7.11. 实现健壮性

In addition to the implication for robustness associated with complexity of the solutions, consideration should be given to the effects of state preservation on robustness.

除了与解的复杂性相关的鲁棒性含义外,还应考虑状态保持对鲁棒性的影响。

If state has become incorrect for whatever reason, then state preservation may retain incorrect state. In extreme cases, it may be that the incorrect state is the cause of the failure in which case preserving that state would be inappropriate.

若状态因任何原因变得不正确,则状态保留可能会保留不正确的状态。在极端情况下,不正确的状态可能是故障的原因,在这种情况下,保持该状态是不合适的。

When state is preserved, the precise amount that is retained is an implementation issue. The basic requirement is that forwarding state is retained (to preserve the data path) and that that state can be accessed by the LDP software component.

在保留状态时,保留的确切数量是一个实现问题。基本要求是保留转发状态(以保留数据路径),并且LDP软件组件可以访问该状态。

In both solutions, if the forwarding state is incorrect and is retained, it will continue to be incorrect. Both solutions have a mechanism to housekeep and free the unwanted state after resynchronization is complete. [RFC3478] may be better at eradicating incorrect forwarding state, because it replays all message exchanges that caused the state to be populated.

在这两种解决方案中,如果转发状态不正确并被保留,它将继续不正确。这两种解决方案都有一种机制,可以在重新同步完成后保留和释放不需要的状态。[RFC3478]在消除不正确的转发状态方面可能更好,因为它会重放导致填充状态的所有消息交换。

In [RFC3478], no more data than the forwarding state needs to have been saved by the recovering node. All LDP state may be relearned by message exchanges with peers. Whether those exchanges may cause the same incorrect state to arise on the recovering node is an obvious concern.

在[RFC3478]中,恢复节点不需要保存超过转发状态的数据。所有LDP状态都可以通过与对等方的消息交换重新学习。这些交换是否会导致恢复节点上出现相同的错误状态是一个明显的问题。

In [RFC3479], the forwarding state must be supplemented by a small amount of state specific to the protocol extensions. LDP state may be retained directly or reconstructed from the forwarding state. The same issues apply when reconstructing state but are mitigated by the fact that this is likely a different code path. Errors in the retained state specific to the protocol extensions will persist.

在[RFC3479]中,转发状态必须由协议扩展特定的少量状态补充。LDP状态可以直接保留或从转发状态重构。重构状态时也会遇到同样的问题,但由于这可能是一条不同的代码路径,问题得到了缓解。特定于协议扩展的保留状态中的错误将持续存在。

7.12. Interoperability and Backward Compatibility
7.12. 互操作性和向后兼容性

It is important that new additions to LDP interoperate with existing implementations at least in provision of the existing levels of function.

LDP的新增功能必须与现有实现互操作,至少在提供现有功能级别方面是如此。

Both [RFC3478] and [RFC3479] do this through rules for handling the absence of the FT optional negotiation object during session initialization.

[RFC3478]和[RFC3479]都是通过处理会话初始化期间缺少FT可选协商对象的规则来实现的。

Additionally, [RFC3478] is able to perform limited recovery (i.e., redistribution of state) even when only one of the participating LSRs supports the procedures. This may offer considerable advantages in interoperation with legacy implementations.

此外,[RFC3478]能够执行有限恢复(即状态重新分配),即使只有一个参与的LSR支持该过程。这可能在与遗留实现的互操作方面提供相当大的优势。

7.13. Interaction With Other Label Distribution Mechanisms
7.13. 与其他标签分发机制的相互作用

Many LDP LSRs also run other label distribution mechanisms. These include management interfaces for configuration of static label mappings, other distinct instances of LDP, and other label distribution protocols. The last example includes traffic engineering label distribution protocol that are used to construct tunnels through which LDP LSPs are established.

许多LDP LSR还运行其他标签分发机制。这些包括用于配置静态标签映射的管理接口、LDP的其他不同实例以及其他标签分发协议。最后一个示例包括流量工程标签分发协议,该协议用于构建建立LDP LSP的隧道。

As with re-use of individual labels by LDP within a restarting LDP system, care must be taken to prevent labels that need to be retained by a restarting LDP session or protocol component from being used by another label distribution mechanism. This might compromise data security, amongst other things.

与LDP在重新启动的LDP系统内重复使用单个标签一样,必须注意防止需要由重新启动的LDP会话或协议组件保留的标签被其他标签分发机制使用。除其他事项外,这可能会危及数据安全。

It is a matter for implementations to avoid this issue through the use of techniques, such as a common label management component or segmented label spaces.

实现需要通过使用诸如通用标签管理组件或分段标签空间等技术来避免此问题。

7.14. Applicability to CR-LDP
7.14. CR-LDP的适用性

CR-LDP [RFC3212] utilizes Downstream-On-Demand label distribution. [RFC3478] describes Downstream-On-Demand as an area for future study and is therefore not applicable for CR-LDP. [RFC3479] is suitable for use in a network entirely based on CR-LDP or in one that is mixed between LDP and CR-LDP.

CR-LDP[RFC3212]利用下游按需标签分发。[RFC3478]将下游随需应变描述为未来研究的领域,因此不适用于CR-LDP。[RFC3479]适用于完全基于CR-LDP的网络或LDP与CR-LDP混合的网络。

8. Security Considerations
8. 安全考虑

This document is informational and introduces no new security concerns.

本文档仅供参考,没有引入新的安全问题。

The security considerations pertaining to the original LDP protocol [RFC3036] remain relevant.

与原始LDP协议[RFC3036]相关的安全注意事项仍然相关。

[RFC3478] introduces the possibility of additional denial-of- service attacks. All of these attacks may be countered by use of an authentication scheme between LDP peers, such as the MD5-based scheme outlined in [LDP].

[RFC3478]引入了其他拒绝服务攻击的可能性。所有这些攻击都可以通过在LDP对等方之间使用身份验证方案来反击,例如[LDP]中概述的基于MD5的方案。

In MPLS, a data mis-delivery security issue can arise if an LSR continues to use labels after expiration of the session that first caused them to be used. Both [RFC3478] and [RFC3479] are open to this issue.

在MPLS中,如果LSR在第一次使用标签的会话过期后继续使用标签,则可能会出现数据错误传递安全问题。[RFC3478]和[RFC3479]都对此问题持开放态度。

9. Intellectual Property Statement
9. 知识产权声明

The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何努力来确定任何此类权利。有关IETF在标准跟踪和标准相关文件中权利的程序信息,请参见BCP-11。可从IETF秘书处获得可供发布的权利声明副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果。

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director.

IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涉及实施本标准所需技术的专有权利。请将信息发送给IETF执行董事。

10. References
10. 工具书类
10.1. Normative References
10.1. 规范性引用文件

[RFC2026] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996.

[RFC2026]Bradner,S.,“互联网标准过程——第3版”,BCP 9,RFC 2026,1996年10月。

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[RFC3036] Andersson, L., Doolan, P., Feldman, N., Fredette, A. and B. Thomas, "LDP Specification", RFC 3036, January 2001.

[RFC3036]Andersson,L.,Doolan,P.,Feldman,N.,Fredette,A.和B.Thomas,“LDP规范”,RFC 3036,2001年1月。

[RFC3478] Leelanivas, M., Rekhter, Y. and R. Aggarwal, "Graceful Restart Mechanism for LDP", RFC 3478, February 2003.

[RFC3478]Leelanivas,M.,Rekhter,Y.和R.Aggarwal,“LDP的优雅重启机制”,RFC 3478,2003年2月。

[RFC3479] Farrel, A., Editor, "Fault Tolerance for the Label Distribution Protocol (LDP)", RFC 3479, February 2003.

[RFC3479]Farrel,A.,编辑,“标签分发协议(LDP)的容错”,RFC 3479,2003年2月。

10.2. Informative References
10.2. 资料性引用

[RFC2547] Rosen, E. and Y. Rekhter, "BGP/MPLS VPNs", RFC 2547, March 1999.

[RFC2547]Rosen,E.和Y.Rekhter,“BGP/MPLS VPN”,RFC 2547,1999年3月。

[RFC3212] Jamoussi, B., Editor, Andersson, L., Callon, R., Dantu, R., Wu, L., Doolan, P., Worster, T., Feldman, N., Fredette, A., Girish, M., Gray, E., Heinanen, J., Kilty, T. and A. Malis, "Constraint-Based LSP Setup using LDP", RFC 3212, January 2002.

[RFC3212]Jamoussi,B.,编辑,Andersson,L.,Callon,R.,Dantu,R.,Wu,L.,Doolan,P.,Worster,T.,Feldman,N.,Fredette,A.,Girish,M.,Gray,E.,Heinanen,J.,Kilty,T.和A.Malis,“使用LDP的基于约束的LSP设置”,RFC 32122002年1月。

[RFC3469] Sharma, V., Ed., and F. Hellstrand, Ed., "Framework for Multi-Protocol Label Switching (MPLS)-based Recovery", RFC 3469, February 2003.

[RFC3469]Sharma,V.,Ed.,和F.Hellstrand,Ed.,“基于多协议标签交换(MPLS)的恢复框架”,RFC 3469,2003年2月。

11. Acknowledgements
11. 致谢

The author would like to thank the authors of [RFC3478] and [RFC3479] for their work on fault tolerance of LDP. Many thanks to Yakov Rekhter, Rahul Aggarwal, Manoj Leelanivas and Andrew Malis for their considered input to this applicability statement.

作者要感谢[RFC3478]和[RFC3479]的作者在LDP容错方面所做的工作。非常感谢雅科夫·雷克特、拉胡尔·阿加瓦尔、马诺·利拉尼瓦斯和安德鲁·马利斯对本适用性声明的深思熟虑的投入。

12. Author's Address
12. 作者地址

Adrian Farrel Old Dog Consulting

阿德里安·法雷尔老狗咨询公司

   Phone:  +44 (0) 1978 860944
   EMail:  adrian@olddog.co.uk
        
   Phone:  +44 (0) 1978 860944
   EMail:  adrian@olddog.co.uk
        
13. Full Copyright Statement
13. 完整版权声明

Copyright (C) The Internet Society (2003). All Rights Reserved.

版权所有(C)互联网协会(2003年)。版权所有。

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.

本文件及其译本可复制并提供给他人,对其进行评论或解释或协助其实施的衍生作品可全部或部分编制、复制、出版和分发,不受任何限制,前提是上述版权声明和本段包含在所有此类副本和衍生作品中。但是,不得以任何方式修改本文件本身,例如删除版权通知或对互联网协会或其他互联网组织的引用,除非出于制定互联网标准的需要,在这种情况下,必须遵循互联网标准过程中定义的版权程序,或根据需要将其翻译成英语以外的其他语言。

The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees.

上述授予的有限许可是永久性的,互联网协会或其继承人或受让人不会撤销。

This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件和其中包含的信息是按“原样”提供的,互联网协会和互联网工程任务组否认所有明示或暗示的保证,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Acknowledgement

确认

Funding for the RFC Editor function is currently provided by the Internet Society.

RFC编辑功能的资金目前由互联网协会提供。