Internet Engineering Task Force (IETF)                            R. Pan
Request for Comments: 8033                                  P. Natarajan
Category: Experimental                                     Cisco Systems
ISSN: 2070-1721                                                 F. Baker
                                                            Unaffiliated
                                                                G. White
                                                               CableLabs
                                                           February 2017
        
Internet Engineering Task Force (IETF)                            R. Pan
Request for Comments: 8033                                  P. Natarajan
Category: Experimental                                     Cisco Systems
ISSN: 2070-1721                                                 F. Baker
                                                            Unaffiliated
                                                                G. White
                                                               CableLabs
                                                           February 2017
        

Proportional Integral Controller Enhanced (PIE): A Lightweight Control Scheme to Address the Bufferbloat Problem

增强型比例积分控制器(PIE):一种解决缓冲区膨胀问题的轻量级控制方案

Abstract

摘要

Bufferbloat is a phenomenon in which excess buffers in the network cause high latency and latency variation. As more and more interactive applications (e.g., voice over IP, real-time video streaming, and financial transactions) run in the Internet, high latency and latency variation degrade application performance. There is a pressing need to design intelligent queue management schemes that can control latency and latency variation, and hence provide desirable quality of service to users.

缓冲区膨胀是一种现象,网络中的缓冲区过多会导致高延迟和延迟变化。随着越来越多的交互式应用程序(如IP语音、实时视频流和金融交易)在Internet上运行,高延迟和延迟变化会降低应用程序性能。迫切需要设计智能队列管理方案来控制延迟和延迟变化,从而为用户提供理想的服务质量。

This document presents a lightweight active queue management design called "PIE" (Proportional Integral controller Enhanced) that can effectively control the average queuing latency to a target value. Simulation results, theoretical analysis, and Linux testbed results have shown that PIE can ensure low latency and achieve high link utilization under various congestion situations. The design does not require per-packet timestamps, so it incurs very little overhead and is simple enough to implement in both hardware and software.

本文档介绍了一种称为“PIE”(比例积分控制器增强型)的轻量级主动队列管理设计,它可以有效地将平均队列延迟控制在目标值。仿真结果、理论分析和Linux测试台结果表明,PIE可以确保低延迟,并在各种拥塞情况下实现高链路利用率。该设计不需要每个数据包的时间戳,因此产生的开销非常小,并且非常简单,可以在硬件和软件中实现。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for examination, experimental implementation, and evaluation.

本文件不是互联网标准跟踪规范;它是为检查、实验实施和评估而发布的。

This document defines an Experimental Protocol for the Internet community. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 7841.

本文档为互联网社区定义了一个实验协议。本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 7841第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc8033.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc8033.

Copyright Notice

版权公告

Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2017 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1. Introduction ....................................................3
   2. Terminology .....................................................5
   3. Design Goals ....................................................5
   4. The Basic PIE Scheme ............................................6
      4.1. Random Dropping ............................................7
      4.2. Drop Probability Calculation ...............................7
      4.3. Latency Calculation ........................................9
      4.4. Burst Tolerance ...........................................10
   5. Optional Design Elements of PIE ................................11
      5.1. ECN Support ...............................................11
      5.2. Dequeue Rate Estimation ...................................11
      5.3. Setting PIE Active and Inactive ...........................13
      5.4. Derandomization ...........................................14
      5.5. Cap Drop Adjustment .......................................15
   6. Implementation Cost ............................................15
   7. Scope of Experimentation .......................................17
   8. Incremental Deployment .........................................17
   9. Security Considerations ........................................18
   10. References ....................................................18
      10.1. Normative References .....................................18
      10.2. Informative References ...................................18
   Appendix A. The Basic PIE Pseudocode ..............................21
   Appendix B. Pseudocode for PIE with Optional Enhancement ..........24
   Contributors ......................................................29
   Authors' Addresses ................................................30
        
   1. Introduction ....................................................3
   2. Terminology .....................................................5
   3. Design Goals ....................................................5
   4. The Basic PIE Scheme ............................................6
      4.1. Random Dropping ............................................7
      4.2. Drop Probability Calculation ...............................7
      4.3. Latency Calculation ........................................9
      4.4. Burst Tolerance ...........................................10
   5. Optional Design Elements of PIE ................................11
      5.1. ECN Support ...............................................11
      5.2. Dequeue Rate Estimation ...................................11
      5.3. Setting PIE Active and Inactive ...........................13
      5.4. Derandomization ...........................................14
      5.5. Cap Drop Adjustment .......................................15
   6. Implementation Cost ............................................15
   7. Scope of Experimentation .......................................17
   8. Incremental Deployment .........................................17
   9. Security Considerations ........................................18
   10. References ....................................................18
      10.1. Normative References .....................................18
      10.2. Informative References ...................................18
   Appendix A. The Basic PIE Pseudocode ..............................21
   Appendix B. Pseudocode for PIE with Optional Enhancement ..........24
   Contributors ......................................................29
   Authors' Addresses ................................................30
        
1. Introduction
1. 介绍

The explosion of smart phones, tablets, and video traffic in the Internet brings about a unique set of challenges for congestion control. To avoid packet drops, many service providers or data-center operators require vendors to put in as much buffer as possible. Because of the rapid decrease in memory chip prices, these requests are easily accommodated to keep customers happy. While this solution succeeds in assuring low packet loss and high TCP throughput, it suffers from a major downside. TCP continuously increases its sending rate and causes network buffers to fill up. TCP cuts its rate only when it receives a packet drop or mark that is interpreted as a congestion signal. However, drops and marks usually occur when network buffers are full or almost full. As a result, excess buffers, initially designed to avoid packet drops, would lead to highly elevated queuing latency and latency variation. Designing a queue management scheme is a delicate balancing act: it not only should allow short-term bursts to smoothly pass but also should control the average latency in the presence of long-running greedy flows.

互联网上智能手机、平板电脑和视频流量的激增给拥塞控制带来了一系列独特的挑战。为了避免数据包丢失,许多服务提供商或数据中心运营商要求供应商放入尽可能多的缓冲区。由于内存芯片价格的快速下降,这些要求很容易满足,以使客户满意。虽然此解决方案成功地确保了低数据包丢失和高TCP吞吐量,但它有一个主要缺点。TCP不断增加其发送速率并导致网络缓冲区填满。TCP只有在接收到被解释为拥塞信号的丢包或标记时才降低速率。但是,当网络缓冲区已满或几乎已满时,通常会发生丢弃和标记。因此,最初设计用于避免数据包丢失的过量缓冲区将导致队列延迟和延迟变化的高度增加。设计一个队列管理方案是一个微妙的平衡行为:它不仅应该允许短期突发顺利通过,而且还应该在存在长时间运行的贪婪流时控制平均延迟。

Active Queue Management (AQM) schemes could potentially solve the aforementioned problem. AQM schemes, such as Random Early Detection (RED) [RED] as suggested in [RFC2309] (which is now obsoleted by [RFC7567]), have been around for well over a decade. RED is implemented in a wide variety of network devices, both in hardware and software. Unfortunately, due to the fact that RED needs careful tuning of its parameters for various network conditions, most network operators don't turn RED on. In addition, RED is designed to control the queue length, which would affect latency implicitly. It does not control latency directly. Hence, the Internet today still lacks an effective design that can control buffer latency to improve the quality of experience to latency-sensitive applications. The more recently published RFC 7567 calls for new methods of controlling network latency.

主动队列管理(AQM)方案可能解决上述问题。AQM方案,如[RFC2309]中建议的随机早期检测(RED)[RED](现已被[RFC7567]淘汰),已经存在了十多年。RED在各种各样的网络设备中实现,包括硬件和软件。不幸的是,由于RED需要在各种网络条件下仔细调整其参数,因此大多数网络运营商不会开启RED。此外,RED设计用于控制队列长度,这将隐含地影响延迟。它不直接控制延迟。因此,今天的互联网仍然缺乏一种有效的设计来控制缓冲区延迟,从而提高对延迟敏感的应用程序的体验质量。最近发布的RFC 7567呼吁采用新方法控制网络延迟。

New algorithms are beginning to emerge to control queuing latency directly to address the bufferbloat problem [CoDel]. Along these lines, Proportional Integral controller Enhanced (PIE) also aims to keep the benefits of RED, including easy implementation and scalability to high speeds. Similar to RED, PIE randomly drops an incoming packet at the onset of congestion. Congestion detection, however, is based on the queuing latency instead of the queue length (as with RED). Furthermore, PIE also uses the derivative (rate of change) of the queuing latency to help determine congestion levels and an appropriate response. The design parameters of PIE are chosen via control theory stability analysis. While these parameters can be fixed to work in various traffic conditions, they could be made self-tuning to optimize system performance.

新的算法开始出现,直接控制排队延迟,以解决缓冲区膨胀问题[CoDel]。沿着这些路线,比例积分控制器增强(PIE)还旨在保持RED的优点,包括易于实现和可扩展到高速。与RED类似,PIE在拥塞开始时随机丢弃传入数据包。然而,拥塞检测是基于队列延迟而不是队列长度(如红色所示)。此外,PIE还使用排队延迟的导数(变化率)来帮助确定拥塞级别和适当的响应。通过控制理论稳定性分析,选择了PIE的设计参数。虽然这些参数可以固定以在各种交通条件下工作,但它们可以进行自调整以优化系统性能。

Separately, it is assumed that any latency-based AQM scheme would be applied over a Fair Queuing (FQ) structure or one of its approximate designs, Flow Queuing or Class-Based Queuing (CBQ). FQ is one of the most studied scheduling algorithms since it was first proposed in 1985 [RFC970]. CBQ has been a standard feature in most network devices today [CBQ]. Any AQM scheme that is built on top of FQ or CBQ could benefit from these advantages. Furthermore, these advantages, such as per-flow or per-class fairness, are orthogonal to the AQM design whose primary goal is to control latency for a given queue. For flows that are classified into the same class and put into the same queue, one needs to ensure that their latency is better controlled and that their fairness is not worse than those under the standard DropTail or RED design. More details about the relationship between FQ and AQM can be found in [RFC7806].

另外,假设任何基于延迟的AQM方案将应用于公平队列(FQ)结构或其近似设计之一,即流队列或基于类的队列(CBQ)。FQ是自1985年首次提出以来研究最多的调度算法之一[RFC970]。CBQ已成为当今大多数网络设备的标准功能[CBQ]。任何建立在FQ或CBQ之上的AQM方案都可以从这些优势中获益。此外,这些优势(例如每流或每类公平性)与AQM设计正交,AQM设计的主要目标是控制给定队列的延迟。对于分类为同一类并放入同一队列的流,需要确保它们的延迟得到更好的控制,并且它们的公平性不比标准DropTail或RED设计下的流差。有关FQ和AQM之间关系的更多详细信息,请参见[RFC7806]。

In October 2013, CableLabs' Data-Over-Cable Service Interface Specification 3.1 (DOCSIS 3.1) specification [DOCSIS_3.1] mandated that cable modems implement a specific variant of the PIE design as the active queue management algorithm. In addition to cable-specific

2013年10月,CableLabs的有线数据服务接口规范3.1(DOCSIS 3.1)规范[DOCSIS_3.1]强制要求有线调制解调器实施PIE设计的特定变体,作为主动队列管理算法。除了特定的电缆

improvements, the PIE design in DOCSIS 3.1 [RFC8034] has improved the original design in several areas, including derandomization of coin tosses and enhanced burst protection.

改进,DOCSIS 3.1[RFC8034]中的饼图设计在几个方面改进了原始设计,包括硬币投掷的去随机化和增强的爆裂保护。

This document describes the design of PIE and separates it into basic elements and optional components that may be implemented to enhance the performance of PIE.

本文档描述了PIE的设计,并将其分为基本元素和可选组件,这些元素和组件可用于增强PIE的性能。

2. Terminology
2. 术语

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119[RFC2119]中所述进行解释。

3. Design Goals
3. 设计目标

A queue management framework is designed to improve the performance of interactive and latency-sensitive applications. It should follow the general guidelines set by the AQM working group document "IETF Recommendations Regarding Active Queue Management" [RFC7567]. More specifically, the PIE design has the following basic criteria.

队列管理框架旨在提高交互式和对延迟敏感的应用程序的性能。它应遵循AQM工作组文件“IETF关于主动队列管理的建议”[RFC7567]中规定的一般指南。更具体地说,饼图设计具有以下基本标准。

* First, queuing latency, instead of queue length, is controlled. Queue sizes change with queue draining rates and various flows' round-trip times. Latency bloat is the real issue that needs to be addressed, as it impairs real-time applications. If latency can be controlled, bufferbloat is not an issue. In fact, once latency is under control, it frees up buffers for sporadic bursts.

* 首先,控制队列延迟,而不是队列长度。队列大小随队列排空率和各种流的往返时间而变化。延迟膨胀是需要解决的真正问题,因为它会损害实时应用程序。如果延迟可以控制,则缓冲区膨胀不是问题。事实上,一旦延迟得到控制,它就会释放出缓冲区,用于零星的突发事件。

* Secondly, PIE aims to attain high link utilization. The goal of low latency shall be achieved without suffering link underutilization or losing network efficiency. An early congestion signal could cause TCP to back off and avoid queue buildup. On the other hand, however, TCP's rate reduction could result in link underutilization. There is a delicate balance between achieving high link utilization and low latency.

* 其次,PIE旨在实现高链路利用率。应在不影响链路利用率或网络效率的情况下实现低延迟的目标。早期拥塞信号可能会导致TCP退出并避免队列累积。然而,另一方面,TCP的速率降低可能导致链路利用率不足。在实现高链路利用率和低延迟之间存在微妙的平衡。

* Furthermore, the scheme should be simple to implement and easily scalable in both hardware and software. PIE strives to maintain design simplicity similar to that of RED, which has been implemented in a wide variety of network devices.

* 此外,该方案应易于实现,并且在硬件和软件上易于扩展。PIE力求保持与RED类似的设计简洁性,RED已在各种网络设备中实现。

* Finally, the scheme should ensure system stability for various network topologies and scale well across an arbitrary number of streams. Design parameters shall be set automatically. Users only need to set performance-related parameters such as target queue latency, not design parameters.

* 最后,该方案应确保各种网络拓扑的系统稳定性,并能在任意数量的流中很好地扩展。设计参数应自动设置。用户只需要设置与性能相关的参数,如目标队列延迟,而不需要设置设计参数。

In the following text, the design of PIE and its operation are described in detail.

在下文中,将详细描述PIE的设计及其操作。

4. The Basic PIE Scheme
4. 基本饼图方案

As illustrated in Figure 1, PIE is comprised of three simple basic components: a) random dropping at enqueuing, b) periodic drop probability updates, and c) latency calculation. When a packet arrives, a random decision is made regarding whether to drop the packet. The drop probability is updated periodically based on how far the current latency is away from the target value and whether the queuing latency is currently trending up or down. The queuing latency can be obtained using direct measurements or using estimations calculated from the queue length and the dequeue rate.

如图1所示,PIE由三个简单的基本组件组成:a)排队时的随机丢弃,b)定期丢弃概率更新,以及c)延迟计算。当一个数据包到达时,会做出关于是否丢弃该数据包的随机决定。根据当前延迟距离目标值的距离以及排队延迟当前是向上还是向下的趋势,定期更新丢弃概率。排队等待时间可以通过直接测量或使用根据队列长度和出列率计算的估计值来获得。

The detailed definition of parameters can be found in Appendix A of this document ("The Basic PIE Pseudocode"). Any state variables that PIE maintains are noted using "PIE->". For a full description of the algorithm, one can refer to the full paper [HPSR-PIE].

参数的详细定义见本文件附录A(“基本PIE伪代码”)。PIE维护的任何状态变量都使用“PIE->”进行注释。有关算法的完整描述,请参阅全文[HPSR-PIE]。

         Random Drop
              /               --------------
      -------/  -------------->    | | | | | -------------->
             /|\                   | | | | |
              |               --------------
              |             Queue Buffer   \
              |                     |       \
              |                     |Queue   \
              |                     |Length   \
              |                     |          \
              |                    \|/         \/
              |          -----------------    -------------------
              |          |     Drop      |    |                 |
              -----<-----|  Probability  |<---| Latency         |
                         |  Calculation  |    | Calculation     |
                         -----------------    -------------------
        
         Random Drop
              /               --------------
      -------/  -------------->    | | | | | -------------->
             /|\                   | | | | |
              |               --------------
              |             Queue Buffer   \
              |                     |       \
              |                     |Queue   \
              |                     |Length   \
              |                     |          \
              |                    \|/         \/
              |          -----------------    -------------------
              |          |     Drop      |    |                 |
              -----<-----|  Probability  |<---| Latency         |
                         |  Calculation  |    | Calculation     |
                         -----------------    -------------------
        

Figure 1: The PIE Structure

图1:饼图结构

4.1. Random Dropping
4.1. 随机下降

PIE randomly drops a packet upon its arrival to a queue according to a drop probability, PIE->drop_prob_, that is obtained from the drop-probability-calculation component. The random drop is triggered by a packet's arrival before enqueuing into a queue.

PIE根据从丢弃概率计算组件获得的丢弃概率PIE->drop_prob_,在数据包到达队列时随机丢弃数据包。在排队进入队列之前,数据包的到达会触发随机丢弃。

* Upon a packet enqueue:

* 在数据包排队时:

randomly drop the packet with a probability of PIE->drop_prob_.

随机丢弃数据包,概率为PIE->drop\u prob\u0。

To ensure that PIE is "work conserving", we bypass the random drop if the latency sample, PIE->qdelay_old_, is smaller than half of the target latency value (QDELAY_REF) when the drop probability is not too high (i.e., PIE->drop_prob_ < 0.2), or if the queue has less than a couple of packets.

为了确保PIE“节省工作”,如果延迟样本PIE->qdelay_old_小于目标延迟值(qdelay_REF)的一半,并且丢弃概率不太高(即PIE->drop_prob_<0.2),或者如果队列包含的数据包少于两个,则绕过随机丢弃。

* Upon a packet enqueue, PIE does the following:

* 在数据包排队时,PIE执行以下操作:

//Safeguard PIE to be work conserving if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2) || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) return ENQUE; else randomly drop the packet with a probability of PIE->drop_prob_.

//如果((PIE->qdelay\u old\qdelay\u REF/2&&PIE->drop\u prob\u<0.2)| |(queue_u.byte\u length()<=2*MEAN\u PKTSIZE))返回ENQUE,则保护PIE以节省工作量;否则随机丢弃数据包,概率为PIE->drop\u prob\u0。

PIE optionally supports Explicit Congestion Notification (ECN); see Section 5.1.

PIE可选地支持显式拥塞通知(ECN);见第5.1节。

4.2. Drop Probability Calculation
4.2. 跌落概率计算

The PIE algorithm periodically updates the drop probability based on the latency samples -- not only the current latency sample but also whether the latency is trending up or down. This is the classical Proportional Integral (PI) controller method, which is known for eliminating steady-state errors. This type of controller has been studied before for controlling the queue length [PI] [QCN]. PIE adopts the PI controller for controlling latency. The algorithm also auto-adjusts the control parameters based on how heavy the congestion is, which is reflected in the current drop probability. Note that the current drop probability is a direct measure of the current congestion level; there is no need to measure the arrival rate and dequeue rate mismatches.

PIE算法根据延迟样本周期性地更新丢弃概率——不仅是当前延迟样本,还包括延迟趋势是向上还是向下。这是经典的比例积分(PI)控制器方法,以消除稳态误差而闻名。之前已经研究过这种类型的控制器,用于控制队列长度[PI][QCN]。PIE采用PI控制器控制延迟。该算法还根据拥塞的严重程度自动调整控制参数,这反映在当前丢弃概率中。请注意,当前丢弃概率是当前拥塞水平的直接度量;无需测量到达率和出列率的不匹配。

When a congestion period ends, we might be left with a high drop probability with light packet arrivals. Hence, the PIE algorithm includes a mechanism by which the drop probability decays

当拥塞期结束时,光包到达可能会给我们留下很高的丢包概率。因此,PIE算法包含一种下降概率衰减的机制

exponentially (rather than linearly) when the system is not congested. This would help the drop probability converge to 0 more quickly, while the PI controller ensures that it would eventually reach zero. The decay parameter of 2% gives us a time constant around 50 * T_UPDATE.

当系统不拥挤时,以指数方式(而不是线性方式)运行。这将有助于下降概率更快地收敛到0,而PI控制器将确保下降概率最终达到0。2%的衰减参数为我们提供了一个大约50*T_更新的时间常数。

Specifically, the PIE algorithm periodically adjusts the drop probability every T_UPDATE interval:

具体而言,PIE算法每隔T_更新间隔定期调整丢弃概率:

* calculate drop probability PIE->drop_prob_, and autotune it as follows:

* 计算下降概率饼图->下降概率,并按如下方式自动调谐:

         p = alpha * (current_qdelay - QDELAY_REF) +
                beta * (current_qdelay - PIE->qdelay_old_);
        
         p = alpha * (current_qdelay - QDELAY_REF) +
                beta * (current_qdelay - PIE->qdelay_old_);
        
         if (PIE->drop_prob_ < 0.000001) {
             p /= 2048;
         } else if (PIE->drop_prob_ < 0.00001) {
             p /= 512;
         } else if (PIE->drop_prob_ < 0.0001) {
             p /= 128;
         } else if (PIE->drop_prob_ < 0.001) {
             p /= 32;
         } else if (PIE->drop_prob_ < 0.01) {
             p /= 8;
         } else if (PIE->drop_prob_ < 0.1) {
             p /= 2;
         } else {
             p = p;
         }
         PIE->drop_prob_ += p;
        
         if (PIE->drop_prob_ < 0.000001) {
             p /= 2048;
         } else if (PIE->drop_prob_ < 0.00001) {
             p /= 512;
         } else if (PIE->drop_prob_ < 0.0001) {
             p /= 128;
         } else if (PIE->drop_prob_ < 0.001) {
             p /= 32;
         } else if (PIE->drop_prob_ < 0.01) {
             p /= 8;
         } else if (PIE->drop_prob_ < 0.1) {
             p /= 2;
         } else {
             p = p;
         }
         PIE->drop_prob_ += p;
        

* decay the drop probability exponentially:

* 按指数衰减下降概率:

         if (current_qdelay == 0 && PIE->qdelay_old_ == 0) {
             PIE->drop_prob_ = PIE->drop_prob_ * 0.98;
                                                 //1 - 1/64 is
                                                 //sufficient
         }
        
         if (current_qdelay == 0 && PIE->qdelay_old_ == 0) {
             PIE->drop_prob_ = PIE->drop_prob_ * 0.98;
                                                 //1 - 1/64 is
                                                 //sufficient
         }
        

* bound the drop probability:

* 限制跌落概率:

         if (PIE->drop_prob_ < 0)
                  PIE->drop_prob_ = 0.0
         if (PIE->drop_prob_ > 1)
                  PIE->drop_prob_ = 1.0
        
         if (PIE->drop_prob_ < 0)
                  PIE->drop_prob_ = 0.0
         if (PIE->drop_prob_ > 1)
                  PIE->drop_prob_ = 1.0
        

* store the current latency value:

* 存储当前延迟值:

PIE->qdelay_old_ = current_qdelay.

饼图->qdelay\u old\u=当前\u qdelay。

The update interval, T_UPDATE, is defaulted to be 15 milliseconds. It MAY be reduced on high-speed links in order to provide smoother response. The target latency value, QDELAY_REF, SHOULD be set to 15 milliseconds. The variables current_qdelay and PIE->qdelay_old_ represent the current and previous samples of the queuing latency, which are calculated by the "latency calculation" component (see Section 4.3). The variable current_qdelay is actually a temporary variable, while PIE->qdelay_old_ is a state variable that PIE keeps. The drop probability is a value between 0 and 1. However, implementations can certainly use integers.

更新间隔T_update默认为15毫秒。为了提供更平滑的响应,可以在高速链路上减小该值。目标延迟值QDELAY_REF应设置为15毫秒。变量current_qdelay和PIE->qdelay_old_u表示排队等待时间的当前和以前的样本,由“延迟计算”组件计算(参见第4.3节)。变量current_qdelay实际上是一个临时变量,而PIE->qdelay_old_是PIE保留的状态变量。跌落概率是一个介于0和1之间的值。然而,实现当然可以使用整数。

The controller parameters, alpha and beta (expressed in Hz), are designed using feedback loop analysis, where TCP's behaviors are modeled using the results from well-studied prior art [TCP-Models]. Note that the above adjustment of 'p' effectively scales the alpha and beta parameters based on the current congestion level indicated by the drop probability.

控制器参数alpha和beta(以Hz表示)使用反馈回路分析进行设计,其中TCP的行为使用充分研究的现有技术[TCP模型]的结果进行建模。请注意,上述对“p”的调整有效地根据丢弃概率指示的当前拥塞水平缩放alpha和beta参数。

The theoretical analysis of PIE can be found in [HPSR-PIE]. As a rule of thumb, to keep the same feedback loop dynamics, if we cut T_UPDATE in half, we should also cut alpha by half and increase beta by alpha/4. If the target latency is reduced, e.g., for data-center use, the values of alpha and beta should be increased by the same order of magnitude by which the target latency is reduced. For example, if QDELAY_REF is reduced and changed from 15 milliseconds to 150 microseconds -- a reduction of two orders of magnitude -- then alpha and beta values should be increased to alpha * 100 and beta * 100.

PIE的理论分析见[HPSR-PIE]。根据经验,为了保持相同的反馈循环动态,如果我们将T_UPDATE一分为二,我们还应该将alpha减半,将beta增加alpha/4。如果目标延迟降低,例如,对于数据中心的使用,alpha和beta的值应增加与目标延迟降低相同的数量级。例如,如果QDELAY_REF减小并从15毫秒更改为150微秒(减少两个数量级),则alpha和beta值应增加到alpha*100和beta*100。

4.3. Latency Calculation
4.3. 延迟计算

The PIE algorithm uses latency to calculate drop probability in one of two ways:

PIE算法使用延迟通过以下两种方式之一计算丢弃概率:

* It estimates the current queuing latency using Little's law (see Section 5.2 for details):

* 它使用Little定律估计当前排队等待时间(详见第5.2节):

         current_qdelay = queue_.byte_length()/dequeue_rate;
        
         current_qdelay = queue_.byte_length()/dequeue_rate;
        

* It may use other techniques for calculating queuing latency, e.g., time-stamp the packets at enqueue, and use the timestamps to calculate latency during dequeue.

* 它可以使用其他技术来计算排队等待时间,例如,在排队时对分组加上时间戳,并使用时间戳来计算出队列期间的等待时间。

4.4. Burst Tolerance
4.4. 突发容限

PIE does not penalize short-term packet bursts as suggested in [RFC7567]. PIE allows bursts of traffic that create finite-duration events in which current queuing latency exceeds QDELAY_REF without triggering packet drops. This document introduces a parameter called "MAX_BURST"; MAX_BURST defines the burst duration that will be protected. By default, the parameter SHOULD be set to 150 milliseconds. For simplicity, the PIE algorithm MAY effectively round MAX_BURST up to an integer multiple of T_UPDATE.

PIE不惩罚[RFC7567]中建议的短期数据包突发。PIE允许流量突发,产生有限持续时间事件,其中当前排队延迟超过QDELAY_REF,而不会触发数据包丢弃。本文档介绍了一个名为“MAX_BURST”的参数;MAX_突发定义将被保护的突发持续时间。默认情况下,该参数应设置为150毫秒。为简单起见,PIE算法可以有效地将MAX_突发取整为T_更新的整数倍。

To implement the burst tolerance function, two basic components of PIE are involved: "random dropping" and "drop probability calculation". The PIE algorithm does the following:

为了实现突发容忍功能,PIE涉及两个基本组件:“随机丢弃”和“丢弃概率计算”。饼图算法执行以下操作:

* In the "random dropping" block and upon packet arrival, PIE checks the following:

* 在“随机丢弃”块中,当数据包到达时,PIE检查以下内容:

Upon a packet enqueue: if PIE->burst_allowance_ > 0 enqueue packet; else randomly drop a packet with a probability of PIE->drop_prob_.

分组排队时:如果PIE->burst\U余量\u0分组排队;否则随机丢弃一个数据包,概率为PIE->drop\u prob\。

         if (PIE->drop_prob_ == 0 and current_qdelay < QDELAY_REF/2 and
             PIE->qdelay_old_ < QDELAY_REF/2)
             PIE->burst_allowance_ = MAX_BURST;
        
         if (PIE->drop_prob_ == 0 and current_qdelay < QDELAY_REF/2 and
             PIE->qdelay_old_ < QDELAY_REF/2)
             PIE->burst_allowance_ = MAX_BURST;
        

* In the "drop probability calculation" block, PIE additionally calculates:

* 在“坠落概率计算”块中,PIE还计算:

      PIE->burst_allowance_ = max(0,PIE->burst_allowance_ - T_UPDATE);
        
      PIE->burst_allowance_ = max(0,PIE->burst_allowance_ - T_UPDATE);
        

The burst allowance, noted by PIE->burst_allowance_, is initialized to MAX_BURST. As long as PIE->burst_allowance_ is above zero, an incoming packet will be enqueued, bypassing the random drop process. During each update instance, the value of PIE->burst_allowance_ is decremented by the update period, T_UPDATE, and is bottomed at 0. When the congestion goes away -- defined here as PIE->drop_prob_ equals 0 and both the current and previous samples of estimated latency are less than half of QDELAY_REF -- PIE->burst_allowance_ is reset to MAX_BURST.

由饼图->突发\u容差\u0注释的突发容差被初始化为最大突发。只要PIE->burst_Allowment_高于零,传入的数据包将排队,绕过随机丢弃过程。在每个更新实例中,PIE->burst_Allowment_u的值随更新周期T_update递减,并在0处见底。当拥塞消失时——这里定义为PIE->drop_prob_uu等于0,并且当前和以前的估计延迟样本都小于QDELAY_REF的一半——PIE->burst_Allowment_u重置为MAX_burst。

5. Optional Design Elements of PIE
5. PIE的可选设计元素

There are several enhancements that are added to further augment the performance of the basic algorithm. For purposes of clarity, they are included in this section.

添加了一些增强功能,以进一步增强基本算法的性能。为清楚起见,本节中包含了这些内容。

5.1. ECN Support
5.1. ECN支持

PIE MAY support ECN by marking (rather than dropping) ECN-capable packets [ECN]. This document introduces an additional threshold called "mark_ecnth", which acts as a safeguard: if the calculated drop probability exceeds mark_ecnth, PIE reverts to packet-dropping for ECN-capable packets. The variable mark_ecnth SHOULD be set to 0.1 (10%).

PIE可以通过标记(而不是丢弃)支持ECN的数据包[ECN]来支持ECN。本文档引入了一个称为“mark_ecnth”的附加阈值,该阈值起到了保护作用:如果计算的丢弃概率超过mark_ecnth,PIE将恢复到支持ECN的数据包的数据包丢弃。变量mark_ecnth应设置为0.1(10%)。

* To support ECN, the "random drop with a probability of PIE->drop_prob_" function in the "random dropping" block is changed to the following:

* 为了支持ECN,“随机丢弃”块中的“随机丢弃概率为饼状->丢弃概率”函数更改为:

* Upon a packet enqueue:

* 在数据包排队时:

         if rand() < PIE->drop_prob_:
          if PIE->drop_prob_ < mark_ecnth && ecn_capable_packet == TRUE:
             mark packet;
          else
             drop packet;
        
         if rand() < PIE->drop_prob_:
          if PIE->drop_prob_ < mark_ecnth && ecn_capable_packet == TRUE:
             mark packet;
          else
             drop packet;
        
5.2. Dequeue Rate Estimation
5.2. 出列率估计

Using timestamps, a latency sample can only be obtained when a packet reaches the head of a queue. When a quick response time is desired or a direct latency sample is not available, one may obtain latency through measuring the dequeue rate. The draining rate of a queue in the network often varies either because other queues are sharing the same link or because the link capacity fluctuates. Rate fluctuation is particularly common in wireless networks. One may measure directly at the dequeue operation. Short, non-persistent bursts of packets result in empty queues from time to time; this would make the measurement less accurate. PIE only measures latency when there is sufficient data in the buffer, i.e., when the queue length is over a

使用时间戳,只有当数据包到达队列的头部时,才能获得延迟样本。当需要快速响应时间或直接延迟样本不可用时,可以通过测量出列率来获得延迟。由于其他队列共享同一链路,或者由于链路容量波动,网络中队列的排放率通常会发生变化。速率波动在无线网络中尤其常见。可以在出列操作时直接测量。短的、非持久性的数据包突发会不时导致空队列;这会降低测量的准确性。PIE仅在缓冲区中有足够数据时测量延迟,即队列长度超过

certain threshold (DQ_THRESHOLD). PIE measures how long it takes to drain DQ_THRESHOLD packets. More specifically, the rate estimation can be implemented as follows:

某些阈值(DQ_阈值)。PIE测量消耗DQ_阈值数据包所需的时间。更具体地说,速率估计可以如下实现:

      current_qdelay = queue_.byte_length() *
                       PIE->avg_dq_time_/DQ_THRESHOLD;
        
      current_qdelay = queue_.byte_length() *
                       PIE->avg_dq_time_/DQ_THRESHOLD;
        

* Upon a packet dequeue:

* 在数据包出列时:

      if PIE->in_measurement_ == FALSE and queue.byte_length() >=
      DQ_THRESHOLD:
         PIE->in_measurement_ = TRUE;
         PIE->measurement_start_ = now;
         PIE->dq_count_ = 0;
        
      if PIE->in_measurement_ == FALSE and queue.byte_length() >=
      DQ_THRESHOLD:
         PIE->in_measurement_ = TRUE;
         PIE->measurement_start_ = now;
         PIE->dq_count_ = 0;
        
      if PIE->in_measurement_ == TRUE:
         PIE->dq_count_ = PIE->dq_count_ + deque_pkt_size;
         if PIE->dq_count_ >= DQ_THRESHOLD then
            weight = DQ_THRESHOLD/2^16
            PIE->avg_dq_time_ = (now - PIE->measurement_start_) *
                                weight + PIE->avg_dq_time_ *
                                (1 - weight);
            PIE->dq_count_ = 0;
            PIE->measurement_start_ = now
         else
            PIE->in_measurement_ = FALSE;
        
      if PIE->in_measurement_ == TRUE:
         PIE->dq_count_ = PIE->dq_count_ + deque_pkt_size;
         if PIE->dq_count_ >= DQ_THRESHOLD then
            weight = DQ_THRESHOLD/2^16
            PIE->avg_dq_time_ = (now - PIE->measurement_start_) *
                                weight + PIE->avg_dq_time_ *
                                (1 - weight);
            PIE->dq_count_ = 0;
            PIE->measurement_start_ = now
         else
            PIE->in_measurement_ = FALSE;
        

The parameter PIE->dq_count_ represents the number of bytes departed since the last measurement. Once PIE->dq_count_ is over DQ_THRESHOLD, a measurement sample is obtained. It is recommended that the threshold be set to 16 KB, assuming a typical packet size of around 1 KB or 1.5 KB. This threshold would allow sufficient data to obtain an average draining rate but would also be fast enough (< 64 KB) to reflect sudden changes in the draining rate. If DQ_THRESHOLD is smaller than 64 KB, a small weight is used to smooth out the dequeue time and obtain PIE->avg_dq_time_. The dequeue rate is simply DQ_THRESHOLD divided by PIE->avg_dq_time_. This threshold is not crucial for the system's stability. Please note that the update interval for calculating the drop probability is different from the rate measurement cycle. The drop probability calculation is done periodically per Section 4.2, and it is done even when the algorithm is not in a measurement cycle; in this case, the previously latched value of PIE->avg_dq_time_ is used.

参数PIE->dq_count_uu表示自上次测量以来离开的字节数。一旦PIE->dq_计数超过dq_阈值,则获得测量样本。建议将阈值设置为16 KB,假设典型数据包大小约为1 KB或1.5 KB。该阈值允许有足够的数据来获得平均排放速率,但也足够快(<64 KB),以反映排放速率的突然变化。如果DQ_阈值小于64 KB,则使用一个小权重来平滑出列时间,并获得PIE->avg_DQ_时间。出列率只是DQ_阈值除以饼图->平均DQ_时间。这个阈值对于系统的稳定性并不重要。请注意,计算跌落概率的更新间隔与速率测量周期不同。根据第4.2节定期进行跌落概率计算,即使算法不在测量周期内,也应进行跌落概率计算;在这种情况下,使用先前锁定的PIE->avg_dq_time_u值。

            Random Drop
                /                     --------------
        -------/  -------------------->    | | | | | -------------->
               /|\             |           | | | | |
                |              |      --------------
                |              |       Queue Buffer
                |              |             |
                |              |             |Queue
                |              |             |Length
                |              |             |
                |             \|/           \|/
                |          ------------------------------
                |          |     Dequeue Rate           |
                -----<-----|  & Drop Probability        |
                           |        Calculation         |
                           ------------------------------
        
            Random Drop
                /                     --------------
        -------/  -------------------->    | | | | | -------------->
               /|\             |           | | | | |
                |              |      --------------
                |              |       Queue Buffer
                |              |             |
                |              |             |Queue
                |              |             |Length
                |              |             |
                |             \|/           \|/
                |          ------------------------------
                |          |     Dequeue Rate           |
                -----<-----|  & Drop Probability        |
                           |        Calculation         |
                           ------------------------------
        

Figure 2: The Enqueue-Based PIE Structure

图2:基于排队的饼图结构

In some platforms, enqueuing and dequeuing functions belong to different modules that are independent of each other. In such situations, a pure enqueue-based design can be developed. An enqueue-based design is depicted in Figure 2. The dequeue rate is deduced from the number of packets enqueued and the queue length. The design is based on the following key observation: over a certain time interval, the number of dequeued packets = the number of enqueued packets minus the number of remaining packets in the queue. In this design, everything can be triggered by packet arrival, including the background update process. The design complexity here is similar to the original design.

在某些平台中,排队和退队功能属于相互独立的不同模块。在这种情况下,可以开发纯排队设计。基于排队的设计如图2所示。从排队的数据包数和队列长度推断出出列率。该设计基于以下关键观察:在一定的时间间隔内,排队数据包的数量=排队数据包的数量减去队列中剩余数据包的数量。在这个设计中,任何东西都可以由数据包到达触发,包括后台更新过程。此处的设计复杂性与原始设计相似。

5.3. Setting PIE Active and Inactive
5.3. 将饼图设置为活动和非活动

Traffic naturally fluctuates in a network. It would be preferable not to unnecessarily drop packets due to a spurious uptick in queuing latency. PIE has an optional feature of automatically becoming active/inactive. To implement this feature, PIE may choose to only become active (from inactive) when the buffer occupancy is over a certain threshold, which may be set to 1/3 of the tail drop threshold. PIE becomes inactive when congestion ends; i.e., when the drop probability reaches 0, current and previous latency samples are all below half of QDELAY_REF.

网络中的流量自然会波动。最好不要由于排队延迟的虚假上升而不必要地丢弃数据包。饼图具有自动变为活动/非活动的可选功能。为了实现此功能,PIE可以选择仅在缓冲区占用率超过某个阈值(可设置为尾部下降阈值的1/3)时才变为活动(从非活动)。当拥塞结束时,PIE变得不活动;i、 例如,当丢弃概率达到0时,当前和以前的延迟样本都低于QDELAY_REF的一半。

Ideally, PIE should become active/inactive based on latency. However, calculating latency when PIE is inactive would introduce unnecessary packet-processing overhead. Weighing the trade-offs, we decided to compare against the tail drop threshold to keep things simple.

理想情况下,PIE应根据延迟变为活动/非活动。然而,当PIE处于非活动状态时计算延迟将引入不必要的数据包处理开销。权衡权衡权衡后,我们决定与尾部下降阈值进行比较,以保持简单。

When PIE optionally becomes active/inactive, the burst protection logic described in Section 4.4 is modified as follows:

当PIE可选地变为活动/非活动时,第4.4节中描述的突发保护逻辑修改如下:

* "Random dropping" block: PIE adds the following:

* “随机下降”块:饼图添加以下内容:

Upon packet arrival:

包裹到达时:

      if PIE->active_ == FALSE && queue_length >= TAIL_DROP/3:
         PIE->active_ = TRUE;
         PIE->burst_allowance_ = MAX_BURST;
        
      if PIE->active_ == FALSE && queue_length >= TAIL_DROP/3:
         PIE->active_ = TRUE;
         PIE->burst_allowance_ = MAX_BURST;
        

if PIE->burst_allowance_ > 0 enqueue packet; else randomly drop a packet with a probability of PIE->drop_prob_.

如果PIE->burst\u Allowment\u0将数据包排队;否则随机丢弃一个数据包,概率为PIE->drop\u prob\。

      if (PIE->drop_prob_ == 0 and current_qdelay < QDELAY_REF/2 and
          PIE->qdelay_old_ < QDELAY_REF/2)
          PIE->active_ = FALSE;
          PIE->burst_allowance_ = MAX_BURST;
        
      if (PIE->drop_prob_ == 0 and current_qdelay < QDELAY_REF/2 and
          PIE->qdelay_old_ < QDELAY_REF/2)
          PIE->active_ = FALSE;
          PIE->burst_allowance_ = MAX_BURST;
        
   *  "Drop probability calculation" block: PIE does the following:
      if PIE->active_ == TRUE:
         PIE->burst_allowance_ =
            max(0,PIE->burst_allowance_ - T_UPDATE);
        
   *  "Drop probability calculation" block: PIE does the following:
      if PIE->active_ == TRUE:
         PIE->burst_allowance_ =
            max(0,PIE->burst_allowance_ - T_UPDATE);
        
5.4. Derandomization
5.4. 去氮化

Although PIE adopts random dropping to achieve latency control, independent coin tosses could introduce outlier situations where packets are dropped too close to each other or too far from each other. This would cause the real drop percentage to temporarily deviate from the intended value PIE->drop_prob_. In certain scenarios, such as a small number of simultaneous TCP flows, these deviations can cause significant deviations in link utilization and queuing latency. PIE may use a derandomization mechanism to avoid such situations. A parameter called "PIE->accu_prob_" is reset to 0 after a drop. Upon packet arrival, PIE->accu_prob_ is incremented by the amount of drop probability, PIE->drop_prob_. If PIE->accu_prob_ is less than a low threshold, e.g., 0.85, the arriving packet is enqueued; on the other hand, if PIE->accu_prob_ is more than a high threshold, e.g., 8.5, and the queue is congested, the arrival packet is forced to be dropped. A packet is only randomly dropped if PIE->accu_prob_ falls between the two thresholds. Since PIE->accu_prob_ is reset to 0 after a drop, another drop will not happen until 0.85/PIE->drop_prob_ packets later. This avoids packets being dropped too close to each other. In the other extreme case

虽然PIE采用随机丢弃来实现延迟控制,但独立投币可能会引入离群情况,即数据包之间的丢弃距离太近或太远。这将导致实际下降百分比暂时偏离预期值饼->下降概率。在某些情况下,例如少量的同时TCP流,这些偏差可能会导致链路利用率和排队延迟的显著偏差。PIE可以使用去随机化机制来避免这种情况。一个名为“PIE->accu_prob__”的参数在下降后重置为0。数据包到达后,PIE->accu_prob_uu会增加丢弃概率的数量,即PIE->drop_prob_uu。如果PIE->accu_prob_小于低阈值,例如0.85,则到达的数据包排队;另一方面,如果PIE->accu_prob_超过高阈值,例如8.5,并且队列拥挤,则将强制丢弃到达数据包。只有当PIE->accu_prob_落在两个阈值之间时,才会随机丢弃数据包。由于PIE->accu_prob_uu在丢弃后重置为0,因此在0.85/PIE->drop_prob_uu之后才会发生另一次丢弃。这样可以避免数据包之间的距离过近。在另一极端情况下

where 8.5/PIE->drop_prob_ packets have been enqueued without incurring a drop, PIE would force a drop in order to prevent the drops from being spaced too far apart. Further analysis can be found in [RFC8034].

如果8.5/PIE->drop_prob___数据包已排队而未发生丢包,PIE将强制丢包,以防止丢包间隔过远。进一步的分析见[RFC8034]。

5.5. Cap Drop Adjustment
5.5. 盖降调整

In the case of a single TCP flow, during the slow-start phase the queue could quickly increase, which could result in a very rapid increase in drop probability. In order to prevent an excessive ramp-up that could negatively impact the throughput in this scenario, PIE can cap the maximum drop probability increase in each step.

在单个TCP流的情况下,在慢启动阶段,队列可能会快速增加,这可能会导致丢弃概率快速增加。为了防止在这种情况下可能对吞吐量产生负面影响的过度爬升,PIE可以限制每个步骤中的最大丢弃概率增加。

* "Drop probability calculation" block: PIE adds the following:

* “坠落概率计算”块:PIE添加以下内容:

      if (PIE->drop_prob_ >= 0.1 && p > 0.02) {
          p = 0.02;
      }
        
      if (PIE->drop_prob_ >= 0.1 && p > 0.02) {
          p = 0.02;
      }
        
6. Implementation Cost
6. 实施成本

PIE can be applied to existing hardware or software solutions. There are three steps involved in PIE, as discussed in Section 4. Their complexities are examined below.

PIE可以应用于现有的硬件或软件解决方案。PIE涉及三个步骤,如第4节所述。它们的复杂性如下所述。

Upon packet arrival, the algorithm simply drops a packet randomly, based on the drop probability. This step is straightforward and requires no packet header examination and manipulation. If the implementation doesn't rely on packet timestamps for calculating latency, PIE does not require extra memory. Furthermore, the input side of a queue is typically under software control while the output side of a queue is hardware based. Hence, a drop at enqueuing can be readily retrofitted into existing or software implementations.

在数据包到达时,该算法根据丢包概率随机丢包。这一步很简单,不需要对数据包头进行检查和操作。如果实现不依赖数据包时间戳来计算延迟,则PIE不需要额外的内存。此外,队列的输入端通常受软件控制,而队列的输出端则基于硬件。因此,排队时的下降可以很容易地改造成现有的或软件实现。

The drop probability calculation is done in the background, and it occurs every T_UPDATE interval. Given modern high-speed links, this period translates into once every tens, hundreds, or even thousands of packets. Hence, the calculation occurs at a much slower time scale than the packet-processing time -- at least an order of magnitude slower. The calculation of drop probability involves multiplications using alpha and beta. Since PIE's control law is robust to minor changes in alpha and beta values, an implementation MAY choose these values to the closest multiples of 2 or 1/2 (e.g., alpha = 1/8, beta = 1 + 1/4) such that the multiplications can be done using simple adds and shifts. As no complicated functions are required, PIE can be easily implemented in both hardware and

丢弃概率的计算是在后台完成的,它会在每个T_更新间隔发生。考虑到现代高速链路,这段时间转化为每数十、数百甚至数千个数据包一次。因此,计算发生在比数据包处理时间慢得多的时间尺度上——至少慢一个数量级。下落概率的计算涉及使用α和β的乘法。由于PIE的控制律对α和β值的微小变化具有鲁棒性,因此实现可将这些值选择为2或1/2(例如,α=1/8,β=1+1/4)的最接近倍数,以便可以使用简单的加法和移位来完成乘法。由于不需要复杂的功能,PIE可以很容易地在硬件和软件中实现

software. The state requirement is only three variables per queue: burst_allowance_, PIE->drop_prob_, and PIE->qdelay_old_. Hence, the memory overhead is small.

软件每个队列的状态要求只有三个变量:突发\u余量\u、PIE->drop\u prob\u和PIE->qdelay\u old\u。因此,内存开销很小。

If one chooses to implement the departure rate estimation, PIE uses a counter to keep track of the number of bytes departed for the current interval. This counter is incremented per packet departure. Every T_UPDATE, PIE calculates latency using the departure rate, which can be implemented using a single multiply operation. Note that many network devices keep track of an interface's departure rate. In this case, PIE might be able to reuse this information and simply skip the third step of the algorithm; hence, it would incur no extra cost. If a platform already leverages packet timestamps for other purposes, PIE can make use of these packet timestamps for latency calculation instead of estimating the departure rate.

如果选择执行离开率估计,PIE将使用计数器跟踪当前间隔内离开的字节数。此计数器在每次数据包离开时递增。每次T_更新,PIE都会使用出发率计算延迟,出发率可以通过单个乘法操作实现。请注意,许多网络设备跟踪接口的离开率。在这种情况下,PIE可以重用这些信息,只需跳过算法的第三步;因此,不会产生额外费用。如果平台已经将数据包时间戳用于其他目的,PIE可以利用这些数据包时间戳进行延迟计算,而不是估计离开率。

Flow queuing can also be combined with PIE to provide isolation between flows. In this case, it is preferable to have an independent value of drop probability per queue. This allows each flow to receive the most appropriate level of congestion signal and ensures that sparse flows are protected from experiencing packet drops. However, running the entire PIE algorithm independently on each queue in order to calculate the drop probability may be overkill. Furthermore, in the case where departure rate estimation is used to predict queuing latency, it is not possible to calculate an accurate per-queue departure rate upon which to implement the PIE drop probability calculation. Instead, it has been proposed [DOCSIS-AQM] that a single implementation of the PIE drop probability calculation based on the overall latency estimate be used, followed by a per-queue scaling of drop probability based on the ratio of queue depth between the queue in question and the current largest queue. This scaling is reasonably simple and has a couple of nice properties:

流队列还可以与PIE相结合,以提供流之间的隔离。在这种情况下,最好具有每个队列的丢弃概率的独立值。这允许每个流接收最适当级别的拥塞信号,并确保稀疏流受到保护,不会发生丢包。然而,在每个队列上独立地运行整个PIE算法以计算丢弃概率可能有些过分。此外,在使用离开率估计来预测排队等待时间的情况下,不可能计算准确的每队列离开率来实施饼图丢弃概率计算。相反,有人建议[DOCSIS-AQM]使用基于总延迟估计的PIE丢弃概率计算的单一实现,然后根据相关队列和当前最大队列之间的队列深度比率对丢弃概率进行每个队列的缩放。这种缩放相当简单,并具有两个很好的特性:

* If a packet is arriving to an empty queue, it is given immunity from packet drops altogether, regardless of the state of the other queues.

* 若一个数据包到达一个空队列,那个么不管其他队列的状态如何,它都可以完全不受数据包丢失的影响。

* In the situation where only a single queue is in use, the algorithm behaves exactly like the single-queue PIE algorithm.

* 在仅使用单个队列的情况下,该算法的行为与单队列饼图算法完全相同。

In summary, PIE is simple enough to be implemented in both software and hardware.

总之,PIE非常简单,可以在软件和硬件中实现。

7. Scope of Experimentation
7. 试验范围

The design of the PIE algorithm is presented in this document. The PIE algorithm effectively controls the average queuing latency to a target value. The following areas can be used for further study and experimentation:

本文介绍了PIE算法的设计。PIE算法有效地将平均排队延迟控制到目标值。以下领域可用于进一步的研究和实验:

* Autotuning of target latency without losing utilization.

* 在不损失利用率的情况下自动调整目标延迟。

* Autotuning for the average round-trip time of traffic.

* 自动调整交通的平均往返时间。

* The proper threshold to transition smoothly between ECN marking and dropping.

* 在ECN标记和下降之间平滑过渡的适当阈值。

* The enhancements described in Section 5, which can be used in experiments to see if they would be of more value in the real world. If so, they will be incorporated into the basic PIE algorithm.

* 第5节中描述的增强功能,可以在实验中使用,看看它们在现实世界中是否更有价值。如果是这样,它们将被合并到基本的饼图算法中。

* The PIE design, which is separated into the data path and the control path. The control path can be implemented in software. Field tests of other control laws can be performed to experiment with further improvements to PIE's performance.

* 饼图设计,分为数据路径和控制路径。控制路径可以在软件中实现。可以对其他控制律进行现场测试,以进一步改善PIE的性能。

Although all network nodes cannot be changed altogether to adopt latency-based AQM schemes such as PIE, a gradual adoption would eventually lead to end-to-end low-latency service for all applications.

尽管不能完全改变所有网络节点以采用基于延迟的AQM方案(如PIE),但逐渐采用最终将导致所有应用程序都能获得端到端的低延迟服务。

8. Incremental Deployment
8. 增量部署

From testbed experiments and large-scale simulations of PIE so far, PIE has been shown to be effective across a diverse range of network scenarios. There is no indication that PIE would be harmful to deploy.

到目前为止,从PIE的试验台实验和大规模模拟来看,PIE在各种网络场景中都被证明是有效的。没有迹象表明PIE的部署是有害的。

The PIE scheme can be independently deployed and managed without a need for interoperability between different network devices. In addition, any individual buffer queue can be incrementally upgraded to PIE, as it can coexist with existing AQM schemes such as Weighted RED (WRED).

PIE方案可以独立部署和管理,而不需要不同网络设备之间的互操作性。此外,任何单个缓冲区队列都可以增量升级为PIE,因为它可以与现有的AQM方案(如加权RED(WRED))共存。

PIE is intended to be self-configuring. Users should not need to configure any design parameters. Upon installation, the two user-configurable parameters -- QDELAY_REF and MAX_BURST -- will be defaulted to 15 milliseconds and 150 milliseconds for non-data-center network devices and to 15 microseconds and 150 microseconds for data-center switches, respectively.

PIE旨在进行自我配置。用户不需要配置任何设计参数。安装后,非数据中心网络设备的两个用户可配置参数QDELAY_REF和MAX_BURST的默认值分别为15毫秒和150毫秒,数据中心交换机的默认值分别为15微秒和150微秒。

Since the data path of the algorithm needs only a simple coin toss and the control-path calculation happens in a much slower time scale, we don't foresee any scaling issues associated with the algorithm as the link speed scales up.

由于该算法的数据路径只需要简单的掷硬币,并且控制路径的计算在慢得多的时间尺度上进行,因此,随着链路速度的提高,我们预计不会出现与该算法相关的任何缩放问题。

9. Security Considerations
9. 安全考虑

This document describes PIE, an active queue management algorithm based on implementations in different products. The PIE algorithm introduces no specific security exposures.

本文档描述了PIE,一种基于不同产品实现的主动队列管理算法。PIE算法没有引入特定的安全暴露。

10. References
10. 工具书类
10.1. Normative References
10.1. 规范性引用文件

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,DOI 10.17487/RFC2119,1997年3月<http://www.rfc-editor.org/info/rfc2119>.

10.2. Informative References
10.2. 资料性引用

[RFC970] Nagle, J., "On Packet Switches With Infinite Storage", RFC 970, DOI 10.17487/RFC0970, December 1985, <http://www.rfc-editor.org/info/rfc970>.

[RFC970]Nagle,J.,“具有无限存储的分组交换机”,RFC 970,DOI 10.17487/RFC0970,1985年12月<http://www.rfc-editor.org/info/rfc970>.

[RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J., and L. Zhang, "Recommendations on Queue Management and Congestion Avoidance in the Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998, <http://www.rfc-editor.org/info/rfc2309>.

[RFC2309]Braden,B.,Clark,D.,Crowcroft,J.,Davie,B.,Deering,S.,Estrin,D.,Floyd,S.,Jacobson,V.,Minshall,G.,Partridge,C.,Peterson,L.,Ramakrishnan,K.,Shenker,S.,Wroclawski,J.,and L.Zhang,“关于互联网中队列管理和拥塞避免的建议”,RFC 2309,DOI 10.17487/RFC2309,1998年4月, <http://www.rfc-editor.org/info/rfc2309>.

[RFC7567] Baker, F., Ed., and G. Fairhurst, Ed., "IETF Recommendations Regarding Active Queue Management", BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, <http://www.rfc-editor.org/info/rfc7567>.

[RFC7567]Baker,F.,Ed.,和G.Fairhurst,Ed.,“IETF关于主动队列管理的建议”,BCP 197,RFC 7567,DOI 10.17487/RFC7567,2015年7月<http://www.rfc-editor.org/info/rfc7567>.

[RFC7806] Baker, F. and R. Pan, "On Queuing, Marking, and Dropping", RFC 7806, DOI 10.17487/RFC7806, April 2016, <http://www.rfc-editor.org/info/rfc7806>.

[RFC7806]Baker,F.和R.Pan,“关于排队、标记和丢弃”,RFC 7806,DOI 10.17487/RFC7806,2016年4月<http://www.rfc-editor.org/info/rfc7806>.

[RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based on Proportional Integral Controller Enhanced (PIE) for Data-Over-Cable Service Interface Specifications (DOCSIS) Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February 2017, <http://www.rfc-editor.org/info/rfc8034>.

[RFC8034]White,G.和R.Pan,“基于比例积分控制器增强(PIE)的电缆数据服务接口规范(DOCSIS)电缆调制解调器的主动队列管理(AQM)”,RFC 8034,DOI 10.17487/RFC8034,2017年2月<http://www.rfc-editor.org/info/rfc8034>.

[CBQ] Cisco, "Class-Based Weighted Fair Queueing", <http://www.cisco.com/en/US/docs/ios/12_0t/12_0t5/ feature/guide/cbwfq.html>.

[CBQ]Cisco,“基于类的加权公平排队”<http://www.cisco.com/en/US/docs/ios/12_0t/12_0t5/ feature/guide/cbwfq.html>。

[CoDel] Nichols, K. and V. Jacobson, "Controlling Queue Delay", Communications of the ACM, Volume 55, Issue 7, pp. 42-50, DOI 10.1145/2209249.2209264, July 2012.

[CoDel]Nichols,K.和V.Jacobson,“控制队列延迟”,《ACM通讯》,第55卷,第7期,第42-50页,DOI 10.1145/2209249.2209264,2012年7月。

[DOCSIS_3.1] CableLabs, "MAC and Upper Layer Protocols Interface Specification", DOCSIS 3.1, January 2017, <https://apps.cablelabs.com/specification/ CM-SP-MULPIv3.1>.

[DOCSIS_3.1]CableLabs,“MAC和上层协议接口规范”,DOCSIS 3.12017年1月<https://apps.cablelabs.com/specification/ CM-SP-MULPIv3.1>。

[DOCSIS-AQM] White, G., "Active Queue Management in DOCSIS 3.x Cable Modems", May 2014, <http://www.cablelabs.com/wp-content/ uploads/2014/06/DOCSIS-AQM_May2014.pdf>.

[DOCSIS-AQM]White,G.“DOCSIS 3.x电缆调制解调器中的主动队列管理”,2014年5月<http://www.cablelabs.com/wp-content/ 上传/2014/06/DOCSIS-AQM_May2014.pdf>。

[ECN] Briscoe, B., Kaippallimalil, J., and P. Thaler, "Guidelines for Adding Congestion Notification to Protocols that Encapsulate IP", Work in Progress, draft-ietf-tsvwg-ecn-encap-guidelines-07, July 2016.

[ECN]Briscoe,B.,Kaippallimalil,J.,和P.Thaler,“向封装IP的协议添加拥塞通知的指南”,正在进行的工作,草稿-ietf-tsvwg-ECN-encap-Guidelines-07,2016年7月。

[HPSR-PIE] Pan, R., Natarajan, P., Piglione, C., Prabhu, M.S., Subramanian, V., Baker, F., and B. Ver Steeg, "PIE: A lightweight control scheme to address the bufferbloat problem", IEEE HPSR, DOI 10.1109/HPSR.2013.6602305, 2013, <https://www.researchgate.net/publication/ 261134127_PIE_A_lightweight_control_scheme_to_address_ the_bufferbloat_problem?origin=mail>.

[HPSR-PIE]Pan,R.,Natarajan,P.,Piglione,C.,Prabhu,M.S.,Subramanian,V.,Baker,F.,和B.Ver Steeg,“PIE:解决缓冲区膨胀问题的轻量级控制方案”,IEEE HPSR,DOI 10.1109/HPSR.2013.66023052013, <https://www.researchgate.net/publication/ 261134127_PIE_A_lightweight_control_scheme_to_address_缓冲区问题?origin=mail>。

[PI] Hollot, C.V., Misra, V., Towsley, D., and W. Gong, "On designing improved controllers for AQM routers supporting TCP flows", INFOCOM 2001, DOI 10.1109/INFCOM.2001.916670, April 2001.

[PI]Hollot,C.V.,Misra,V.,Towsley,D.,和W.Gong,“关于为支持TCP流的AQM路由器设计改进的控制器”,INFOCOM 2001,DOI 10.1109/INFCOM.2001.9166702001年4月。

[QCN] IEEE, "IEEE Standard for Local and Metropolitan Area Networks--Virtual Bridged Local Area Networks - Amendment: 10: Congestion Notification", IEEE 802.1Qau, <http://www.ieee802.org/1/pages/802.1au.html>.

[QCN]IEEE,“IEEE局域网和城域网标准——虚拟桥接局域网——修改件:10:拥塞通知”,IEEE 802.1Qau<http://www.ieee802.org/1/pages/802.1au.html>.

[RED] Floyd, S. and V. Jacobson, "Random Early Detection (RED) Gateways for Congestion Avoidance", IEEE/ACM Transactions on Networking, Volume 1, Issue 4, DOI 10.1109/90.251892, August 1993.

[RED]Floyd,S.和V.Jacobson,“用于避免拥塞的随机早期检测(RED)网关”,IEEE/ACM网络事务,第1卷,第4期,DOI 10.1109/90.251892,1993年8月。

[TCP-Models] Misra, V., Gong, W., and D. Towsley, "Fluid-based analysis of a network of AQM routers supporting TCP flows with an application to RED", SIGCOMM 2000, Volume 30, Issue 4, pp. 151-160, DOI 10.1145/347057.347421, October 2000.

[TCP模型]Misra,V.,Gong,W.,和D.Towsley,“支持TCP流的AQM路由器网络的基于流体的分析,应用到RED”,SIGCOMM 2000,第30卷,第4期,第151-160页,DOI 10.1145/347057.347421,2000年10月。

Appendix A. The Basic PIE Pseudocode
附录A.基本PIE伪代码
   Configurable parameters:
      -  QDELAY_REF.  AQM Latency Target (default: 15 milliseconds)
      -  MAX_BURST.  AQM Max Burst Allowance (default: 150 milliseconds)
        
   Configurable parameters:
      -  QDELAY_REF.  AQM Latency Target (default: 15 milliseconds)
      -  MAX_BURST.  AQM Max Burst Allowance (default: 150 milliseconds)
        
   Internal parameters:
      -  Weights in the drop probability calculation (1/s):
         alpha (default: 1/8), beta (default: 1 + 1/4)
      -  T_UPDATE: a period to calculate drop probability
         (default: 15 milliseconds)
        
   Internal parameters:
      -  Weights in the drop probability calculation (1/s):
         alpha (default: 1/8), beta (default: 1 + 1/4)
      -  T_UPDATE: a period to calculate drop probability
         (default: 15 milliseconds)
        
   Table that stores status variables (ending with "_"):
      -  burst_allowance_: current burst allowance
      -  drop_prob_: The current packet drop probability.  Reset to 0
      -  qdelay_old_: The previous queue delay.  Reset to 0
        
   Table that stores status variables (ending with "_"):
      -  burst_allowance_: current burst allowance
      -  drop_prob_: The current packet drop probability.  Reset to 0
      -  qdelay_old_: The previous queue delay.  Reset to 0
        
   Public/system functions:
      -  queue_.  Holds the pending packets
      -  drop(packet).  Drops/discards a packet
      -  now().  Returns the current time
      -  random().  Returns a uniform r.v. in the range 0 ~ 1
      -  queue_.byte_length().  Returns current queue_ length in bytes
      -  queue_.enque(packet).  Adds packet to tail of queue_
      -  queue_.deque().  Returns the packet from the head of queue_
      -  packet.size().  Returns size of packet
      -  packet.timestamp_delay().  Returns timestamped packet latency
        
   Public/system functions:
      -  queue_.  Holds the pending packets
      -  drop(packet).  Drops/discards a packet
      -  now().  Returns the current time
      -  random().  Returns a uniform r.v. in the range 0 ~ 1
      -  queue_.byte_length().  Returns current queue_ length in bytes
      -  queue_.enque(packet).  Adds packet to tail of queue_
      -  queue_.deque().  Returns the packet from the head of queue_
      -  packet.size().  Returns size of packet
      -  packet.timestamp_delay().  Returns timestamped packet latency
        
   ============================
        
   ============================
        
   //Called on each packet arrival
     enque(Packet packet) {
          if (PIE->drop_prob_ == 0 && current_qdelay < QDELAY_REF/2
              && PIE->qdelay_old_ < QDELAY_REF/2) {
              PIE->burst_allowance_ = MAX_BURST;
          }
          if (PIE->burst_allowance_ == 0 && drop_early() == DROP) {
                   drop(packet);
          } else {
                   queue_.enque(packet);
          }
     }
        
   //Called on each packet arrival
     enque(Packet packet) {
          if (PIE->drop_prob_ == 0 && current_qdelay < QDELAY_REF/2
              && PIE->qdelay_old_ < QDELAY_REF/2) {
              PIE->burst_allowance_ = MAX_BURST;
          }
          if (PIE->burst_allowance_ == 0 && drop_early() == DROP) {
                   drop(packet);
          } else {
                   queue_.enque(packet);
          }
     }
        
   ============================
        
   ============================
        

drop_early() {

早退{

         //Safeguard PIE to be work conserving
         if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
               || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
              return ENQUE;
         }
        
         //Safeguard PIE to be work conserving
         if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
               || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
              return ENQUE;
         }
        
         double u = random();
         if (u < PIE->drop_prob_) {
              return DROP;
         } else {
        
         double u = random();
         if (u < PIE->drop_prob_) {
              return DROP;
         } else {
        
              return ENQUE;
         }
      }
        
              return ENQUE;
         }
      }
        
   ============================
        
   ============================
        
   //We choose the timestamp option of obtaining latency for clarity
   //Rate estimation method can be found in the extended PIE pseudocode
        
   //We choose the timestamp option of obtaining latency for clarity
   //Rate estimation method can be found in the extended PIE pseudocode
        

deque(Packet packet) {

deque(数据包){

       current_qdelay = packet.timestamp_delay();
     }
        
       current_qdelay = packet.timestamp_delay();
     }
        
   ============================
        
   ============================
        
   //Update periodically, T_UPDATE = 15 milliseconds
        
   //Update periodically, T_UPDATE = 15 milliseconds
        

calculate_drop_prob() {

计算落差{

//Can be implemented using integer multiply

//可以使用整数乘法实现

          p = alpha * (current_qdelay - QDELAY_REF) + \
              beta * (current_qdelay - PIE->qdelay_old_);
        
          p = alpha * (current_qdelay - QDELAY_REF) + \
              beta * (current_qdelay - PIE->qdelay_old_);
        
          if (PIE->drop_prob_ < 0.000001) {
              p /= 2048;
          } else if (PIE->drop_prob_ < 0.00001) {
              p /= 512;
          } else if (PIE->drop_prob_ < 0.0001) {
              p /= 128;
          } else if (PIE->drop_prob_ < 0.001) {
              p /= 32;
          } else if (PIE->drop_prob_ < 0.01) {
              p /= 8;
        
          if (PIE->drop_prob_ < 0.000001) {
              p /= 2048;
          } else if (PIE->drop_prob_ < 0.00001) {
              p /= 512;
          } else if (PIE->drop_prob_ < 0.0001) {
              p /= 128;
          } else if (PIE->drop_prob_ < 0.001) {
              p /= 32;
          } else if (PIE->drop_prob_ < 0.01) {
              p /= 8;
        
          } else if (PIE->drop_prob_ < 0.1) {
              p /= 2;
          } else {
              p = p;
          }
        
          } else if (PIE->drop_prob_ < 0.1) {
              p /= 2;
          } else {
              p = p;
          }
        
          PIE->drop_prob_ += p;
        
          PIE->drop_prob_ += p;
        
          //Exponentially decay drop prob when congestion goes away
          if (current_qdelay == 0 && PIE->qdelay_old_ == 0) {
              PIE->drop_prob_ *= 0.98;           //1 - 1/64 is
                                                 //sufficient
          }
        
          //Exponentially decay drop prob when congestion goes away
          if (current_qdelay == 0 && PIE->qdelay_old_ == 0) {
              PIE->drop_prob_ *= 0.98;           //1 - 1/64 is
                                                 //sufficient
          }
        
          //Bound drop probability
          if (PIE->drop_prob_ < 0)
                   PIE->drop_prob_ = 0.0
          if (PIE->drop_prob_ > 1)
                   PIE->drop_prob_ = 1.0
        
          //Bound drop probability
          if (PIE->drop_prob_ < 0)
                   PIE->drop_prob_ = 0.0
          if (PIE->drop_prob_ > 1)
                   PIE->drop_prob_ = 1.0
        
          PIE->qdelay_old_ = current_qdelay;
        
          PIE->qdelay_old_ = current_qdelay;
        
          PIE->burst_allowance_ =
             max(0,PIE->burst_allowance_ - T_UPDATE);
       }
   }
        
          PIE->burst_allowance_ =
             max(0,PIE->burst_allowance_ - T_UPDATE);
       }
   }
        
Appendix B. Pseudocode for PIE with Optional Enhancement
附录B.具有可选增强功能的PIE伪代码
   Configurable parameters:
      -  QDELAY_REF.  AQM Latency Target (default: 15 milliseconds)
      -  MAX_BURST.  AQM Max Burst Allowance (default: 150 milliseconds)
      -  MAX_ECNTH.  AQM Max ECN Marking Threshold (default: 10%)
        
   Configurable parameters:
      -  QDELAY_REF.  AQM Latency Target (default: 15 milliseconds)
      -  MAX_BURST.  AQM Max Burst Allowance (default: 150 milliseconds)
      -  MAX_ECNTH.  AQM Max ECN Marking Threshold (default: 10%)
        
   Internal parameters:
      -  Weights in the drop probability calculation (1/s):
         alpha (default: 1/8), beta (default: 1 + 1/4)
      -  DQ_THRESHOLD: (in bytes, default: 2^14 (in a power of 2) )
      -  T_UPDATE: a period to calculate drop probability
         (default: 15 milliseconds)
      -  TAIL_DROP: the tail drop threshold (max allowed queue depth)
         for the queue
        
   Internal parameters:
      -  Weights in the drop probability calculation (1/s):
         alpha (default: 1/8), beta (default: 1 + 1/4)
      -  DQ_THRESHOLD: (in bytes, default: 2^14 (in a power of 2) )
      -  T_UPDATE: a period to calculate drop probability
         (default: 15 milliseconds)
      -  TAIL_DROP: the tail drop threshold (max allowed queue depth)
         for the queue
        
   Table that stores status variables (ending with "_"):
      -  active_: INACTIVE/ACTIVE
      -  burst_allowance_: current burst allowance
      -  drop_prob_: The current packet drop probability.  Reset to 0
      -  accu_prob_: Accumulated drop probability.  Reset to 0
      -  qdelay_old_: The previous queue delay estimate.  Reset to 0
      -  last_timestamp_: Timestamp of previous status update
      -  dq_count_, measurement_start_, in_measurement_, avg_dq_time_.
         Variables for measuring average dequeue rate
        
   Table that stores status variables (ending with "_"):
      -  active_: INACTIVE/ACTIVE
      -  burst_allowance_: current burst allowance
      -  drop_prob_: The current packet drop probability.  Reset to 0
      -  accu_prob_: Accumulated drop probability.  Reset to 0
      -  qdelay_old_: The previous queue delay estimate.  Reset to 0
      -  last_timestamp_: Timestamp of previous status update
      -  dq_count_, measurement_start_, in_measurement_, avg_dq_time_.
         Variables for measuring average dequeue rate
        
   Public/system functions:
      -  queue_.  Holds the pending packets
      -  drop(packet).  Drops/discards a packet
      -  mark(packet).  Marks ECN for a packet
      -  now().  Returns the current time
      -  random().  Returns a uniform r.v. in the range 0 ~ 1
      -  queue_.byte_length().  Returns current queue_ length in bytes
      -  queue_.enque(packet).  Adds packet to tail of queue_
      -  queue_.deque().  Returns the packet from the head of queue_
      -  packet.size().  Returns size of packet
      -  packet.ecn().  Returns whether packet is ECN capable or not
        
   Public/system functions:
      -  queue_.  Holds the pending packets
      -  drop(packet).  Drops/discards a packet
      -  mark(packet).  Marks ECN for a packet
      -  now().  Returns the current time
      -  random().  Returns a uniform r.v. in the range 0 ~ 1
      -  queue_.byte_length().  Returns current queue_ length in bytes
      -  queue_.enque(packet).  Adds packet to tail of queue_
      -  queue_.deque().  Returns the packet from the head of queue_
      -  packet.size().  Returns size of packet
      -  packet.ecn().  Returns whether packet is ECN capable or not
        
   ============================
        
   ============================
        
   //Called on each packet arrival
     enque(Packet packet) {
          if (queue_.byte_length() + packet.size() > TAIL_DROP) {
                 drop(packet);
                 PIE->accu_prob_ = 0;
          } else if (PIE->active_ == TRUE && drop_early() == DROP
                     && PIE->burst_allowance_ == 0) {
                 if (PIE->drop_prob_ < MAX_ECNTH && packet.ecn() ==
                     TRUE)
                       mark(packet);
                 else
                       drop(packet);
                       PIE->accu_prob_ = 0;
          } else {
                 queue_.enque(packet);
          }
        
   //Called on each packet arrival
     enque(Packet packet) {
          if (queue_.byte_length() + packet.size() > TAIL_DROP) {
                 drop(packet);
                 PIE->accu_prob_ = 0;
          } else if (PIE->active_ == TRUE && drop_early() == DROP
                     && PIE->burst_allowance_ == 0) {
                 if (PIE->drop_prob_ < MAX_ECNTH && packet.ecn() ==
                     TRUE)
                       mark(packet);
                 else
                       drop(packet);
                       PIE->accu_prob_ = 0;
          } else {
                 queue_.enque(packet);
          }
        
          //If the queue is over a certain threshold, turn on PIE
          if (PIE->active_ == INACTIVE
              && queue_.byte_length() >= TAIL_DROP/3) {
               PIE->active_ = ACTIVE;
               PIE->qdelay_old_ = 0;
               PIE->drop_prob_ = 0;
               PIE->in_measurement_ = TRUE;
               PIE->dq_count_ = 0;
               PIE->avg_dq_time_ = 0;
               PIE->last_timestamp_ = now;
               PIE->burst_allowance_ = MAX_BURST;
               PIE->accu_prob_ = 0;
               PIE->measurement_start_ = now;
          }
        
          //If the queue is over a certain threshold, turn on PIE
          if (PIE->active_ == INACTIVE
              && queue_.byte_length() >= TAIL_DROP/3) {
               PIE->active_ = ACTIVE;
               PIE->qdelay_old_ = 0;
               PIE->drop_prob_ = 0;
               PIE->in_measurement_ = TRUE;
               PIE->dq_count_ = 0;
               PIE->avg_dq_time_ = 0;
               PIE->last_timestamp_ = now;
               PIE->burst_allowance_ = MAX_BURST;
               PIE->accu_prob_ = 0;
               PIE->measurement_start_ = now;
          }
        
          //If the queue has been idle for a while, turn off PIE
          //Reset counters when accessing the queue after some idle
          //period if PIE was active before
          if ( PIE->drop_prob_ == 0 && PIE->qdelay_old_ == 0
               && current_qdelay == 0) {
               PIE->active_ = INACTIVE;
               PIE->in_measurement_ = FALSE;
          }
        
          //If the queue has been idle for a while, turn off PIE
          //Reset counters when accessing the queue after some idle
          //period if PIE was active before
          if ( PIE->drop_prob_ == 0 && PIE->qdelay_old_ == 0
               && current_qdelay == 0) {
               PIE->active_ = INACTIVE;
               PIE->in_measurement_ = FALSE;
          }
        

}

}

   ============================
        
   ============================
        

drop_early() {

早退{

         //PIE is active but the queue is not congested: return ENQUE
         if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
               || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
              return ENQUE;
         }
        
         //PIE is active but the queue is not congested: return ENQUE
         if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
               || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
              return ENQUE;
         }
        
         if (PIE->drop_prob_ == 0) {
                  PIE->accu_prob_ = 0;
         }
        
         if (PIE->drop_prob_ == 0) {
                  PIE->accu_prob_ = 0;
         }
        
         //For practical reasons, drop probability can be further scaled
         //according to packet size, but one needs to set a bound to
         //avoid unnecessary bias
        
         //For practical reasons, drop probability can be further scaled
         //according to packet size, but one needs to set a bound to
         //avoid unnecessary bias
        
         //Random drop
         PIE->accu_prob_ += PIE->drop_prob_;
         if (PIE->accu_prob_ < 0.85)
             return ENQUE;
         if (PIE->accu_prob_ >= 8.5)
             return DROP;
                 double u = random();
         if (u < PIE->drop_prob_) {
                      PIE->accu_prob_ = 0;
                      return DROP;
         } else {
                      return ENQUE;
         }
      }
        
         //Random drop
         PIE->accu_prob_ += PIE->drop_prob_;
         if (PIE->accu_prob_ < 0.85)
             return ENQUE;
         if (PIE->accu_prob_ >= 8.5)
             return DROP;
                 double u = random();
         if (u < PIE->drop_prob_) {
                      PIE->accu_prob_ = 0;
                      return DROP;
         } else {
                      return ENQUE;
         }
      }
        
   ============================
        
   ============================
        
    //Update periodically, T_UPDATE = 15 milliseconds
    calculate_drop_prob() {
        if ( (now - PIE->last_timestamp_) >= T_UPDATE &&
                PIE->active_ == ACTIVE) {
        
    //Update periodically, T_UPDATE = 15 milliseconds
    calculate_drop_prob() {
        if ( (now - PIE->last_timestamp_) >= T_UPDATE &&
                PIE->active_ == ACTIVE) {
        
          //Can be implemented using integer multiply
          //DQ_THRESHOLD is power of 2 value
          current_qdelay = queue_.byte_length() *
          PIE->avg_dq_time_/DQ_THRESHOLD;
        
          //Can be implemented using integer multiply
          //DQ_THRESHOLD is power of 2 value
          current_qdelay = queue_.byte_length() *
          PIE->avg_dq_time_/DQ_THRESHOLD;
        
          p = alpha * (current_qdelay - QDELAY_REF) + \
              beta * (current_qdelay - PIE->qdelay_old_);
        
          p = alpha * (current_qdelay - QDELAY_REF) + \
              beta * (current_qdelay - PIE->qdelay_old_);
        
          if (PIE->drop_prob_ < 0.000001) {
              p /= 2048;
          } else if (PIE->drop_prob_ < 0.00001) {
              p /= 512;
          } else if (PIE->drop_prob_ < 0.0001) {
              p /= 128;
          } else if (PIE->drop_prob_ < 0.001) {
              p /= 32;
          } else if (PIE->drop_prob_ < 0.01) {
              p /= 8;
          } else if (PIE->drop_prob_ < 0.1) {
              p /= 2;
          } else {
              p = p;
          }
        
          if (PIE->drop_prob_ < 0.000001) {
              p /= 2048;
          } else if (PIE->drop_prob_ < 0.00001) {
              p /= 512;
          } else if (PIE->drop_prob_ < 0.0001) {
              p /= 128;
          } else if (PIE->drop_prob_ < 0.001) {
              p /= 32;
          } else if (PIE->drop_prob_ < 0.01) {
              p /= 8;
          } else if (PIE->drop_prob_ < 0.1) {
              p /= 2;
          } else {
              p = p;
          }
        
          if (PIE->drop_prob_ >= 0.1 && p > 0.02) {
              p = 0.02;
          }
          PIE->drop_prob_ += p;
        
          if (PIE->drop_prob_ >= 0.1 && p > 0.02) {
              p = 0.02;
          }
          PIE->drop_prob_ += p;
        
          //Exponentially decay drop prob when congestion goes away
          if (current_qdelay < QDELAY_REF/2 && PIE->qdelay_old_ <
              QDELAY_REF/2) {
                 PIE->drop_prob_ *= 0.98;        //1 - 1/64 is
                                                 //sufficient
          }
        
          //Exponentially decay drop prob when congestion goes away
          if (current_qdelay < QDELAY_REF/2 && PIE->qdelay_old_ <
              QDELAY_REF/2) {
                 PIE->drop_prob_ *= 0.98;        //1 - 1/64 is
                                                 //sufficient
          }
        
          //Bound drop probability
          if (PIE->drop_prob_ < 0)
                   PIE->drop_prob_ = 0
          if (PIE->drop_prob_ > 1)
                   PIE->drop_prob_ = 1
        
          //Bound drop probability
          if (PIE->drop_prob_ < 0)
                   PIE->drop_prob_ = 0
          if (PIE->drop_prob_ > 1)
                   PIE->drop_prob_ = 1
        
          PIE->qdelay_old_ = current_qdelay;
          PIE->last_timestamp_ = now;
          PIE->burst_allowance_ = max(0,PIE->burst_allowance_ -
             T_UPDATE);
       }
   }
        
          PIE->qdelay_old_ = current_qdelay;
          PIE->last_timestamp_ = now;
          PIE->burst_allowance_ = max(0,PIE->burst_allowance_ -
             T_UPDATE);
       }
   }
        
   ============================
        
   ============================
        
   //Called on each packet departure
     deque(Packet packet) {
        
   //Called on each packet departure
     deque(Packet packet) {
        
        //Dequeue rate estimation
        if (PIE->in_measurement_ == TRUE) {
             PIE->dq_count_ = packet.size() + PIE->dq_count_;
             //Start a new measurement cycle if we have enough packets
             if ( PIE->dq_count_ >= DQ_THRESHOLD) {
               dq_time = now - PIE->measurement_start_;
               if (PIE->avg_dq_time_ == 0) {
                   PIE->avg_dq_time_ = dq_time;
               } else {
                   weight = DQ_THRESHOLD/2^16
                   PIE->avg_dq_time_ = dq_time * weight +
                      PIE->avg_dq_time_ * (1 - weight);
               }
               PIE->in_measurement_ = FALSE;
             }
        }
        
        //Dequeue rate estimation
        if (PIE->in_measurement_ == TRUE) {
             PIE->dq_count_ = packet.size() + PIE->dq_count_;
             //Start a new measurement cycle if we have enough packets
             if ( PIE->dq_count_ >= DQ_THRESHOLD) {
               dq_time = now - PIE->measurement_start_;
               if (PIE->avg_dq_time_ == 0) {
                   PIE->avg_dq_time_ = dq_time;
               } else {
                   weight = DQ_THRESHOLD/2^16
                   PIE->avg_dq_time_ = dq_time * weight +
                      PIE->avg_dq_time_ * (1 - weight);
               }
               PIE->in_measurement_ = FALSE;
             }
        }
        
        //Start a measurement if we have enough data in the queue
        if (queue_.byte_length() >= DQ_THRESHOLD &&
            PIE->in_measurement_ == FALSE) {
               PIE->in_measurement_ = TRUE;
               PIE->measurement_start_ = now;
               PIE->dq_count_ = 0;
        }
     }
        
        //Start a measurement if we have enough data in the queue
        if (queue_.byte_length() >= DQ_THRESHOLD &&
            PIE->in_measurement_ == FALSE) {
               PIE->in_measurement_ = TRUE;
               PIE->measurement_start_ = now;
               PIE->dq_count_ = 0;
        }
     }
        

Contributors

贡献者

Bill Ver Steeg Comcast Cable Email: William_VerSteeg@comcast.com

Bill Ver Steeg Comcast有线电视电子邮件:William_VerSteeg@comcast.com

Mythili Prabhu* Akamai Technologies 3355 Scott Blvd. Santa Clara, CA 95054 United States of America Email: mythili@akamai.com

Mythili Prabhu*Akamai Technologies斯科特大道3355号。加利福尼亚州圣克拉拉市95054美利坚合众国电子邮件:mythili@akamai.com

Chiara Piglione* Broadcom Corporation 3151 Zanker Road San Jose, CA 95134 United States of America Email: chiara@broadcom.com

Chiara Piglione*美国加利福尼亚州圣何塞市赞克路3151号博通公司95134电子邮件:chiara@broadcom.com

Vijay Subramanian* PLUMgrid, Inc. 350 Oakmead Parkway Suite 250 Sunnyvale, CA 94085 United States of America Email: vns@plumgrid.com * Formerly at Cisco Systems

Vijay Subramanian*PLUMgrid,Inc.加利福尼亚州桑尼维尔市奥克米德公园大道350号套房,邮编94085美利坚合众国电子邮件:vns@plumgrid.com*以前在思科系统公司

Authors' Addresses

作者地址

Rong Pan Cisco Systems 3625 Cisco Way San Jose, CA 95134 United States of America

Rong Pan Cisco Systems 3625 Cisco Way San Jose,CA 95134美利坚合众国

   Email: ropan@cisco.com
        
   Email: ropan@cisco.com
        

Preethi Natarajan Cisco Systems 725 Alder Drive Milpitas, CA 95035 United States of America

美国加利福尼亚州米尔皮塔斯奥尔德大道725号Preethi Natarajan思科系统公司,邮编95035

   Email: prenatar@cisco.com
        
   Email: prenatar@cisco.com
        

Fred Baker Santa Barbara, CA 93117 United States of America

Fred Baker Santa Barbara,加利福尼亚州,美利坚合众国,93117

   Email: FredBaker.IETF@gmail.com
        
   Email: FredBaker.IETF@gmail.com
        

Greg White CableLabs 858 Coal Creek Circle Louisville, CO 80027 United States of America

Greg White CableLabs 858美国科罗拉多州路易斯维尔市煤溪圈80027

   Email: g.white@cablelabs.com
        
   Email: g.white@cablelabs.com