Network Working Group                                      S. Shanmugham
Request for Comments: 4463                           Cisco Systems, Inc.
Category: Informational                                        P. Monaco
                                                   Nuance Communications
                                                              B. Eberman
                                                        Speechworks Inc.
                                                              April 2006
        
Network Working Group                                      S. Shanmugham
Request for Comments: 4463                           Cisco Systems, Inc.
Category: Informational                                        P. Monaco
                                                   Nuance Communications
                                                              B. Eberman
                                                        Speechworks Inc.
                                                              April 2006
        

A Media Resource Control Protocol (MRCP) Developed by Cisco, Nuance, and Speechworks

由Cisco、Nuance和Speechworks开发的媒体资源控制协议(MRCP)

Status of This Memo

关于下段备忘

This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.

本备忘录为互联网社区提供信息。它没有规定任何类型的互联网标准。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2006).

版权所有(C)互联网协会(2006年)。

IESG Note

IESG注释

This RFC is not a candidate for any level of Internet Standard. The IETF disclaims any knowledge of the fitness of this RFC for any purpose and in particular notes that the decision to publish is not based on IETF review for such things as security, congestion control, or inappropriate interaction with deployed protocols. The RFC Editor has chosen to publish this document at its discretion. Readers of this document should exercise caution in evaluating its value for implementation and deployment. See RFC 3932 for more information.

本RFC不适用于任何级别的互联网标准。IETF不承认本RFC适用于任何目的的任何知识,特别注意到,发布决定并非基于IETF对安全、拥塞控制或与已部署协议的不当交互等事项的审查。RFC编辑已自行决定发布本文件。本文档的读者在评估其实施和部署价值时应谨慎。有关更多信息,请参阅RFC 3932。

Note that this document uses a MIME type 'application/mrcp' which has not been registered with the IANA, and is therefore not recognized as a standard IETF MIME type. The historical value of this document as an ancestor to ongoing standardization in this space, however, makes the publication of this document meaningful.

请注意,本文档使用的MIME类型“application/mrcp”尚未向IANA注册,因此未被识别为标准IETF MIME类型。然而,本文件作为该领域正在进行的标准化的前身的历史价值使得本文件的出版具有意义。

Abstract

摘要

This document describes a Media Resource Control Protocol (MRCP) that was developed jointly by Cisco Systems, Inc., Nuance Communications, and Speechworks, Inc. It is published as an RFC as input for further IETF development in this area.

本文件描述了由Cisco Systems,Inc.,Nuance Communications和Speechworks,Inc.联合开发的媒体资源控制协议(MRCP)。该协议作为RFC发布,作为该领域IETF进一步开发的输入。

MRCP controls media service resources like speech synthesizers, recognizers, signal generators, signal detectors, fax servers, etc., over a network. This protocol is designed to work with streaming protocols like RTSP (Real Time Streaming Protocol) or SIP (Session Initiation Protocol), which help establish control connections to external media streaming devices, and media delivery mechanisms like RTP (Real Time Protocol).

MRCP通过网络控制媒体服务资源,如语音合成器、识别器、信号发生器、信号检测器、传真服务器等。此协议设计用于与RTSP(实时流协议)或SIP(会话启动协议)等流协议(有助于建立到外部媒体流设备的控制连接)以及RTP(实时协议)等媒体交付机制一起使用。

Table of Contents

目录

   1. Introduction ....................................................3
   2. Architecture ....................................................4
      2.1. Resources and Services .....................................4
      2.2. Server and Resource Addressing .............................5
   3. MRCP Protocol Basics ............................................5
      3.1. Establishing Control Session and Media Streams .............5
      3.2. MRCP over RTSP .............................................6
      3.3. Media Streams and RTP Ports ................................8
   4. Notational Conventions ..........................................8
   5. MRCP Specification ..............................................9
      5.1. Request ...................................................10
      5.2. Response ..................................................10
      5.3. Event .....................................................12
      5.4. Message Headers ...........................................12
   6. Media Server ...................................................19
      6.1. Media Server Session ......................................19
   7. Speech Synthesizer Resource ....................................21
      7.1. Synthesizer State Machine .................................22
      7.2. Synthesizer Methods .......................................22
      7.3. Synthesizer Events ........................................23
      7.4. Synthesizer Header Fields .................................23
      7.5. Synthesizer Message Body ..................................29
      7.6. SET-PARAMS ................................................32
      7.7. GET-PARAMS ................................................32
      7.8. SPEAK .....................................................33
      7.9. STOP ......................................................34
      7.10. BARGE-IN-OCCURRED ........................................35
      7.11. PAUSE ....................................................37
      7.12. RESUME ...................................................37
      7.13. CONTROL ..................................................38
      7.14. SPEAK-COMPLETE ...........................................40
        
   1. Introduction ....................................................3
   2. Architecture ....................................................4
      2.1. Resources and Services .....................................4
      2.2. Server and Resource Addressing .............................5
   3. MRCP Protocol Basics ............................................5
      3.1. Establishing Control Session and Media Streams .............5
      3.2. MRCP over RTSP .............................................6
      3.3. Media Streams and RTP Ports ................................8
   4. Notational Conventions ..........................................8
   5. MRCP Specification ..............................................9
      5.1. Request ...................................................10
      5.2. Response ..................................................10
      5.3. Event .....................................................12
      5.4. Message Headers ...........................................12
   6. Media Server ...................................................19
      6.1. Media Server Session ......................................19
   7. Speech Synthesizer Resource ....................................21
      7.1. Synthesizer State Machine .................................22
      7.2. Synthesizer Methods .......................................22
      7.3. Synthesizer Events ........................................23
      7.4. Synthesizer Header Fields .................................23
      7.5. Synthesizer Message Body ..................................29
      7.6. SET-PARAMS ................................................32
      7.7. GET-PARAMS ................................................32
      7.8. SPEAK .....................................................33
      7.9. STOP ......................................................34
      7.10. BARGE-IN-OCCURRED ........................................35
      7.11. PAUSE ....................................................37
      7.12. RESUME ...................................................37
      7.13. CONTROL ..................................................38
      7.14. SPEAK-COMPLETE ...........................................40
        
      7.15. SPEECH-MARKER ............................................41
   8. Speech Recognizer Resource .....................................42
      8.1. Recognizer State Machine ..................................42
      8.2. Recognizer Methods ........................................42
      8.3. Recognizer Events .........................................43
      8.4. Recognizer Header Fields ..................................43
      8.5. Recognizer Message Body ...................................51
      8.6. SET-PARAMS ................................................56
      8.7. GET-PARAMS ................................................56
      8.8. DEFINE-GRAMMAR ............................................57
      8.9. RECOGNIZE .................................................60
      8.10. STOP .....................................................63
      8.11. GET-RESULT ...............................................64
      8.12. START-OF-SPEECH ..........................................64
      8.13. RECOGNITION-START-TIMERS .................................65
      8.14. RECOGNITON-COMPLETE ......................................65
      8.15. DTMF Detection ...........................................67
   9. Future Study ...................................................67
   10. Security Considerations .......................................67
   11. RTSP-Based Examples ...........................................67
   12. Informative References ........................................74
   Appendix A. ABNF Message Definitions ..............................76
   Appendix B. Acknowledgements ......................................84
        
      7.15. SPEECH-MARKER ............................................41
   8. Speech Recognizer Resource .....................................42
      8.1. Recognizer State Machine ..................................42
      8.2. Recognizer Methods ........................................42
      8.3. Recognizer Events .........................................43
      8.4. Recognizer Header Fields ..................................43
      8.5. Recognizer Message Body ...................................51
      8.6. SET-PARAMS ................................................56
      8.7. GET-PARAMS ................................................56
      8.8. DEFINE-GRAMMAR ............................................57
      8.9. RECOGNIZE .................................................60
      8.10. STOP .....................................................63
      8.11. GET-RESULT ...............................................64
      8.12. START-OF-SPEECH ..........................................64
      8.13. RECOGNITION-START-TIMERS .................................65
      8.14. RECOGNITON-COMPLETE ......................................65
      8.15. DTMF Detection ...........................................67
   9. Future Study ...................................................67
   10. Security Considerations .......................................67
   11. RTSP-Based Examples ...........................................67
   12. Informative References ........................................74
   Appendix A. ABNF Message Definitions ..............................76
   Appendix B. Acknowledgements ......................................84
        
1. Introduction
1. 介绍

The Media Resource Control Protocol (MRCP) is designed to provide a mechanism for a client device requiring audio/video stream processing to control processing resources on the network. These media processing resources may be speech recognizers (a.k.a. Automatic-Speech-Recognition (ASR) engines), speech synthesizers (a.k.a. Text-To-Speech (TTS) engines), fax, signal detectors, etc. MRCP allows implementation of distributed Interactive Voice Response platforms, for example VoiceXML [6] interpreters. The MRCP protocol defines the requests, responses, and events needed to control the media processing resources. The MRCP protocol defines the state machine for each resource and the required state transitions for each request and server-generated event.

媒体资源控制协议(MRCP)旨在为需要音频/视频流处理的客户端设备提供一种机制,以控制网络上的处理资源。这些媒体处理资源可以是语音识别器(又称自动语音识别(ASR)引擎)、语音合成器(又称文本到语音(TTS)引擎)、传真、信号检测器等。MRCP允许实现分布式交互式语音响应平台,例如VoiceXML[6]解释器。MRCP协议定义了控制媒体处理资源所需的请求、响应和事件。MRCP协议定义了每个资源的状态机以及每个请求和服务器生成的事件所需的状态转换。

The MRCP protocol does not address how the control session is established with the server and relies on the Real Time Streaming Protocol (RTSP) [2] to establish and maintain the session. The session control protocol is also responsible for establishing the media connection from the client to the network server. The MRCP protocol and its messaging is designed to be carried over RTSP or another protocol as a MIME-type similar to the Session Description Protocol (SDP) [5].

MRCP协议不涉及如何与服务器建立控制会话,而是依赖实时流协议(RTSP)[2]来建立和维护会话。会话控制协议还负责建立从客户端到网络服务器的媒体连接。MRCP协议及其消息传递被设计为通过RTSP或另一种类似于会话描述协议(SDP)的MIME类型的协议进行传输[5]。

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [8].

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119[8]中所述进行解释。

2. Architecture
2. 建筑学

The system consists of a client that requires media streams generated or needs media streams processed and a server that has the resources or devices to process or generate the streams. The client establishes a control session with the server for media processing using a protocol such as RTSP. This will also set up and establish the RTP stream between the client and the server or another RTP endpoint. Each resource needed in processing or generating the stream is addressed or referred to by a URL. The client can now use MRCP messages to control the media resources and affect how they process or generate the media stream.

该系统由一个需要生成或处理媒体流的客户端和一个具有处理或生成媒体流的资源或设备的服务器组成。客户端使用协议(如RTSP)与服务器建立控制会话,以进行媒体处理。这还将在客户端和服务器或另一个RTP端点之间建立RTP流。处理或生成流所需的每个资源都由URL寻址或引用。客户机现在可以使用MRCP消息来控制媒体资源,并影响它们处理或生成媒体流的方式。

     |--------------------|
     ||------------------||                   |----------------------|
     || Application Layer||                   ||--------------------||
     ||------------------||                   || TTS  | ASR  | Fax  ||
     ||  ASR/TTS API     ||                   ||Plugin|Plugin|Plugin||
     ||------------------||                   ||  on  |  on  |  on  ||
     ||    MRCP Core     ||                   || MRCP | MRCP | MRCP ||
     ||  Protocol Stack  ||                   ||--------------------||
     ||------------------||                   ||   RTSP Stack       ||
     ||   RTSP Stack     ||                   ||                    ||
     ||------------------||                   ||--------------------||
     ||   TCP/IP Stack   ||========IP=========||  TCP/IP Stack      ||
     ||------------------||                   ||--------------------||
     |--------------------|                   |----------------------|
        
     |--------------------|
     ||------------------||                   |----------------------|
     || Application Layer||                   ||--------------------||
     ||------------------||                   || TTS  | ASR  | Fax  ||
     ||  ASR/TTS API     ||                   ||Plugin|Plugin|Plugin||
     ||------------------||                   ||  on  |  on  |  on  ||
     ||    MRCP Core     ||                   || MRCP | MRCP | MRCP ||
     ||  Protocol Stack  ||                   ||--------------------||
     ||------------------||                   ||   RTSP Stack       ||
     ||   RTSP Stack     ||                   ||                    ||
     ||------------------||                   ||--------------------||
     ||   TCP/IP Stack   ||========IP=========||  TCP/IP Stack      ||
     ||------------------||                   ||--------------------||
     |--------------------|                   |----------------------|
        

MRCP client Real-time Streaming MRCP media server

MRCP客户端实时流媒体MRCP媒体服务器

2.1. Resources and Services
2.1. 资源和服务

The server is set up to offer a certain set of resources and services to the client. These resources are of 3 types.

为服务器设置了特定的资源,并将其设置为要提供的服务。这些资源有三种类型。

Transmission Resources

传输资源

These are resources that are capable of generating real-time streams, like signal generators that generate tones and sounds of certain frequencies and patterns, and speech synthesizers that generate spoken audio streams, etc.

这些是能够生成实时流的资源,如生成特定频率和模式的音调和声音的信号发生器,以及生成语音音频流的语音合成器等。

Reception Resources

接待资源

These are resources that receive and process streaming data like signal detectors and speech recognizers.

这些是接收和处理流数据的资源,如信号检测器和语音识别器。

Dual Mode Resources

双模式资源

These are resources that both send and receive data like a fax resource, capable of sending or receiving fax through a two-way RTP stream.

这些资源可以像传真资源一样发送和接收数据,能够通过双向RTP流发送或接收传真。

2.2. Server and Resource Addressing
2.2. 服务器和资源寻址

The server as a whole is addressed using a container URL, and the individual resources the server has to offer are reached by individual resource URLs within the container URL.

服务器作为一个整体使用容器URL寻址,服务器必须提供的各个资源通过容器URL中的各个资源URL访问。

RTSP Example:

RTSP示例:

A media server or container URL like,

媒体服务器或容器URL,如,

     rtsp://mediaserver.com/media/
        
     rtsp://mediaserver.com/media/
        

may contain one or more resource URLs of the form,

可能包含表单的一个或多个资源URL,

     rtsp://mediaserver.com/media/speechrecognizer/
     rtsp://mediaserver.com/media/speechsynthesizer/
     rtsp://mediaserver.com/media/fax/
        
     rtsp://mediaserver.com/media/speechrecognizer/
     rtsp://mediaserver.com/media/speechsynthesizer/
     rtsp://mediaserver.com/media/fax/
        
3. MRCP Protocol Basics
3. MRCP协议基础

The message format for MRCP is text based, with mechanisms to carry embedded binary data. This allows data like recognition grammars, recognition results, synthesizer speech markup, etc., to be carried in the MRCP message between the client and the server resource. The protocol does not address session control management, media management, reliable sequencing, and delivery or server or resource addressing. These are left to a protocol like SIP or RTSP. MRCP addresses the issue of controlling and communicating with the resource processing the stream, and defines the requests, responses, and events needed to do that.

MRCP的消息格式是基于文本的,具有携带嵌入式二进制数据的机制。这允许在客户机和服务器资源之间的MRCP消息中携带识别语法、识别结果、合成器语音标记等数据。该协议不涉及会话控制管理、媒体管理、可靠排序以及交付或服务器或资源寻址。这些都留给SIP或RTSP之类的协议处理。MRCP解决了控制和与处理流的资源通信的问题,并定义了执行此操作所需的请求、响应和事件。

3.1. Establishing Control Session and Media Streams
3.1. 建立控制会话和媒体流

The control session between the client and the server is established using a protocol like RTSP. This protocol will also set up the appropriate RTP streams between the server and the client, allocating ports and setting up transport parameters as needed. Each control

客户端和服务器之间的控制会话是使用类似RTSP的协议建立的。该协议还将在服务器和客户端之间设置适当的RTP流,根据需要分配端口和设置传输参数。每个控件

session is identified by a unique session-id. The format, usage, and life cycle of the session-id is in accordance with the RTSP protocol. The resources within the session are addressed by the individual resource URLs.

会话由唯一的会话id标识。会话id的格式、用法和生命周期符合RTSP协议。会话中的资源由各个资源URL寻址。

The MRCP protocol is designed to work with and tunnel through another protocol like RTSP, and augment its capabilities. MRCP relies on RTSP headers for sequencing, reliability, and addressing to make sure that messages get delivered reliably and in the correct order and to the right resource. The MRCP messages are carried in the RTSP message body. The media server delivers the MRCP message to the appropriate resource or device by looking at the session-level message headers and URL information. Another protocol, such as SIP [4], could be used for tunneling MRCP messages.

MRCP协议设计用于与另一个协议(如RTSP)协作并通过另一个协议进行传输,并增强其功能。MRCP依靠RTSP报头进行排序、可靠性和寻址,以确保消息以正确的顺序可靠地传递到正确的资源。MRCP消息携带在RTSP消息正文中。媒体服务器通过查看会话级消息头和URL信息,将MRCP消息传递到适当的资源或设备。另一个协议,如SIP[4],可用于隧道MRCP消息。

3.2. MRCP over RTSP
3.2. RTSP上的MRCP

RTSP supports both TCP and UDP mechanisms for the client to talk to the server and is differentiated by the RTSP URL. All MRCP based media servers MUST support TCP for transport and MAY support UDP.

RTSP同时支持TCP和UDP机制,以便客户端与服务器进行通信,并通过RTSP URL进行区分。所有基于MRCP的媒体服务器必须支持TCP传输,并且可能支持UDP。

In RTSP, the ANNOUNCE method/response MUST be used to carry MRCP request/responses between the client and the server. MRCP messages MUST NOT be communicated in the RTSP SETUP or TEARDOWN messages.

在RTSP中,必须使用通告方法/响应在客户端和服务器之间传送MRCP请求/响应。不得在RTSP设置或拆卸消息中传达MRCP消息。

Currently all RTSP messages are request/responses and there is no support for asynchronous events in RTSP. This is because RTSP was designed to work over TCP or UDP and, hence, could not assume reliability in the underlying protocol. Hence, when using MRCP over RTSP, an asynchronous event from the MRCP server is packaged in a server-initiated ANNOUNCE method/response communication. A future RTSP extension to send asynchronous events from the server to the client would provide an alternate vehicle to carry such asynchronous MRCP events from the server.

目前,所有RTSP消息都是请求/响应,RTSP中不支持异步事件。这是因为RTSP被设计为在TCP或UDP上工作,因此无法假定底层协议的可靠性。因此,当通过RTSP使用MRCP时,来自MRCP服务器的异步事件打包在服务器启动的通告方法/响应通信中。将来从服务器向客户机发送异步事件的RTSP扩展将提供另一种载体,从服务器承载此类异步MRCP事件。

An RTSP session is created when an RTSP SETUP message is sent from the client to a server and is addressed to a server URL or any one of its resource URLs without specifying a session-id. The server will establish a session context and will respond with a session-id to the client. This sequence will also set up the RTP transport parameters between the client and the server, and then the server will be ready to receive or send media streams. If the client wants to attach an additional resource to an existing session, the client should send that session's ID in the subsequent SETUP message.

RTSP会话是在RTSP设置消息从客户端发送到服务器并发送到服务器URL或其任何一个资源URL而不指定会话id时创建的。服务器将建立会话上下文并向客户端发送会话id。此序列还将设置客户端和服务器之间的RTP传输参数,然后服务器将准备好接收或发送媒体流。如果客户端希望将其他资源附加到现有会话,则客户端应在后续设置消息中发送该会话的ID。

When a media server implementing MRCP over RTSP receives a PLAY, RECORD, or PAUSE RTSP method from an MRCP resource URL, it should respond with an RTSP 405 "Method not Allowed" response. For these resources, the only allowed RTSP methods are SETUP, TEARDOWN, DESCRIBE, and ANNOUNCE.

当通过RTSP实现MRCP的媒体服务器从MRCP资源URL接收到播放、录制或暂停RTSP方法时,它应以RTSP 405“不允许方法”响应进行响应。对于这些资源,唯一允许的RTSP方法是SETUP、TEARDOWN、description和annound。

Example 1:

例1:

   C->S:  ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:4
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:223
        
   C->S:  ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:4
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:223
        
          SPEAK 543257 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          SPEAK 543257 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
           <paragraph>
             <sentence>You have 4 new messages.</sentence>
             <sentence>The first is from <say-as
             type="name">Stephanie Williams</say-as>
             and arrived at <break/>
             <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
           <paragraph>
             <sentence>You have 4 new messages.</sentence>
             <sentence>The first is from <say-as
             type="name">Stephanie Williams</say-as>
             and arrived at <break/>
             <say-as type="time">3:45pm</say-as>.</sentence>
        
             <sentence>The subject is <prosody
             rate="-20%">ski trip</prosody></sentence>
           </paragraph>
          </speak>
        
             <sentence>The subject is <prosody
             rate="-20%">ski trip</prosody></sentence>
           </paragraph>
          </speak>
        
   S->C:  RTSP/1.0 200 OK
          CSeq: 4
          Session:12345678
          RTP-Info:url=rtsp://media.server.com/media/synthesizer;
                    seq=9810092;rtptime=3450012
          Content-Type:application/mrcp
          Content-Length:52
        
   S->C:  RTSP/1.0 200 OK
          CSeq: 4
          Session:12345678
          RTP-Info:url=rtsp://media.server.com/media/synthesizer;
                    seq=9810092;rtptime=3450012
          Content-Type:application/mrcp
          Content-Length:52
        

MRCP/1.0 543257 200 IN-PROGRESS

MRCP/1.0 543257 200正在进行中

   S->C:  ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:6
          Session:12345678
        
   S->C:  ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:6
          Session:12345678
        

Content-Type:application/mrcp Content-Length:123

内容类型:应用程序/mrcp内容长度:123

SPEAK-COMPLETE 543257 COMPLETE MRCP/1.0

SPEAK-COMPLETE 543257 COMPLETE MRCP/1.0

   C->S:  RTSP/1.0 200 OK
          CSeq:6
        
   C->S:  RTSP/1.0 200 OK
          CSeq:6
        

For the sake of brevity, most examples from here on show only the MRCP messages and do not show the RTSP message and headers in which they are tunneled. Also, RTSP messages such as response that are not carrying an MRCP message are also left out.

为简洁起见,下面的大多数示例仅显示MRCP消息,而不显示RTSP消息和它们所在的头。此外,未携带MRCP消息的RTSP消息(如响应)也被忽略。

3.3. Media Streams and RTP Ports
3.3. 媒体流和RTP端口

A single set of RTP/RTCP ports is negotiated and shared between the MRCP client and server when multiple media processing resources, such as automatic speech recognition (ASR) engines and text to speech (TTS) engines, are used for a single session. The individual resource instances allocated on the server under a common session identifier will feed from/to that single RTP stream.

当多个媒体处理资源(如自动语音识别(ASR)引擎和文本到语音(TTS)引擎)用于单个会话时,MRCP客户端和服务器之间协商并共享一组RTP/RTCP端口。在服务器上以公共会话标识符分配的各个资源实例将从该RTP流馈送到该RTP流。

The client can send multiple media streams towards the server, differentiated by using different synchronized source (SSRC) identifier values. Similarly the server can use multiple Synchronized Source (SSRC) identifier values to differentiate media streams originating from the individual transmission resource URLs if more than one exists. The individual resources may, on the other hand, work together to send just one stream to the client. This is up to the implementation of the media server.

客户端可以向服务器发送多个媒体流,通过使用不同的同步源(SSRC)标识符值进行区分。类似地,如果存在多个同步源(SSRC)标识符值,则服务器可以使用多个同步源(SSRC)标识符值来区分源自单个传输资源URL的媒体流。另一方面,各个资源可以一起工作以仅向客户端发送一个流。这取决于媒体服务器的实现。

4. Notational Conventions
4. 符号约定

Since many of the definitions and syntax are identical to HTTP/1.1, this specification only points to the section where they are defined rather than copying it. For brevity, [HX.Y] refers to Section X.Y of the current HTTP/1.1 specification (RFC 2616 [1]).

由于许多定义和语法与HTTP/1.1相同,因此本规范只指向定义它们的部分,而不是复制它们。为简洁起见,[HX.Y]参考当前HTTP/1.1规范(RFC 2616[1])的第X.Y节。

All the mechanisms specified in this document are described in both prose and an augmented Backus-Naur form (ABNF) similar to that used in [H2.1]. It is described in detail in RFC 4234 [3].

本文件中规定的所有机制均以散文和类似于[H2.1]中使用的增广巴科斯-诺尔形式(ABNF)进行了描述。RFC 4234[3]对此进行了详细描述。

The ABNF provided along with the descriptive text is informative in nature and may not be complete. The complete message format in ABNF form is provided in Appendix A and is the normative format definition.

随描述性文本提供的ABNF本质上是信息性的,可能不完整。附录A中提供了ABNF格式的完整信息格式,这是标准格式定义。

5. MRCP Specification
5. MRCP规范

The MRCP PDU is textual using an ISO 10646 character set in the UTF-8 encoding (RFC 3629 [12]) to allow many different languages to be represented. However, to assist in compact representations, MRCP also allows other character sets such as ISO 8859-1 to be used when desired. The MRCP protocol headers and field names use only the US-ASCII subset of UTF-8. Internationalization only applies to certain fields like grammar, results, speech markup, etc., and not to MRCP as a whole.

MRCP PDU使用UTF-8编码(RFC 3629[12])中的ISO 10646字符集进行文本化,以允许表示多种不同的语言。然而,为了有助于紧凑表示,MRCP还允许在需要时使用其他字符集,如ISO 8859-1。MRCP协议头和字段名仅使用UTF-8的US-ASCII子集。国际化只适用于某些领域,如语法、结果、语音标记等,而不适用于整个MRCP。

Lines are terminated by CRLF, but receivers SHOULD be prepared to also interpret CR and LF by themselves as line terminators. Also, some parameters in the PDU may contain binary data or a record spanning multiple lines. Such fields have a length value associated with the parameter, which indicates the number of octets immediately following the parameter.

线路由CRLF端接,但接收机也应准备将CR和LF本身解释为线路端接器。此外,PDU中的一些参数可能包含二进制数据或跨越多行的记录。此类字段具有与参数关联的长度值,该值指示紧跟在参数之后的八位字节数。

The whole MRCP PDU is encoded in the body of the session level message as a MIME entity of type application/mrcp. The individual MRCP messages do not have addressing information regarding which resource the request/response are to/from. Instead, the MRCP message relies on the header of the session level message carrying it to deliver the request to the appropriate resource, or to figure out who the response or event is from.

整个MRCP PDU作为类型为application/MRCP的MIME实体编码在会话级消息体中。单个MRCP消息没有关于请求/响应来自哪个资源的寻址信息。相反,MRCP消息依赖于承载它的会话级消息的报头来将请求传递到适当的资源,或者确定响应或事件来自谁。

The MRCP message set consists of requests from the client to the server, responses from the server to the client and asynchronous events from the server to the client. All these messages consist of a start-line, one or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and an optional message body.

MRCP消息集包括从客户端到服务器的请求、从服务器到客户端的响应以及从服务器到客户端的异步事件。所有这些消息包括一个起始行、一个或多个标题字段(也称为“标题”)、一个空行(即,在CRLF之前没有任何内容的行)和一个可选的消息正文。

generic-message = start-line message-header CRLF [ message-body ]

通用消息=起始行消息头CRLF[消息正文]

          message-body    =   *OCTET
        
          message-body    =   *OCTET
        
          start-line      =   request-line / status-line / event-line
        
          start-line      =   request-line / status-line / event-line
        

The message-body contains resource-specific and message-specific data that needs to be carried between the client and server as a MIME entity. The information contained here and the actual MIME-types used to carry the data are specified later when addressing the specific messages.

消息体包含特定于资源和特定于消息的数据,这些数据需要作为MIME实体在客户端和服务器之间传输。此处包含的信息以及用于承载数据的实际MIME类型将在稍后对特定消息进行寻址时指定。

If a message contains data in the message body, the header fields will contain content-headers indicating the MIME-type and encoding of the data in the message body.

如果消息正文中包含数据,则标头字段将包含指示消息正文中数据的MIME类型和编码的内容标头。

5.1. Request
5.1. 要求

An MRCP request consists of a Request line followed by zero or more parameters as part of the message headers and an optional message body containing data specific to the request message.

MRCP请求包括一个请求行,后跟零个或多个参数,作为消息头的一部分,以及一个可选的消息体,其中包含特定于请求消息的数据。

The Request message from a client to the server includes, within the first line, the method to be applied, a method tag for that request, and the version of protocol in use.

从客户端到服务器的请求消息在第一行中包括要应用的方法、该请求的方法标记以及正在使用的协议版本。

request-line = method-name SP request-id SP mrcp-version CRLF

请求行=方法名称SP请求id SP mrcp版本CRLF

The request-id field is a unique identifier created by the client and sent to the server. The server resource should use this identifier in its response to this request. If the request does not complete with the response, future asynchronous events associated with this request MUST carry the request-id.

请求id字段是由客户端创建并发送到服务器的唯一标识符。服务器资源在响应此请求时应使用此标识符。如果请求未与响应一起完成,则与此请求相关联的未来异步事件必须带有request-id。

     request-id    =    1*DIGIT
        
     request-id    =    1*DIGIT
        

The method-name field identifies the specific request that the client is making to the server. Each resource supports a certain list of requests or methods that can be issued to it, and will be addressed in later sections.

“方法名称”字段标识客户端向服务器发出的特定请求。每个资源都支持可以向其发出的请求或方法的特定列表,将在后面的小节中介绍。

method-name = synthesizer-method / recognizer-method

方法名称=合成器方法/识别器方法

The mrcp-version field is the MRCP protocol version that is being used by the client.

mrcp版本字段是客户端正在使用的mrcp协议版本。

     mrcp-version   =    "MRCP" "/" 1*DIGIT "." 1*DIGIT
        
     mrcp-version   =    "MRCP" "/" 1*DIGIT "." 1*DIGIT
        
5.2. Response
5.2. 回答

After receiving and interpreting the request message, the server resource responds with an MRCP response message. It consists of a status line optionally followed by a message body.

在接收并解释请求消息之后,服务器资源将使用MRCP响应消息进行响应。它由一个状态行(可选)和一个消息体组成。

response-line = mrcp-version SP request-id SP status-code SP request-state CRLF

响应行=mrcp版本SP请求id SP状态代码SP请求状态CRLF

The mrcp-version field used here is similar to the one used in the Request Line and indicates the version of MRCP protocol running on the server.

此处使用的mrcp版本字段与请求行中使用的字段类似,表示服务器上运行的mrcp协议的版本。

The request-id used in the response MUST match the one sent in the corresponding request message.

响应中使用的请求id必须与相应请求消息中发送的id匹配。

The status-code field is a 3-digit code representing the success or failure or other status of the request.

状态代码字段是一个3位代码,表示请求的成功或失败或其他状态。

The request-state field indicates if the job initiated by the Request is PENDING, IN-PROGRESS, or COMPLETE. The COMPLETE status means that the Request was processed to completion and that there will be no more events from that resource to the client with that request-id. The PENDING status means that the job has been placed on a queue and will be processed in first-in-first-out order. The IN-PROGRESS status means that the request is being processed and is not yet complete. A PENDING or IN-PROGRESS status indicates that further Event messages will be delivered with that request-id.

请求状态字段指示由请求启动的作业是挂起、正在进行还是已完成。“完成”状态表示请求已被处理到完成状态,并且该资源将不再向具有该请求id的客户端发送事件。“挂起”状态表示作业已放置在队列中,并将以先进先出的顺序进行处理。“正在进行”状态表示请求正在处理中,尚未完成。挂起或进行中状态表示将使用该请求id传递更多事件消息。

request-state = "COMPLETE" / "IN-PROGRESS" / "PENDING"

请求状态=“完成”/“正在进行”/“待定”

5.2.1. Status Codes
5.2.1. 状态代码

The status codes are classified under the Success(2XX) codes and the Failure(4XX) codes.

状态代码分为成功(2XX)代码和失败(4XX)代码。

5.2.1.1. Success 2xx
5.2.1.1. 成功2xx

200 Success 201 Success with some optional parameters ignored.

200成功201成功,忽略一些可选参数。

5.2.1.2. Failure 4xx
5.2.1.2. 故障4xx

401 Method not allowed 402 Method not valid in this state 403 Unsupported Parameter 404 Illegal Value for Parameter 405 Not found (e.g., Resource URI not initialized or doesn't exist) 406 Mandatory Parameter Missing 407 Method or Operation Failed (e.g., Grammar compilation failed in the recognizer. Detailed cause codes MAY BE available through a resource specific header field.) 408 Unrecognized or unsupported message entity

401方法不允许402方法在此状态下无效403不支持的参数404未找到参数405的非法值(例如,资源URI未初始化或不存在)406强制参数缺少407方法或操作失败(例如,识别器中的语法编译失败。详细的原因代码可通过特定于资源的标头字段获得。)408无法识别或不支持的消息实体

409 Unsupported Parameter Value 421-499 Resource specific Failure codes

409不支持的参数值421-499特定于资源的故障代码

5.3. Event
5.3. 事件

The server resource may need to communicate a change in state or the occurrence of a certain event to the client. These messages are used when a request does not complete immediately and the response returns a status of PENDING or IN-PROGRESS. The intermediate results and events of the request are indicated to the client through the event message from the server. Events have the request-id of the request that is in progress and is generating these events and status value. The status value is COMPLETE if the request is done and this was the last event, else it is IN-PROGRESS.

服务器资源可能需要将状态的变化或某个事件的发生告知客户端。当请求未立即完成且响应返回挂起或进行中状态时,将使用这些消息。请求的中间结果和事件通过来自服务器的事件消息指示给客户端。事件具有正在进行且正在生成这些事件和状态值的请求的请求id。如果请求已完成且这是最后一个事件,则状态值为COMPLETE,否则它正在进行中。

event-line = event-name SP request-id SP request-state SP mrcp-version CRLF

事件行=事件名称SP请求id SP请求状态SP mrcp版本CRLF

The mrcp-version used here is identical to the one used in the Request/Response Line and indicates the version of MRCP protocol running on the server.

此处使用的mrcp版本与请求/响应行中使用的版本相同,并指示服务器上运行的mrcp协议的版本。

The request-id used in the event should match the one sent in the request that caused this event.

事件中使用的请求id应与导致此事件的请求中发送的id匹配。

The request-state indicates if the Request/Command causing this event is complete or still in progress, and is the same as the one mentioned in Section 5.2. The final event will contain a COMPLETE status indicating the completion of the request.

请求状态指示导致此事件的请求/命令是否已完成或仍在进行中,并且与第5.2节中提到的相同。最终事件将包含指示请求完成的完成状态。

The event-name identifies the nature of the event generated by the media resource. The set of valid event names are dependent on the resource generating it, and will be addressed in later sections.

事件名称标识由媒体资源生成的事件的性质。有效事件名称集取决于生成它的资源,将在后面的章节中讨论。

event-name = synthesizer-event / recognizer-event

事件名称=合成器事件/识别器事件

5.4. Message Headers
5.4. 消息头

MRCP header fields, which include general-header (Section 5.4) and resource-specific-header (Sections 7.4 and 8.4), follow the same generic format as that given in Section 2.1 of RFC 2822 [7]. Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive. The field value MAY be preceded by any amount of linear whitespace (LWS), though a single SP is preferred. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT.

MRCP标题字段,包括一般标题(第5.4节)和资源特定标题(第7.4节和第8.4节),遵循RFC 2822[7]第2.1节中给出的通用格式。每个标题字段由一个名称,后跟一个冒号(“:”)和字段值组成。字段名不区分大小写。字段值前面可以有任意数量的线性空白(LWS),但最好是单个SP。通过在每一额外行之前至少添加一个SP或HT,可以将标题字段扩展到多行。

          message-header =    1*(generic-header / resource-header)
        
          message-header =    1*(generic-header / resource-header)
        

The order in which header fields with differing field names are received is not significant. However, it is "good practice" to send general-header fields first, followed by request-header or response-header fields, and ending with the entity-header fields.

接收具有不同字段名的标题字段的顺序并不重要。然而,“良好做法”是首先发送常规头字段,然后发送请求头或响应头字段,最后发送实体头字段。

Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field value for that header field is defined as a comma-separated list (i.e., #(values)).

当且仅当消息头字段的整个字段值定义为逗号分隔列表(即#(值))时,消息中可能存在多个具有相同字段名的消息头字段。

It MUST be possible to combine the multiple header fields into one "field-name:field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. Therefore, the order in which header fields with the same field-name are received is significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded.

必须能够将多个标题字段合并为一个“字段名称:字段值”对,而不改变消息的语义,方法是将每个后续字段值附加到第一个字段值,每个字段值用逗号分隔。因此,具有相同字段名的标题字段的接收顺序对于组合字段值的解释非常重要,因此,在转发消息时,代理不得更改这些字段值的顺序。

Generic Headers

通用标题

     generic-header      =    active-request-id-list
                         /    proxy-sync-id
                         /    content-id
                         /    content-type
                         /    content-length
                         /    content-base
                         /    content-location
                         /    content-encoding
                         /    cache-control
                         /    logging-tag
        
     generic-header      =    active-request-id-list
                         /    proxy-sync-id
                         /    content-id
                         /    content-type
                         /    content-length
                         /    content-base
                         /    content-location
                         /    content-encoding
                         /    cache-control
                         /    logging-tag
        

All headers in MRCP will be case insensitive, consistent with HTTP and RTSP protocol header definitions.

MRCP中的所有头都不区分大小写,与HTTP和RTSP协议头定义一致。

5.4.1. Active-Request-Id-List
5.4.1. 活动请求Id列表

In a request, this field indicates the list of request-ids to which it should apply. This is useful when there are multiple Requests that are PENDING or IN-PROGRESS and you want this request to apply to one or more of these specifically.

在请求中,此字段指示应应用的请求ID列表。当存在多个挂起或正在进行的请求,并且您希望将此请求应用于其中一个或多个请求时,这非常有用。

In a response, this field returns the list of request-ids that the operation modified or were in progress or just completed. There could be one or more requests that returned a request-state of PENDING or IN-PROGRESS. When a method affecting one or more PENDING

在响应中,此字段返回操作已修改、正在进行或刚刚完成的请求ID列表。可能有一个或多个请求返回了挂起或正在进行的请求状态。当一个方法影响一个或多个挂起

or IN-PROGRESS requests is sent from the client to the server, the response MUST contain the list of request-ids that were affected in this header field.

如果正在进行的请求从客户端发送到服务器,则响应必须包含受此标头字段影响的请求ID列表。

The active-request-id-list is only used in requests and responses, not in events.

活动请求id列表仅用于请求和响应,而不用于事件。

For example, if a STOP request with no active-request-id-list is sent to a synthesizer resource (a wildcard STOP) that has one or more SPEAK requests in the PENDING or IN-PROGRESS state, all SPEAK requests MUST be cancelled, including the one IN-PROGRESS. In addition, the response to the STOP request would contain the request-id of all the SPEAK requests that were terminated in the active-request-id-list. In this case, no SPEAK-COMPLETE or RECOGNITION-COMPLETE events will be sent for these terminated requests.

例如,如果没有活动请求id列表的停止请求被发送到合成器资源(通配符停止),该资源有一个或多个处于挂起或正在进行状态的语音请求,则必须取消所有语音请求,包括正在进行的语音请求。此外,对停止请求的响应将包含活动请求id列表中终止的所有语音请求的请求id。在这种情况下,不会为这些终止的请求发送SPEAK-COMPLETE或RECOGNITION-COMPLETE事件。

active-request-id-list = "Active-Request-Id-List" ":" request-id *("," request-id) CRLF

活动请求id列表=“活动请求id列表”“:“请求id*(”,“请求id”)CRLF

5.4.2. Proxy-Sync-Id
5.4.2. 代理同步Id

When any server resource generates a barge-in-able event, it will generate a unique Tag and send it as a header field in an event to the client. The client then acts as a proxy to the server resource and sends a BARGE-IN-OCCURRED method (Section 7.10) to the synthesizer server resource with the Proxy-Sync-Id it received from the server resource. When the recognizer and synthesizer resources are part of the same session, they may choose to work together to achieve quicker interaction and response. Here, the proxy-sync-id helps the resource receiving the event, proxied by the client, to decide if this event has been processed through a direct interaction of the resources.

当任何服务器资源生成一个barge-in-able事件时,它将生成一个唯一的标记,并将其作为事件中的标题字段发送给客户端。然后,客户端充当服务器资源的代理,并使用从服务器资源接收到的代理同步Id向合成器服务器资源发送插入方法(第7.10节)。当识别器和合成器资源是同一会话的一部分时,它们可以选择协同工作以实现更快的交互和响应。在这里,代理同步id帮助接收由客户端代理的事件的资源决定是否通过资源的直接交互处理了该事件。

     proxy-sync-id    =  "Proxy-Sync-Id" ":" 1*ALPHA CRLF
        
     proxy-sync-id    =  "Proxy-Sync-Id" ":" 1*ALPHA CRLF
        
5.4.3. Accept-Charset
5.4.3. 接受字符集

See [H14.2]. This specifies the acceptable character set for entities returned in the response or events associated with this request. This is useful in specifying the character set to use in the Natural Language Semantics Markup Language (NLSML) results of a RECOGNITON-COMPLETE event.

见[H14.2]。这为响应中返回的实体或与此请求关联的事件指定可接受的字符集。这在指定要在识别完成事件的自然语言语义标记语言(NLSML)结果中使用的字符集时非常有用。

5.4.4. Content-Type
5.4.4. 内容类型

See [H14.17]. Note that the content types suitable for MRCP are restricted to speech markup, grammar, recognition results, etc., and are specified later in this document. The multi-part content type "multi-part/mixed" is supported to communicate multiple of the above mentioned contents, in which case the body parts cannot contain any MRCP specific headers.

见[H14.17]。请注意,适用于MRCP的内容类型仅限于语音标记、语法、识别结果等,并在本文档稍后部分中指定。支持多部分内容类型“多部分/混合”来传达上述多个内容,在这种情况下,正文部分不能包含任何MRCP特定的标题。

5.4.5. Content-Id
5.4.5. 内容Id

This field contains an ID or name for the content, by which it can be referred to. The definition of this field conforms to RFC 2392 [14], RFC 2822 [7], RFC 2046 [13] and is needed in multi-part messages. In MRCP whenever the content needs to be stored, by either the client or the server, it is stored associated with this ID. Such content can be referenced during the session in URI form using the session:URI scheme described in a later section.

此字段包含内容的ID或名称,通过该ID或名称可以引用内容。此字段的定义符合RFC 2392[14]、RFC 2822[7]、RFC 2046[13]的要求,并且在多部分消息中是必需的。在MRCP中,每当客户机或服务器需要存储内容时,都会将其与此ID关联存储。在会话期间,可以使用后面一节中描述的session:URI方案以URI形式引用此类内容。

5.4.6. Content-Base
5.4.6. 内容库

The content-base entity-header field may be used to specify the base URI for resolving relative URLs within the entity.

content base entity header字段可用于指定用于解析实体内相对URL的基本URI。

content-base = "Content-Base" ":" absoluteURI CRLF

content base=“content base”“:”绝对URI CRLF

Note, however, that the base URI of the contents within the entity-body may be redefined within that entity-body. An example of this would be a multi-part MIME entity, which in turn can have multiple entities within it.

然而,请注意,实体主体内内容的基本URI可以在该实体主体内重新定义。这方面的一个例子是一个由多个部分组成的MIME实体,而该实体又可以包含多个实体。

5.4.7. Content-Encoding
5.4.7. 内容编码

The content-encoding entity-header field is used as a modifier to the media-type. When present, its value indicates what additional content coding has been applied to the entity-body, and thus what decoding mechanisms must be applied in order to obtain the media-type referenced by the content-type header field. Content-encoding is primarily used to allow a document to be compressed without losing the identity of its underlying media type.

内容编码实体标题字段用作媒体类型的修饰符。当存在时,其值指示已将哪些附加内容编码应用于实体主体,因此必须应用哪些解码机制才能获得内容类型标头字段引用的媒体类型。内容编码主要用于压缩文档而不丢失其底层媒体类型的标识。

          content-encoding =  "Content-Encoding" ":"
                              *WSP content-coding
                              *(*WSP "," *WSP content-coding *WSP )
                              CRLF
        
          content-encoding =  "Content-Encoding" ":"
                              *WSP content-coding
                              *(*WSP "," *WSP content-coding *WSP )
                              CRLF
        
          content-coding   =  token
        
          content-coding   =  token
        
          token            =  1*(alphanum / "-" / "." / "!" / "%" / "*"
                              / "_" / "+" / "`" / "'" / "~" )
        
          token            =  1*(alphanum / "-" / "." / "!" / "%" / "*"
                              / "_" / "+" / "`" / "'" / "~" )
        

Content coding is defined in [H3.5]. An example of its use is

[H3.5]中定义了内容编码。其使用的一个例子是

Content-Encoding:gzip

内容编码:gzip

If multiple encodings have been applied to an entity, the content codings MUST be listed in the order in which they were applied.

如果对一个实体应用了多个编码,则必须按应用顺序列出内容编码。

5.4.8. Content-Location
5.4.8. 内容位置

The content-location entity-header field MAY BE used to supply the resource location for the entity enclosed in the message when that entity is accessible from a location separate from the requested resource's URI.

当可以从与请求的资源的URI分离的位置访问该实体时,可以使用content-location-entity报头字段为消息中包含的实体提供资源位置。

content-location = "Content-Location" ":" ( absoluteURI / relativeURI ) CRLF

content location=“content location”“:”(绝对URI/relativeURI)CRLF

The content-location value is a statement of the location of the resource corresponding to this particular entity at the time of the request. The media server MAY use this header field to optimize certain operations. When providing this header field, the entity being sent should not have been modified from what was retrieved from the content-location URI.

content location值是请求时对应于该特定实体的资源位置的语句。媒体服务器可以使用此标头字段来优化某些操作。提供此标头字段时,不应根据从内容位置URI检索的内容修改要发送的实体。

For example, if the client provided a grammar markup inline, and it had previously retrieved it from a certain URI, that URI can be provided as part of the entity, using the content-location header field. This allows a resource like the recognizer to look into its cache to see if this grammar was previously retrieved, compiled, and cached. In which case, it might optimize by using the previously compiled grammar object.

例如,如果客户机内联提供了语法标记,并且它以前从某个URI中检索过该标记,则可以使用content location标头字段将该URI作为实体的一部分提供。这允许像识别器这样的资源查看其缓存,以查看该语法以前是否被检索、编译和缓存过。在这种情况下,它可能会使用以前编译的语法对象进行优化。

If the content-location is a relative URI, the relative URI is interpreted relative to the content-base URI.

如果内容位置是相对URI,则相对URI将相对于内容库URI进行解释。

5.4.9. Content-Length
5.4.9. 内容长度

This field contains the length of the content of the message body (i.e., after the double CRLF following the last header field). Unlike HTTP, it MUST be included in all messages that carry content beyond the header portion of the message. If it is missing, a default value of zero is assumed. It is interpreted according to [H14.13].

此字段包含消息正文内容的长度(即,在最后一个标头字段后面的双CRLF之后)。与HTTP不同,它必须包含在所有包含消息头部分以外内容的消息中。如果缺少,则假定默认值为零。根据[H14.13]对其进行解释。

5.4.10. Cache-Control
5.4.10. 缓存控制

If the media server plans on implementing caching, it MUST adhere to the cache correctness rules of HTTP 1.1 (RFC2616), when accessing and caching HTTP URI. In particular, the expires and cache-control headers of the cached URI or document must be honored and will always take precedence over the Cache-Control defaults set by this header field. The cache-control directives are used to define the default caching algorithms on the media server for the session or request. The scope of the directive is based on the method it is sent on. If the directives are sent on a SET-PARAMS method, it SHOULD apply for all requests for documents the media server may make in that session. If the directives are sent on any other messages, they MUST only apply to document requests the media server needs to make for that method. An empty cache-control header on the GET-PARAMS method is a request for the media server to return the current cache-control directives setting on the server.

如果媒体服务器计划实现缓存,那么在访问和缓存HTTP URI时,它必须遵守HTTP 1.1(RFC2616)的缓存正确性规则。特别是,必须遵守缓存URI或文档的expires和cache control标头,并始终优先于此标头字段设置的缓存控制默认值。缓存控制指令用于定义媒体服务器上会话或请求的默认缓存算法。指令的作用域基于它发送的方法。如果指令是通过SET-PARAMS方法发送的,则应适用于媒体服务器在该会话中可能发出的所有文档请求。如果指令是在任何其他消息上发送的,则它们必须仅适用于媒体服务器需要为该方法发出的文档请求。GET-PARAMS方法上的空缓存控制标头请求媒体服务器返回服务器上的当前缓存控制指令设置。

          cache-control  =    "Cache-Control" ":" *WSP cache-directive
                              *( *WSP "," *WSP cache-directive *WSP )
                              CRLF
        
          cache-control  =    "Cache-Control" ":" *WSP cache-directive
                              *( *WSP "," *WSP cache-directive *WSP )
                              CRLF
        
          cache-directive =   "max-age" "=" delta-seconds
                          /   "max-stale" "=" delta-seconds
                          /   "min-fresh" "=" delta-seconds
        
          cache-directive =   "max-age" "=" delta-seconds
                          /   "max-stale" "=" delta-seconds
                          /   "min-fresh" "=" delta-seconds
        
          delta-seconds       = 1*DIGIT
        
          delta-seconds       = 1*DIGIT
        

Here, delta-seconds is a time value to be specified as an integer number of seconds, represented in decimal, after the time that the message response or data was received by the media server.

这里,delta seconds是一个时间值,在媒体服务器接收到消息响应或数据的时间之后,指定为以十进制表示的整数秒数。

These directives allow the media server to override the basic expiration mechanism.

这些指令允许媒体服务器覆盖基本过期机制。

max-age

最大年龄

Indicates that the client is OK with the media server using a response whose age is no greater than the specified time in seconds. Unless a max-stale directive is also included, the client is not willing to accept the media server using a stale response.

指示客户端与媒体服务器的关系正常,使用的响应时间不超过指定的时间(以秒为单位)。除非还包含max stale指令,否则客户端不愿意接受使用stale响应的媒体服务器。

min-fresh

min fresh

Indicates that the client is willing to accept the media server using a response whose freshness lifetime is no less than its current age plus the specified time in seconds. That is, the

指示客户端愿意使用新鲜度生存期不小于其当前时间加上以秒为单位的指定时间的响应来接受媒体服务器。就是

client wants the media server to use a response that will still be fresh for at least the specified number of seconds.

客户端希望媒体服务器使用至少在指定的秒数内仍然是新的响应。

max-stale

斯塔尔马克斯

Indicates that the client is willing to accept the media server using a response that has exceeded its expiration time. If max-stale is assigned a value, then the client is willing to accept the media server using a response that has exceeded its expiration time by no more than the specified number of seconds. If no value is assigned to max-stale, then the client is willing to accept the media server using a stale response of any age.

指示客户端愿意使用超过其过期时间的响应来接受媒体服务器。如果为max stale分配了一个值,则客户端愿意使用超出其到期时间不超过指定秒数的响应来接受媒体服务器。如果没有为max stale分配任何值,则客户端愿意接受使用任何时间段的stale响应的媒体服务器。

The media server cache MAY BE requested to use stale response/data without validation, but only if this does not conflict with any "MUST"-level requirements concerning cache validation (e.g., a "must-revalidate" cache-control directive) in the HTTP 1.1 specification pertaining the URI.

可能会要求媒体服务器缓存使用过时的响应/数据而不进行验证,但前提是这不会与HTTP 1.1规范中有关URI的缓存验证的任何“必须”级别要求(例如,“必须重新验证”缓存控制指令)相冲突。

If both the MRCP cache-control directive and the cached entry on the media server include "max-age" directives, then the lesser of the two values is used for determining the freshness of the cached entry for that request.

如果MRCP cache control指令和媒体服务器上的缓存项都包含“max age”指令,则两个值中的较小值将用于确定该请求的缓存项的新鲜度。

5.4.11. Logging-Tag
5.4.11. 日志标签

This header field MAY BE sent as part of a SET-PARAMS/GET-PARAMS method to set the logging tag for logs generated by the media server. Once set, the value persists until a new value is set or the session is ended. The MRCP server should provide a mechanism to subset its output logs so that system administrators can examine or extract only the log file portion during which the logging tag was set to a certain value.

此标头字段可以作为SET-PARAMS/GET-PARAMS方法的一部分发送,以设置媒体服务器生成的日志的日志标记。设置后,该值将持续存在,直到设置新值或会话结束。MRCP服务器应该提供一种机制来子集其输出日志,以便系统管理员可以仅检查或提取日志文件部分,在此期间日志标记被设置为特定值。

MRCP clients using this feature should take care to ensure that no two clients specify the same logging tag. In the event that two clients specify the same logging tag, the effect on the MRCP server's output logs in undefined.

使用此功能的MRCP客户端应注意确保没有两个客户端指定相同的日志标记。如果两个客户端指定了相同的日志标记,那么对MRCP服务器输出日志的影响将是未定义的。

     logging-tag    =    "Logging-Tag" ":" 1*ALPHA CRLF
        
     logging-tag    =    "Logging-Tag" ":" 1*ALPHA CRLF
        
6. Media Server
6. 媒体服务器

The capability of media server resources can be found using the RTSP DESCRIBE mechanism. When a client issues an RTSP DESCRIBE method for a media resource URI, the media server response MUST contain an SDP description in its body describing the capabilities of the media server resource. The SDP description MUST contain at a minimum the media header (m-line) describing the codec and other media related features it supports. It MAY contain another SDP header as well, but support for it is optional.

可以使用RTSP描述机制找到媒体服务器资源的功能。当客户端为媒体资源URI发出RTSP descripe方法时,媒体服务器响应必须在其正文中包含描述媒体服务器资源功能的SDP描述。SDP描述必须至少包含描述编解码器及其支持的其他媒体相关功能的媒体头(m行)。它也可能包含另一个SDP头,但对它的支持是可选的。

The usage of SDP messages in the RTSP message body and its application follows the SIP RFC 2543 [4], but is limited to media-related negotiation and description.

RTSP消息体及其应用程序中SDP消息的使用遵循SIP RFC 2543[4],但仅限于与媒体相关的协商和描述。

6.1. Media Server Session
6.1. 媒体服务器会话

As discussed in Section 3.2, a client/server should share one RTSP session-id for the different resources it may use under the same session. The client MUST allocate a set of client RTP/RTCP ports for a new session and MUST NOT send a Session-ID in the SETUP message for the first resource. The server then creates a Session-ID and allocates a set of server RTP/RTCP ports and responds to the SETUP message.

如第3.2节所述,客户机/服务器应为其在同一会话下可能使用的不同资源共享一个RTSP会话id。客户端必须为新会话分配一组客户端RTP/RTCP端口,并且不得在第一个资源的设置消息中发送会话ID。然后,服务器创建会话ID并分配一组服务器RTP/RTCP端口,并响应设置消息。

If the client wants to open more resources with the same server under the same session, it will send the session-id (that it got in the earlier SETUP response) in the SETUP for the new resource. A SETUP message with an existing session-id tells the server that this new resource will feed from/into the same RTP/RTCP stream of that existing session.

如果客户端希望在同一会话下使用同一服务器打开更多资源,它将在新资源的设置中发送会话id(在早期的设置响应中获得)。带有现有会话id的设置消息告诉服务器,此新资源将从该现有会话的同一RTP/RTCP流馈送/馈送到该会话的同一RTP/RTCP流中。

If the client wants to open a resource from a media server that is not where the first resource came from, it will send separate SETUP requests with no session-id header field in them. Each server will allocate its own session-id and return it in the response. Each of them will also come back with their own set of RTP/RTCP ports. This would be the case when the synthesizer engine and the recognition engine are on different servers.

如果客户机希望从不是第一个资源来源的媒体服务器打开资源,它将发送单独的安装请求,其中没有会话id头字段。每个服务器将分配自己的会话id,并在响应中返回它。它们中的每一个都将带着自己的一组RTP/RTCP端口返回。当合成器引擎和识别引擎位于不同的服务器上时,就会出现这种情况。

The RTSP SETUP method SHOULD contain an SDP description of the media stream being set up. The RTSP SETUP response MUST contain an SDP description of the media stream that it expects to receive and send on that session.

RTSP设置方法应包含正在设置的媒体流的SDP描述。RTSP设置响应必须包含预期在该会话中接收和发送的媒体流的SDP描述。

The SDP description in the SETUP method from the client SHOULD describe the required media parameters like codec, Named Signaling Event (NSE) payload types, etc. This could have multiple media

客户端设置方法中的SDP描述应描述所需的媒体参数,如编解码器、命名信令事件(NSE)负载类型等。这可能有多个媒体

headers (i.e., m-lines) to allow the client to provide the media server with more than one option to choose from.

允许客户端向媒体服务器提供多个选项以供选择的标题(即m行)。

The SDP description in the SETUP response should reflect the media parameters that the media server will be using for the stream. It should be within the choices that were specified in the SDP of the SETUP method, if one was provided.

设置响应中的SDP描述应反映媒体服务器将用于流的媒体参数。它应该在设置方法的SDP中指定的选项范围内(如果提供)。

Example:

例子:

C->S:

C->S:

       SETUP rtsp://media.server.com/recognizer/ RTSP/1.0
       CSeq:1
       Transport:RTP/AVP;unicast;client_port=46456-46457
       Content-Type:application/sdp
       Content-Length:190
        
       SETUP rtsp://media.server.com/recognizer/ RTSP/1.0
       CSeq:1
       Transport:RTP/AVP;unicast;client_port=46456-46457
       Content-Type:application/sdp
       Content-Length:190
        
       v=0
       o=- 123 456 IN IP4 10.0.0.1
       s=Media Server
       p=+1-888-555-1212
       c=IN IP4 0.0.0.0
       t=0 0
       m=audio 46456 RTP/AVP 0 96
       a=rtpmap:0 pcmu/8000
       a=rtpmap:96 telephone-event/8000
       a=fmtp:96 0-15
        
       v=0
       o=- 123 456 IN IP4 10.0.0.1
       s=Media Server
       p=+1-888-555-1212
       c=IN IP4 0.0.0.0
       t=0 0
       m=audio 46456 RTP/AVP 0 96
       a=rtpmap:0 pcmu/8000
       a=rtpmap:96 telephone-event/8000
       a=fmtp:96 0-15
        

S->C:

S->C:

       RTSP/1.0 200 OK
       CSeq:1
       Session:0a030258_00003815_3bc4873a_0001_0000
       Transport:RTP/AVP;unicast;client_port=46456-46457;
                  server_port=46460-46461
       Content-Length:190
       Content-Type:application/sdp
        
       RTSP/1.0 200 OK
       CSeq:1
       Session:0a030258_00003815_3bc4873a_0001_0000
       Transport:RTP/AVP;unicast;client_port=46456-46457;
                  server_port=46460-46461
       Content-Length:190
       Content-Type:application/sdp
        
       v=0
       o=- 3211724219 3211724219 IN IP4 10.3.2.88
       s=Media Server
       c=IN IP4 0.0.0.0
       t=0 0
       m=audio 46460 RTP/AVP 0 96
       a=rtpmap:0 pcmu/8000
       a=rtpmap:96 telephone-event/8000
       a=fmtp:96 0-15
        
       v=0
       o=- 3211724219 3211724219 IN IP4 10.3.2.88
       s=Media Server
       c=IN IP4 0.0.0.0
       t=0 0
       m=audio 46460 RTP/AVP 0 96
       a=rtpmap:0 pcmu/8000
       a=rtpmap:96 telephone-event/8000
       a=fmtp:96 0-15
        

If an SDP description was not provided in the RTSP SETUP method, then the media server may decide on parameters of the stream but MUST specify what it chooses in the SETUP response. An SDP announcement is only returned in a response to a SETUP message that does not specify a Session. That is, the server will not return an SDP announcement for the synthesizer SETUP of a session already established with a recognizer.

如果RTSP设置方法中未提供SDP描述,则媒体服务器可以决定流的参数,但必须指定其在设置响应中选择的参数。SDP公告仅在响应未指定会话的设置消息时返回。也就是说,对于已经与识别器建立的会话的合成器设置,服务器不会返回SDP公告。

C->S:

C->S:

       SETUP rtsp://media.server.com/recognizer/ RTSP/1.0
       CSeq:1
       Transport:RTP/AVP;unicast;client_port=46498
        
       SETUP rtsp://media.server.com/recognizer/ RTSP/1.0
       CSeq:1
       Transport:RTP/AVP;unicast;client_port=46498
        

S->C:

S->C:

       RTSP/1.0 200 OK
       CSeq:1
       Session:0a030258_000039dc_3bc48a13_0001_0000
       Transport:RTP/AVP;unicast; client_port=46498;
                  server_port=46502-46503
       Content-Length:193
       Content-Type:application/sdp
        
       RTSP/1.0 200 OK
       CSeq:1
       Session:0a030258_000039dc_3bc48a13_0001_0000
       Transport:RTP/AVP;unicast; client_port=46498;
                  server_port=46502-46503
       Content-Length:193
       Content-Type:application/sdp
        
       v=0
       o=- 3211724947 3211724947 IN IP4 10.3.2.88
       s=Media Server
       c=IN IP4 0.0.0.0
       t=0 0
       m=audio 46502 RTP/AVP 0 101
       a=rtpmap:0 pcmu/8000
       a=rtpmap:101 telephone-event/8000
       a=fmtp:101 0-15
        
       v=0
       o=- 3211724947 3211724947 IN IP4 10.3.2.88
       s=Media Server
       c=IN IP4 0.0.0.0
       t=0 0
       m=audio 46502 RTP/AVP 0 101
       a=rtpmap:0 pcmu/8000
       a=rtpmap:101 telephone-event/8000
       a=fmtp:101 0-15
        
7. Speech Synthesizer Resource
7. 语音合成器资源

This resource is capable of converting text provided by the client and generating a speech stream in real-time. Depending on the implementation and capability of this resource, the client can control parameters like voice characteristics, speaker speed, etc.

该资源能够转换客户端提供的文本并实时生成语音流。根据此资源的实现和能力,客户端可以控制语音特征、扬声器速度等参数。

The synthesizer resource is controlled by MRCP requests from the client. Similarly, the resource can respond to these requests or generate asynchronous events to the server to indicate certain conditions during the processing of the stream.

合成器资源由来自客户端的MRCP请求控制。类似地,资源可以响应这些请求或向服务器生成异步事件,以指示流处理过程中的某些条件。

7.1. Synthesizer State Machine
7.1. 合成器状态机

The synthesizer maintains states because it needs to correlate MRCP requests from the client. The state transitions shown below describe the states of the synthesizer and reflect the request at the head of the queue. A SPEAK request in the PENDING state can be deleted or stopped by a STOP request and does not affect the state of the resource.

合成器保持状态,因为它需要关联来自客户端的MRCP请求。下面显示的状态转换描述合成器的状态,并反映队列头部的请求。处于挂起状态的SPEAK请求可以被STOP请求删除或停止,并且不会影响资源的状态。

        Idle                   Speaking                  Paused
        State                  State                     State
        |                       |                          |
        |----------SPEAK------->|                 |--------|
        |<------STOP------------|             CONTROL      |
        |<----SPEAK-COMPLETE----|                 |------->|
        |<----BARGE-IN-OCCURRED-|                          |
        |              |--------|                          |
        |          CONTROL      |-----------PAUSE--------->|
        |              |------->|<----------RESUME---------|
        |                       |               |----------|
        |                       |              PAUSE       |
        |                       |               |--------->|
        |              |--------|----------|               |
        |     BARGE-IN-OCCURRED |      SPEECH-MARKER       |
        |              |------->|<---------|               |
        |----------|            |             |------------|
        |         STOP          |          SPEAK           |
        |          |            |             |----------->|
        |<---------|                                       |
        |<-------------------STOP--------------------------|
        
        Idle                   Speaking                  Paused
        State                  State                     State
        |                       |                          |
        |----------SPEAK------->|                 |--------|
        |<------STOP------------|             CONTROL      |
        |<----SPEAK-COMPLETE----|                 |------->|
        |<----BARGE-IN-OCCURRED-|                          |
        |              |--------|                          |
        |          CONTROL      |-----------PAUSE--------->|
        |              |------->|<----------RESUME---------|
        |                       |               |----------|
        |                       |              PAUSE       |
        |                       |               |--------->|
        |              |--------|----------|               |
        |     BARGE-IN-OCCURRED |      SPEECH-MARKER       |
        |              |------->|<---------|               |
        |----------|            |             |------------|
        |         STOP          |          SPEAK           |
        |          |            |             |----------->|
        |<---------|                                       |
        |<-------------------STOP--------------------------|
        
7.2. Synthesizer Methods
7.2. 合成器方法

The synthesizer supports the following methods.

合成器支持以下方法。

     synthesizer-method  =  "SET-PARAMS"
                         /  "GET-PARAMS"
                         /  "SPEAK"
                         /  "STOP"
                         /  "PAUSE"
                         /  "RESUME"
                         /  "BARGE-IN-OCCURRED"
                         /  "CONTROL"
        
     synthesizer-method  =  "SET-PARAMS"
                         /  "GET-PARAMS"
                         /  "SPEAK"
                         /  "STOP"
                         /  "PAUSE"
                         /  "RESUME"
                         /  "BARGE-IN-OCCURRED"
                         /  "CONTROL"
        
7.3. Synthesizer Events
7.3. 合成器事件

The synthesizer may generate the following events.

合成器可产生以下事件。

synthesizer-event = "SPEECH-MARKER" / "SPEAK-COMPLETE"

合成器事件=“语音标记”/“讲话完成”

7.4. Synthesizer Header Fields
7.4. 合成器头字段

A synthesizer message may contain header fields containing request options and information to augment the Request, Response, or Event of the message with which it is associated.

合成器消息可以包含包含请求选项和信息的报头字段,以增加与其关联的消息的请求、响应或事件。

     synthesizer-header  =  jump-target       ; Section 7.4.1
                         /  kill-on-barge-in  ; Section 7.4.2
                         /  speaker-profile   ; Section 7.4.3
                         /  completion-cause  ; Section 7.4.4
                         /  voice-parameter   ; Section 7.4.5
                         /  prosody-parameter ; Section 7.4.6
                         /  vendor-specific   ; Section 7.4.7
                         /  speech-marker     ; Section 7.4.8
                         /  speech-language   ; Section 7.4.9
                         /  fetch-hint        ; Section 7.4.10
                         /  audio-fetch-hint  ; Section 7.4.11
                         /  fetch-timeout     ; Section 7.4.12
                         /  failed-uri        ; Section 7.4.13
                         /  failed-uri-cause  ; Section 7.4.14
                         /  speak-restart     ; Section 7.4.15
                         /  speak-length      ; Section 7.4.16
        
     synthesizer-header  =  jump-target       ; Section 7.4.1
                         /  kill-on-barge-in  ; Section 7.4.2
                         /  speaker-profile   ; Section 7.4.3
                         /  completion-cause  ; Section 7.4.4
                         /  voice-parameter   ; Section 7.4.5
                         /  prosody-parameter ; Section 7.4.6
                         /  vendor-specific   ; Section 7.4.7
                         /  speech-marker     ; Section 7.4.8
                         /  speech-language   ; Section 7.4.9
                         /  fetch-hint        ; Section 7.4.10
                         /  audio-fetch-hint  ; Section 7.4.11
                         /  fetch-timeout     ; Section 7.4.12
                         /  failed-uri        ; Section 7.4.13
                         /  failed-uri-cause  ; Section 7.4.14
                         /  speak-restart     ; Section 7.4.15
                         /  speak-length      ; Section 7.4.16
        

Parameter Support Methods/Events/Response

参数支持方法/事件/响应

jump-target MANDATORY SPEAK, CONTROL logging-tag MANDATORY SET-PARAMS, GET-PARAMS kill-on-barge-in MANDATORY SPEAK speaker-profile OPTIONAL SET-PARAMS, GET-PARAMS, SPEAK, CONTROL completion-cause MANDATORY SPEAK-COMPLETE voice-parameter MANDATORY SET-PARAMS, GET-PARAMS, SPEAK, CONTROL prosody-parameter MANDATORY SET-PARAMS, GET-PARAMS, SPEAK, CONTROL vendor-specific MANDATORY SET-PARAMS, GET-PARAMS speech-marker MANDATORY SPEECH-MARKER speech-language MANDATORY SET-PARAMS, GET-PARAMS, SPEAK fetch-hint MANDATORY SET-PARAMS, GET-PARAMS, SPEAK audio-fetch-hint MANDATORY SET-PARAMS, GET-PARAMS, SPEAK fetch-timeout MANDATORY SET-PARAMS, GET-PARAMS, SPEAK

跳转目标强制说话,控制日志标签强制设置参数,GET-PARAMS在驳船上杀死强制说话扬声器配置文件可选设置参数,GET-PARAMS,SPEAK,控制完成原因强制说话完成语音参数强制设置参数,GET-PARAMS,SPEAK,控制韵律参数强制设置参数,GET-PARAMS,SPEAK,控制特定于供应商的强制SET-PARAMS、GET-PARAMS语音标记强制SPEAK-marker语音语言强制SET-PARAMS、GET-PARAMS、SPEAK-fetch提示强制SET-PARAMS、GET-PARAMS、SPEAK-audio fetch提示强制SET-PARAMS、GET-PARAMS、SPEAK-fetch超时强制SET-PARAMS、GET-PARAMS、SPEAK

failed-uri MANDATORY Any failed-uri-cause MANDATORY Any speak-restart MANDATORY CONTROL speak-length MANDATORY SPEAK, CONTROL

失败的uri强制任何失败的uri导致强制任何讲话重新启动强制控制讲话长度强制讲话,控制

7.4.1. Jump-Target
7.4.1. 跳靶

This parameter MAY BE specified in a CONTROL method and controls the jump size to move forward or rewind backward on an active SPEAK request. A + or - indicates a relative value to what is being currently played. This MAY BE specified in a SPEAK request to indicate an offset into the speech markup that the SPEAK request should start speaking from. The different speech length units supported are dependent on the synthesizer implementation. If it does not support a unit or the operation, the resource SHOULD respond with a status code of 404 "Illegal or Unsupported value for parameter".

此参数可在控制方法中指定,并控制跳转大小,以便在活动讲话请求时向前或向后移动。+或-表示当前播放内容的相对值。这可以在SPEAK请求中指定,以指示SPEAK请求应该从语音标记开始说话的偏移量。支持的不同语音长度单位取决于合成器的实现。如果资源不支持单元或操作,则应以404“参数值非法或不支持”的状态代码响应。

     jump-target         =    "Jump-Size" ":" speech-length-value CRLF
     speech-length-value =    numeric-speech-length
                         /    text-speech-length
     text-speech-length  =    1*ALPHA SP "Tag"
     numeric-speech-length=   ("+" / "-") 1*DIGIT SP
                              numeric-speech-unit
     numeric-speech-unit =    "Second"
                         /    "Word"
                         /    "Sentence"
                         /    "Paragraph"
        
     jump-target         =    "Jump-Size" ":" speech-length-value CRLF
     speech-length-value =    numeric-speech-length
                         /    text-speech-length
     text-speech-length  =    1*ALPHA SP "Tag"
     numeric-speech-length=   ("+" / "-") 1*DIGIT SP
                              numeric-speech-unit
     numeric-speech-unit =    "Second"
                         /    "Word"
                         /    "Sentence"
                         /    "Paragraph"
        
7.4.2. Kill-On-Barge-In
7.4.2. 乘船进港

This parameter MAY BE sent as part of the SPEAK method to enable kill-on-barge-in support. If enabled, the SPEAK method is interrupted by DTMF input detected by a signal detector resource or by the start of speech sensed or recognized by the speech recognizer resource.

该参数可作为SPEAK方法的一部分发送,以启用驳船压井支援。如果启用,语音方法将被信号检测器资源检测到的DTMF输入或语音识别器资源感测或识别的语音开始中断。

     kill-on-barge-in    =    "Kill-On-Barge-In" ":" boolean-value CRLF
     boolean-value       =    "true" / "false"
        
     kill-on-barge-in    =    "Kill-On-Barge-In" ":" boolean-value CRLF
     boolean-value       =    "true" / "false"
        

If the recognizer or signal detector resource is on, the same server as the synthesizer, the server should be intelligent enough to recognize their interactions by their common RTSP session-id and work with each other to provide kill-on-barge-in support. The client needs to send a BARGE-IN-OCCURRED method to the synthesizer resource when it receives a barge-in-able event from the synthesizer resource

如果识别器或信号检测器资源与合成器位于同一台服务器上,则该服务器应足够智能,能够通过其公共RTSP会话id识别它们之间的交互,并相互协作以提供驳船上的压井支持。当客户机从合成器资源接收到一个驳船入站事件时,需要向合成器资源发送一个驳船入站方法

or signal detector resource. These resources MAY BE local or distributed. If this field is not specified, the value defaults to "true".

或信号检测器资源。这些资源可以是本地的或分布式的。如果未指定此字段,则该值默认为“true”。

7.4.3. Speaker Profile
7.4.3. 扬声器配置文件

This parameter MAY BE part of the SET-PARAMS/GET-PARAMS or SPEAK request from the client to the server and specifies the profile of the speaker by a URI, which may be a set of voice parameters like gender, accent, etc.

此参数可能是从客户端到服务器的SET-PARAMS/GET-PARAMS或SPEAK请求的一部分,并通过URI指定说话人的个人资料,URI可能是一组语音参数,如性别、口音等。

speaker-profile = "Speaker-Profile" ":" uri CRLF

speaker profile=“speaker profile”“:”uri CRLF

7.4.4. Completion Cause
7.4.4. 完成原因

This header field MUST be specified in a SPEAK-COMPLETE event coming from the synthesizer resource to the client. This indicates the reason behind the SPEAK request completion.

必须在从合成器资源到客户端的SPEAK-COMPLETE事件中指定此标头字段。这表示语音请求完成的原因。

     completion-cause    =    "Completion-Cause" ":" 1*DIGIT SP 1*ALPHA
                             CRLF
        
     completion-cause    =    "Completion-Cause" ":" 1*DIGIT SP 1*ALPHA
                             CRLF
        

Cause-Code Cause-Name Description 000 normal SPEAK completed normally. 001 barge-in SPEAK request was terminated because of barge-in. 002 parse-failure SPEAK request terminated because of a failure to parse the speech markup text. 003 uri-failure SPEAK request terminated because, access to one of the URIs failed. 004 error SPEAK request terminated prematurely due to synthesizer error. 005 language-unsupported Language not supported.

原因代码原因名称描述000正常说话正常完成。001由于驳入,驳入讲话请求被终止。002解析失败由于解析语音标记文本失败,语音请求终止。003 uri失败,请求终止,因为访问其中一个uri失败。004由于合成器错误,语音请求提前终止。005语言不支持不支持的语言。

7.4.5. Voice-Parameters
7.4.5. 语音参数

This set of parameters defines the voice of the speaker.

这组参数定义了扬声器的声音。

voice-parameter = "Voice-" voice-param-name ":" voice-param-value CRLF

语音参数=“语音-”语音参数名称“:“语音参数值CRLF”

voice-param-name is any one of the attribute names under the voice element specified in W3C's Speech Synthesis Markup Language Specification [9]. The voice-param-value is any one of the value choices of the corresponding voice element attribute specified in the above section.

voice param name是W3C语音合成标记语言规范[9]中指定的voice元素下的任何一个属性名称。voice param值是上述部分中指定的相应voice元素属性的任意一个值选项。

These header fields MAY BE sent in SET-PARAMS/GET-PARAMS request to define/get default values for the entire session or MAY BE sent in the SPEAK request to define default values for that speak request. Furthermore, these attributes can be part of the speech text marked up in Speech Synthesis Markup Language (SSML).

这些头字段可以在SET-PARAMS/GET-PARAMS请求中发送以定义/获取整个会话的默认值,也可以在SPEAK请求中发送以定义该SPEAK请求的默认值。此外,这些属性可以是语音合成标记语言(SSML)中标记的语音文本的一部分。

These voice parameter header fields can also be sent in a CONTROL method to affect a SPEAK request in progress and change its behavior on the fly. If the synthesizer resource does not support this operation, it should respond back to the client with a status of unsupported.

这些语音参数头字段也可以通过控制方法发送,以影响正在进行的语音请求并动态更改其行为。如果合成器资源不支持此操作,它应以不支持的状态响应客户端。

7.4.6. Prosody-Parameters
7.4.6. 韵律参数

This set of parameters defines the prosody of the speech.

这组参数定义了语音的韵律。

prosody-parameter = "Prosody-" prosody-param-name ":" prosody-param-value CRLF

韵律参数=“韵律—“韵律参数名称”:“韵律参数值CRLF”

prosody-param-name is any one of the attribute names under the prosody element specified in W3C's Speech Synthesis Markup Language Specification [9]. The prosody-param-value is any one of the value choices of the corresponding prosody element attribute specified in the above section.

韵律参数名称是W3C语音合成标记语言规范[9]中指定的韵律元素下的任何一个属性名称。韵律参数值是上一节中指定的相应韵律元素属性的任何一个值选项。

These header fields MAY BE sent in SET-PARAMS/GET-PARAMS request to define/get default values for the entire session or MAY BE sent in the SPEAK request to define default values for that speak request. Furthermore, these attributes can be part of the speech text marked up in SSML.

这些头字段可以在SET-PARAMS/GET-PARAMS请求中发送以定义/获取整个会话的默认值,也可以在SPEAK请求中发送以定义该SPEAK请求的默认值。此外,这些属性可以是SSML中标记的语音文本的一部分。

The prosody parameter header fields in the SET-PARAMS or SPEAK request only apply if the speech data is of type text/plain and does not use a speech markup format.

SET-PARAMS或SPEAK请求中的韵律参数头字段仅在语音数据类型为text/plain且未使用语音标记格式时适用。

These prosody parameter header fields MAY also be sent in a CONTROL method to affect a SPEAK request in progress and to change its behavior on the fly. If the synthesizer resource does not support this operation, it should respond back to the client with a status of unsupported.

这些韵律参数头字段也可以在控制方法中发送,以影响正在进行的语音请求并动态更改其行为。如果合成器资源不支持此操作,它应以不支持的状态响应客户端。

7.4.7. Vendor-Specific Parameters
7.4.7. 供应商特定参数

This set of headers allows for the client to set vendor-specific parameters.

这组头允许客户端设置特定于供应商的参数。

vendor-specific = "Vendor-Specific-Parameters" ":" vendor-specific-av-pair *[";" vendor-specific-av-pair] CRLF

供应商特定=“供应商特定参数”“:“供应商特定av对*[”;“供应商特定av对]CRLF

vendor-specific-av-pair = vendor-av-pair-name "=" vendor-av-pair-value

供应商特定av对=供应商av对名称“=”供应商av对值

This header MAY BE sent in the SET-PARAMS/GET-PARAMS method and is used to set vendor-specific parameters on the server side. The vendor-av-pair-name can be any vendor-specific field name and conforms to the XML vendor-specific attribute naming convention. The vendor-av-pair-value is the value to set the attribute to and needs to be quoted.

此标头可以通过SET-PARAMS/GET-PARAMS方法发送,用于在服务器端设置供应商特定的参数。供应商av对名称可以是任何供应商特定的字段名称,并且符合XML供应商特定的属性命名约定。供应商av对值是设置属性的值,需要引用。

When asking the server to get the current value of these parameters, this header can be sent in the GET-PARAMS method with the list of vendor-specific attribute names to get separated by a semicolon.

当要求服务器获取这些参数的当前值时,可以使用get-PARAMS方法发送此标头,其中包含供应商特定的属性名称列表,以分号分隔。

7.4.8. Speech Marker
7.4.8. 语音标记

This header field contains a marker tag that may be embedded in the speech data. Most speech markup formats provide mechanisms to embed marker fields between speech texts. The synthesizer will generate SPEECH-MARKER events when it reaches these marker fields. This field SHOULD be part of the SPEECH-MARKER event and will contain the marker tag values.

此标题字段包含可嵌入语音数据中的标记标记。大多数语音标记格式提供了在语音文本之间嵌入标记字段的机制。合成器到达这些标记字段时将生成语音标记事件。此字段应为SPEECH-MARKER事件的一部分,并将包含标记标记值。

     speech-marker =          "Speech-Marker" ":" 1*ALPHA CRLF
        
     speech-marker =          "Speech-Marker" ":" 1*ALPHA CRLF
        
7.4.9. Speech Language
7.4.9. 言语

This header field specifies the default language of the speech data if it is not specified in the speech data. The value of this header field should follow RFC 3066 [16] for its values. This MAY occur in SPEAK, SET-PARAMS, or GET-PARAMS request.

如果未在语音数据中指定,则此标题字段指定语音数据的默认语言。此标头字段的值应遵循RFC 3066[16]的值。这可能发生在SPEAK、SET-PARAMS或GET-PARAMS请求中。

     speech-language          =    "Speech-Language" ":" 1*ALPHA CRLF
        
     speech-language          =    "Speech-Language" ":" 1*ALPHA CRLF
        
7.4.10. Fetch Hint
7.4.10. 获取提示

When the synthesizer needs to fetch documents or other resources like speech markup or audio files, etc., this header field controls URI access properties. This defines when the synthesizer should retrieve content from the server. A value of "prefetch" indicates a file may be downloaded when the request is received, whereas "safe" indicates a file that should only be downloaded when actually needed. The default value is "prefetch". This header field MAY occur in SPEAK, SET-PARAMS, or GET-PARAMS requests.

当合成器需要获取文档或其他资源(如语音标记或音频文件等)时,此头字段控制URI访问属性。这定义合成器从服务器检索内容的时间。值“prefetch”表示在收到请求时可以下载文件,而“safe”表示仅应在实际需要时下载文件。默认值为“预取”。此标头字段可能出现在SPEAK、SET-PARAMS或GET-PARAMS请求中。

     fetch-hint               =    "Fetch-Hint" ":" 1*ALPHA CRLF
        
     fetch-hint               =    "Fetch-Hint" ":" 1*ALPHA CRLF
        
7.4.11. Audio Fetch Hint
7.4.11. 音频提取提示

When the synthesizer needs to fetch documents or other resources like speech audio files, etc., this header field controls URI access properties. This defines whether or not the synthesizer can attempt to optimize speech by pre-fetching audio. The value is either "safe" to say that audio is only fetched when it is needed, never before; "prefetch" to permit, but not require the platform to pre-fetch the audio; or "stream" to allow it to stream the audio fetches. The default value is "prefetch". This header field MAY occur in SPEAK, SET-PARAMS, or GET-PARAMS requests.

当合成器需要获取文档或其他资源(如语音音频文件等)时,此头字段控制URI访问属性。这定义了合成器是否可以尝试通过预取音频来优化语音。该值要么是“安全”的,即仅在需要时获取音频,而不是之前;“预取”允许但不要求平台预取音频;或“stream”以允许它对音频回迁进行流式处理。默认值为“预取”。此标头字段可能出现在SPEAK、SET-PARAMS或GET-PARAMS请求中。

     audio-fetch-hint         =    "Audio-Fetch-Hint" ":" 1*ALPHA CRLF
        
     audio-fetch-hint         =    "Audio-Fetch-Hint" ":" 1*ALPHA CRLF
        
7.4.12. Fetch Timeout
7.4.12. 提取超时

When the synthesizer needs to fetch documents or other resources like speech audio files, etc., this header field controls URI access properties. This defines the synthesizer timeout for resources the media server may need to fetch from the network. This is specified in milliseconds. The default value is platform-dependent. This header field MAY occur in SPEAK, SET-PARAMS, or GET-PARAMS.

当合成器需要获取文档或其他资源(如语音音频文件等)时,此头字段控制URI访问属性。这定义了媒体服务器可能需要从网络获取的资源的合成器超时。这是以毫秒为单位指定的。默认值取决于平台。此标头字段可能出现在SPEAK、SET-PARAMS或GET-PARAMS中。

     fetch-timeout            =    "Fetch-Timeout" ":" 1*DIGIT CRLF
        
     fetch-timeout            =    "Fetch-Timeout" ":" 1*DIGIT CRLF
        
7.4.13. Failed URI
7.4.13. 失败的URI

When a synthesizer method needs a synthesizer to fetch or access a URI, and the access fails, the media server SHOULD provide the failed URI in this header field in the method response.

当合成器方法需要合成器来获取或访问URI,并且访问失败时,媒体服务器应在方法响应的此标头字段中提供失败的URI。

failed-uri = "Failed-URI" ":" Url CRLF

failed uri=“failed uri”“:”Url CRLF

7.4.14. Failed URI Cause
7.4.14. 失败的URI原因

When a synthesizer method needs a synthesizer to fetch or access a URI, and the access fails, the media server SHOULD provide the URI specific or protocol-specific response code through this header field in the method response. This field has been defined as alphanumeric to accommodate all protocols, some of which might have a response string instead of a numeric response code.

当合成器方法需要合成器获取或访问URI,且访问失败时,媒体服务器应通过方法响应中的此标头字段提供特定于URI或特定于协议的响应代码。此字段已定义为字母数字以适应所有协议,其中一些协议可能具有响应字符串而不是数字响应代码。

     failed-uri-cause         =    "Failed-URI-Cause" ":" 1*ALPHA CRLF
        
     failed-uri-cause         =    "Failed-URI-Cause" ":" 1*ALPHA CRLF
        
7.4.15. Speak Restart
7.4.15. 讲话重新开始

When a CONTROL jump backward request is issued to a currently speaking synthesizer resource and the jumps beyond the start of the speech, the current SPEAK request re-starts from the beginning of its speech data and the response to the CONTROL request would contain this header indicating a restart. This header MAY occur in the CONTROL response.

当向当前语音合成器资源发出控制向后跳转请求且跳转超出语音开始时,当前语音请求从其语音数据的开始处重新启动,并且对控制请求的响应将包含指示重新启动的此标头。此标题可能出现在控件响应中。

speak-restart = "Speak-Restart" ":" boolean-value CRLF

speak restart=“speak restart”“:”布尔值CRLF

7.4.16. Speak Length
7.4.16. 说话长度

This parameter MAY BE specified in a CONTROL method to control the length of speech to speak, relative to the current speaking point in the currently active SPEAK request. A "-" value is illegal in this field. If a field with a Tag unit is specified, then the media must speak until the tag is reached or the SPEAK request complete, whichever comes first. This MAY BE specified in a SPEAK request to indicate the length to speak in the speech data and is relative to the point in speech where the SPEAK request starts. The different speech length units supported are dependent on the synthesizer implementation. If it does not support a unit or the operation, the resource SHOULD respond with a status code of 404 "Illegal or Unsupported value for parameter".

可以在控制方法中指定该参数,以相对于当前活动讲话请求中的当前讲话点控制要讲话的讲话长度。在此字段中“-”值是非法的。如果指定了带有标签单元的字段,则媒体必须讲话,直到到达标签或讲话请求完成为止,以先到者为准。这可以在讲话请求中指定,以指示语音数据中要讲话的长度,并且与讲话请求开始的讲话点相关。支持的不同语音长度单位取决于合成器的实现。如果资源不支持单元或操作,则应以404“参数值非法或不支持”的状态代码响应。

speak-length = "Speak-Length" ":" speech-length-value CRLF

speak length=“speak length”“:“语音长度值CRLF”

7.5. Synthesizer Message Body
7.5. 合成器消息体

A synthesizer message may contain additional information associated with the Method, Response, or Event in its message body.

合成器消息可以在其消息体中包含与方法、响应或事件相关联的附加信息。

7.5.1. Synthesizer Speech Data
7.5.1. 合成器语音数据

Marked-up text for the synthesizer to speak is specified as a MIME entity in the message body. The message to be spoken by the synthesizer can be specified inline (by embedding the data in the message body) or by reference (by providing the URI to the data). In either case, the data and the format used to markup the speech needs to be supported by the media server.

合成器要讲话的标记文本在消息体中指定为MIME实体。合成器所说的消息可以内联指定(通过在消息体中嵌入数据)或通过引用指定(通过向数据提供URI)。无论哪种情况,媒体服务器都需要支持用于标记语音的数据和格式。

All media servers MUST support plain text speech data and W3C's Speech Synthesis Markup Language [9] at a minimum and, hence, MUST support the MIME types text/plain and application/synthesis+ssml at a minimum.

所有媒体服务器必须至少支持纯文本语音数据和W3C语音合成标记语言[9],因此必须至少支持MIME类型text/plain和application/Synthesis+ssml。

If the speech data needs to be specified by URI reference, the MIME type text/uri-list is used to specify the one or more URIs that will list what needs to be spoken. If a list of speech URIs is specified, speech data provided by each URI must be spoken in the order in which the URI are specified.

如果需要通过URI引用指定语音数据,则MIME类型text/URI列表用于指定一个或多个URI,该URI将列出需要说的内容。如果指定了语音URI列表,则每个URI提供的语音数据必须按照指定URI的顺序进行语音。

If the data to be spoken consists of a mix of URI and inline speech data, the multipart/mixed MIME-type is used and embedded with the MIME-blocks for text/uri-list, application/synthesis+ssml or text/plain. The character set and encoding used in the speech data may be specified according to standard MIME-type definitions. The multi-part MIME-block can contain actual audio data in .wav or Sun audio format. This is used when the client has audio clips that it may have recorded, then stored in memory or a local device, and that it currently needs to play as part of the SPEAK request. The audio MIME-parts can be sent by the client as part of the multi-part MIME-block. This audio will be referenced in the speech markup data that will be another part in the multi-part MIME-block according to the multipart/mixed MIME-type specification.

如果要说出的数据由URI和内联语音数据的混合组成,则使用多部分/混合MIME类型,并将其与文本/URI列表、应用程序/synthesis+ssml或文本/plain的MIME块一起嵌入。可以根据标准MIME类型定义指定语音数据中使用的字符集和编码。多部分MIME块可以包含.wav或Sun音频格式的实际音频数据。当客户端有可能已录制的音频片段,然后存储在内存或本地设备中,并且当前需要作为SPEAK请求的一部分播放时,使用此选项。音频MIME部分可以作为多部分MIME块的一部分由客户端发送。根据多部分/混合MIME类型规范,该音频将在语音标记数据中引用,语音标记数据将是多部分MIME块中的另一部分。

Example 1: Content-Type:text/uri-list Content-Length:176

示例1:内容类型:text/uri列表内容长度:176

       http://www.cisco.com/ASR-Introduction.sml
       http://www.cisco.com/ASR-Document-Part1.sml
       http://www.cisco.com/ASR-Document-Part2.sml
       http://www.cisco.com/ASR-Conclusion.sml
        
       http://www.cisco.com/ASR-Introduction.sml
       http://www.cisco.com/ASR-Document-Part1.sml
       http://www.cisco.com/ASR-Document-Part2.sml
       http://www.cisco.com/ASR-Conclusion.sml
        
   Example 2:
       Content-Type:application/synthesis+ssml
       Content-Length:104
        
   Example 2:
       Content-Type:application/synthesis+ssml
       Content-Length:104
        
       <?xml version="1.0"?>
        
       <?xml version="1.0"?>
        
       <speak>
       <paragraph>
                <sentence>You have 4 new messages.</sentence>
                <sentence>The first is from <say-as
                type="name">Stephanie Williams</say-as>
                and arrived at <break/>
                <say-as type="time">3:45pm</say-as>.</sentence>
        
       <speak>
       <paragraph>
                <sentence>You have 4 new messages.</sentence>
                <sentence>The first is from <say-as
                type="name">Stephanie Williams</say-as>
                and arrived at <break/>
                <say-as type="time">3:45pm</say-as>.</sentence>
        
                <sentence>The subject is <prosody
                rate="-20%">ski trip</prosody></sentence>
       </paragraph>
       </speak>
        
                <sentence>The subject is <prosody
                rate="-20%">ski trip</prosody></sentence>
       </paragraph>
       </speak>
        
   Example 3:
       Content-Type:multipart/mixed; boundary="--break"
        
   Example 3:
       Content-Type:multipart/mixed; boundary="--break"
        

--break Content-Type:text/uri-list Content-Length:176

--中断内容类型:文本/uri列表内容长度:176

       http://www.cisco.com/ASR-Introduction.sml
       http://www.cisco.com/ASR-Document-Part1.sml
       http://www.cisco.com/ASR-Document-Part2.sml
       http://www.cisco.com/ASR-Conclusion.sml
        
       http://www.cisco.com/ASR-Introduction.sml
       http://www.cisco.com/ASR-Document-Part1.sml
       http://www.cisco.com/ASR-Document-Part2.sml
       http://www.cisco.com/ASR-Conclusion.sml
        

--break Content-Type:application/synthesis+ssml Content-Length:104

--中断内容类型:应用/合成+ssml内容长度:104

       <?xml version="1.0"?>
       <speak>
       <paragraph>
                <sentence>You have 4 new messages.</sentence>
                <sentence>The first is from <say-as
                type="name">Stephanie Williams</say-as>
                and arrived at <break/>
                <say-as type="time">3:45pm</say-as>.</sentence>
        
       <?xml version="1.0"?>
       <speak>
       <paragraph>
                <sentence>You have 4 new messages.</sentence>
                <sentence>The first is from <say-as
                type="name">Stephanie Williams</say-as>
                and arrived at <break/>
                <say-as type="time">3:45pm</say-as>.</sentence>
        
                <sentence>The subject is <prosody
                rate="-20%">ski trip</prosody></sentence>
       </paragraph>
       </speak>
        --break
        
                <sentence>The subject is <prosody
                rate="-20%">ski trip</prosody></sentence>
       </paragraph>
       </speak>
        --break
        
7.6. SET-PARAMS
7.6. 集参数

The SET-PARAMS method, from the client to server, tells the synthesizer resource to define default synthesizer context parameters, like voice characteristics and prosody, etc. If the server accepted and set all parameters, it MUST return a Response-Status of 200. If it chose to ignore some optional parameters, it MUST return 201.

从客户端到服务器的SET-PARAMS方法告诉合成器资源定义默认合成器上下文参数,如语音特征和韵律等。如果服务器接受并设置所有参数,则必须返回200的响应状态。如果选择忽略一些可选参数,则必须返回201。

If some of the parameters being set are unsupported or have illegal values, the server accepts and sets the remaining parameters and MUST respond with a Response-Status of 403 or 404, and MUST include in the response the header fields that could not be set.

如果正在设置的某些参数不受支持或具有非法值,则服务器接受并设置其余参数,并且必须以403或404的响应状态进行响应,并且必须在响应中包含无法设置的标头字段。

   Example:
     C->S:SET-PARAMS 543256 MRCP/1.0
         Voice-gender:female
         Voice-category:adult
         Voice-variant:3
        
   Example:
     C->S:SET-PARAMS 543256 MRCP/1.0
         Voice-gender:female
         Voice-category:adult
         Voice-variant:3
        
     S->C:MRCP/1.0 543256 200 COMPLETE
        
     S->C:MRCP/1.0 543256 200 COMPLETE
        
7.7. GET-PARAMS
7.7. 获取参数

The GET-PARAMS method, from the client to server, asks the synthesizer resource for its current synthesizer context parameters, like voice characteristics and prosody, etc. The client SHOULD send the list of parameters it wants to read from the server by listing a set of empty parameter header fields. If a specific list is not specified then the server SHOULD return all the settable parameters including vendor-specific parameters and their current values. The wild card use can be very intensive as the number of settable parameters can be large depending on the vendor. Hence, it is RECOMMENDED that the client does not use the wildcard GET-PARAMS operation very often.

从客户端到服务器的GET-PARAMS方法要求合成器资源提供其当前合成器上下文参数,如语音特征和韵律等。客户端应通过列出一组空参数头字段,发送要从服务器读取的参数列表。如果未指定特定列表,则服务器应返回所有可设置参数,包括供应商特定参数及其当前值。通配符的使用可能非常密集,因为可设置参数的数量可能很大,具体取决于供应商。因此,建议客户端不要经常使用通配符GET-PARAMS操作。

   Example:
     C->S:GET-PARAMS 543256 MRCP/1.0
          Voice-gender:
          Voice-category:
          Voice-variant:
          Vendor-Specific-Parameters:com.mycorp.param1;
                      com.mycorp.param2
        
   Example:
     C->S:GET-PARAMS 543256 MRCP/1.0
          Voice-gender:
          Voice-category:
          Voice-variant:
          Vendor-Specific-Parameters:com.mycorp.param1;
                      com.mycorp.param2
        
     S->C:MRCP/1.0 543256 200 COMPLETE
          Voice-gender:female
          Voice-category:adult
          Voice-variant:3
        
     S->C:MRCP/1.0 543256 200 COMPLETE
          Voice-gender:female
          Voice-category:adult
          Voice-variant:3
        
          Vendor-Specific-Parameters:com.mycorp.param1="Company Name";
                         com.mycorp.param2="124324234@mycorp.com"
        
          Vendor-Specific-Parameters:com.mycorp.param1="Company Name";
                         com.mycorp.param2="124324234@mycorp.com"
        
7.8. SPEAK
7.8. 谈

The SPEAK method from the client to the server provides the synthesizer resource with the speech text and initiates speech synthesis and streaming. The SPEAK method can carry voice and prosody header fields that define the behavior of the voice being synthesized, as well as the actual marked-up text to be spoken. If specific voice and prosody parameters are specified as part of the speech markup text, it will take precedence over the values specified in the header fields and those set using a previous SET-PARAMS request.

从客户端到服务器的SPEAK方法向合成器资源提供语音文本,并启动语音合成和流式传输。SPEAK方法可以携带voice和prosody标题字段,这些字段定义正在合成的语音的行为,以及要说出的实际标记文本。如果将特定语音和韵律参数指定为语音标记文本的一部分,则它将优先于标题字段中指定的值以及使用以前的set-PARAMS请求设置的值。

When applying voice parameters, there are 3 levels of scope. The highest precedence are those specified within the speech markup text, followed by those specified in the header fields of the SPEAK request and, hence, apply for that SPEAK request only, followed by the session default values that can be set using the SET-PARAMS request and apply for the whole session moving forward.

应用语音参数时,有3个范围级别。最高优先级是在语音标记文本中指定的优先级,其次是在SPEAK请求的头字段中指定的优先级,因此仅适用于该SPEAK请求,然后是可以使用set-PARAMS请求设置的会话默认值,并适用于向前移动的整个会话。

If the resource is idle and the SPEAK request is being actively processed, the resource will respond with a success status code and a request-state of IN-PROGRESS.

如果资源空闲且正在积极处理SPEAK请求,则资源将以成功状态代码和正在进行的请求状态进行响应。

If the resource is in the speaking or paused states (i.e., it is in the middle of processing a previous SPEAK request), the status returns success and a request-state of PENDING. This means that this SPEAK request is in queue and will be processed after the currently active SPEAK request is completed.

如果资源处于发言或暂停状态(即,在处理先前的发言请求的中间),则状态返回成功和请求状态。这意味着此发言请求处于队列中,将在当前活动发言请求完成后进行处理。

For the synthesizer resource, this is the only request that can return a request-state of IN-PROGRESS or PENDING. When the text to be synthesized is complete, the resource will issue a SPEAK-COMPLETE event with the request-id of the SPEAK message and a request-state of COMPLETE.

对于合成器资源,这是唯一可以返回“正在进行”或“挂起”请求状态的请求。当要合成的文本完成时,资源将发出SPEAK-complete事件,该事件具有SPEAK消息的请求id和请求状态complete。

   Example:
     C->S:SPEAK 543257 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
   Example:
     C->S:SPEAK 543257 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     S->C:SPEAK-COMPLETE 543257 COMPLETE MRCP/1.0
          Completion-Cause:000 normal
        
     S->C:SPEAK-COMPLETE 543257 COMPLETE MRCP/1.0
          Completion-Cause:000 normal
        
7.9. STOP
7.9. 停止

The STOP method from the client to the server tells the resource to stop speaking if it is speaking something.

从客户端到服务器的STOP方法告诉资源,如果它正在讲话,就停止讲话。

The STOP request can be sent with an active-request-id-list header field to stop the zero or more specific SPEAK requests that may be in queue and return a response code of 200(Success). If no active-request-id-list header field is sent in the STOP request, it will terminate all outstanding SPEAK requests.

可以使用活动请求id列表头字段发送停止请求,以停止队列中可能存在的零个或多个特定语音请求,并返回200(成功)的响应代码。如果停止请求中未发送任何活动请求id列表头字段,它将终止所有未完成的语音请求。

If a STOP request successfully terminated one or more PENDING or IN-PROGRESS SPEAK requests, then the response message body contains an active-request-id-list header field listing the SPEAK request-ids that were terminated. Otherwise, there will be no active-request-id-list header field in the response. No SPEAK-COMPLETE events will be sent for these terminated requests.

如果停止请求成功终止了一个或多个挂起或正在进行的语音请求,则响应消息正文包含一个活动请求id列表标题字段,其中列出了已终止的语音请求id。否则,响应中将没有活动的请求id列表头字段。不会为这些终止的请求发送SPEAK-COMPLETE事件。

If a SPEAK request that was IN-PROGRESS and speaking was stopped, the next pending SPEAK request, if any, would become IN-PROGRESS and move to the speaking state.

如果正在进行且正在讲话的讲话请求已停止,则下一个挂起的讲话请求(如果有)将变为正在进行并移动到讲话状态。

If a SPEAK request that was IN-PROGRESS and in the paused state was stopped, the next pending SPEAK request, if any, would become IN-PROGRESS and move to the paused state.

如果正在进行且处于暂停状态的讲话请求已停止,则下一个挂起的讲话请求(如果有)将变为正在进行并移动到暂停状态。

   Example:
     C->S:SPEAK 543258 MRCP/1.0
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
   Example:
     C->S:SPEAK 543258 MRCP/1.0
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     C->S:STOP 543259 200 MRCP/1.0
        
     C->S:STOP 543259 200 MRCP/1.0
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
7.10. BARGE-IN-OCCURRED
7.10. 驳船进港

The BARGE-IN-OCCURRED method is a mechanism for the client to communicate a barge-in-able event it detects to the speech resource.

“驳船发生”方法是一种机制,用于客户端在其检测到的驳船事件中与语音资源进行通信。

This event is useful in two scenarios,

此事件在两种情况下很有用,

1. The client has detected some events like DTMF digits or other barge-in-able events and wants to communicate that to the synthesizer.

1. 客户端已检测到一些事件,如DTMF数字或其他可插入事件,并希望将其与合成器通信。

2. The recognizer resource and the synthesizer resource are in different servers. In which case the client MUST act as a Proxy and receive event from the recognition resource, and then send a BARGE-IN-OCCURRED method to the synthesizer. In such cases, the BARGE-IN-OCCURRED method would also have a proxy-sync-id header field received from the resource generating the original event.

2. 识别器资源和合成器资源位于不同的服务器中。在这种情况下,客户机必须充当代理并从识别资源接收事件,然后向合成器发送插入发生的方法。在这种情况下,BARGE-In-occurrent方法还将具有从生成原始事件的资源接收的代理同步id头字段。

If a SPEAK request is active with kill-on-barge-in enabled, and the BARGE-IN-OCCURRED event is received, the synthesizer should stop streaming out audio. It should also terminate any speech requests queued behind the current active one, irrespective of whether they

如果语音请求在启用“驳船上的压井”的情况下处于活动状态,并且接收到“驳船上发生”事件,则合成器应停止输出音频。它还应该终止在当前活动请求之后排队的任何语音请求,无论它们是否

have barge-in enabled or not. If a barge-in-able prompt was playing and it was terminated, the response MUST contain the request-ids of all SPEAK requests that were terminated in its active-request-id-list. There will be no SPEAK-COMPLETE events generated for these requests.

是否启用驳船进港。如果正在播放barge in able提示符,并且该提示符被终止,则响应必须包含在其活动请求id列表中终止的所有SPEAK请求的请求id。将不会为这些请求生成SPEAK-COMPLETE事件。

If the synthesizer and the recognizer are on the same server, they could be optimized for a quicker kill-on-barge-in response by having them interact directly based on a common RTSP session-id. In these cases, the client MUST still proxy the recognition event through a BARGE-IN-OCCURRED method, but the synthesizer resource may have already stopped and sent a SPEAK-COMPLETE event with a barge-in completion cause code. If there were no SPEAK requests terminated as a result of the BARGE-IN-OCCURRED method, the response would still be a 200 success, but MUST not contain an active-request-id-list header field.

如果合成器和识别器位于同一台服务器上,则可通过让它们直接基于公共RTSP会话id进行交互,对它们进行优化,以更快地在驳船上进行杀伤响应。在这些情况下,客户端仍必须通过驳船插入方法代理识别事件,但是合成器资源可能已经停止并发送了一个完整的讲话事件,其中包含一个插入完成原因代码的驳船。如果没有由于“插入发生”方法而终止的SPEAK请求,则响应仍将是200成功,但不得包含活动请求id列表标题字段。

     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     C->S:BARGE-IN-OCCURRED 543259 200 MRCP/1.0
          Proxy-Sync-Id:987654321
        
     C->S:BARGE-IN-OCCURRED 543259 200 MRCP/1.0
          Proxy-Sync-Id:987654321
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
7.11. PAUSE
7.11. 暂停

The PAUSE method from the client to the server tells the resource to pause speech, if it is speaking something. If a PAUSE method is issued on a session when a SPEAK is not active, the server SHOULD respond with a status of 402 or "Method not valid in this state". If a PAUSE method is issued on a session when a SPEAK is active and paused, the server SHOULD respond with a status of 200 or "Success". If a SPEAK request was active, the server MUST return an active-request-id-list header with the request-id of the SPEAK request that was paused.

从客户端到服务器的PAUSE方法告诉资源暂停讲话(如果它正在讲话)。如果在通话未激活时在会话上发出暂停方法,则服务器应以402状态或“此状态下方法无效”进行响应。如果在通话处于活动状态且已暂停时在会话上发出暂停方法,则服务器应以状态200或“成功”进行响应。如果SPEAK请求处于活动状态,服务器必须返回一个活动请求id列表标头,其中包含暂停的SPEAK请求的请求id。

     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     C->S:PAUSE 543259 MRCP/1.0
        
     C->S:PAUSE 543259 MRCP/1.0
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
7.12. RESUME
7.12. 简历

The RESUME method from the client to the server tells a paused synthesizer resource to continue speaking. If a RESUME method is issued on a session when a SPEAK is not active, the server SHOULD respond with a status of 402 or "Method not valid in this state". If a RESUME method is issued on a session when a SPEAK is active and speaking (i.e., not paused), the server SHOULD respond with a status

从客户端到服务器的RESUME方法告诉暂停的合成器资源继续讲话。如果在会话上发出恢复方法,则当通话未激活时,服务器应以402状态或“此状态下的方法无效”进行响应。如果在会话上发出了恢复方法,且讲话处于活动状态且正在讲话(即未暂停),则服务器应以状态响应

of 200 or "Success". If a SPEAK request was active, the server MUST return an active-request-id-list header with the request-id of the SPEAK request that was resumed

200或“成功”。如果SPEAK请求处于活动状态,服务器必须返回一个活动请求id列表标头,其中包含已恢复的SPEAK请求的请求id

   Example:
     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
   Example:
     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
              <sentence>You have 4 new messages.</sentence>
              <sentence>The first is from <say-as
              type="name">Stephanie Williams</say-as>
              and arrived at <break/>
              <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
              <sentence>You have 4 new messages.</sentence>
              <sentence>The first is from <say-as
              type="name">Stephanie Williams</say-as>
              and arrived at <break/>
              <say-as type="time">3:45pm</say-as>.</sentence>
        
              <sentence>The subject is <prosody
              rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
              <sentence>The subject is <prosody
              rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     C->S:PAUSE 543259 MRCP/1.0
        
     C->S:PAUSE 543259 MRCP/1.0
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
     C->S:RESUME 543260 MRCP/1.0
        
     C->S:RESUME 543260 MRCP/1.0
        
     S->C:MRCP/1.0 543260 200 COMPLETE
          Active-Request-Id-List:543258
        
     S->C:MRCP/1.0 543260 200 COMPLETE
          Active-Request-Id-List:543258
        
7.13. CONTROL
7.13. 控制

The CONTROL method from the client to the server tells a synthesizer that is speaking to modify what it is speaking on the fly. This method is used to make the synthesizer jump forward or backward in what it is being spoken, change speaker rate and speaker parameters, etc. It affects the active or IN-PROGRESS SPEAK request. Depending on the implementation and capability of the synthesizer resource, it may allow this operation or one or more of its parameters.

从客户端到服务器的控制方法告诉正在讲话的合成器动态修改它正在讲话的内容。该方法用于使合成器在讲话内容上向前或向后跳跃,改变扬声器频率和扬声器参数等。它会影响当前或正在进行的讲话请求。根据合成器资源的实现和能力,它可能允许此操作或其一个或多个参数。

When a CONTROL to jump forward is issued and the operation goes beyond the end of the active SPEAK method's text, the request succeeds. A SPEAK-COMPLETE event follows the response to the CONTROL method. If there are more SPEAK requests in the queue, the synthesizer resource will continue to process the next SPEAK method. When a CONTROL to jump backwards is issued and the operation jumps to the beginning of the speech data of the active SPEAK request, the response to the CONTROL request contains the speak-restart header.

当发出向前跳转的控件且操作超出活动SPEAK方法文本的结尾时,请求成功。对控制方法的响应之后会出现SPEAK-COMPLETE事件。如果队列中有更多SPEAK请求,合成器资源将继续处理下一个SPEAK方法。当发出向后跳转的控件并且操作跳转到活动SPEAK请求的语音数据的开头时,对该控件请求的响应包含SPEAK restart标头。

These two behaviors can be used to rewind or fast-forward across multiple speech requests, if the client wants to break up a speech markup text into multiple SPEAK requests.

如果客户端希望将语音标记文本分解为多个语音请求,则可以使用这两种行为在多个语音请求之间倒带或快进。

If a SPEAK request was active when the CONTROL method was received, the server MUST return an active-request-id-list header with the Request-id of the SPEAK request that was active.

如果收到控制方法时SPEAK请求处于活动状态,则服务器必须返回一个活动请求id列表标头,其中包含活动SPEAK请求的请求id。

   Example:
     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
   Example:
     C->S:SPEAK 543258 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543258 200 IN-PROGRESS
        
     C->S:CONTROL 543259 MRCP/1.0
          Prosody-rate:fast
        
     C->S:CONTROL 543259 MRCP/1.0
          Prosody-rate:fast
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Active-Request-Id-List:543258
        
     C->S:CONTROL 543260 MRCP/1.0
        
     C->S:CONTROL 543260 MRCP/1.0
        

Jump-Size:-15 Words

跳转大小:-15个单词

     S->C:MRCP/1.0 543260 200 COMPLETE
          Active-Request-Id-List:543258
        
     S->C:MRCP/1.0 543260 200 COMPLETE
          Active-Request-Id-List:543258
        
7.14. SPEAK-COMPLETE
7.14. 说完整的

This is an Event message from the synthesizer resource to the client indicating that the SPEAK request was completed. The request-id header field WILL match the request-id of the SPEAK request that initiated the speech that just completed. The request-state field should be COMPLETE indicating that this is the last Event with that request-id, and that the request with that request-id is now complete. The completion-cause header field specifies the cause code pertaining to the status and reason of request completion such as the SPEAK completed normally or because of an error or kill-on-barge-in, etc.

这是从合成器资源发送到客户端的事件消息,指示SPEAK请求已完成。request id header字段将匹配发起刚刚完成的语音的SPEAK请求的请求id。请求状态字段应完整,表明这是具有该请求id的最后一个事件,并且具有该请求id的请求现在已完成。完成原因标题字段指定与请求完成的状态和原因相关的原因代码,例如正常完成的讲话或由于驳船进港时的错误或压井等。

   Example:
     C->S:SPEAK 543260 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
   Example:
     C->S:SPEAK 543260 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
            <sentence>The subject is <prosody
            rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543260 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543260 200 IN-PROGRESS
        
     S->C:SPEAK-COMPLETE 543260 COMPLETE MRCP/1.0
        
     S->C:SPEAK-COMPLETE 543260 COMPLETE MRCP/1.0
        

Completion-Cause:000 normal

完成原因:000正常

7.15. SPEECH-MARKER
7.15. 语音标记

This is an event generated by the synthesizer resource to the client when it hits a marker tag in the speech markup it is currently processing. The request-id field in the header matches the SPEAK request request-id that initiated the speech. The request-state field should be IN-PROGRESS as the speech is still not complete and there is more to be spoken. The actual speech marker tag hit, describing where the synthesizer is in the speech markup, is returned in the speech-marker header field.

这是合成器资源在其当前处理的语音标记中点击标记标记时向客户端生成的事件。标头中的请求id字段与发起语音的SPEAK请求id匹配。“请求状态”字段应处于进行中状态,因为讲话仍不完整,还有更多的话要说。实际的语音标记标记命中,描述合成器在语音标记中的位置,在语音标记头字段中返回。

   Example:
     C->S:SPEAK 543261 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
   Example:
     C->S:SPEAK 543261 MRCP/1.0
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
            <mark name="here"/>
            <sentence>The subject is
               <prosody rate="-20%">ski trip</prosody>
            </sentence>
            <mark name="ANSWER"/>
          </paragraph>
          </speak>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
            <sentence>You have 4 new messages.</sentence>
            <sentence>The first is from <say-as
            type="name">Stephanie Williams</say-as>
            and arrived at <break/>
            <say-as type="time">3:45pm</say-as>.</sentence>
            <mark name="here"/>
            <sentence>The subject is
               <prosody rate="-20%">ski trip</prosody>
            </sentence>
            <mark name="ANSWER"/>
          </paragraph>
          </speak>
        
     S->C:MRCP/1.0 543261 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543261 200 IN-PROGRESS
        
     S->C:SPEECH-MARKER 543261 IN-PROGRESS MRCP/1.0
          Speech-Marker:here
        
     S->C:SPEECH-MARKER 543261 IN-PROGRESS MRCP/1.0
          Speech-Marker:here
        
     S->C:SPEECH-MARKER 543261 IN-PROGRESS MRCP/1.0
          Speech-Marker:ANSWER
        
     S->C:SPEECH-MARKER 543261 IN-PROGRESS MRCP/1.0
          Speech-Marker:ANSWER
        
     S->C:SPEAK-COMPLETE 543261 COMPLETE MRCP/1.0
          Completion-Cause:000 normal
        
     S->C:SPEAK-COMPLETE 543261 COMPLETE MRCP/1.0
          Completion-Cause:000 normal
        
8. Speech Recognizer Resource
8. 语音识别器资源

The speech recognizer resource is capable of receiving an incoming voice stream and providing the client with an interpretation of what was spoken in textual form.

语音识别器资源能够接收传入的语音流,并向客户端提供以文本形式所说内容的解释。

8.1. Recognizer State Machine
8.1. 识别器状态机

The recognizer resource is controlled by MRCP requests from the client. Similarly, the resource can respond to these requests or generate asynchronous events to the server to indicate certain conditions during the processing of the stream. Hence, the recognizer maintains states to correlate MRCP requests from the client. The state transitions are described below.

识别器资源由来自客户端的MRCP请求控制。类似地,资源可以响应这些请求或向服务器生成异步事件,以指示流处理过程中的某些条件。因此,识别器维护状态以关联来自客户端的MRCP请求。状态转换如下所述。

        Idle                   Recognizing               Recognized
        State                  State                     State
         |                       |                          |
         |---------RECOGNIZE---->|---RECOGNITION-COMPLETE-->|
         |<------STOP------------|<-----RECOGNIZE-----------|
         |                       |                          |
         |                       |              |-----------|
         |              |--------|       GET-RESULT         |
         |       START-OF-SPEECH |              |---------->|
         |------------| |------->|                          |
         |            |          |----------|               |
         |      DEFINE-GRAMMAR   | RECOGNITION-START-TIMERS |
         |<-----------|          |<---------|               |
         |                       |                          |
         |                       |                          |
         |-------|               |                          |
         |      STOP             |                          |
         |<------|               |                          |
         |                                                  |
         |<-------------------STOP--------------------------|
         |<-------------------DEFINE-GRAMMAR----------------|
        
        Idle                   Recognizing               Recognized
        State                  State                     State
         |                       |                          |
         |---------RECOGNIZE---->|---RECOGNITION-COMPLETE-->|
         |<------STOP------------|<-----RECOGNIZE-----------|
         |                       |                          |
         |                       |              |-----------|
         |              |--------|       GET-RESULT         |
         |       START-OF-SPEECH |              |---------->|
         |------------| |------->|                          |
         |            |          |----------|               |
         |      DEFINE-GRAMMAR   | RECOGNITION-START-TIMERS |
         |<-----------|          |<---------|               |
         |                       |                          |
         |                       |                          |
         |-------|               |                          |
         |      STOP             |                          |
         |<------|               |                          |
         |                                                  |
         |<-------------------STOP--------------------------|
         |<-------------------DEFINE-GRAMMAR----------------|
        
8.2. Recognizer Methods
8.2. 识别器方法
   The recognizer supports the following methods.
     recognizer-method   =    SET-PARAMS
                         /    GET-PARAMS
                         /    DEFINE-GRAMMAR
                         /    RECOGNIZE
                         /    GET-RESULT
                         /    RECOGNITION-START-TIMERS
                         /    STOP
        
   The recognizer supports the following methods.
     recognizer-method   =    SET-PARAMS
                         /    GET-PARAMS
                         /    DEFINE-GRAMMAR
                         /    RECOGNIZE
                         /    GET-RESULT
                         /    RECOGNITION-START-TIMERS
                         /    STOP
        
8.3. Recognizer Events
8.3. 识别器事件

The recognizer may generate the following events.

识别器可能会生成以下事件。

recognizer-event = START-OF-SPEECH / RECOGNITION-COMPLETE

识别器事件=语音开始/识别完成

8.4. Recognizer Header Fields
8.4. 识别器标题字段

A recognizer message may contain header fields containing request options and information to augment the Method, Response, or Event message it is associated with.

识别器消息可以包含包含请求选项和信息的标题字段,以增强与之关联的方法、响应或事件消息。

     recognizer-header   =    confidence-threshold     ; Section 8.4.1
                         /    sensitivity-level        ; Section 8.4.2
                         /    speed-vs-accuracy        ; Section 8.4.3
                         /    n-best-list-length       ; Section 8.4.4
                         /    no-input-timeout         ; Section 8.4.5
                         /    recognition-timeout      ; Section 8.4.6
                         /    waveform-url             ; Section 8.4.7
                         /    completion-cause         ; Section 8.4.8
                         /    recognizer-context-block ; Section 8.4.9
                         /    recognizer-start-timers  ; Section 8.4.10
                         /    vendor-specific          ; Section 8.4.11
                         /    speech-complete-timeout  ; Section 8.4.12
                         /    speech-incomplete-timeout; Section 8.4.13
                         /    dtmf-interdigit-timeout  ; Section 8.4.14
                         /    dtmf-term-timeout        ; Section 8.4.15
                         /    dtmf-term-char           ; Section 8.4.16
                         /    fetch-timeout            ; Section 8.4.17
                         /    failed-uri               ; Section 8.4.18
                         /    failed-uri-cause         ; Section 8.4.19
                         /    save-waveform            ; Section 8.4.20
                         /    new-audio-channel        ; Section 8.4.21
                         /    speech-language          ; Section 8.4.22
        
     recognizer-header   =    confidence-threshold     ; Section 8.4.1
                         /    sensitivity-level        ; Section 8.4.2
                         /    speed-vs-accuracy        ; Section 8.4.3
                         /    n-best-list-length       ; Section 8.4.4
                         /    no-input-timeout         ; Section 8.4.5
                         /    recognition-timeout      ; Section 8.4.6
                         /    waveform-url             ; Section 8.4.7
                         /    completion-cause         ; Section 8.4.8
                         /    recognizer-context-block ; Section 8.4.9
                         /    recognizer-start-timers  ; Section 8.4.10
                         /    vendor-specific          ; Section 8.4.11
                         /    speech-complete-timeout  ; Section 8.4.12
                         /    speech-incomplete-timeout; Section 8.4.13
                         /    dtmf-interdigit-timeout  ; Section 8.4.14
                         /    dtmf-term-timeout        ; Section 8.4.15
                         /    dtmf-term-char           ; Section 8.4.16
                         /    fetch-timeout            ; Section 8.4.17
                         /    failed-uri               ; Section 8.4.18
                         /    failed-uri-cause         ; Section 8.4.19
                         /    save-waveform            ; Section 8.4.20
                         /    new-audio-channel        ; Section 8.4.21
                         /    speech-language          ; Section 8.4.22
        

Parameter Support Methods/Events

参数支持方法/事件

confidence-threshold MANDATORY SET-PARAMS, RECOGNIZE GET-RESULT sensitivity-level Optional SET-PARAMS, GET-PARAMS, RECOGNIZE speed-vs-accuracy Optional SET-PARAMS, GET-PARAMS, RECOGNIZE n-best-list-length Optional SET-PARAMS, GET-PARAMS, RECOGNIZE, GET-RESULT no-input-timeout MANDATORY SET-PARAMS, GET-PARAMS, RECOGNIZE

置信阈值强制设置参数,识别获取结果灵敏度级别可选设置参数,获取参数,识别速度与精度可选设置参数,获取参数,识别n最佳列表长度可选设置参数,获取参数,识别,获取结果无输入超时强制设置参数,获取参数,识别

recognition-timeout MANDATORY SET-PARAMS, GET-PARAMS, RECOGNIZE waveform-url MANDATORY RECOGNITION-COMPLETE completion-cause MANDATORY DEFINE-GRAMMAR, RECOGNIZE, RECOGNITON-COMPLETE recognizer-context-block Optional SET-PARAMS, GET-PARAMS recognizer-start-timers MANDATORY RECOGNIZE vendor-specific MANDATORY SET-PARAMS, GET-PARAMS speech-complete-timeout MANDATORY SET-PARAMS, GET-PARAMS RECOGNIZE speech-incomplete-timeout MANDATORY SET-PARAMS, GET-PARAMS RECOGNIZE dtmf-interdigit-timeout MANDATORY SET-PARAMS, GET-PARAMS RECOGNIZE dtmf-term-timeout MANDATORY SET-PARAMS, GET-PARAMS RECOGNIZE dtmf-term-char MANDATORY SET-PARAMS, GET-PARAMS RECOGNIZE fetch-timeout MANDATORY SET-PARAMS, GET-PARAMS RECOGNIZE, DEFINE-GRAMMAR failed-uri MANDATORY DEFINE-GRAMMAR response, RECOGNITION-COMPLETE failed-uri-cause MANDATORY DEFINE-GRAMMAR response, RECOGNITION-COMPLETE save-waveform MANDATORY SET-PARAMS, GET-PARAMS, RECOGNIZE new-audio-channel MANDATORY RECOGNIZE speech-language MANDATORY SET-PARAMS, GET-PARAMS, RECOGNIZE, DEFINE-GRAMMAR

识别超时强制设置参数,获取参数,识别波形url强制识别完成原因强制定义语法,识别,识别完成识别器上下文块可选设置参数,获取参数识别器启动计时器强制识别特定于供应商的强制设置参数,GET-PARAMS语音完成超时强制SET-PARAMS,GET-PARAMS识别语音不完整超时强制SET-PARAMS,GET-PARAMS识别dtmf交指超时强制SET-PARAMS,GET-PARAMS识别dtmf术语超时强制SET-PARAMS,GET-PARAMS识别dtmf术语字符强制SET-PARAMS,GET-PARAMS RECOGNIZE获取超时强制SET-PARAMS,GET-PARAMS RECOGNIZE,DEFINE-GRAMMAR失败uri强制DEFINE-GRAMMAR响应,RECOGNITION-COMPLETE失败uri导致强制DEFINE-GRAMMAR响应,RECOGNITION-COMPLETE保存波形强制SET-PARAMS,GET-PARAMS,识别新音频频道强制识别语音语言强制设置参数,获取参数,识别,定义语法

8.4.1. Confidence Threshold
8.4.1. 置信阈值

When a recognition resource recognizes or matches a spoken phrase with some portion of the grammar, it associates a confidence level with that conclusion. The confidence-threshold parameter tells the recognizer resource what confidence level should be considered a successful match. This is an integer from 0-100 indicating the recognizer's confidence in the recognition. If the recognizer determines that its confidence in all its recognition results is less than the confidence threshold, then it MUST return no-match as the recognition result. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS. The default value for this field is platform specific.

当识别资源识别或匹配口语短语与语法的某个部分时,它会将置信水平与该结论相关联。置信阈值参数告诉识别器资源应将什么样的置信水平视为成功匹配。这是一个0-100之间的整数,表示识别器对识别的信心。如果识别器确定其对所有识别结果的置信度小于置信阈值,则它必须返回不匹配作为识别结果。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。此字段的默认值是特定于平台的。

     confidence-threshold =    "Confidence-Threshold" ":" 1*DIGIT CRLF
        
     confidence-threshold =    "Confidence-Threshold" ":" 1*DIGIT CRLF
        
8.4.2. Sensitivity Level
8.4.2. 灵敏度水平

To filter out background noise and not mistake it for speech, the recognizer may support a variable level of sound sensitivity. The sensitivity-level parameter allows the client to set this value on the recognizer. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS. A higher value for this field means higher sensitivity. The default value for this field is platform specific.

为了滤除背景噪声,并且不会将其误认为是语音,识别器可以支持不同级别的声音灵敏度。灵敏度级别参数允许客户端在识别器上设置此值。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。此字段的值越高,表示灵敏度越高。此字段的默认值是特定于平台的。

     sensitivity-level   =    "Sensitivity-Level" ":" 1*DIGIT CRLF
        
     sensitivity-level   =    "Sensitivity-Level" ":" 1*DIGIT CRLF
        
8.4.3. Speed Vs Accuracy
8.4.3. 速度与精度

Depending on the implementation and capability of the recognizer resource, it may be tunable towards Performance or Accuracy. Higher accuracy may mean more processing and higher CPU utilization, meaning less calls per media server and vice versa. This parameter on the resource can be tuned by the speed-vs-accuracy header. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS. A higher value for this field means higher speed. The default value for this field is platform specific.

根据识别器资源的实现和能力,它可以针对性能或准确性进行调整。更高的准确性可能意味着更多的处理和更高的CPU利用率,这意味着每个媒体服务器的调用更少,反之亦然。资源上的此参数可通过速度与精度标题进行调整。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。此字段的值越大,表示速度越快。此字段的默认值是特定于平台的。

     speed-vs-accuracy   =     "Speed-Vs-Accuracy" ":" 1*DIGIT CRLF
        
     speed-vs-accuracy   =     "Speed-Vs-Accuracy" ":" 1*DIGIT CRLF
        
8.4.4. N Best List Length
8.4.4. N最佳列表长度

When the recognizer matches an incoming stream with the grammar, it may come up with more than one alternative match because of confidence levels in certain words or conversation paths. If this header field is not specified, by default, the recognition resource will only return the best match above the confidence threshold. The client, by setting this parameter, could ask the recognition resource to send it more than 1 alternative. All alternatives must still be above the confidence-threshold. A value greater than one does not guarantee that the recognizer will send the requested number of alternatives. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS. The minimum value for this field is 1. The default value for this field is 1.

当识别器将传入流与语法匹配时,由于某些单词或对话路径的可信度,它可能会找到多个备选匹配。如果未指定此标头字段,默认情况下,识别资源将仅返回高于置信阈值的最佳匹配。通过设置此参数,客户端可以要求识别资源向其发送多个备选方案。所有备选方案必须仍然高于置信阈值。大于1的值不能保证识别器将发送请求数量的备选方案。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。此字段的最小值为1。此字段的默认值为1。

     n-best-list-length  =    "N-Best-List-Length" ":" 1*DIGIT CRLF
        
     n-best-list-length  =    "N-Best-List-Length" ":" 1*DIGIT CRLF
        
8.4.5. No Input Timeout
8.4.5. 无输入超时

When recognition is started and there is no speech detected for a certain period of time, the recognizer can send a RECOGNITION-COMPLETE event to the client and terminate the recognition operation. The no-input-timeout header field can set this timeout value. The value is in milliseconds. This header field MAY occur in RECOGNIZE,

当识别启动且在一定时间段内未检测到语音时,识别器可以向客户端发送识别完成事件并终止识别操作。“无输入超时”标题字段可以设置此超时值。该值以毫秒为单位。此标题字段可能出现在识别中,

SET-PARAMS, or GET-PARAMS. The value for this field ranges from 0 to MAXTIMEOUT, where MAXTIMEOUT is platform specific. The default value for this field is platform specific.

SET-PARAMS或GET-PARAMS。此字段的值范围为0到MAXTIMEOUT,其中MAXTIMEOUT是特定于平台的。此字段的默认值是特定于平台的。

     no-input-timeout    =    "No-Input-Timeout" ":" 1*DIGIT CRLF
        
     no-input-timeout    =    "No-Input-Timeout" ":" 1*DIGIT CRLF
        
8.4.6. Recognition Timeout
8.4.6. 识别超时

When recognition is started and there is no match for a certain period of time, the recognizer can send a RECOGNITION-COMPLETE event to the client and terminate the recognition operation. The recognition-timeout parameter field sets this timeout value. The value is in milliseconds. The value for this field ranges from 0 to MAXTIMEOUT, where MAXTIMEOUT is platform specific. The default value is 10 seconds. This header field MAY occur in RECOGNIZE, SET-PARAMS or GET-PARAMS.

当识别启动且在一定时间段内没有匹配时,识别器可以向客户端发送识别完成事件并终止识别操作。识别超时参数字段设置此超时值。该值以毫秒为单位。此字段的值范围为0到MAXTIMEOUT,其中MAXTIMEOUT是特定于平台的。默认值为10秒。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。

     recognition-timeout =    "Recognition-Timeout" ":" 1*DIGIT CRLF
        
     recognition-timeout =    "Recognition-Timeout" ":" 1*DIGIT CRLF
        
8.4.7. Waveform URL
8.4.7. 波形URL

If the save-waveform header field is set to true, the recognizer MUST record the incoming audio stream of the recognition into a file and provide a URI for the client to access it. This header MUST be present in the RECOGNITION-COMPLETE event if the save-waveform header field was set to true. The URL value of the header MUST be NULL if there was some error condition preventing the server from recording. Otherwise, the URL generated by the server SHOULD be globally unique across the server and all its recognition sessions. The URL SHOULD BE available until the session is torn down.

如果save waveform header字段设置为true,则识别器必须将识别的传入音频流记录到一个文件中,并提供一个URI供客户端访问。如果save waveform header(保存波形标题)字段设置为true,则此标题必须出现在RECOGNITION-COMPLETE事件中。如果存在阻止服务器录制的错误条件,则标头的URL值必须为NULL。否则,服务器生成的URL在整个服务器及其所有识别会话中应该是全局唯一的。在会话被拆除之前,URL应该是可用的。

waveform-url = "Waveform-URL" ":" Url CRLF

波形url=“波形url”“:”url CRLF

8.4.8. Completion Cause
8.4.8. 完成原因

This header field MUST be part of a RECOGNITION-COMPLETE event coming from the recognizer resource to the client. This indicates the reason behind the RECOGNIZE method completion. This header field MUST BE sent in the DEFINE-GRAMMAR and RECOGNIZE responses, if they return with a failure status and a COMPLETE state.

此标头字段必须是从识别器资源发送到客户端的识别完成事件的一部分。这表示识别方法完成的原因。如果响应返回失败状态和完整状态,则必须在DEFINE-GRAMMAR和Recognite响应中发送此标头字段。

Cause-Code Cause-Name Description

原因代码原因名称描述

000 success RECOGNIZE completed with a match or DEFINE-GRAMMAR succeeded in downloading and compiling the grammar

000成功识别已完成匹配或DEFINE-GRAMMAR已成功下载和编译语法

001 no-match RECOGNIZE completed, but no match was found 002 no-input-timeout RECOGNIZE completed without a match due to a no-input-timeout 003 recognition-timeout RECOGNIZE completed without a match due to a recognition-timeout 004 gram-load-failure RECOGNIZE failed due grammar load failure. 005 gram-comp-failure RECOGNIZE failed due to grammar compilation failure. 006 error RECOGNIZE request terminated prematurely due to a recognizer error. 007 speech-too-early RECOGNIZE request terminated because speech was too early. 008 too-much-speech-timeout RECOGNIZE request terminated because speech was too long. 009 uri-failure Failure accessing a URI. 010 language-unsupported Language not supported.

001未完成匹配识别,但未找到匹配002未输入超时未完成识别由于未输入超时未完成匹配003识别超时未完成识别由于识别超时未完成匹配004 gram加载失败由于语法加载失败识别失败。005 gram comp failure由于语法编译失败而无法识别。006错误识别由于识别器错误而提前终止的请求。007语音太早识别请求因语音太早而终止。008太多语音超时识别请求已终止,因为语音太长。009 uri访问uri失败。010语言不支持不支持的语言。

8.4.9. Recognizer Context Block
8.4.9. 识别器上下文块

This parameter MAY BE sent as part of the SET-PARAMS or GET-PARAMS request. If the GET-PARAMS method contains this header field with no value, then it is a request to the recognizer to return the recognizer context block. The response to such a message MAY contain a recognizer context block as a message entity. If the server returns a recognizer context block, the response MUST contain this header field and its value MUST match the content-id of that entity.

此参数可以作为SET-PARAMS或GET-PARAMS请求的一部分发送。如果GET-PARAMS方法包含此标头字段且没有值,则它是对识别器的请求,以返回识别器上下文块。对这样的消息的响应可以包含作为消息实体的识别器上下文块。如果服务器返回识别器上下文块,则响应必须包含此标头字段,且其值必须与该实体的内容id匹配。

If the SET-PARAMS method contains this header field, it MUST contain a message entity containing the recognizer context data, and a content-id matching this header field.

如果SET-PARAMS方法包含此标头字段,则它必须包含包含识别器上下文数据的消息实体,以及与此标头字段匹配的内容id。

This content-id should match the content-id that came with the context data during the GET-PARAMS operation.

此内容id应与GET-PARAMS操作期间上下文数据附带的内容id匹配。

recognizer-context-block = "Recognizer-Context-Block" ":" 1*ALPHA CRLF

识别器上下文块=“识别器上下文块”:“1*ALPHA CRLF”

8.4.10. Recognition Start Timers
8.4.10. 识别开始计时器

This parameter MAY BE sent as part of the RECOGNIZE request. A value of false tells the recognizer to start recognition, but not to start the no-input timer yet. The recognizer should not start the timers until the client sends a RECOGNITION-START-TIMERS request to the recognizer. This is useful in the scenario when the recognizer and synthesizer engines are not part of the same session. Here, when a kill-on-barge-in prompt is being played, you want the RECOGNIZE request to be simultaneously active so that it can detect and implement kill-on-barge-in. But at the same time, you don't want the recognizer to start the no-input timers until the prompt is finished. The default value is "true".

此参数可作为识别请求的一部分发送。值false指示识别器开始识别,但尚未启动无输入计时器。在客户端向识别器发送识别启动计时器请求之前,识别器不应启动计时器。这在识别器和合成器引擎不属于同一会话的情况下非常有用。在这里,当正在播放“在驳船上压井”提示时,您希望识别请求同时处于活动状态,以便它能够检测并实施在驳船上压井。但同时,在提示完成之前,您不希望识别器启动无输入计时器。默认值为“true”。

recognizer-start-timers = "Recognizer-Start-Timers" ":" boolean-value CRLF

识别器启动计时器=“识别器启动计时器”“:”布尔值CRLF

8.4.11. Vendor Specific Parameters
8.4.11. 供应商特定参数

This set of headers allows the client to set Vendor Specific parameters.

这组头允许客户端设置特定于供应商的参数。

This header can be sent in the SET-PARAMS method and is used to set vendor-specific parameters on the server. The vendor-av-pair-name can be any vendor-specific field name and conforms to the XML vendor-specific attribute naming convention. The vendor-av-pair-value is the value to set the attribute to, and needs to be quoted.

此标头可以通过SET-PARAMS方法发送,用于在服务器上设置特定于供应商的参数。供应商av对名称可以是任何供应商特定的字段名称,并且符合XML供应商特定的属性命名约定。供应商av对值是设置属性的值,需要引用。

When asking the server to get the current value of these parameters, this header can be sent in the GET-PARAMS method with the list of vendor-specific attribute names to get separated by a semicolon. This header field MAY occur in SET-PARAMS or GET-PARAMS.

当要求服务器获取这些参数的当前值时,可以使用get-PARAMS方法发送此标头,其中包含供应商特定的属性名称列表,以分号分隔。此标题字段可能出现在SET-PARAMS或GET-PARAMS中。

8.4.12. Speech Complete Timeout
8.4.12. 语音完成超时

This header field specifies the length of silence required following user speech before the speech recognizer finalizes a result (either accepting it or throwing a nomatch event). The speech-complete-timeout value is used when the recognizer currently has a complete match of an active grammar, and specifies how long it should wait for more input before declaring a match. By contrast, the incomplete timeout is used when the speech is an incomplete match to an active grammar. The value is in milliseconds.

此标题字段指定在语音识别器最终确定结果(接受结果或引发nomatch事件)之前,用户讲话后所需的静默时间长度。当识别器当前与活动语法完全匹配时,将使用speech complete timeout(语音完成超时)值,并指定在声明匹配之前应等待更多输入的时间。相比之下,如果语音与活动语法不完全匹配,则使用不完全超时。该值以毫秒为单位。

speech-complete-timeout = "Speech-Complete-Timeout" ":" 1*DIGIT CRLF

语音完成超时=“语音完成超时”:“1*位CRLF

A long speech-complete-timeout value delays the result completion and, therefore, makes the computer's response slow. A short speech-complete-timeout may lead to an utterance being broken up inappropriately. Reasonable complete timeout values are typically in the range of 0.3 seconds to 1.0 seconds. The value for this field ranges from 0 to MAXTIMEOUT, where MAXTIMEOUT is platform specific. The default value for this field is platform specific. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS.

长语音完成超时值会延迟结果完成,因此会使计算机的响应变慢。简短的讲话完成超时可能会导致不适当地打断讲话。合理的完整超时值通常在0.3秒到1.0秒的范围内。此字段的值范围为0到MAXTIMEOUT,其中MAXTIMEOUT是特定于平台的。此字段的默认值是特定于平台的。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。

8.4.13. Speech Incomplete Timeout
8.4.13. 语音不完全超时

This header field specifies the required length of silence following user speech, after which a recognizer finalizes a result. The incomplete timeout applies when the speech prior to the silence is an incomplete match of all active grammars. In this case, once the timeout is triggered, the partial result is rejected (with a nomatch event). The value is in milliseconds. The value for this field ranges from 0 to MAXTIMEOUT, where MAXTIMEOUT is platform specific. The default value for this field is platform specific.

此标题字段指定用户讲话后所需的静默时间长度,然后识别器最终确定结果。当静默之前的语音与所有活动语法不完全匹配时,将应用不完全超时。在这种情况下,一旦触发超时,部分结果将被拒绝(使用nomatch事件)。该值以毫秒为单位。此字段的值范围为0到MAXTIMEOUT,其中MAXTIMEOUT是特定于平台的。此字段的默认值是特定于平台的。

speech-incomplete-timeout = "Speech-Incomplete-Timeout" ":" 1*DIGIT CRLF

语音不完整超时=“语音不完整超时”:“1*位CRLF

The speech-incomplete-timeout also applies when the speech prior to the silence is a complete match of an active grammar, but where it is possible to speak further and still match the grammar. By contrast, the complete timeout is used when the speech is a complete match to an active grammar and no further words can be spoken.

当静默前的讲话与活动语法完全匹配,但可以进一步讲话且仍与语法匹配时,语音不完整超时也适用。相比之下,当语音与活动语法完全匹配且无法说出更多单词时,则使用完全超时。

A long speech-incomplete-timeout value delays the result completion and, therefore, makes the computer's response slow. A short speech-incomplete-timeout may lead to an utterance being broken up inappropriately.

长语音不完整超时值会延迟结果完成,因此会使计算机的响应变慢。简短的讲话不完全超时可能会导致话语被不适当地打断。

The speech-incomplete-timeout is usually longer than the speech-complete-timeout to allow users to pause mid-utterance (for example, to breathe). This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS.

语音未完成超时通常比语音完成超时长,以允许用户在说话过程中暂停(例如呼吸)。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。

8.4.14. DTMF Interdigit Timeout
8.4.14. DTMF交指超时

This header field specifies the inter-digit timeout value to use when recognizing DTMF input. The value is in milliseconds. The value for this field ranges from 0 to MAXTIMEOUT, where MAXTIMEOUT is platform specific. The default value is 5 seconds. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS.

此标题字段指定识别DTMF输入时要使用的位间超时值。该值以毫秒为单位。此字段的值范围为0到MAXTIMEOUT,其中MAXTIMEOUT是特定于平台的。默认值为5秒。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。

dtmf-interdigit-timeout = "DTMF-Interdigit-Timeout" ":" 1*DIGIT CRLF

dtmf叉指超时=“dtmf叉指超时”:“1*位CRLF

8.4.15. DTMF Term Timeout
8.4.15. DTMF术语超时

This header field specifies the terminating timeout to use when recognizing DTMF input. The value is in milliseconds. The value for this field ranges from 0 to MAXTIMEOUT, where MAXTIMEOUT is platform specific. The default value is 10 seconds. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS.

此标头字段指定识别DTMF输入时使用的终止超时。该值以毫秒为单位。此字段的值范围为0到MAXTIMEOUT,其中MAXTIMEOUT是特定于平台的。默认值为10秒。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。

     dtmf-term-timeout   =    "DTMF-Term-Timeout" ":" 1*DIGIT CRLF
        
     dtmf-term-timeout   =    "DTMF-Term-Timeout" ":" 1*DIGIT CRLF
        
8.4.16. DTMF-Term-Char
8.4.16. DTMF术语字符

This header field specifies the terminating DTMF character for DTMF input recognition. The default value is NULL which is specified as an empty header field. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS.

此标题字段指定用于DTMF输入识别的终止DTMF字符。默认值为NULL,指定为空的标头字段。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。

dtmf-term-char = "DTMF-Term-Char" ":" CHAR CRLF

dtmf term char=“dtmf term char”“:”char CRLF

8.4.17. Fetch Timeout
8.4.17. 提取超时

When the recognizer needs to fetch grammar documents, this header field controls URI access properties. This defines the recognizer timeout for completing the fetch of the resources the media server needs from the network. The value is in milliseconds. The value for this field ranges from 0 to MAXTIMEOUT, where MAXTIMEOUT is platform specific. The default value for this field is platform specific. This header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS.

当识别器需要获取语法文档时,此标头字段控制URI访问属性。这定义了识别器超时时间,用于完成从网络获取媒体服务器所需的资源。该值以毫秒为单位。此字段的值范围为0到MAXTIMEOUT,其中MAXTIMEOUT是特定于平台的。此字段的默认值是特定于平台的。此标题字段可能出现在RECOGNIZE、SET-PARAMS或GET-PARAMS中。

8.4.18. Failed URI
8.4.18. 失败的URI

When a recognizer method needs a recognizer to fetch or access a URI, and the access fails, the media server SHOULD provide the failed URI in this header field in the method response.

当识别器方法需要识别器来获取或访问URI,且访问失败时,媒体服务器应在方法响应的此标头字段中提供失败的URI。

8.4.19. Failed URI Cause
8.4.19. 失败的URI原因

When a recognizer method needs a recognizer to fetch or access a URI, and the access fails, the media server SHOULD provide the URI-specific or protocol-specific response code through this header field in the method response. This field has been defined as alphanumeric to accommodate all protocols, some of which might have a response string instead of a numeric response code.

当识别器方法需要识别器获取或访问URI,且访问失败时,媒体服务器应通过方法响应中的此标头字段提供特定于URI或特定于协议的响应代码。此字段已定义为字母数字以适应所有协议,其中一些协议可能具有响应字符串而不是数字响应代码。

8.4.20. Save Waveform
8.4.20. 保存波形

This header field allows the client to indicate to the recognizer that it MUST save the audio stream that was recognized. The recognizer MUST then record the recognized audio and make it available to the client in the form of a URI returned in the waveform-uri header field in the RECOGNITION-COMPLETE event. If there was an error in recording the stream or the audio clip is otherwise not available, the recognizer MUST return an empty waveform-uri header field. The default value for this fields is "false".

此标头字段允许客户端向识别器指示它必须保存已识别的音频流。然后,识别器必须记录识别出的音频,并以在RECOGNITION-COMPLETE事件的波形URI标头字段中返回的URI的形式将其提供给客户端。如果录制流时出错或音频剪辑不可用,则识别器必须返回空波形uri标题字段。此字段的默认值为“false”。

save-waveform = "Save-Waveform" ":" boolean-value CRLF

保存波形=“保存波形”“:”布尔值CRLF

8.4.21. New Audio Channel
8.4.21. 新音频频道

This header field MAY BE specified in a RECOGNIZE message and allows the client to tell the media server that, from that point on, it will be sending audio data from a new audio source, channel, or speaker. If the recognition resource had collected any line statistics or information, it MUST discard it and start fresh for this RECOGNIZE. This helps in the case where the client MAY want to reuse an open recognition session with the media server for multiple telephone calls.

此标头字段可以在识别消息中指定,并允许客户端告知媒体服务器,从该点开始,它将从新的音频源、频道或扬声器发送音频数据。如果识别资源收集了任何行统计信息或信息,则必须放弃该行统计信息或信息,并重新开始此识别。这有助于客户机在多个电话呼叫中重复使用与媒体服务器的开放式识别会话。

new-audio-channel = "New-Audio-Channel" ":" boolean-value CRLF

新建音频频道=“新建音频频道”“:”布尔值CRLF

8.4.22. Speech Language
8.4.22. 言语

This header field specifies the language of recognition grammar data within a session or request, if it is not specified within the data. The value of this header field should follow RFC 3066 [16] for its values. This MAY occur in DEFINE-GRAMMAR, RECOGNIZE, SET-PARAMS, or GET-PARAMS request.

此标头字段指定会话或请求中识别语法数据的语言(如果未在数据中指定)。此标头字段的值应遵循RFC 3066[16]的值。这可能发生在DEFINE-GRAMMAR、RECOGNIZE、SET-PARAMS或GET-PARAMS请求中。

8.5. Recognizer Message Body
8.5. 识别器消息体

A recognizer message may carry additional data associated with the method, response, or event. The client may send the grammar to be recognized in DEFINE-GRAMMAR or RECOGNIZE requests. When the grammar is sent in the DEFINE-GRAMMAR method, the server should be able to download compile and optimize the grammar. The RECOGNIZE request MUST contain a list of grammars that need to be active during the recognition. The server resource may send the recognition results in the RECOGNITION-COMPLETE event or the GET-RESULT response. This data will be carried in the message body of the corresponding MRCP message.

识别器消息可以携带与方法、响应或事件相关联的附加数据。客户端可以发送要在DEFINE-grammar或RECOGNIZE请求中识别的语法。当使用DEFINE-grammar方法发送语法时,服务器应该能够下载、编译和优化语法。识别请求必须包含识别期间需要激活的语法列表。服务器资源可以在recognition-COMPLETE事件或GET-RESULT响应中发送识别结果。该数据将携带在相应MRCP报文的报文正文中。

8.5.1. Recognizer Grammar Data
8.5.1. 识别器语法数据

Recognizer grammar data from the client to the server can be provided inline or by reference. Either way, they are carried as MIME entities in the message body of the MRCP request message. The grammar specified inline or by reference specifies the grammar used to match in the recognition process and this data is specified in one of the standard grammar specification formats like W3C's XML or ABNF or Sun's Java Speech Grammar Format, etc. All media servers MUST support W3C's XML based grammar markup format [11] (MIME-type application/grammar+xml) and SHOULD support the ABNF form (MIME-type application/grammar).

从客户端到服务器的识别器语法数据可以内联提供,也可以通过引用提供。无论哪种方式,它们都作为MIME实体携带在MRCP请求消息的消息体中。内联或引用指定的语法指定识别过程中用于匹配的语法,该数据以标准语法规范格式之一指定,如W3C的XML或ABNF或Sun的Java语音语法格式等。所有媒体服务器必须支持W3C基于XML的语法标记格式[11](MIME类型应用程序/语法+xml),并应支持ABNF表单(MIME类型应用程序/语法)。

When a grammar is specified in-line in the message, the client MUST provide a content-id for that grammar as part of the content headers. The server MUST store the grammar associated with that content-id for the duration of the session. A stored grammar can be overwritten by defining a new grammar with the same content-id. Grammars that have been associated with a content-id can be referenced through a special "session:" URI scheme.

当在消息中内联指定语法时,客户端必须提供该语法的内容id作为内容头的一部分。服务器必须在会话期间存储与该内容id关联的语法。通过定义具有相同内容id的新语法,可以覆盖存储的语法。可以通过特殊的“会话:”URI方案引用与内容id关联的语法。

Example: session:help@root-level.store

示例:会话:help@root-level.store

If grammar data needs to be specified by external URI reference, the MIME-type text/uri-list is used to list the one or more URI that will specify the grammar data. All media servers MUST support the HTTP URI access mechanism.

如果语法数据需要由外部URI引用指定,则MIME类型text/URI列表用于列出将指定语法数据的一个或多个URI。所有媒体服务器都必须支持HTTP URI访问机制。

If the data to be defined consists of a mix of URI and inline grammar data, the multipart/mixed MIME-type is used and embedded with the MIME-blocks for text/uri-list, application/grammar or application/grammar+xml. The character set and encoding used in the grammar data may be specified according to standard MIME-type definitions.

如果要定义的数据由URI和内联语法数据的混合组成,则将使用多部分/混合MIME类型,并将其与文本/URI列表、应用程序/语法或应用程序/语法+xml的MIME块一起嵌入。语法数据中使用的字符集和编码可以根据标准MIME类型定义指定。

When more than one grammar URI or inline grammar block is specified in a message body of the RECOGNIZE request, it is an active list of grammar alternatives to listen. The ordering of the list implies the precedence of the grammars, with the first grammar in the list having the highest precedence.

当在RECOGNIZE请求的消息体中指定了多个语法URI或内联语法块时,它是要侦听的语法替代项的活动列表。列表的顺序意味着语法的优先级,列表中的第一个语法具有最高的优先级。

   Example 1:
       Content-Type:application/grammar+xml
       Content-Id:request1@form-level.store
       Content-Length:104
        
   Example 1:
       Content-Type:application/grammar+xml
       Content-Id:request1@form-level.store
       Content-Length:104
        
       <?xml version="1.0"?>
        
       <?xml version="1.0"?>
        
       <!-- the default grammar language is US English -->
       <grammar xml:lang="en-US" version="1.0">
        
       <!-- the default grammar language is US English -->
       <grammar xml:lang="en-US" version="1.0">
        
       <!-- single language attachment to tokens -->
       <rule id="yes">
                  <one-of>
                      <item xml:lang="fr-CA">oui</item>
                      <item xml:lang="en-US">yes</item>
                  </one-of>
          </rule>
        
       <!-- single language attachment to tokens -->
       <rule id="yes">
                  <one-of>
                      <item xml:lang="fr-CA">oui</item>
                      <item xml:lang="en-US">yes</item>
                  </one-of>
          </rule>
        
       <!-- single language attachment to a rule expansion -->
          <rule id="request">
                  may I speak to
                  <one-of xml:lang="fr-CA">
                      <item>Michel Tremblay</item>
                      <item>Andre Roy</item>
                  </one-of>
          </rule>
        
       <!-- single language attachment to a rule expansion -->
          <rule id="request">
                  may I speak to
                  <one-of xml:lang="fr-CA">
                      <item>Michel Tremblay</item>
                      <item>Andre Roy</item>
                  </one-of>
          </rule>
        
          <!-- multiple language attachment to a token -->
          <rule id="people1">
                  <token lexicon="en-US,fr-CA"> Robert </token>
          </rule>
        
          <!-- multiple language attachment to a token -->
          <rule id="people1">
                  <token lexicon="en-US,fr-CA"> Robert </token>
          </rule>
        
          <!-- the equivalent single-language attachment expansion -->
          <rule id="people2">
                  <one-of>
                      <item xml:lang="en-US">Robert</item>
                      <item xml:lang="fr-CA">Robert</item>
                  </one-of>
          </rule>
        
          <!-- the equivalent single-language attachment expansion -->
          <rule id="people2">
                  <one-of>
                      <item xml:lang="en-US">Robert</item>
                      <item xml:lang="fr-CA">Robert</item>
                  </one-of>
          </rule>
        
          </grammar>
        
          </grammar>
        

Example 2: Content-Type:text/uri-list Content-Length:176

示例2:内容类型:text/uri列表内容长度:176

      session:help@root-level.store
      http://www.cisco.com/Directory-Name-List.grxml
      http://www.cisco.com/Department-List.grxml
      http://www.cisco.com/TAC-Contact-List.grxml
      session:menu1@menu-level.store
        
      session:help@root-level.store
      http://www.cisco.com/Directory-Name-List.grxml
      http://www.cisco.com/Department-List.grxml
      http://www.cisco.com/TAC-Contact-List.grxml
      session:menu1@menu-level.store
        
   Example 3:
      Content-Type:multipart/mixed; boundary="--break"
        
   Example 3:
      Content-Type:multipart/mixed; boundary="--break"
        
      --break
      Content-Type:text/uri-list
      Content-Length:176
      http://www.cisco.com/Directory-Name-List.grxml
      http://www.cisco.com/Department-List.grxml
      http://www.cisco.com/TAC-Contact-List.grxml
        
      --break
      Content-Type:text/uri-list
      Content-Length:176
      http://www.cisco.com/Directory-Name-List.grxml
      http://www.cisco.com/Department-List.grxml
      http://www.cisco.com/TAC-Contact-List.grxml
        
      --break
      Content-Type:application/grammar+xml
      Content-Id:request1@form-level.store
      Content-Length:104
        
      --break
      Content-Type:application/grammar+xml
      Content-Id:request1@form-level.store
      Content-Length:104
        
      <?xml version="1.0"?>
        
      <?xml version="1.0"?>
        
      <!-- the default grammar language is US English -->
      <grammar xml:lang="en-US" version="1.0">
        
      <!-- the default grammar language is US English -->
      <grammar xml:lang="en-US" version="1.0">
        
      <!-- single language attachment to tokens -->
      <rule id="yes">
                  <one-of>
                      <item xml:lang="fr-CA">oui</item>
                      <item xml:lang="en-US">yes</item>
                  </one-of>
         </rule>
        
      <!-- single language attachment to tokens -->
      <rule id="yes">
                  <one-of>
                      <item xml:lang="fr-CA">oui</item>
                      <item xml:lang="en-US">yes</item>
                  </one-of>
         </rule>
        
      <!-- single language attachment to a rule expansion -->
         <rule id="request">
                  may I speak to
                  <one-of xml:lang="fr-CA">
                      <item>Michel Tremblay</item>
                      <item>Andre Roy</item>
                  </one-of>
         </rule>
        
      <!-- single language attachment to a rule expansion -->
         <rule id="request">
                  may I speak to
                  <one-of xml:lang="fr-CA">
                      <item>Michel Tremblay</item>
                      <item>Andre Roy</item>
                  </one-of>
         </rule>
        
         <!-- multiple language attachment to a token -->
         <rule id="people1">
                  <token lexicon="en-US,fr-CA"> Robert </token>
         </rule>
        
         <!-- multiple language attachment to a token -->
         <rule id="people1">
                  <token lexicon="en-US,fr-CA"> Robert </token>
         </rule>
        
         <!-- the equivalent single-language attachment expansion -->
        
         <!-- the equivalent single-language attachment expansion -->
        
         <rule id="people2">
                  <one-of>
                      <item xml:lang="en-US">Robert</item>
                      <item xml:lang="fr-CA">Robert</item>
                  </one-of>
         </rule>
        
         <rule id="people2">
                  <one-of>
                      <item xml:lang="en-US">Robert</item>
                      <item xml:lang="fr-CA">Robert</item>
                  </one-of>
         </rule>
        

</grammar> --break

</grammar>--中断

8.5.2. Recognizer Result Data
8.5.2. 识别器结果数据

Recognition result data from the server is carried in the MRCP message body of the RECOGNITION-COMPLETE event or the GET-RESULT response message as MIME entities. All media servers MUST support W3C's Natural Language Semantics Markup Language (NLSML) [10] as the default standard for returning recognition results back to the client, and hence MUST support the MIME-type application/x-nlsml.

来自服务器的识别结果数据作为MIME实体携带在识别完成事件或GET-result响应消息的MRCP消息体中。所有媒体服务器都必须支持W3C的自然语言语义标记语言(NLSML)[10]作为将识别结果返回给客户端的默认标准,因此必须支持MIME类型的应用程序/x-NLSML。

Example 1: Content-Type:application/x-nlsml Content-Length:104

示例1:内容类型:应用程序/x-nlsml内容长度:104

      <?xml version="1.0"?>
      <result grammar="http://theYesNoGrammar">
          <interpretation>
              <instance>
                  <myApp:yes_no>
                      <response>yes</response>
                  </myApp:yes_no>
              </instance>
              <input>ok</input>
          </interpretation>
      </result>
        
      <?xml version="1.0"?>
      <result grammar="http://theYesNoGrammar">
          <interpretation>
              <instance>
                  <myApp:yes_no>
                      <response>yes</response>
                  </myApp:yes_no>
              </instance>
              <input>ok</input>
          </interpretation>
      </result>
        
8.5.3. Recognizer Context Block
8.5.3. 识别器上下文块

When the client has to change recognition servers within a call, this is a block of data that the client MAY collect from the first media server and provide to the second media server. This may be because the client needs different language support or because the media server issued an RTSP RE-DIRECT. Here, the first recognizer may have collected acoustic and other data during its recognition. When we switch recognition servers, communicating this data may allow the second recognition server to provide better recognition based on the acoustic data collected by the previous recognizer. This block of data is vendor-specific and MUST be carried as MIME-type application/octets in the body of the message.

当客户端必须在呼叫中更改识别服务器时,这是客户端可以从第一个媒体服务器收集并提供给第二个媒体服务器的数据块。这可能是因为客户端需要不同的语言支持,或者是因为媒体服务器发出了RTSP RE-DIRECT。这里,第一识别器可能在其识别期间收集了声学和其他数据。当我们切换识别服务器时,传输此数据可能会允许第二个识别服务器基于前一个识别器收集的声学数据提供更好的识别。此数据块是特定于供应商的,必须作为MIME类型的应用程序/八位字节携带在消息体中。

This block of data is communicated in the SET-PARAMS and GET-PARAMS method/response messages. In the GET-PARAMS method, if an empty recognizer-context-block header field is present, then the recognizer should return its vendor-specific context block in the message body as a MIME-entity with a specific content-id. The content-id value should also be specified in the recognizer-context-block header field

该数据块在SET-PARAMS和GET-PARAMS方法/响应消息中进行通信。在GET-PARAMS方法中,如果存在空的识别器上下文块标题字段,则识别器应在消息正文中返回其供应商特定的上下文块,作为具有特定内容id的MIME实体。还应在识别器上下文块标题字段中指定内容id值

in the GET-PARAMS response. The SET-PARAMS request wishing to provide this vendor-specific data should send it in the message body as a MIME-entity with the same content-id that it received from the GET-PARAMS. The content-id should also be sent in the recognizer-context-block header field of the SET-PARAMS message.

在GET-PARAMS响应中。希望提供此供应商特定数据的SET-PARAMS请求应将其作为MIME实体发送到消息正文中,该MIME实体的内容id与其从GET-PARAMS接收到的内容id相同。还应在SET-PARAMS消息的识别器上下文块标题字段中发送内容id。

Each automatic speech recognition (ASR) vendor choosing to use this mechanism to handoff recognizer context data among its servers should distinguish its vendor-specific block of data from other vendors by choosing a unique content-id that they should recognize.

选择使用此机制在其服务器之间切换识别器上下文数据的每个自动语音识别(ASR)供应商应通过选择其应识别的唯一内容id,将其特定于供应商的数据块与其他供应商区分开来。

8.6. SET-PARAMS
8.6. 集参数

The SET-PARAMS method, from the client to the server, tells the recognizer resource to set and modify recognizer context parameters like recognizer characteristics, result detail level, etc. In the following sections some standard parameters are discussed. If the server resource does not recognize an OPTIONAL parameter, it MUST ignore that field. Many of the parameters in the SET-PARAMS method can also be used in another method like the RECOGNIZE method. But the difference is that when you set something like the sensitivity-level using the SET-PARAMS, it applies for all future requests, whenever applicable. On the other hand, when you pass sensitivity-level in a RECOGNIZE request, it applies only to that request.

从客户端到服务器的SET-PARAMS方法告诉识别器资源设置和修改识别器上下文参数,如识别器特征、结果详细程度等。在以下部分中,将讨论一些标准参数。如果服务器资源无法识别可选参数,则必须忽略该字段。SET-PARAMS方法中的许多参数也可用于其他方法,如识别方法。但不同之处在于,当您使用set-PARAMS设置敏感度级别时,它适用于所有未来的请求(只要适用)。另一方面,当您在识别请求中通过敏感度级别时,它仅适用于该请求。

   Example:
     C->S:SET-PARAMS 543256 MRCP/1.0
          Sensitivity-Level:20
          Recognition-Timeout:30
          Confidence-Threshold:85
        
   Example:
     C->S:SET-PARAMS 543256 MRCP/1.0
          Sensitivity-Level:20
          Recognition-Timeout:30
          Confidence-Threshold:85
        
     S->C:MRCP/1.0 543256 200 COMPLETE
        
     S->C:MRCP/1.0 543256 200 COMPLETE
        
8.7. GET-PARAMS
8.7. 获取参数

The GET-PARAMS method, from the client to the server, asks the recognizer resource for its current default parameters, like sensitivity-level, n-best-list-length, etc. The client can request specific parameters from the server by sending it one or more empty parameter headers with no values. The server should then return the settings for those specific parameters only. When the client does not send a specific list of empty parameter headers, the recognizer should return the settings for all parameters. The wild card use can be very intensive as the number of settable parameters can be large depending on the vendor. Hence, it is RECOMMENDED that the client does not use the wildcard GET-PARAMS operation very often.

从客户端到服务器的GET-PARAMS方法向识别器资源请求其当前默认参数,如灵敏度级别、n-best-list-length等。客户端可以通过向服务器发送一个或多个不带值的空参数头来请求特定参数。然后,服务器应仅返回这些特定参数的设置。当客户端未发送空参数头的特定列表时,识别器应返回所有参数的设置。通配符的使用可能非常密集,因为可设置参数的数量可能很大,具体取决于供应商。因此,建议客户端不要经常使用通配符GET-PARAMS操作。

Example: C->S:GET-PARAMS 543256 MRCP/1.0 Sensitivity-Level: Recognition-Timeout: Confidence-threshold:

示例:C->S:GET-PARAMS 543256 MRCP/1.0灵敏度级别:识别超时:置信阈值:

     S->C:MRCP/1.0 543256 200 COMPLETE
          Sensitivity-Level:20
          Recognition-Timeout:30
          Confidence-Threshold:85
        
     S->C:MRCP/1.0 543256 200 COMPLETE
          Sensitivity-Level:20
          Recognition-Timeout:30
          Confidence-Threshold:85
        
8.8. DEFINE-GRAMMAR
8.8. 定义语法

The DEFINE-GRAMMAR method, from the client to the server, provides a grammar and tells the server to define, download if needed, and compile the grammar.

从客户端到服务器的DEFINE-GRAMMAR方法提供语法,并告诉服务器定义、下载(如果需要)和编译语法。

If the server resource is in the recognition state, the DEFINE-GRAMMAR request MUST respond with a failure status.

如果服务器资源处于识别状态,DEFINE-GRAMMAR请求必须以失败状态响应。

If the resource is in the idle state and is able to successfully load and compile the grammar, the status MUST return a success code and the request-state MUST be COMPLETE.

如果资源处于空闲状态,并且能够成功加载和编译语法,则状态必须返回成功代码,请求状态必须为完成。

If the recognizer could not define the grammar for some reason, say the download failed or the grammar failed to compile, or the grammar was in an unsupported form, the MRCP response for the DEFINE-GRAMMAR method MUST contain a failure status code of 407, and a completion-cause header field describing the failure reason.

如果识别器由于某种原因无法定义语法,例如下载失败或语法编译失败,或者语法的格式不受支持,则define-grammar方法的MRCP响应必须包含故障状态代码407和描述故障原因的完成原因标头字段。

   Example:
     C->S:DEFINE-GRAMMAR 543257 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
   Example:
     C->S:DEFINE-GRAMMAR 543257 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
          <?xml version="1.0"?>
        
          <?xml version="1.0"?>
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
              <one-of>
                  <item xml:lang="fr-CA">oui</item>
                  <item xml:lang="en-US">yes</item>
              </one-of>
          </rule>
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
              <one-of>
                  <item xml:lang="fr-CA">oui</item>
                  <item xml:lang="en-US">yes</item>
              </one-of>
          </rule>
        
          <!-- single language attachment to a rule expansion -->
          <rule id="request">
              may I speak to
              <one-of xml:lang="fr-CA">
                  <item>Michel Tremblay</item>
                  <item>Andre Roy</item>
              </one-of>
          </rule>
        
          <!-- single language attachment to a rule expansion -->
          <rule id="request">
              may I speak to
              <one-of xml:lang="fr-CA">
                  <item>Michel Tremblay</item>
                  <item>Andre Roy</item>
              </one-of>
          </rule>
        
          </grammar>
        
          </grammar>
        
     S->C:MRCP/1.0 543257 200 COMPLETE
          Completion-Cause:000 success
        
     S->C:MRCP/1.0 543257 200 COMPLETE
          Completion-Cause:000 success
        
     C->S:DEFINE-GRAMMAR 543258 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:helpgrammar@root-level.store
          Content-Length:104
        
     C->S:DEFINE-GRAMMAR 543258 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:helpgrammar@root-level.store
          Content-Length:104
        
          <?xml version="1.0"?>
        
          <?xml version="1.0"?>
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <rule id="request">
              I need help
          </rule>
        
          <rule id="request">
              I need help
          </rule>
        
          </grammar>
        
          </grammar>
        
     S->C:MRCP/1.0 543258 200 COMPLETE
          Completion-Cause:000 success
        
     S->C:MRCP/1.0 543258 200 COMPLETE
          Completion-Cause:000 success
        
     C->S:DEFINE-GRAMMAR 543259 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:request2@field-level.store
          Content-Length:104
          <?xml version="1.0"?>
        
     C->S:DEFINE-GRAMMAR 543259 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:request2@field-level.store
          Content-Length:104
          <?xml version="1.0"?>
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <rule id="request">
              I need help
          </rule>
        
          <rule id="request">
              I need help
          </rule>
        
     S->C:MRCP/1.0 543258 200 COMPLETE
        
     S->C:MRCP/1.0 543258 200 COMPLETE
        

Completion-Cause:000 success

完成原因:000成功

     C->S:DEFINE-GRAMMAR 543259 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:request2@field-level.store
          Content-Length:104
        
     C->S:DEFINE-GRAMMAR 543259 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Id:request2@field-level.store
          Content-Length:104
        
          <?xml version="1.0"?>
        
          <?xml version="1.0"?>
        
               <grammar xml:lang="en">
        
               <grammar xml:lang="en">
        
               <import uri="session:politeness@form-level.store"
                       name="polite"/>
        
               <import uri="session:politeness@form-level.store"
                       name="polite"/>
        
               <rule id="basicCmd" scope="public">
               <example> please move the window </example>
               <example> open a file </example>
        
               <rule id="basicCmd" scope="public">
               <example> please move the window </example>
               <example> open a file </example>
        
               <ruleref import="polite#startPolite"/>
               <ruleref uri="#command"/>
               <ruleref import="polite#endPolite"/>
               </rule>
        
               <ruleref import="polite#startPolite"/>
               <ruleref uri="#command"/>
               <ruleref import="polite#endPolite"/>
               </rule>
        
               <rule id="command">
               <ruleref uri="#action"/> <ruleref uri="#object"/>
               </rule>
        
               <rule id="command">
               <ruleref uri="#action"/> <ruleref uri="#object"/>
               </rule>
        
               <rule id="action">
                    <choice>
                    <item weight="10" tag="OPEN">   open </item>
                    <item weight="2"  tag="CLOSE">  close </item>
                    <item weight="1"  tag="DELETE"> delete </item>
                    <item weight="1"  tag="MOVE">   move </item>
                    </choice>
               </rule>
        
               <rule id="action">
                    <choice>
                    <item weight="10" tag="OPEN">   open </item>
                    <item weight="2"  tag="CLOSE">  close </item>
                    <item weight="1"  tag="DELETE"> delete </item>
                    <item weight="1"  tag="MOVE">   move </item>
                    </choice>
               </rule>
        
               <rule id="object">
               <count number="optional">
                    <choice>
                         <item> the </item>
                         <item> a </item>
                    </choice>
               </count>
               <choice>
                    <item> window </item>
                    <item> file </item>
                    <item> menu </item>
               </choice>
        
               <rule id="object">
               <count number="optional">
                    <choice>
                         <item> the </item>
                         <item> a </item>
                    </choice>
               </count>
               <choice>
                    <item> window </item>
                    <item> file </item>
                    <item> menu </item>
               </choice>
        
               </rule>
        
               </rule>
        
               </grammar>
        
               </grammar>
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Completion-Cause:000 success
        
     S->C:MRCP/1.0 543259 200 COMPLETE
          Completion-Cause:000 success
        
     C->S:RECOGNIZE 543260 MRCP/1.0
          N-Best-List-Length:2
          Content-Type:text/uri-list
          Content-Length:176
        
     C->S:RECOGNIZE 543260 MRCP/1.0
          N-Best-List-Length:2
          Content-Type:text/uri-list
          Content-Length:176
        
          session:request1@form-level.store
          session:request2@field-level.store
          session:helpgramar@root-level.store
        
          session:request1@form-level.store
          session:request2@field-level.store
          session:helpgramar@root-level.store
        
     S->C:MRCP/1.0 543260 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543260 200 IN-PROGRESS
        
     S->C:START-OF-SPEECH 543260 IN-PROGRESS MRCP/1.0
        
     S->C:START-OF-SPEECH 543260 IN-PROGRESS MRCP/1.0
        
     S->C:RECOGNITION-COMPLETE 543260 COMPLETE MRCP/1.0
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:applicationt/x-nlsml
          Content-Length:276
        
     S->C:RECOGNITION-COMPLETE 543260 COMPLETE MRCP/1.0
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:applicationt/x-nlsml
          Content-Length:276
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
               <interpretation>
                    <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                    </xf:instance>
                    <input>   may I speak to Andre Roy </input>
               </interpretation>
          </result>
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
               <interpretation>
                    <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                    </xf:instance>
                    <input>   may I speak to Andre Roy </input>
               </interpretation>
          </result>
        
8.9. RECOGNIZE
8.9. 认出

The RECOGNIZE method from the client to the server tells the recognizer to start recognition and provides it with a grammar to match for. The RECOGNIZE method can carry parameters to control the sensitivity, confidence level, and the level of detail in results provided by the recognizer. These parameters override the current defaults set by a previous SET-PARAMS method.

从客户端到服务器的RECOGNIZE方法告诉识别器开始识别,并为其提供匹配的语法。识别方法可以携带参数来控制识别器提供的结果的灵敏度、置信度和详细程度。这些参数覆盖由以前的set-PARAMS方法设置的当前默认值。

If the resource is in the recognition state, the RECOGNIZE request MUST respond with a failure status.

如果资源处于识别状态,则识别请求必须以失败状态响应。

If the resource is in the Idle state and was able to successfully start the recognition, the server MUST return a success code and a request-state of IN-PROGRESS. This means that the recognizer is active and that the client should expect further events with this request-id.

如果资源处于空闲状态并且能够成功启动识别,则服务器必须返回成功代码和正在进行的请求状态。这意味着识别器处于活动状态,并且客户端应该期待具有此请求id的进一步事件。

If the resource could not start a recognition, it MUST return a failure status code of 407 and contain a completion-cause header field describing the cause of failure.

如果资源无法启动识别,则它必须返回故障状态代码407,并包含描述故障原因的完成原因标头字段。

For the recognizer resource, this is the only request that can return request-state of IN-PROGRESS, meaning that recognition is in progress. When the recognition completes by matching one of the grammar alternatives or by a time-out without a match or for some other reason, the recognizer resource MUST send the client a RECOGNITON-COMPLETE event with the result of the recognition and a request-state of COMPLETE.

对于识别器资源,这是唯一可以返回请求状态为“正在进行”的请求,这意味着识别正在进行。当识别通过匹配其中一个语法替代项或在没有匹配项的情况下超时或出于某些其他原因而完成时,识别器资源必须向客户端发送一个带有识别结果和请求状态COMPLETE的RECOGNITON-COMPLETE事件。

For large grammars that can take a long time to compile and for grammars that are used repeatedly, the client could issue a DEFINE-GRAMMAR request with the grammar ahead of time. In such a case, the client can issue the RECOGNIZE request and reference the grammar through the "session:" special URI. This also applies in general if the client wants to restart recognition with a previous inline grammar.

对于可能需要很长时间编译的大型语法和重复使用的语法,客户机可以提前使用语法发出DEFINE-GRAMMAR请求。在这种情况下,客户机可以发出RECOGNIZE请求,并通过“session:”特殊URI引用语法。如果客户机希望使用以前的内联语法重新启动识别,这通常也适用。

Note that since the audio and the messages are carried over separate communication paths there may be a race condition between the start of the flow of audio and the receipt of the RECOGNIZE method. For example, if audio flow is started by the client at the same time as the RECOGNIZE method is sent, either the audio or the RECOGNIZE will arrive at the recognizer first. As another example, the client may chose to continuously send audio to the Media server and signal the Media server to recognize using the RECOGNIZE method. A number of mechanisms exist to resolve this condition and the mechanism chosen is left to the implementers of recognizer Media servers.

注意,由于音频和消息通过单独的通信路径传送,因此在音频流的开始和识别方法的接收之间可能存在竞争条件。例如,如果客户端在发送识别方法的同时启动音频流,则音频或识别将首先到达识别器。作为另一示例,客户端可以选择连续地向媒体服务器发送音频,并使用识别方法向媒体服务器发送识别信号。有许多机制可以解决这种情况,选择的机制留给识别器媒体服务器的实现者。

   Example:
     C->S:RECOGNIZE 543257 MRCP/1.0
          Confidence-Threshold:90
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
   Example:
     C->S:RECOGNIZE 543257 MRCP/1.0
          Confidence-Threshold:90
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
          <?xml version="1.0"?>
        
          <?xml version="1.0"?>
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
                   <one-of>
                            <item xml:lang="fr-CA">oui</item>
                            <item xml:lang="en-US">yes</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
                   <one-of>
                            <item xml:lang="fr-CA">oui</item>
                            <item xml:lang="en-US">yes</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   may I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   may I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
            </grammar>
        
            </grammar>
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     S->C:START-OF-SPEECH 543257 IN-PROGRESS MRCP/1.0
        
     S->C:START-OF-SPEECH 543257 IN-PROGRESS MRCP/1.0
        
     S->C:RECOGNITION-COMPLETE 543257 COMPLETE MRCP/1.0
        
     S->C:RECOGNITION-COMPLETE 543257 COMPLETE MRCP/1.0
        
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:application/x-nlsml
          Content-Length:276
        
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:application/x-nlsml
          Content-Length:276
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                  </xf:instance>
                    <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                  </xf:instance>
                    <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
8.10. STOP
8.10. 停止

The STOP method from the client to the server tells the resource to stop recognition if one is active. If a RECOGNIZE request is active and the STOP request successfully terminated it, then the response header contains an active-request-id-list header field containing the request-id of the RECOGNIZE request that was terminated. In this case, no RECOGNITION-COMPLETE event will be sent for the terminated request. If there was no recognition active, then the response MUST NOT contain an active-request-id-list header field. Either way,method the response MUST contain a status of 200(Success).

从客户端到服务器的STOP方法告诉资源,如果资源处于活动状态,则停止识别。如果识别请求处于活动状态且停止请求成功终止,则响应标头包含活动请求id列表标头字段,其中包含已终止的识别请求的请求id。在这种情况下,不会为终止的请求发送识别完成事件。如果没有激活的识别,则响应不得包含激活的请求id列表标题字段。无论哪种方法,响应的状态都必须为200(成功)。

   Example:
     C->S:RECOGNIZE 543257 MRCP/1.0
          Confidence-Threshold:90
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
   Example:
     C->S:RECOGNIZE 543257 MRCP/1.0
          Confidence-Threshold:90
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
          <?xml version="1.0"?>
        
          <?xml version="1.0"?>
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
                   <one-of>
                            <item xml:lang="fr-CA">oui</item>
                            <item xml:lang="en-US">yes</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
                   <one-of>
                            <item xml:lang="fr-CA">oui</item>
                            <item xml:lang="en-US">yes</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   may I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   may I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
          </grammar>
        
          </grammar>
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     C->S:STOP 543258 200 MRCP/1.0
        
     C->S:STOP 543258 200 MRCP/1.0
        
     S->C:MRCP/1.0 543258 200 COMPLETE
          Active-Request-Id-List:543257
        
     S->C:MRCP/1.0 543258 200 COMPLETE
          Active-Request-Id-List:543257
        
8.11. GET-RESULT
8.11. 结果

The GET-RESULT method from the client to the server can be issued when the recognizer is in the recognized state. This request allows the client to retrieve results for a completed recognition. This is useful if the client decides it wants more alternatives or more information. When the media server receives this request, it should re-compute and return the results according to the recognition constraints provided in the GET-RESULT request.

当识别器处于已识别状态时,可以从客户端向服务器发出GET-RESULT方法。此请求允许客户端检索完成识别的结果。如果客户决定需要更多的备选方案或更多的信息,这将非常有用。当媒体服务器收到此请求时,它应该根据GET-RESULT请求中提供的识别约束重新计算并返回结果。

The GET-RESULT request could specify constraints like a different confidence-threshold, or n-best-list-length. This feature is optional and the automatic speech recognition (ASR) engine may return a status of unsupported feature.

GET-RESULT请求可以指定不同的置信阈值或n-best-list-length等约束。此功能是可选的,自动语音识别(ASR)引擎可能返回不支持功能的状态。

   Example:
     C->S:GET-RESULT 543257 MRCP/1.0
          Confidence-Threshold:90
        
   Example:
     C->S:GET-RESULT 543257 MRCP/1.0
          Confidence-Threshold:90
        
     S->C:MRCP/1.0 543257 200 COMPLETE
          Content-Type:application/x-nlsml
          Content-Length:276
        
     S->C:MRCP/1.0 543257 200 COMPLETE
          Content-Type:application/x-nlsml
          Content-Length:276
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
        
                      </Person>
                  </xf:instance>
                            <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
                      </Person>
                  </xf:instance>
                            <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
8.12. START-OF-SPEECH
8.12. 开场白

This is an event from the recognizer to the client indicating that it has detected speech. This event is useful in implementing kill-on-barge-in scenarios when the synthesizer resource is in a different session than the recognizer resource and, hence, is not aware of an incoming audio source. In these cases, it is up to the client to act

这是从识别器到客户端的一个事件,表示它已检测到语音。当合成器资源处于与识别器资源不同的会话中,并且因此不知道传入音频源时,此事件在实施“在驳船上杀死”场景中非常有用。在这些情况下,应由客户采取行动

as a proxy and turn around and issue the BARGE-IN-OCCURRED method to the synthesizer resource. The recognizer resource also sends a unique proxy-sync-id in the header for this event, which is sent to the synthesizer in the BARGE-IN-OCCURRED method to the synthesizer.

作为代理,转身并向合成器资源发出插入式方法。识别器资源还为此事件在标头中发送一个唯一的代理同步id,该id通过插入发生方法发送到合成器。

This event should be generated irrespective of whether the synthesizer and recognizer are in the same media server or not.

无论合成器和识别器是否在同一媒体服务器中,都应生成此事件。

8.13. RECOGNITION-START-TIMERS
8.13. 自动启动定时器

This request is sent from the client to the recognition resource when it knows that a kill-on-barge-in prompt has finished playing. This is useful in the scenario when the recognition and synthesizer engines are not in the same session. Here, when a kill-on-barge-in prompt is being played, you want the RECOGNIZE request to be simultaneously active so that it can detect and implement kill-on-barge-in. But at the same time, you don't want the recognizer to start the no-input timers until the prompt is finished. The parameter recognizer-start-timers header field in the RECOGNIZE request will allow the client to say if the timers should be started or not. The recognizer should not start the timers until the client sends a RECOGNITION-START-TIMERS method to the recognizer.

当客户端知道驳船上的kill-on-barge-in提示已完成播放时,该请求将从客户端发送到识别资源。这在识别引擎和合成器引擎不在同一会话中的场景中非常有用。在这里,当正在播放“在驳船上压井”提示时,您希望识别请求同时处于活动状态,以便它能够检测并实施在驳船上压井。但同时,在提示完成之前,您不希望识别器启动无输入计时器。识别请求中的参数识别器启动计时器标头字段将允许客户端说出是否应启动计时器。在客户端向识别器发送RECOGNITION-start-timers方法之前,识别器不应启动计时器。

8.14. RECOGNITON-COMPLETE
8.14. 识别完成

This is an Event from the recognizer resource to the client indicating that the recognition completed. The recognition result is sent in the MRCP body of the message. The request-state field MUST be COMPLETE indicating that this is the last event with that request-id, and that the request with that request-id is now complete. The recognizer context still holds the results and the audio waveform input of that recognition until the next RECOGNIZE request is issued. A URL to the audio waveform MAY BE returned to the client in a waveform-url header field in the RECOGNITION-COMPLETE event. The client can use this URI to retrieve or playback the audio.

这是从识别器资源到客户端的事件,表示识别已完成。识别结果将发送到消息的MRCP正文中。请求状态字段必须完整,表明这是具有该请求id的最后一个事件,并且具有该请求id的请求现在已完成。识别器上下文仍然保存该识别的结果和音频波形输入,直到发出下一个识别请求。音频波形的URL可以在识别-完成事件中的波形URL头字段中返回给客户端。客户端可以使用此URI检索或播放音频。

   Example:
     C->S:RECOGNIZE 543257 MRCP/1.0
          Confidence-Threshold:90
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
   Example:
     C->S:RECOGNIZE 543257 MRCP/1.0
          Confidence-Threshold:90
          Content-Type:application/grammar+xml
          Content-Id:request1@form-level.store
          Content-Length:104
        
          <?xml version="1.0"?>
        
          <?xml version="1.0"?>
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
                   <one-of>
                            <item xml:lang="fr-CA">oui</item>
                            <item xml:lang="en-US">yes</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to tokens -->
          <rule id="yes">
                   <one-of>
                            <item xml:lang="fr-CA">oui</item>
                            <item xml:lang="en-US">yes</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   may I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   may I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
          </grammar>
        
          </grammar>
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     S->C:MRCP/1.0 543257 200 IN-PROGRESS
        
     S->C:START-OF-SPEECH 543257 IN-PROGRESS MRCP/1.0
        
     S->C:START-OF-SPEECH 543257 IN-PROGRESS MRCP/1.0
        
     S->C:RECOGNITION-COMPLETE 543257 COMPLETE MRCP/1.0
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:application/x-nlsml
          Content-Length:276
        
     S->C:RECOGNITION-COMPLETE 543257 COMPLETE MRCP/1.0
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:application/x-nlsml
          Content-Length:276
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                  </xf:instance>
                            <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                  </xf:instance>
                            <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
8.15. DTMF Detection
8.15. 双音多频检测

Digits received as DTMF tones will be delivered to the automatic speech recognition (ASR) engine in the RTP stream according to RFC 2833 [15]. The automatic speech recognizer (ASR) needs to support RFC 2833 [15] to recognize digits. If it does not support RFC 2833 [15], it will have to process the audio stream and extract the audio tones from it.

根据RFC 2833[15],作为DTMF音调接收的数字将在RTP流中传送到自动语音识别(ASR)引擎。自动语音识别器(ASR)需要支持RFC 2833[15]来识别数字。如果它不支持RFC 2833[15],则必须处理音频流并从中提取音频音调。

9. Future Study
9. 未来研究

Various sections of the recognizer could be distributed into Digital Signal Processors (DSPs) on the Voice Browser/Gateway or IP Phones. For instance, the gateway might perform voice activity detection to reduce network bandwidth and CPU requirement of the automatic speech recognition (ASR) server. Such extensions are deferred for further study and will not be addressed in this document.

可以将语音/信号分配到数字手机的各个IP浏览器或识别器部分。例如,网关可以执行语音活动检测,以减少自动语音识别(ASR)服务器的网络带宽和CPU需求。此类延期将推迟进行进一步研究,本文件将不再讨论。

10. Security Considerations
10. 安全考虑

The MRCP protocol may carry sensitive information such as account numbers, passwords, etc. For this reason it is important that the client have the option of secure communication with the server for both the control messages as well as the media, though the client is not required to use it. If all MRCP communications happens in a trusted domain behind a firewall, this may not be necessary. If the client or server is deployed in an insecure network, communication happening across this insecure network needs to be protected. In such cases, the following additional security functionality MUST be supported on the MRCP server. MRCP servers MUST implement Transport Layer Security (TLS) to secure the RTSP communication, i.e., the RTSP stack SHOULD support the rtsps: URI form. MRCP servers MUST support Secure Real-Time Transport Protocol (SRTP) as an option to send and receive media.

MRCP协议可能携带敏感信息,如账号、密码等。因此,客户机有权选择与服务器进行控制消息和媒体的安全通信,尽管客户机无需使用。如果所有MRCP通信都发生在防火墙后面的受信任域中,则可能不需要这样做。如果客户机或服务器部署在不安全的网络中,则需要保护在该不安全网络上发生的通信。在这种情况下,MRCP服务器上必须支持以下附加安全功能。MRCP服务器必须实现传输层安全(TLS)以保护RTSP通信,即RTSP堆栈应支持RTSP:URI表单。MRCP服务器必须支持安全实时传输协议(SRTP)作为发送和接收媒体的选项。

11. RTSP-Based Examples
11. 基于RTSP的示例

The following is an example of a typical session of speech synthesis and recognition between a client and the server.

以下是客户机和服务器之间语音合成和识别的典型会话示例。

Opening the synthesizer. This is the first resource for this session. The server and client agree on a single Session ID 12345678 and set of RTP/RTCP ports on both sides.

打开合成器。这是此会话的第一个资源。服务器和客户端就单个会话ID 12345678以及双方的RTP/RTCP端口集达成一致。

     C->S:SETUP rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:2
          Transport:RTP/AVP;unicast;client_port=46456-46457
          Content-Type:application/sdp
        
     C->S:SETUP rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:2
          Transport:RTP/AVP;unicast;client_port=46456-46457
          Content-Type:application/sdp
        

Content-Length:190

内容长度:190

          v=0
          o=- 123 456 IN IP4 10.0.0.1
          s=Media Server
          p=+1-888-555-1212
          c=IN IP4 0.0.0.0
          t=0 0
          m=audio 0 RTP/AVP 0 96
          a=rtpmap:0 pcmu/8000
          a=rtpmap:96 telephone-event/8000
          a=fmtp:96 0-15
        
          v=0
          o=- 123 456 IN IP4 10.0.0.1
          s=Media Server
          p=+1-888-555-1212
          c=IN IP4 0.0.0.0
          t=0 0
          m=audio 0 RTP/AVP 0 96
          a=rtpmap:0 pcmu/8000
          a=rtpmap:96 telephone-event/8000
          a=fmtp:96 0-15
        
     S->C:RTSP/1.0 200 OK
          CSeq:2
          Transport:RTP/AVP;unicast;client_port=46456-46457;
                    server_port=46460-46461
          Session:12345678
          Content-Length:190
          Content-Type:application/sdp
        
     S->C:RTSP/1.0 200 OK
          CSeq:2
          Transport:RTP/AVP;unicast;client_port=46456-46457;
                    server_port=46460-46461
          Session:12345678
          Content-Length:190
          Content-Type:application/sdp
        
          v=0
          o=- 3211724219 3211724219 IN IP4 10.3.2.88
          s=Media Server
          c=IN IP4 0.0.0.0
          t=0 0
          m=audio 46460 RTP/AVP 0 96
          a=rtpmap:0 pcmu/8000
          a=rtpmap:96 telephone-event/8000
          a=fmtp:96 0-15
        
          v=0
          o=- 3211724219 3211724219 IN IP4 10.3.2.88
          s=Media Server
          c=IN IP4 0.0.0.0
          t=0 0
          m=audio 46460 RTP/AVP 0 96
          a=rtpmap:0 pcmu/8000
          a=rtpmap:96 telephone-event/8000
          a=fmtp:96 0-15
        

Opening a recognizer resource. Uses the existing session ID and ports.

打开识别器资源。使用现有会话ID和端口。

     C->S:SETUP rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:3
          Transport:RTP/AVP;unicast;client_port=46456-46457;
                     mode=record;ttl=127
          Session:12345678
        
     C->S:SETUP rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:3
          Transport:RTP/AVP;unicast;client_port=46456-46457;
                     mode=record;ttl=127
          Session:12345678
        
     S->C:RTSP/1.0 200 OK
          CSeq:3
          Transport:RTP/AVP;unicast;client_port=46456-46457;
                     server_port=46460-46461;mode=record;ttl=127
          Session:12345678
        
     S->C:RTSP/1.0 200 OK
          CSeq:3
          Transport:RTP/AVP;unicast;client_port=46456-46457;
                     server_port=46460-46461;mode=record;ttl=127
          Session:12345678
        

An ANNOUNCE message with the MRCP SPEAK request initiates speech.

带有MRCP SPEAK请求的公告消息将启动语音。

     C->S:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:4
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:456
        
     C->S:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:4
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:456
        
          SPEAK 543257 MRCP/1.0
          Kill-On-Barge-In:false
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          SPEAK 543257 MRCP/1.0
          Kill-On-Barge-In:false
          Voice-gender:neutral
          Voice-category:teenager
          Prosody-volume:medium
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
                   <sentence>You have 4 new messages.</sentence>
                   <sentence>The first is from <say-as
                   type="name">Stephanie Williams</say-as> <mark
          name="Stephanie"/>
                   and arrived at <break/>
                   <say-as type="time">3:45pm</say-as>.</sentence>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
                   <sentence>You have 4 new messages.</sentence>
                   <sentence>The first is from <say-as
                   type="name">Stephanie Williams</say-as> <mark
          name="Stephanie"/>
                   and arrived at <break/>
                   <say-as type="time">3:45pm</say-as>.</sentence>
        
                   <sentence>The subject is <prosody
                   rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
                   <sentence>The subject is <prosody
                   rate="-20%">ski trip</prosody></sentence>
          </paragraph>
          </speak>
        
     S->C:RTSP/1.0 200 OK
          CSeq:4
          Session:12345678
          RTP-Info:url=rtsp://media.server.com/media/synthesizer;
                     seq=9810092;rtptime=3450012
          Content-Type:application/mrcp
          Content-Length:456
        
     S->C:RTSP/1.0 200 OK
          CSeq:4
          Session:12345678
          RTP-Info:url=rtsp://media.server.com/media/synthesizer;
                     seq=9810092;rtptime=3450012
          Content-Type:application/mrcp
          Content-Length:456
        

MRCP/1.0 543257 200 IN-PROGRESS

MRCP/1.0 543257 200正在进行中

The synthesizer hits the special marker in the message to be spoken and faithfully informs the client of the event.

合成器点击要说出的消息中的特殊标记,并将事件如实地通知客户机。

     S->C:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:5
          Session:12345678
        
     S->C:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:5
          Session:12345678
        

Content-Type:application/mrcp Content-Length:123

内容类型:应用程序/mrcp内容长度:123

          SPEECH-MARKER 543257 IN-PROGRESS MRCP/1.0
          Speech-Marker:Stephanie
     C->S:RTSP/1.0 200 OK
          CSeq:5
        
          SPEECH-MARKER 543257 IN-PROGRESS MRCP/1.0
          Speech-Marker:Stephanie
     C->S:RTSP/1.0 200 OK
          CSeq:5
        

The synthesizer finishes with the SPEAK request.

合成器完成讲话请求。

     S->C:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:6
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:123
        
     S->C:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:6
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:123
        

SPEAK-COMPLETE 543257 COMPLETE MRCP/1.0

SPEAK-COMPLETE 543257 COMPLETE MRCP/1.0

     C->S:RTSP/1.0 200 OK
          CSeq:6
        
     C->S:RTSP/1.0 200 OK
          CSeq:6
        

The recognizer is issued a request to listen for the customer choices.

向识别器发出请求,以侦听客户的选择。

     C->S:ANNOUNCE rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:7
          Session:12345678
        
     C->S:ANNOUNCE rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:7
          Session:12345678
        
          RECOGNIZE 543258 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Length:104
        
          RECOGNIZE 543258 MRCP/1.0
          Content-Type:application/grammar+xml
          Content-Length:104
        
          <?xml version="1.0"?>
        
          <?xml version="1.0"?>
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- the default grammar language is US English -->
          <grammar xml:lang="en-US" version="1.0">
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   Can I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
          <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   Can I speak to
                   <one-of xml:lang="fr-CA">
                            <item>Michel Tremblay</item>
                            <item>Andre Roy</item>
                   </one-of>
               </rule>
        
          </grammar>
        
          </grammar>
        
     S->C:RTSP/1.0 200 OK
          CSeq:7
          Content-Type:application/mrcp
          Content-Length:123
        
     S->C:RTSP/1.0 200 OK
          CSeq:7
          Content-Type:application/mrcp
          Content-Length:123
        

MRCP/1.0 543258 200 IN-PROGRESS

MRCP/1.0 543258 200正在进行中

The client issues the next MRCP SPEAK method in an ANNOUNCE message, asking the user the question. It is generally RECOMMENDED when playing a prompt to the user with kill-on-barge-in and asking for input, that the client issue the RECOGNIZE request ahead of the SPEAK request for optimum performance and user experience. This way, it is guaranteed that the recognizer is online before the prompt starts playing and the user's speech will not be truncated at the beginning (especially for power users).

客户端在公告消息中发出下一个MRCP SPEAK方法,询问用户问题。通常建议,当在驳船上使用kill向用户发出提示并请求输入时,客户机在发出语音请求之前发出识别请求,以获得最佳性能和用户体验。这样,可以保证在提示符开始播放之前识别器处于在线状态,并且用户的语音不会在开始时被截断(特别是对于超级用户)。

     C->S:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:8 Session:12345678 Content-Type:application/mrcp
          Content-Length:733
        
     C->S:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:8 Session:12345678 Content-Type:application/mrcp
          Content-Length:733
        
          SPEAK 543259 MRCP/1.0
          Kill-On-Barge-In:true
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          SPEAK 543259 MRCP/1.0
          Kill-On-Barge-In:true
          Content-Type:application/synthesis+ssml
          Content-Length:104
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
                   <sentence>Welcome to ABC corporation.</sentence>
                   <sentence>Who would you like Talk to.</sentence>
          </paragraph>
          </speak>
        
          <?xml version="1.0"?>
          <speak>
          <paragraph>
                   <sentence>Welcome to ABC corporation.</sentence>
                   <sentence>Who would you like Talk to.</sentence>
          </paragraph>
          </speak>
        
     S->C:RTSP/1.0 200 OK
          CSeq:8
          Content-Type:application/mrcp
          Content-Length:123
        
     S->C:RTSP/1.0 200 OK
          CSeq:8
          Content-Type:application/mrcp
          Content-Length:123
        

MRCP/1.0 543259 200 IN-PROGRESS

MRCP/1.0 543259 200正在进行中

Since the last SPEAK request had Kill-On-Barge-In set to "true", the message synthesizer is interrupted when the user starts speaking, and the client is notified.

由于上一次发言请求的Kill On Barge In设置为“true”,因此当用户开始发言时,消息合成器会中断,并通知客户端。

Now, since the recognition and synthesizer resources are in the same session, they worked with each other to deliver kill-on-barge-in. If the resources were in different sessions, it would have taken a few more messages before the client got the SPEAK-COMPLETE event from the

现在,由于识别和合成器资源在同一个会话中,它们相互协作,在驳船上交付杀伤。如果资源处于不同的会话中,那么在客户端从服务器获取SPEAK-COMPLETE事件之前,需要多发送几条消息

synthesizer resource. Whether the synthesizer and recognizer are in the same session or not, the recognizer MUST generate the START-OF-SPEECH event to the client.

合成器资源。无论合成器和识别器是否在同一会话中,识别器都必须向客户端生成语音启动事件。

The client should have then blindly turned around and issued a BARGE-IN-OCCURRED method to the synthesizer resource. The synthesizer, if kill-on-barge-in was enabled on the current SPEAK request, would have then interrupted it and issued SPEAK-COMPLETE event to the client. In this example, since the synthesizer and recognizer are in the same session, the client did not issue the BARGE-IN-OCCURRED method to the synthesizer and assumed that kill-on-barge-in was implemented between the two resources in the same session and worked.

然后,客户机应该盲目地转身,向合成器资源发出一个插入式方法。如果在当前的SPEAK请求中启用了kill on barge in,则合成器将中断该请求并向客户端发出SPEAK-COMPLETE事件。在本例中,由于合成器和识别器处于同一会话中,客户机没有向合成器发出“插入发生”方法,并假设在同一会话中在两个资源之间实施插入式压井,并且该方法有效。

The completion-cause code differentiates if this is normal completion or a kill-on-barge-in interruption.

完工原因代码区分这是正常完工还是中断中的驳船压井。

     S->C:ANNOUNCE rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:9
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:273
        
     S->C:ANNOUNCE rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:9
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:273
        

START-OF-SPEECH 543258 IN-PROGRESS MRCP/1.0

开始讲话543258正在进行中MRCP/1.0

     C->S:RTSP/1.0 200 OK
          CSeq:9
        
     C->S:RTSP/1.0 200 OK
          CSeq:9
        
     S->C:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:10
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:273
        
     S->C:ANNOUNCE rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:10
          Session:12345678
          Content-Type:application/mrcp
          Content-Length:273
        

SPEAK-COMPLETE 543259 COMPLETE MRCP/1.0 Completion-Cause:000 normal

SPEAK-COMPLETE 543259 COMPLETE MRCP/1.0完成原因:000正常

     C->S:RTSP/1.0 200 OK
          CSeq:10
        
     C->S:RTSP/1.0 200 OK
          CSeq:10
        

The recognition resource matched the spoken stream to a grammar and generated results. The result of the recognition is returned by the server as part of the RECOGNITION-COMPLETE event.

识别资源将语音流与语法匹配并生成结果。识别结果由服务器作为识别完成事件的一部分返回。

     S->C:ANNOUNCE rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:11
          Session:12345678
          Content-Type:application/mrcp
        
     S->C:ANNOUNCE rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:11
          Session:12345678
          Content-Type:application/mrcp
        

Content-Length:733

内容长度:733

          RECOGNITION-COMPLETE 543258 COMPLETE MRCP/1.0
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:application/x-nlsml
          Content-Length:104
        
          RECOGNITION-COMPLETE 543258 COMPLETE MRCP/1.0
          Completion-Cause:000 success
          Waveform-URL:http://web.media.com/session123/audio.wav
          Content-Type:application/x-nlsml
          Content-Length:104
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                  </xf:instance>
                            <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
          <?xml version="1.0"?>
          <result x-model="http://IdentityModel"
            xmlns:xf="http://www.w3.org/2000/xforms"
            grammar="session:request1@form-level.store">
              <interpretation>
                  <xf:instance name="Person">
                      <Person>
                          <Name> Andre Roy </Name>
                      </Person>
                  </xf:instance>
                            <input>   may I speak to Andre Roy </input>
              </interpretation>
          </result>
        
     C->S:RTSP/1.0 200 OK
          CSeq:11
        
     C->S:RTSP/1.0 200 OK
          CSeq:11
        
     C->S:TEARDOWN rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:12
          Session:12345678
        
     C->S:TEARDOWN rtsp://media.server.com/media/synthesizer RTSP/1.0
          CSeq:12
          Session:12345678
        
     S->C:RTSP/1.0 200 OK
          CSeq:12
        
     S->C:RTSP/1.0 200 OK
          CSeq:12
        

We are done with the resources and are tearing them down. When the last of the resources for this session are released, the Session-ID and the RTP/RTCP ports are also released.

我们已经用完了这些资源,正在拆除它们。释放此会话的最后一个资源时,也会释放会话ID和RTP/RTCP端口。

     C->S:TEARDOWN rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:13
          Session:12345678
        
     C->S:TEARDOWN rtsp://media.server.com/media/recognizer RTSP/1.0
          CSeq:13
          Session:12345678
        
     S->C:RTSP/1.0 200 OK
          CSeq:13
        
     S->C:RTSP/1.0 200 OK
          CSeq:13
        
12. Informative References
12. 资料性引用

[1] Fielding, R., Gettys, J., Mogul, J., Frystyk. H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext transfer protocol -- HTTP/1.1", RFC 2616, June 1999.

[1] 菲尔丁,R.,盖蒂,J.,莫卧儿,J.,弗莱斯蒂克。H.,Masinter,L.,Leach,P.,和T.Berners Lee,“超文本传输协议——HTTP/1.1”,RFC 2616,1999年6月。

[2] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998

[2] Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC2326,1998年4月

[3] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 4234, October 2005.

[3] Crocker,D.和P.Overell,“语法规范的扩充BNF:ABNF”,RFC 42342005年10月。

[4] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.

[4] Rosenberg,J.,Schulzrinne,H.,Camarillo,G.,Johnston,A.,Peterson,J.,Sparks,R.,Handley,M.,和E.Schooler,“SIP:会话启动协议”,RFC 3261,2002年6月。

[5] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998.

[5] Handley,M.和V.Jacobson,“SDP:会话描述协议”,RFC 2327,1998年4月。

[6] World Wide Web Consortium, "Voice Extensible Markup Language (VoiceXML) Version 2.0", W3C Candidate Recommendation, March 2004.

[6] 万维网联盟,“语音可扩展标记语言(VoiceXML)2.0版”,W3C候选推荐,2004年3月。

[7] Resnick, P., "Internet Message Format", RFC 2822, April 2001.

[7] Resnick,P.,“互联网信息格式”,RFC 2822,2001年4月。

[8] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[8] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[9] World Wide Web Consortium, "Speech Synthesis Markup Language (SSML) Version 1.0", W3C Candidate Recommendation, September 2004.

[9] 万维网联盟,“语音合成标记语言(SSML)1.0版”,W3C候选推荐,2004年9月。

[10] World Wide Web Consortium, "Natural Language Semantics Markup Language (NLSML) for the Speech Interface Framework", W3C Working Draft, 30 May 2001.

[10] 万维网联盟,“语音接口框架的自然语言语义标记语言(NLSML)”,W3C工作草案,2001年5月30日。

[11] World Wide Web Consortium, "Speech Recognition Grammar Specification Version 1.0", W3C Candidate Recommendation, March 2004.

[11] 万维网联盟,“语音识别语法规范1.0版”,W3C候选推荐,2004年3月。

[12] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003.

[12] Yergeau,F.,“UTF-8,ISO 10646的转换格式”,STD 63,RFC 3629,2003年11月。

[13] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996.

[13] Freed,N.和N.Borenstein,“多用途互联网邮件扩展(MIME)第二部分:媒体类型”,RFC 20461996年11月。

[14] Levinson, E., "Content-ID and Message-ID Uniform Resource Locators", RFC 2392, August 1998.

[14] Levinson,E.,“内容ID和消息ID统一资源定位器”,RFC 2392,1998年8月。

[15] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals", RFC 2833, May 2000.

[15] Schulzrinne,H.和S.Petrack,“DTMF数字、电话音和电话信号的RTP有效载荷”,RFC 28332000年5月。

[16] Alvestrand, H., "Tags for the Identification of Languages", BCP 47, RFC 3066, January 2001.

[16] Alvestrand,H.,“语言识别标签”,BCP 47,RFC 3066,2001年1月。

Appendix A. ABNF Message Definitions
附录A.ABNF消息定义
   ALPHA          =  %x41-5A / %x61-7A   ; A-Z / a-z
        
   ALPHA          =  %x41-5A / %x61-7A   ; A-Z / a-z
        
   CHAR           =  %x01-7F     ; any 7-bit US-ASCII character,
                                 ;    excluding NUL
        
   CHAR           =  %x01-7F     ; any 7-bit US-ASCII character,
                                 ;    excluding NUL
        
   CR             =  %x0D        ; carriage return
        
   CR             =  %x0D        ; carriage return
        

CRLF = CR LF ; Internet standard newline

CRLF=CR;互联网标准新线

   DIGIT          =  %x30-39     ; 0-9
        
   DIGIT          =  %x30-39     ; 0-9
        
   DQUOTE         =  %x22        ; " (Double Quote)
        
   DQUOTE         =  %x22        ; " (Double Quote)
        
   HEXDIG         =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
        
   HEXDIG         =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
        
   HTAB           =  %x09        ; horizontal tab
        
   HTAB           =  %x09        ; horizontal tab
        
   LF             =  %x0A        ; linefeed
        
   LF             =  %x0A        ; linefeed
        
   OCTET          =  %x00-FF     ; 8 bits of data
        
   OCTET          =  %x00-FF     ; 8 bits of data
        
   SP             =  %x20        ; space
        
   SP             =  %x20        ; space
        
   WSP            =  SP / HTAB   ; white space
        
   WSP            =  SP / HTAB   ; white space
        
   LWS            =  [*WSP CRLF] 1*WSP ; linear whitespace
        
   LWS            =  [*WSP CRLF] 1*WSP ; linear whitespace
        
   SWS            =  [LWS] ; sep whitespace
        
   SWS            =  [LWS] ; sep whitespace
        
   UTF8-NONASCII  =  %xC0-DF 1UTF8-CONT
                  /  %xE0-EF 2UTF8-CONT
                  /  %xF0-F7 3UTF8-CONT
                  /  %xF8-Fb 4UTF8-CONT
                  /  %xFC-FD 5UTF8-CONT
        
   UTF8-NONASCII  =  %xC0-DF 1UTF8-CONT
                  /  %xE0-EF 2UTF8-CONT
                  /  %xF0-F7 3UTF8-CONT
                  /  %xF8-Fb 4UTF8-CONT
                  /  %xFC-FD 5UTF8-CONT
        
   UTF8-CONT      =  %x80-BF
        
   UTF8-CONT      =  %x80-BF
        
   param          =  *pchar
        
   param          =  *pchar
        

quoted-string = SWS DQUOTE *(qdtext / quoted-pair ) DQUOTE

带引号的字符串=SWS DQUOTE*(qdtext/带引号的对)DQUOTE

   qdtext         =  LWS / %x21 / %x23-5B / %x5D-7E
                     / UTF8-NONASCII
        
   qdtext         =  LWS / %x21 / %x23-5B / %x5D-7E
                     / UTF8-NONASCII
        

quoted-pair = "\" (%x00-09 / %x0B-0C / %x0E-7F)

带引号的pair=“\”(%x00-09/%x0B-0C/%x0E-7F)

   token          =  1*(alphanum / "-" / "." / "!" / "%" / "*"
                      / "_" / "+" / "`" / "'" / "~" )
        
   token          =  1*(alphanum / "-" / "." / "!" / "%" / "*"
                      / "_" / "+" / "`" / "'" / "~" )
        
   reserved       =  ";" / "/" / "?" / ":" / "@" / "&" / "="
                     / "+" / "$" / ","
        
   reserved       =  ";" / "/" / "?" / ":" / "@" / "&" / "="
                     / "+" / "$" / ","
        
   mark           =  "-" / "_" / "." / "!" / "~" / "*" / "'"
                     / "(" / ")"
        
   mark           =  "-" / "_" / "." / "!" / "~" / "*" / "'"
                     / "(" / ")"
        
   unreserved     =  alphanum / mark
        
   unreserved     =  alphanum / mark
        
   char           =  unreserved / escaped /
                     ":" / "@" / "&" / "=" / "+" / "$" / ","
        
   char           =  unreserved / escaped /
                     ":" / "@" / "&" / "=" / "+" / "$" / ","
        
   alphanum       =  ALPHA / DIGIT
        
   alphanum       =  ALPHA / DIGIT
        

escaped = "%" HEXDIG HEXDIG

转义=“%”hextig hextig

   absoluteURI    =  scheme ":" ( hier-part / opaque-part )
        
   absoluteURI    =  scheme ":" ( hier-part / opaque-part )
        
   relativeURI    =  ( net-path / abs-path / rel-path )
                     [ "?" query ]
        
   relativeURI    =  ( net-path / abs-path / rel-path )
                     [ "?" query ]
        
   hier-part      =  ( net-path / abs-path ) [ "?" query ]
        
   hier-part      =  ( net-path / abs-path ) [ "?" query ]
        
   net-path       =  "//" authority [ abs-path ]
        
   net-path       =  "//" authority [ abs-path ]
        

abs-path = "/" path-segments

abs path=“/”路径段

rel-path = rel-segment [ abs-path ]

rel路径=rel段[abs路径]

   rel-segment    =  1*( unreserved / escaped / ";" / "@"
                     / "&" / "=" / "+" / "$" / "," )
        
   rel-segment    =  1*( unreserved / escaped / ";" / "@"
                     / "&" / "=" / "+" / "$" / "," )
        

opaque-part = uric-no-slash *uric

不透明部分=无斜线*无斜线

   uric           =  reserved / unreserved / escaped
        
   uric           =  reserved / unreserved / escaped
        
   uric-no-slash  =  unreserved / escaped / ";" / "?" / ":"
                     / "@" / "&" / "=" / "+" / "$" / ","
        
   uric-no-slash  =  unreserved / escaped / ";" / "?" / ":"
                     / "@" / "&" / "=" / "+" / "$" / ","
        
   path-segments  =  segment *( "/" segment )
        
   path-segments  =  segment *( "/" segment )
        
   segment        =  *pchar *( ";" param )
        
   segment        =  *pchar *( ";" param )
        
   scheme         =  ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
        
   scheme         =  ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
        
   authority      =  srvr / reg-name
        
   authority      =  srvr / reg-name
        
   srvr           =  [ [ userinfo "@" ] hostport ]
        
   srvr           =  [ [ userinfo "@" ] hostport ]
        
   reg-name       =  1*( unreserved / escaped / "$" / ","
                     / ";" / ":" / "@" / "&" / "=" / "+" )
        
   reg-name       =  1*( unreserved / escaped / "$" / ","
                     / ";" / ":" / "@" / "&" / "=" / "+" )
        
   query          =  *uric
        
   query          =  *uric
        
   userinfo       =  ( user ) [ ":" password ] "@"
        
   userinfo       =  ( user ) [ ":" password ] "@"
        
   user           =  1*( unreserved / escaped
                       / user-unreserved )
        
   user           =  1*( unreserved / escaped
                       / user-unreserved )
        
   user-unreserved  =  "&" / "=" / "+" / "$" / "," / ";"
                       / "?" / "/"
        
   user-unreserved  =  "&" / "=" / "+" / "$" / "," / ";"
                       / "?" / "/"
        
   password         =  *( unreserved / escaped /
                       "&" / "=" / "+" / "$" / "," )
        
   password         =  *( unreserved / escaped /
                       "&" / "=" / "+" / "$" / "," )
        

hostport = host [ ":" port ]

主机端口=主机[“:”端口]

   host             =  hostname / IPv4address / IPv6reference
        
   host             =  hostname / IPv4address / IPv6reference
        
   hostname         =  *( domainlabel "." ) toplabel [ "." ]
        
   hostname         =  *( domainlabel "." ) toplabel [ "." ]
        
   domainlabel      =  alphanum
                       / alphanum *( alphanum / "-" ) alphanum
        
   domainlabel      =  alphanum
                       / alphanum *( alphanum / "-" ) alphanum
        
   toplabel       =    ALPHA / ALPHA *( alphanum / "-" )
                       alphanum
        
   toplabel       =    ALPHA / ALPHA *( alphanum / "-" )
                       alphanum
        
   IPv4address    =    1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "."
                       1*3DIGIT
        
   IPv4address    =    1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "."
                       1*3DIGIT
        

IPv6reference = "[" IPv6address "]"

IPv6reference=“[“IPV6地址”]”

IPv6address = hexpart [ ":" IPv4address ]

IPv6address=hexpart[“:”IPv4address]

   hexpart        =    hexseq / hexseq "::" [ hexseq ] / "::"
                       [ hexseq ]
        
   hexpart        =    hexseq / hexseq "::" [ hexseq ] / "::"
                       [ hexseq ]
        
   hexseq         =    hex4 *( ":" hex4)
        
   hexseq         =    hex4 *( ":" hex4)
        
   hex4           =    1*4HEXDIG
        
   hex4           =    1*4HEXDIG
        
   port           =    1*DIGIT
        
   port           =    1*DIGIT
        

generic-message = start-line message-header CRLF [ message-body ]

通用消息=起始行消息头CRLF[消息正文]

   message-body   =    *OCTET
        
   message-body   =    *OCTET
        
   start-line     =    request-line / status-line / event-line
        
   start-line     =    request-line / status-line / event-line
        

request-line = method-name SP request-id SP mrcp-version CRLF

请求行=方法名称SP请求id SP mrcp版本CRLF

status-line = mrcp-version SP request-id SP status-code SP request-state CRLF

状态行=mrcp版本SP请求id SP状态代码SP请求状态CRLF

event-line = event-name SP request-id SP request-state SP mrcp-version CRLF

事件行=事件名称SP请求id SP请求状态SP mrcp版本CRLF

   message-header =    1*(generic-header / resource-header)
        
   message-header =    1*(generic-header / resource-header)
        
   generic-header =    active-request-id-list
                  /    proxy-sync-id
                  /    content-id
                  /    content-type
                  /    content-length
                  /    content-base
                  /    content-location
                  /    content-encoding
                  /    cache-control
                  /    logging-tag
   ; -- content-id is as defined in RFC 2392 and RFC 2046
        
   generic-header =    active-request-id-list
                  /    proxy-sync-id
                  /    content-id
                  /    content-type
                  /    content-length
                  /    content-base
                  /    content-location
                  /    content-encoding
                  /    cache-control
                  /    logging-tag
   ; -- content-id is as defined in RFC 2392 and RFC 2046
        
   mrcp-version   =    "MRCP" "/" 1*DIGIT "." 1*DIGIT
        
   mrcp-version   =    "MRCP" "/" 1*DIGIT "." 1*DIGIT
        
   request-id     =    1*DIGIT
        
   request-id     =    1*DIGIT
        
   status-code    =    1*DIGIT
        
   status-code    =    1*DIGIT
        

active-request-id-list = "Active-Request-Id-List" ":" request-id *("," request-id) CRLF

活动请求id列表=“活动请求id列表”“:“请求id*(”,“请求id”)CRLF

   proxy-sync-id  =    "Proxy-Sync-Id" ":" 1*ALPHA CRLF
        
   proxy-sync-id  =    "Proxy-Sync-Id" ":" 1*ALPHA CRLF
        
   content-length =    "Content-Length" ":" 1*DIGIT CRLF
        
   content-length =    "Content-Length" ":" 1*DIGIT CRLF
        

content-base = "Content-Base" ":" absoluteURI CRLF

content base=“content base”“:”绝对URI CRLF

content-type = "Content-Type" ":" media-type

content type=“content type”“:”媒体类型

   media-type     =    type "/" subtype *( ";" parameter )
        
   media-type     =    type "/" subtype *( ";" parameter )
        
   type           =    token
        
   type           =    token
        
   subtype        =    token
        
   subtype        =    token
        

parameter = attribute "=" value

参数=属性“=”值

   attribute      =    token
        
   attribute      =    token
        
   value          =    token / quoted-string
        
   value          =    token / quoted-string
        
   content-encoding =  "Content-Encoding" ":"
                       *WSP content-coding
                       *(*WSP "," *WSP content-coding *WSP )
                       CRLF
        
   content-encoding =  "Content-Encoding" ":"
                       *WSP content-coding
                       *(*WSP "," *WSP content-coding *WSP )
                       CRLF
        
   content-coding   =  token
        
   content-coding   =  token
        

content-location = "Content-Location" ":" ( absoluteURI / relativeURI ) CRLF

content location=“content location”“:”(绝对URI/relativeURI)CRLF

   cache-control  =    "Cache-Control" ":"
                       *WSP cache-directive
                       *( *WSP "," *WSP cache-directive *WSP )
                       CRLF
        
   cache-control  =    "Cache-Control" ":"
                       *WSP cache-directive
                       *( *WSP "," *WSP cache-directive *WSP )
                       CRLF
        
   cache-directive =   "max-age" "=" delta-seconds
                   /   "max-stale" "=" delta-seconds
                   /   "min-fresh" "=" delta-seconds
        
   cache-directive =   "max-age" "=" delta-seconds
                   /   "max-stale" "=" delta-seconds
                   /   "min-fresh" "=" delta-seconds
        
   logging-tag    =    "Logging-Tag" ":" 1*ALPHA CRLF
        
   logging-tag    =    "Logging-Tag" ":" 1*ALPHA CRLF
        

resource-header = recognizer-header / synthesizer-header

资源头=识别器头/合成器头

method-name = synthesizer-method / recognizer-method

方法名称=合成器方法/识别器方法

event-name = synthesizer-event / recognizer-event

事件名称=合成器事件/识别器事件

request-state = "COMPLETE" / "IN-PROGRESS" / "PENDING"

请求状态=“完成”/“正在进行”/“待定”

   synthesizer-method = "SET-PARAMS"
                  /    "GET-PARAMS"
                  /    "SPEAK"
                  /    "STOP"
                  /    "PAUSE"
                  /    "RESUME"
                  /    "BARGE-IN-OCCURRED"
                  /    "CONTROL"
        
   synthesizer-method = "SET-PARAMS"
                  /    "GET-PARAMS"
                  /    "SPEAK"
                  /    "STOP"
                  /    "PAUSE"
                  /    "RESUME"
                  /    "BARGE-IN-OCCURRED"
                  /    "CONTROL"
        

synthesizer-event = "SPEECH-MARKER" / "SPEAK-COMPLETE"

合成器事件=“语音标记”/“讲话完成”

   synthesizer-header =     jump-target
                      /     kill-on-barge-in
                      /     speaker-profile
                      /     completion-cause
                      /     voice-parameter
                      /     prosody-parameter
                      /     vendor-specific
                      /     speech-marker
                      /     speech-language
                      /     fetch-hint
                      /     audio-fetch-hint
                      /     fetch-timeout
                      /     failed-uri
                      /     failed-uri-cause
                      /     speak-restart
                      /     speak-length
        
   synthesizer-header =     jump-target
                      /     kill-on-barge-in
                      /     speaker-profile
                      /     completion-cause
                      /     voice-parameter
                      /     prosody-parameter
                      /     vendor-specific
                      /     speech-marker
                      /     speech-language
                      /     fetch-hint
                      /     audio-fetch-hint
                      /     fetch-timeout
                      /     failed-uri
                      /     failed-uri-cause
                      /     speak-restart
                      /     speak-length
        
   recognizer-method = "SET-PARAMS"
                      /    "GET-PARAMS"
                      /    "DEFINE-GRAMMAR"
                      /    "RECOGNIZE"
                      /    "GET-RESULT"
                      /    "RECOGNITION-START-TIMERS"
                      /    "STOP"
        
   recognizer-method = "SET-PARAMS"
                      /    "GET-PARAMS"
                      /    "DEFINE-GRAMMAR"
                      /    "RECOGNIZE"
                      /    "GET-RESULT"
                      /    "RECOGNITION-START-TIMERS"
                      /    "STOP"
        

recognizer-event = "START-OF-SPEECH" / "RECOGNITION-COMPLETE"

识别器事件=“语音开始”/“识别完成”

   recognizer-header =      confidence-threshold
                     /      sensitivity-level
                     /      speed-vs-accuracy
                     /      n-best-list-length
        
   recognizer-header =      confidence-threshold
                     /      sensitivity-level
                     /      speed-vs-accuracy
                     /      n-best-list-length
        
                     /      no-input-timeout
                     /      recognition-timeout
                     /      waveform-url
                     /      completion-cause
                     /      recognizer-context-block
                     /      recognizer-start-timers
                     /      vendor-specific
                     /      speech-complete-timeout
                     /      speech-incomplete-timeout
                     /      dtmf-interdigit-timeout
                     /      dtmf-term-timeout
                     /      dtmf-term-char
                     /      fetch-timeout
                     /      failed-uri
                     /      failed-uri-cause
                     /      save-waveform
                     /      new-audio-channel
                     /      speech-language
        
                     /      no-input-timeout
                     /      recognition-timeout
                     /      waveform-url
                     /      completion-cause
                     /      recognizer-context-block
                     /      recognizer-start-timers
                     /      vendor-specific
                     /      speech-complete-timeout
                     /      speech-incomplete-timeout
                     /      dtmf-interdigit-timeout
                     /      dtmf-term-timeout
                     /      dtmf-term-char
                     /      fetch-timeout
                     /      failed-uri
                     /      failed-uri-cause
                     /      save-waveform
                     /      new-audio-channel
                     /      speech-language
        

jump-target = "Jump-Size" ":" speech-length-value CRLF

跳转目标=“跳转大小”“:”语音长度值CRLF

speech-length-value = numeric-speech-length / text-speech-length

语音长度值=数字语音长度/文本语音长度

   text-speech-length =     1*ALPHA SP "Tag"
        
   text-speech-length =     1*ALPHA SP "Tag"
        
   numeric-speech-length =("+" / "-") 1*DIGIT SP
                       numeric-speech-unit
        
   numeric-speech-length =("+" / "-") 1*DIGIT SP
                       numeric-speech-unit
        
   numeric-speech-unit =    "Second"
                       /    "Word"
                       /    "Sentence"
                       /    "Paragraph"
        
   numeric-speech-unit =    "Second"
                       /    "Word"
                       /    "Sentence"
                       /    "Paragraph"
        
   delta-seconds  =    1*DIGIT
        
   delta-seconds  =    1*DIGIT
        

kill-on-barge-in = "Kill-On-Barge-In" ":" boolean-value CRLF

在驳船上压井=“在驳船上压井”:“布尔值CRLF

   boolean-value  =    "true" / "false"
        
   boolean-value  =    "true" / "false"
        

speaker-profile = "Speaker-Profile" ":" absoluteURI CRLF

speaker profile=“speaker profile”“:“absoluteURI CRLF

   completion-cause =  "Completion-Cause" ":" 1*DIGIT SP
                       1*ALPHA CRLF
        
   completion-cause =  "Completion-Cause" ":" 1*DIGIT SP
                       1*ALPHA CRLF
        

voice-parameter = "Voice-" voice-param-name ":" voice-param-value CRLF

语音参数=“语音-”语音参数名称“:“语音参数值CRLF”

   voice-param-name =  1*ALPHA
        
   voice-param-name =  1*ALPHA
        
   voice-param-value = 1*alphanum
        
   voice-param-value = 1*alphanum
        

prosody-parameter = "Prosody-" prosody-param-name ":" prosody-param-value CRLF

韵律参数=“韵律—“韵律参数名称”:“韵律参数值CRLF”

   prosody-param-name =     1*ALPHA
        
   prosody-param-name =     1*ALPHA
        
   prosody-param-value = 1*alphanum
        
   prosody-param-value = 1*alphanum
        

vendor-specific = "Vendor-Specific-Parameters" ":" vendor-specific-av-pair *[";" vendor-specific-av-pair] CRLF

供应商特定=“供应商特定参数”“:“供应商特定av对*[”;“供应商特定av对]CRLF

vendor-specific-av-pair = vendor-av-pair-name "=" vendor-av-pair-value

供应商特定av对=供应商av对名称“=”供应商av对值

   vendor-av-pair-name = 1*ALPHA
        
   vendor-av-pair-name = 1*ALPHA
        
   vendor-av-pair-value = 1*alphanum
        
   vendor-av-pair-value = 1*alphanum
        
   speech-marker  =    "Speech-Marker" ":" 1*ALPHA CRLF
        
   speech-marker  =    "Speech-Marker" ":" 1*ALPHA CRLF
        
   speech-language =   "Speech-Language" ":" 1*ALPHA CRLF
        
   speech-language =   "Speech-Language" ":" 1*ALPHA CRLF
        
   fetch-hint     =    "Fetch-Hint" ":" 1*ALPHA CRLF
        
   fetch-hint     =    "Fetch-Hint" ":" 1*ALPHA CRLF
        
   audio-fetch-hint =  "Audio-Fetch-Hint" ":" 1*ALPHA CRLF
        
   audio-fetch-hint =  "Audio-Fetch-Hint" ":" 1*ALPHA CRLF
        
   fetch-timeout  =    "Fetch-Timeout" ":" 1*DIGIT CRLF
        
   fetch-timeout  =    "Fetch-Timeout" ":" 1*DIGIT CRLF
        

failed-uri = "Failed-URI" ":" absoluteURI CRLF

failed uri=“failed uri”“:”绝对uri CRLF

   failed-uri-cause =  "Failed-URI-Cause" ":" 1*ALPHA CRLF
        
   failed-uri-cause =  "Failed-URI-Cause" ":" 1*ALPHA CRLF
        

speak-restart = "Speak-Restart" ":" boolean-value CRLF

speak restart=“speak restart”“:”布尔值CRLF

speak-length = "Speak-Length" ":" speech-length-value CRLF confidence-threshold = "Confidence-Threshold" ":" 1*DIGIT CRLF

speak length=“speak length”“:“语音长度值CRLF置信阈值=”置信阈值”“:“1*位CRLF”

   sensitivity-level = "Sensitivity-Level" ":" 1*DIGIT CRLF
        
   sensitivity-level = "Sensitivity-Level" ":" 1*DIGIT CRLF
        
   speed-vs-accuracy = "Speed-Vs-Accuracy" ":" 1*DIGIT CRLF
        
   speed-vs-accuracy = "Speed-Vs-Accuracy" ":" 1*DIGIT CRLF
        
   n-best-list-length = "N-Best-List-Length" ":" 1*DIGIT CRLF
        
   n-best-list-length = "N-Best-List-Length" ":" 1*DIGIT CRLF
        
   no-input-timeout =  "No-Input-Timeout" ":" 1*DIGIT CRLF
        
   no-input-timeout =  "No-Input-Timeout" ":" 1*DIGIT CRLF
        
   recognition-timeout = "Recognition-Timeout" ":" 1*DIGIT CRLF
        
   recognition-timeout = "Recognition-Timeout" ":" 1*DIGIT CRLF
        

waveform-url = "Waveform-URL" ":" absoluteURI CRLF

波形url=“波形url”“:”绝对URI CRLF

recognizer-context-block = "Recognizer-Context-Block" ":" 1*ALPHA CRLF

识别器上下文块=“识别器上下文块”:“1*ALPHA CRLF”

recognizer-start-timers = "Recognizer-Start-Timers" ":" boolean-value CRLF

识别器启动计时器=“识别器启动计时器”“:”布尔值CRLF

speech-complete-timeout = "Speech-Complete-Timeout" ":" 1*DIGIT CRLF

语音完成超时=“语音完成超时”:“1*位CRLF

speech-incomplete-timeout = "Speech-Incomplete-Timeout" ":" 1*DIGIT CRLF

语音不完整超时=“语音不完整超时”:“1*位CRLF

dtmf-interdigit-timeout = "DTMF-Interdigit-Timeout" ":" 1*DIGIT CRLF

dtmf叉指超时=“dtmf叉指超时”:“1*位CRLF

   dtmf-term-timeout = "DTMF-Term-Timeout" ":" 1*DIGIT CRLF
        
   dtmf-term-timeout = "DTMF-Term-Timeout" ":" 1*DIGIT CRLF
        

dtmf-term-char = "DTMF-Term-Char" ":" CHAR CRLF

dtmf term char=“dtmf term char”“:”char CRLF

save-waveform = "Save-Waveform" ":" boolean-value CRLF

保存波形=“保存波形”“:”布尔值CRLF

new-audio-channel = "New-Audio-Channel" ":" boolean-value CRLF

新建音频频道=“新建音频频道”“:”布尔值CRLF

Appendix B. Acknowledgements
附录B.确认书

Andre Gillet (Nuance Communications) Andrew Hunt (SpeechWorks) Aaron Kneiss (SpeechWorks) Kristian Finlator (SpeechWorks) Martin Dragomirecky (Cisco Systems, Inc.) Pierre Forgues (Nuance Communications) Suresh Kaliannan (Cisco Systems, Inc.) Corey Stohs (Cisco Systems, Inc.) Dan Burnett (Nuance Communications)

Andre Gillet(Nuance Communications)Andrew Hunt(SpeechWorks)Aaron Kneiss(SpeechWorks)Kristian Finlator(SpeechWorks)Martin Dragomirecky(Cisco Systems,Inc.)Pierre Forgues(Nuance Communications)Suresh Kaliannan(Cisco Systems,Inc.)Corey Stohs(Cisco Systems,Inc.)Dan Burnett(Nuance Communications)

Authors' Addresses

作者地址

Saravanan Shanmugham Cisco Systems, Inc. 170 W. Tasman Drive San Jose, CA 95134

Saravanan Shanmugham Cisco Systems,Inc.加利福尼亚州圣何塞塔斯曼大道西170号,邮编95134

   EMail: sarvi@cisco.com
        
   EMail: sarvi@cisco.com
        

Peter Monaco Nuasis Corporation 303 Bryant St. Mountain View, CA 94041

彼得摩纳哥Nuasis公司加利福尼亚州布莱恩特圣山景城303号,邮编94041

   EMail: peter.monaco@nuasis.com
        
   EMail: peter.monaco@nuasis.com
        

Brian Eberman Speechworks, Inc. 695 Atlantic Avenue Boston, MA 02111

Brian Eberman Speechworks,Inc.马萨诸塞州波士顿大西洋大道695号,邮编:02111

   EMail: brian.eberman@speechworks.com
        
   EMail: brian.eberman@speechworks.com
        

Full Copyright Statement

完整版权声明

Copyright (C) The Internet Society (2006).

版权所有(C)互联网协会(2006年)。

This document is subject to the rights, licenses and restrictions contained in BCP 78 and at www.rfc-editor.org/copyright.html, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78和www.rfc-editor.org/copyright.html中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件及其包含的信息是按“原样”提供的,贡献者、他/她所代表或赞助的组织(如有)、互联网协会和互联网工程任务组不承担任何明示或暗示的担保,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Intellectual Property

知识产权

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关RFC文件中权利的程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.

向IETF秘书处披露的知识产权副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果,可从IETF在线知识产权存储库获取,网址为http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涵盖实施本标准所需技术的专有权利。请将信息发送至IETF的IETF-ipr@ietf.org.

Acknowledgement

确认

Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA).

RFC编辑器功能的资金由IETF行政支持活动(IASA)提供。