A comparison of H323 and SIP based architectures for VoIP systems

A comparison of H323 and SIP based architectures for VoIP systems VINCENZO LANGONE Switching & Routing Division, Alcatel Italia, via Bosco Primo, 8409...
Author: Marion Walker
4 downloads 5 Views 317KB Size
A comparison of H323 and SIP based architectures for VoIP systems VINCENZO LANGONE Switching & Routing Division, Alcatel Italia, via Bosco Primo, 84091 Battipaglia (SA), ITALY MAURIZIO LONGO, MARIACARMELA SPATOLA (*), PIERANGELO ARCIELLO (**) DIIIE Dpt., University of Salerno, via Ponte Don Melillo 1, 84084 Fisciano (SA), ITALY (*) Currently with IIASS, Vietri sul Mare (SA), ITALY; (**) currently with ASCOM, Napoli, ITALY

Abstract: - Communicating via packet data networks, such as IP, has become a preferred strategy for both corporate and public network planners. Predictions have been made that data traffic will soon exceed telephone traffic, if it hasn’t already. At the same time, more and more companies are seeing the value of their IP telephony networks (IP telephony) to reduce telephone and facsimile costs and to set the stage for advanced multimedia applications. Providing high quality telephony over IP networks is one of the key steps in the convergence of voice, fax, video, and data communications services. IP telephony has now been proven feasible; the race is on to adopt standards, design terminals and gateways, and begin the rollout of services on a global scale. Although PSTN and IP networks are fundamentally different in terms of routing and performance, it is possible for the networks to be connected, exchanging voice and data traffic. We have considered two protocols for Internet telephony: the H.323 designed by International Telecommunications Union (ITU) and the Session Initiation Protocol (SIP) by Internet Engineering Task Force (IETF). We will present how the protocols have been designed to solve different problems of signalling and some issues that are addressed by both protocols will be compared.

1 VoIP concepts IP telephony can be defined as the ability to make telephone calls over IP-based data networks, rather then over the familiar Public Switched Telephone Network (PSTN). [1] VoIP is a term used to indicate, in IP telephony, a set of services for managing the delivery of voice information using the Internet Protocol. [2] Advantages of this technology include lower cost for long distance phone calls, because they are considered local calls. Another advantage of IP telephony is that it improves efficiency of bandwidth use for real-time voice transmission. Even though basic telephony is the initial application for IP telephony, longer term benefits are expected to be derived from multimedia and multiservice applications. [1] This means that voice will not be the only service that will be offered in VoIP networks, whose technology is very scaleable and flexible for the introduction of new services (sound, video, graphics, etc.). [3] PSTN is a circuit switched network. It dedicates a fixed amount of bandwidth for each conversation and thus quality is guaranteed. Conversely, the IP network is a packet switched network, hence the Quality of Service (QoS) is not guaranteed: since IP packets carrying voice are treated just like IP packets carrying any other type of data, they are

subject to delays, loss and retransmissions, more so when the network is congested. QoS is defined as “the collective effect of service performance which determine the degree of satisfaction of a user of the service”. In IP telephony the QoS perceived by the user is mainly dependent on two things: the quality of perceived voice and the delay in a two-way conversation. [4] The relevant parameters to describe the QoS in VoIP are: - Call set-up time: is the delay experienced by the end-user between dialling the number and the establishment of the audio connection between the terminals. - End-to-end delay: is the time that the voice takes to run from speaker to listener. - Echo: is the reflection of the speaker’s voice that is heard in the speaker’s ear. - Packet loss: the IP network is unreliable. - Jitter: is the packet’s variable delay. [3], [5] In the early age of VoIP, only PC to PC communications were capable to transport voice over IP networks. These communications were based on an end-to-end IP connectivity. The situation got more complicated when people wished to reach PSTN destinations. A new entity, called VoIP gateway was envisaged to interconnect portions of both networks. Gateways act as end-systems on both the IP network and the PSTN. This means that IP hosts wishing to contact a PSTN user would first contact a gateway, which

would terminate the IP portion of the call and initiate a new call on the PSTN to the final destination (Fig. 1). During the session the gateway has to perform two types of translations: - translation between signalling procedures used in each networks; - translation of media encoding used in each networks. All these operations must be processed automatically and even transparently to the user. [6]

Fig. 1: IP Telephony

2 H.323 concepts The H.323 standard deals with the transmission of real-time audio, video, and data communications over packet-based networks. It specifies the components, protocols and procedures providing multimedia communications. [7]

Gatekeeper: a gatekeeper acts like the central point for all calls within its zone and provides call control services to registered endpoints. In many ways, an H.323 gatekeeper acts like a virtual switch. The collections of all terminals, gateways and MCUs managed by a single gatekeeper is known as an H.323 Zone. The gatekeeper has both mandatory and optional services. The following are mandatory services. Address translation: translations of Alias address to transport address using a table that is updated with registration messages. Admission control: authorization of LAN access using Admission Request, Confirm and Reject messages. LAN access may be based on call authorization, bandwidth, or some other criteria. Bandwidth control: the gatekeeper may support for Bandwidth Request, Confirm and Reject messages if a terminal request additional capacities during a call. Zone management: the gatekeeper provides the above functions for terminals, MCUs, and gateways which are registered within its Zone of control.

2.2 Main protocols specifies by H.323 Table 1: H.323 protocols Video

H.261 H.263

2.1 Components The H.323 standard specifies four kinds of components: terminals, gateways, gatekeepers, Multipoint Control Units (MCU) (Fig. 2). Terminal: a terminal is used for real-time bi-directional multimedia communications. An H.323 terminal can either be a PC or a stand-alone device. It supports audio and can optionally support video or data communications. Gateway: a gateway connects two dissimilar networks. An H.323 gateway provides connectivity between an H.323 network and a non H.323 network. For example, a gateway can connect and provide communications between an H.323 terminal and PSTN network. This connectivity of dissimilar networks is achieved by translating protocols for call set-up and release, converting media formats between different networks, and transferring information between the networks connected by the gateway. A gateway is not required, however, for communications between two terminals within an H.323 network. [7]

Audio Multiplexing G.711 G.722 G.728 G.723 G.729

H.225.0 (RAS, Q.931)

Control

Data

Comm. Interface

H.245

T.120

TCP/IP

Overall system control is provided by three separate signalling functions (Table 1): the H.245 control channel, the H.225.0 call signalling channel, and the RAS channel. The H.245 is a reliable channel. It carries control messages for capabilities exchange, opening and closing of channels for audio and video streams, preference requests and flow control. The H.225.0 is a protocol made up of the RAS (Registration, Admission, Status) and Q.931 signalling protocols. The RAS protocol transmits over an unreliable channel used to communicate registration, admission, bandwidth changes, and status messages between two H.323 entities. The Q.931 protocol used in an H.323 system is a subset of the Q.931 commands, as defined for ISDN Basic Call Control, and is used for call set-up and termination. [8], [9]

Fig. 2: H.323 architectures and components

Audio signals contain digitised and compressed speech. H.323 terminals must support the G.711 voice standard for speech compression. Support for other ITU voice standards is optional. While video capabilities are optional, any videoenabled H.323 terminal must support the H.261 codec; support for H.263 is optional. Video information is transmitted at a rate no greater than that selected during the capability exchange. H.323 supports data conferencing through the T.120 specification. It addresses point-to-point and multipoint data conferences. T.120 provides interoperability at the application, network, and transport level. Within the IP stack, unreliable services are provided by User Datagram Protocol (UDP). H.323 uses UDP for the audio, video, and the RAS channel. H.323 uses reliable (TCP) end-to-end services for the H.245, T.120 and Q.931. [8]

2.3 Other protocols Security: the H.235 standard addresses four general issues related to security, Authentication, Integrity, Privacy, and non-Repudiation. Supplementary services: supplementary services for H.323, namely Call Transfer and Call Diversion, have been defined by the H.450 series. H.450.1 defines the signalling protocol between H.323 endpoints for the control of supplementary services. H.450.2 defines Call Transfer and H.450.3 Call Diversion. [8]

2.4 Call scenario An example of PSTN to H.323 terminal call flow is (Fig. 3): 1. A PSTN subscriber dials an access number that is provided by the Internet telephony service provider. 2. The call is routed by the PSTN to the “access” Internet telephony switch. 3. The gateway plays an announcement requesting that the subscriber enter the destination number to be called. The collected destination digit information is sent in a call set-up request message to the gatekeeper. 4. The gatekeeper determines a destination gatekeeper IP address based on the destination digit information. An IP packet requesting the availability status for the destination H.323 terminal is sent to the destination gatekeeper. 5. The destination gatekeeper responds to the request by providing the destination terminal availability status and IP address information to the originating gateway. 6. The originating gateway sets up a virtual circuit to the destination H.323 terminal. This circuit is identified by a call-reference variable that will be used by both the originating gateway and the H.323 terminal for the duration of the call to identify all IP packets associated with that particular call. 7. If the H.323 terminal indicates that call set-up is successful and the called party has answered, IP signalling messages are sent to the originating gatekeeper, which then signals the originating gateway. The originating gateway signals the originating PSTN switch to indicate that the call is now completed. The

exchange of IP packets proceeds until either the calling or called party terminates the call. [10]

7

PSTN Telephone Switch

Gateway

Gateway 6

2

7

3

PSTN Telephone Switch

LAN/MAN 4

1

Gatekeeper

Gatekeeper 5 7

Terminal H.323

Fig. 3: Call scenario

3 SIP signalling The Session Initiation Protocol (SIP) is an application-layer signalling protocol conceived by Internet Engineering Task Force (IETF) to establish, modify and terminate sessions or calls with one or more participants. [11]

3.1 Network elements SIP consists of three types of network elements: terminals, proxy servers and redirect servers, but the minimum configuration needed for communication in SIP based IP telephony is two terminals. Servers are mainly used to route and redirect calls. A SIP server can operate in either proxy or redirect mode: a redirect server informs the caller to contact another server directly, a proxy server contacts one or more next hop servers itself and passes the call request further. The proxy server has to maintain a call state whereas a redirect server can forget the call request after it has been processed. [4]

3.2 Protocols SIP is designed as part of the overall IETF multimedia data and control architecture currently incorporating protocols such as: RSVP to reserve network resources, RTP to transport real-time data, RTSP to control delivery of streaming media, SAP to advertise multimedia sessions and SDP to describe multimedia sessions (Fig. 4). However, the functionality and operations of SIP do not depend on any of these protocols. [11]

Fig. 4: Protocols architecture

The functionality of SIP is concentrated on signalling, therefore the SIP protocol includes all basic signalling, user location and registration, while the other services reside in separate protocols. SIP architecture is modular, so different functions can be easily replaced. SIP uses the Session Description Protocol (SDP) to describe the capabilities and the media types supported by the terminals. SDP messages are text-based, listing the features that must be implemented in the endpoints. SDP messages are mainly sent within SIP messages, but other ways can be used instead. Sessions can be announced using the Session Announcement Protocol (SAP). It is primarily used for announcing large public and broadcast streams thanks to the multicast signalling feature. [4], [12]

3.3 SIP messages In SIP the client-server approach is used where the client transmits requests and the server returns responses. A single call may involve several clients and servers. [4]

A SIP message is either a request from a client to a server or a response from a server to a client. They are in text format as HTML. Both request and response messages use a generic message format. They consist of a start-line, one or more header fields and an optional message-body. [11] The client requests invoke methods on the server. A request message consists of a request-line (specifying the method and the protocol version), a number of header fields (general headers, requests headers, entity headers specifying call properties and service information) and an optional message-body (which can contain a session description). The following methods are used in SIP: - INVITE invites a user to a conference; - ACK is used for reliable message exchanges; - OPTIONS signals information about capabilities; - BYE terminates a connection between two users; - CANCEL cancels a pending request; - REGISTER conveys location information to a SIP server. After receiving and interpreting a request message, the recipient responds with a SIP response message. It consists of a status-line (protocol version followed by a numeric status-code and its associated textual phrase), a number of header fields (general headers, response headers, entity headers) and an optional message body. The status-code is a three-digits integer that indicates the outcome of the attempt to understand and satisfy the request. The three-digits codes are hierarchically organised with the first digit representing the class of response and the other two digits providing additional information. SIP allows six values for the first digit: - 1xx: informational; - 2xx: success; - 3xx: redirection; - 4xx: client error; - 5xx: server error;

-

6xx: global failure. [4], [11]

3.4 Operations Caller and called are identified by SIP addresses with a form: user@host. In many cases, a user’s SIP URL can be guessed from e-mail address. When making a SIP call, a caller first locates the appropriate server and then sends a SIP request. The most common SIP operation is the invitation. The INVITE request typically contains a session description (for example written in SDP format), that provides the called party with information about features and media formats to join the session. This capability exchange is performed during call set-up, but can also be needed during the call. The client sends one or more SIP requests to the server and receives one or more responses from the server. A request together with the responses it triggers makes up a SIP transaction. [4]

3.5 Call set-up A successful call set-up consists of two requests: INVITE followed by ACK. On the countrary, a negative response can be sent with a BYE reply. The invitation may pass through several servers on the way to the called site. A proxy server receives the request and forward it towards the location of the called (Fig. 5). A redirect server only informs the caller about the next hop and the caller sends a new request to the suggested receiver directly (Fig. 6). When the user has been contacted, a response is sent back to the caller. [12]

Fig. 5: Invite with proxy server

Fig.6: Invite with redirect server

4 Comparison between H.323 and SIP In the following a comparison of the two protocols is given with respect to their main aspects, as summarised in Table 2. Network architecture: the basic configuration in both H.323 and SIP consists of at least two terminals connected to a Local Area Network (LAN). However, in practical applications it is necessary to add some of the other entities in order to get an efficient communications system that is connected to the outside world. Each entity brings some more functionality to the network. Transport protocol: for control signals H.323 requires reliable transport, instead SIP does not require any reliable transport protocol (reliability is achieved by retransmitting requests every 0.5 second until a response is returned). Addressing: each physical H.323 entity has one network address which is dependent on the network environment. Endpoints may have one or several alias addresses. A gatekeeper is needed to resolve all aliases, forcing small H.323 networks without gatekeepers to use only host names. The addressing format chosen in SIP is an email-like identifier in the form user@host. Complexity: much of the complexity of H.323 comes from the multiple protocol components it consists of. The components are tightly intertwined and cannot be separately used or exchanged. Also the format message is a complexity issue. The H.323 protocol is based on ASN.1 (Abstract Syntax Notation 1) and uses a binary representation. SIP is a

noticeably simpler protocol. In SIP the messages are in text format, similar to HTTP. Call set-up delay: as a consequence of the H.323 complexity one may experience a rather long set-up delay. Call set-up requires about 6 to 7 round trip time (RTT), depending on whether a gatekeeper is being used or not. Extensibility: since Internet telephony is still immature and continuously under development, it is likely that additional signalling capabilities will be needed in the future as new applications appear. It is important that extensions can be added to the existing protocols. H.323 provides nonstandardParam fields placed in various locations in the ASN.1 structures. These fields are subject to a vendor specific coding and value which only has meaning for that vendor, so there is no way for terminals from different vendors to exchange information about extensions they support. SIP has the same approach as HTTP. New headers can be added to the SIP messages, the textual names mean that the fields of the headers are self-describing and that different developers can understand easily and support new features each other. Codec support: in H.323 each codec has to be centrally registered and standardised (only by ITU) before they can be included in the H.323 applications. This is a significant limitation for small software developers. In SIP codecs are identifies by string names which does not limit the codecs to a range of predefined set. Also in SIP codecs can be

registered with IANA (Internet Assigned Number Authority) to improve compatibility. Table 2: Protocols comparison

H.323

Network architecture

SIP

The basic configuration is the same, but it is necessary to add specific entities

Reliable protocol Any, allowing for use of Transport protocol required connectionless protocols Alias E-mail address Addressing

Complexity

High: binary representation ASN.1 6 - 7 RTT

Low: text-based messages like HTTP 1,5 RTT

Call set-up delay

Extensibility

Codec support

NonstandardPar New headers am placed in can be added to ASN.1 structures the SIP messages easily ITU registered Any IANA codecs registered codecs Monolithic Modular

Protocol architecture Low

High

Scalability Roughly the same Supported services None Loop detection

It is possible thanks to header Via

Protocol architecture: H.323 is an umbrella standard consisting of recommendation H.225, H.245, etc. Together they form a vertically integrated protocols suite. The services of the components are intertwined in such a way that all components are required and components are difficult to replace. The SIP protocol includes basic call signalling, user

location, registration. The other services reside in separate protocols so it is possible to replace individual components. Scalability: H.323 was originally designed for use on a single LAN. SIP is engineered for WAN, thus the maximum network size is not limited. Another scalability issue is that H.323 use a stateful model, namely that gatekeepers must keep call state for the entire duration of a call. In SIP either a stateful or a stateless model can be used. SIP messages contain sufficient state information to allow the response to be forwarded correctly. Supported services: both protocols offer roughly equivalent services and new services are still to be added. Loop detection: SIP traces the route of messages using Via headers. These headers are used to help messages to find their path back to sender and for loop detection. When a server notices that is own address is listed in the Via headers, it will not forward it. In this case an error message is sent back. In H.323 there is no easy way to perform loop detection. [12] Security: security mechanisms are supported by H.323 version 2. Their usage is specified in standard H.235. SIP has HTTP-like security mechanisms. Interoperability: SIP terminals can only contact other SIP terminals. Correspondingly, H.323 terminals are only able to contact other H.323 terminals. We have two islands unable to communicate each other. Gateways between SIP and H.323 are required, otherwise the calls must pass through the PSTN. [12]

5 Conclusions Which signalling protocol will be usable for Internet telephony tomorrow? It is not easy to answer this question, as the protocols are still being developed and the number of supporting applications is small. H.323 started as a protocol for multimedia communications on a LAN, so it had some important problems as its extension to a wide area. These problems were addressed in SIP from its start. Today, all commercially available products are based on H.323, except a large number of non-standard systems running their own protocols. H.323 is a complete protocol, and it is also very similar to the other H.32x protocols, which simplifies connections to already available equipment. The support for SIP is almost non-existent today. Furthermore it is difficult to gain acceptance competing with all the largest software developers supporting H.323. In addition to traditional telephony, future will bring new types of services based on speech transmission. These applications would require lightweight signalling protocols, like SIP. Simplicity and extensibility are important factors for gaining widespread usage in Internet.

In future, it is possible that some applications and devices will support both protocols, but on the long run, only one protocol is desired. Including two protocols in every device is not very practical. If we think that H.323 is best suited for Intranet telephony and SIP is more well for telephony on Internet, we have two systems. Interconnections of these systems would require a SIP-H.323 gateway. It is also possible to mix the protocols by using different protocols in different phases of a call. [12]

References: [1] Young David J., IP Telephony for Service Providers… and the power of Convergence, January 1999, Scidyn Corporation, http://www.scidyn.com/pdf/white.pdf [2] Fingal Fredrik, Gustavsson Patrik, A SIP of IPtelephony, Master’s Thesis, Department of Communication Systems, Lund Institute of Technology, Lund University, 10 February 1999, http://www.cs.columbia.edu/~hgs/sip/drafts/Fing9902_SI P.pdf [3] Bertrand E., Roth S., De Vleeschauwer D., Van Doorselaer B., Issues for VoIP, 21 May 1999, Alcatel CRC [4] Huovinen Lasse, Niu Shuanghong, IP Telephony, Department of Computer Science and Engineering, Helsinki University of Technology, http://www.hut.fi/~lhuovine/study/iwork99/voip.html [5] Telogy Networks’ Voice over Packet White Paper, Telogy-Networks, http://www.telogy.com/our_products/golden_gateway/V OPwhite.html [6] Bertrand Emmanuel, VoIP call routing and gateway location protocols, 2 May 1999, Alcatel CRC [7] H.323 Tutorial, 22 May 1999, Trillium Digital Systems, Web ProForums, http://www.webproforum.com/h323/ [8] A Primer on the H.323 Series Standard (Version 2.0), DataBeam, http://www.databeam.com/h323/h323primer.html [9] Unverrich Rod, Voice Over IP – Understanding H.323, 26 giugno 1998, HP Network Systems Test Division, http://www.tmo.hp.com/tmo/pia/internetadvisor/PIAApp/ tutorials/English/IntAdvisor_VoIP.html [10] Internet Protocol (IP) / Intelligent Network (IN) Integration-Tutorial, http://www.microlegend.com/itswitch.htm [11] Handley M., Schulzrinne H., Schooler E., Rosenberg J., SIP: Session Initiation Protocol, RFC 2543, marzo 1999 [12] Beijar Nicklas, Signaling Protocols for Internet Telephony – Architectures based on H.323 and SIP, 20 October 1998, Helsinki University of Technology, http://keskus.hut.fi/tutkimus/ipana/paperit/sip.pdf

Suggest Documents