home -> notebook -> ITSP Architecture

A Scalable ITSP Architecture Using VoIP and Asterisk

Overview

This guideline outlines the development of an Internet Telephony Service Provider infrastructure designed to initially support tens of thousands of simultaneous users and capable of scaling to hundreds of thousands of simultaneous users.

High Level View

The Tools for the Job

Network

This architecture assumes the availability of multiple Internet Service Provider (ISP) partners to provide edge termination for the geographically diverse customers. The system will interface directly with the backbones of the ISPs to enable high call quality. Connection quality to available providers will be constantly monitored and information gathered will be used to determine most efficient routes to end users.

The internal switched network is assumed to support Quality of Service (QOS) and the core backbone of the clustered servers will leverage a Gigabit Ethernet fabric.

Asterisk

Asterisk is a highly versatile telephony platform, the applications namesake is appropriately a character traditionally used as a wildcard (*). The ability of Asterisk to reliably provide features ranging from a TDM Gateway to a Conference Server sans licensing costs has fueled rapid adoption in both the enterprise and small business markets.

The large majority of consumer interaction will be managed by Asterisk. For example, Voicemail, call Transfer, Directory Listings, and basic IVR will all be a function of the Asterisk servers. The architecture does not call for the Asterisk servers to maintain unique user information; upon receiving a call the server will query the central database for permissions, user specific data and a Least Cost Route (LCR). Upon completion of a call, the individual systems will populate the central database with information regarding the transaction (primarily call detail records and voicemail messages).

The servers used for the Asterisk system are relatively modern x86 based servers. Each node will be able to handle the simultaneous manipulation of between 60 to 200 SIP streams, depending on the cost of the process undertaken. In order to enable a larger number of users to simultaneously access the system, the architecture leverages an array of Asterisk servers load balanced by a SIP Express Router.

SIP Express Router (SER)

The SER systems will be responsible for creating outbound calls, validating inbound calls and routing incoming traffic to available Asterisk servers. The system will not manage the actual transfer of the media once the call is established, but will initiate the transaction. This allows the SER application to simultaneously govern tens of thousands of calls.

SER is a highly efficient application with a very small footprint. It is central to the architecture and thus must not be a single point of failure. The SER systems can be made redundant with a two node Linux HA Cluster.

Management Interface

A web based system management utility will be used to perform the majority of the administration tasks. Skyy Consulting offers many custom integrated tools which will modify the configurations of all the systems involved in the ITSP real-time. Integrating into your custom billing, crm and web-front are part of the tools which Skyy Consulting specializes in.

Database

The architecture prevents data conflicts from arising by maintaining all unique customer data in a single database which will be accessible by all nodes in the system. Without redundancy this also introduces a single point of failure. MySQL clustering is a start on an entry level ITSP project. As the project evolves, database growth can be handled through scaling the cluster and creating failover slave databases.

Though a MySQL cluster is specified in this document, any SQL complaint database with the capability of scaling to high volume use will be adequate. Major database vendors provide solutions which can also be integrated into the Skyy Consulting ITSP architecture.

Due to the possibility of spikes in demand, the system will integrate a JMS queuing system to throttle non immediate queries and writes. This will not limit the ability of direct queries to be used in procedures which require immediate or near immediate response.

System Management/Monitoring

Administrative management of the system will include mechanisms for growth and failure monitoring.

Call and route quality are monitored via services residing on both the SER and Asterisk servers. The call quality service on the SER machines will constantly query available SIP gateways for latency and update the Least Cost Routing table accordingly. In the event that latency increases above a pre-determined number, the system will prevent future calls from being made to that gateway until the latency drops to an acceptable level.

In addition, for troubleshooting purposes call monitoring will be enabled on the Asterisk servers. A technician will be able to ‘listen in’ or record a call via an administrative interface. For in-depth troubleshooting of audio streams, Ethereal based capture and analysis tools will be made available by Skyy Consulting.

SER, Asterisk, Linux HA, MySQL, and JBoss include SNMP (Simple Network Management Protocol) modules. In the event of a system failure, SNMP traps will be sent to the NOC (Network Operations Center).

An operating system image will be hosted on the internal network as a mechanism to provision additional Asterisk servers rapidly thus allowing for rapid scaling and recovery of the system.

Central Storage

A centrally accessible storage infrastructure is necessary to provide access to voicemail and various files that need to be universally available for the Asterisk server farm. For example if a caller leaves a voicemail on Server 12, the recipient should be able to access it from Server 35.

The system can leverage the existing storage infrastructure of the ITSP provider, or integrate a new SAN, iSCSI and RAID array's.

If necessary, the system can sustain an internal Parallel Virtual File System (PVFS) for storage requirements. PVFS is a distributed file system which can be universally mounted to all servers in the Asterisk farm. This will allow the machines to share the load of both heavy IO processes and network usage.

The PVFS system was initially developed for use in the Beowulf Cluster project. By leveraging the Beowulf Network Redundancy (BNR) project, the distributed file system can also integrate redundancy via a parity mechanism similar to that used in RAID (Redundant Array of Inexpensive Disks). The BNR system has one drawback, it does not allow for dynamic fail over. A user is forced to manually recover lost files.

However, the size and scope of this system does not warrant the use of a more robust file system due to administrative constraints. Statistically we do not anticipate more than a single server failing over the course of a six month period in a 20 – 30 unit cluster. In conjunction with low level RAID one (disk mirroring) implemented on the physical server level, the statistically probable administrative burden is deemed to be in an acceptable window.

Hardware

By leveraging commodity x86 based hardware to perform core functions, Skyy Consulting hopes to maintain minimal hardware acquisition costs. If the proposed PVFS storage option is selected, the system hardware will be almost entirely independent of existing infrastructure, allowing for rapid modular growth.

The internal local area network switches must be managed and capable of handling gigabit speeds amongst all systems in the cluster.

Scalability

By segregating functionality and rigorously maintaining autonomy of all major systems in the architecture, we allow for rapid expansion of capacity. The modular design has multiple benefits:

 

Each of the mission critical applications chosen have the capacity to scale well beyond expectations of volume:

Asterisk to DB Diagram

Homeland Security

Federal regulations mandate service providers be capable of recording phone calls on their system in the event that Homeland Security requests a line tap.

An interface to enable call recording for an individual user or line is provided in the management utility by Skyy Consulting.

911

911 services is implemented by maintaining an internal database of nationwide emergency numbers. This will be compared with address information provided by the consumer and the emergency facility nearest to the customer provided address is dialed.

Feature Possibilities

STUN (Simple traversal of UDP (User Datagram Protocol) through NATs (Network Address Translation))
Video Conferencing
Conferencing
Instant Message Client Integration
Dynamic Call Queue Generation
Predictive Dialing
CRM / Mail Client Integration