Wednesday, October 5, 2011

Introduction to Asterisk

As with all types of software in the past decade or so, there are two types of solutions available for use: proprietary, closed programs and platforms and open – source, typically free software. In the contact center environment it could be no different; and we are lucky enough to have a real jewel available: Asterisk.

Asterisk started as a PBX, but over time a large amount of complementary modules have been developed to enhance its functionality: other than a fully featured PBX, it can easily act as an IVR (with speech recognition supported through third party engines); it integrates with CTI; it can work in inbound and outbound environments; it supports most commonly used protocols both in TDM and VoIP environments; it offers automatic redundancy capabilities in case the primary servers fail and a lot more. Detailed information about its full set of capabilities can be found on its homepage. This set of features is not very far from the ones offered by standard commercial solutions.

The software suite is probably the ideal solution for many small call centers. It is extremely cheap to implement compared to various proprietary solutions, the only cost being the payroll of the engineers setting it up and the occasional consultancy/support in case something goes wrong. Asterisk is very flexible and can be configured to run in very diverse environments.

The main drawback of using Asterisk is its inherent complexity. It requires far superior and diverse technical skills compared to more expensive commercial proprietary products to work with it at the same capacity. The complexity could make a PBX ecosystem built around Asterisk less reliable in case something is overlooked. While this is not a problem of the platform itself, in a real life scenario most experts would not consider Asterisk as a proper solution for very large scale and complicated contact center ecosystems.

It is, however, by far the most cost – effective choice for small deployments.


Friday, August 12, 2011

DTMF vs Speech enabled IVR: an in-depth examination - part 2

Transition from DTMF to Speech and issues that arise

In the previous post we briefly discussed the differences between DTMF – enabled and speech – enabled IVR systems. We will now turn to the transition of an IVR application based on DTMF to a similar application powered by a speech recognition engine and various considerations that have to be taken into account while doing so:

Speech – powered applications have to be complicated to justify the investment. It is clearly neither cost effective nor really more efficient to spent excessive amounts of money into a two-layer application with 3-4 options on the first menu and 2-3 options on each submenu. These can be implemented very nicely with DTMF and the user gets served quickly enough. Thus, if your self-service application is simple and small, it is currently best to use DTMF.

For complicated applications though, using speech recognition is vastly superior in terms of efficiency and quality. And this is the case even in situations when the correct recognition % achieved by the speech recognition engine is even below 50%! The reason is that the user of an automated system actually wants to minimize the interaction time. A typical user will definitely prefer entering the same information twice or even three times and get done in 1 minute total, rather than having to navigate menus and listen to irrelevant information for 2 minutes before being able to quickly and accurately enter the information once.

The following example is using a (randomly created for this purpose) application flow tree complex enough to showcase the difference in implementation logic between DTMF and speech.

In the tree appearing to the right, the leaves are the final services the application offers. The information retrieval and announcement services are highlighted in yellow, and the services that the customer has to enter information are highlighted in orange.

In a DTMF powered application, each menu has to be presented hierarchically with the users having to listen first to the options 1-5 then after they select a submenu and being presented with all the options below it they go to the next submenu etc. Typically the user can navigate back to the previous menu or the start menu by using * and # keys or some number.

Speech enabled application on the other hand allows the user to directly jump to any sub-tree they wish, or directly access a service (leaf of the tree). The user may also jump at any point during their navigation to any service with one action, without having to pass through the hierarchy. Traversing across the tree requires, of course, the user to be able to know the available options otherwise the options have to be presented again in a hierarchical manner. As soon as the user tries the application a few times though, service times can be severely lowered.

Let’s assume that a caller wants to perform actions 1.3.3 and 5.1.1.2. In the DTMF style application they would have to go through 3 menus for the first item then jump to start and then go through 4 more menus to the second item. This procedure will never be improved, no matter how experienced the user is (save for the time to listen to prompts which can be eliminated via barge-in). That is a minimum of 8 steps required. In a speech enabled application though, an experienced user can jump directly from the initial menu to the first item and then from there jump directly again to the second item without even having to go to the start menu. In this case we can achieve the same result with 2 steps.So, for this particular example, supposing we have a 50% average recognition success, the experienced system user is still served roughly twice as fast as with the 100% accurate DTMF.

The example mentioned above showcases quite clearly the advantages speech recognition can bring to advanced IVR users. However, inexperienced users that interact with the system for the first time will typically spend more time learning how to work with it. This is part of the learning process that is inherent in any new technology being rolled out to the general public, and it typically takes some time before the new system becomes more efficient than the old, for the average users.


Tuesday, August 9, 2011

DTMF vs Speech enabled IVR: an in-depth examination

I have recently been heavily involved in a rather large scale deployment of a new customer care IVR system which is gradually replacing an old DTMF-based system. Based on that experience I would like to elaborate a bit on the differences between DTMF-based and speech-recognition-based systems and highlight some concerns that have came up while deploying the speech-enabled IVR. Since there are a lot going on in such systems to make them work, this text will span more than one post.

The system we worked on is an Avaya Media Processing Server (MPS) IVR (a recent acquisition by Nortel) powered up by Nuance speech recognition engine, however the concepts discussed below should apply to a large degree to any platform.

The characteristics of DTMF

Let’s start with DTMF and its characteristics; DTMF is a powerful way of entering and transmitting information through telephony that has been with us for many decades. It has a lot of advantages that made it prevalent in IVR systems up until very recently, with the two most important being the following:


  • DTMF is very simple to implement. It reuses the same technology as classic telephony and it is very simple to integrate into branching logic on the IVR platform. It is also very quick to process, relying on the telephony infrastructure already in place.
  • It is very accurate. If the user is slightly focused, accuracy can be easily close to 100% despite external conditions such as noise.


Despite these advantages, DTMF does also come with a bunch of drawbacks that really limit its potential:
DTMF is limited in capacity. You can enter so many distinct tones as there are buttons in the phone keypad. While this is sufficient for entering numbers either as data or as options for branching logic, DTMF is incapable of allowing the user to enter more complicated and detailed pieces of information. This severely reduces the capabilities and services that can be deployed.

Excessive use of branching logic in large menus with many submenus makes the IVR application extremely clunky. Menus that contain more than 3-4 different options are too cumbersome for the user. Navigation is also time-consuming, since for example to reach a service that lies in the third layer of an application, the user has to go through three menus with various options each. This can result in slow servicing times.

Finally, using DTMF requires the user having free hands to push the buttons. This is not very helpful when you are on the move or otherwise engaged.

The characteristics of Speech Recognition


On the other side of the table, there is speech recognition. It is a relatively old technology that has only recently reached maturity. Speech recognition systems effectively solve the major problems DTMF suffers from; they allows for very complex input options (not just numbers), they offer vastly superior navigation options and can be used without involving one’s hands. Furthermore, speech-enabled systems feel more natural to interact with. On the other hand, their accuracy is usually a lot less than 100%, performance can be severely affected by noise and they are currently a lot more expensive and complicated to deploy, compared to DTMF.

While initially it seems that it is a more or less an equal tradeoff of pros and cons between the two options, the truth is that speech recognition systems are far superior, assuming a minimum application complexity. The main reason is that most of the drawbacks they come with (which are the strengths of DTMF) simply don’t matter enough! This is a very bold statement that will be extensively discussed in a follow-up post with examples.



Thursday, July 7, 2011

Customer interactions: shifting from the phone to the internet

The last couple of years with the explosive growth of smartphones and tablet PCs, we have entered a new Internet era. Through the first two decades of its existence, the web has been accessed primarily by classic computer devices (desktops/laptops). While laptops do offer a decent degree of portability, they still remain cumbersome to carry around everywhere. We are now witnessing a rapid change in internet consumption mediums, as capable small devices that can fit in our pockets or handbags offer access to the world of the web.

The convenience mobile access to the internet offers is too immense to be held back. Having every kind of information you could find on the web available with us any time at any place is definitely going to improve the efficiency of many of our activities. But retrieving information is probably the least of the benefits; the ability to add information as we come across it and the ability to communicate with others using multiple different types of media are probably the two biggest changes. Adding information to the internet has traditionally been a rather long procedure of gathering material, processing it and then uploading it. Now with mobile devices capable of connecting to the internet in a close to fully featured manner, everything can be added on the fly. And social media offer new ways to efficiently share everything with various circles.

As mobile internet rapidly becomes a habit for more and more people all over the world, certain behaviors and trends are bound to change. And one of the things that are already changing is the way customers interact with companies. Traditionally, people have used the telephone to contact companies and vice versa. The telephone became the medium of choice in the past for various reasons; it offers quick and accurate communication, it has a rather dense degree of interactivity (there is minimal delay between each question and answer) and mobile phones are small enough to be carried around everywhere. Smartphones and mobile internet can sustain all the advantages the telephone has and add a whole lot more.

Let’s take an example into consideration: we have a washing machine that happens to malfunction and we are calling the vendor to find out how to fix it. Using the phone, the vendor first level support technician could give some generic instructions on how to perform basic troubleshooting actions to try and isolate the problem at hand. This communication, while being quick and interactive is not very accurate, since the information the technician gets from us might not be good enough (since we do not possess the knowledge required to mention all the important issues). Using a video call instead or taking some pictures and streaming these to the technician could greatly increase the quality of information the technician has at their disposal. Furthermore, the technician could, in the same manner, send a video or a picture showing exactly how to perform an action that could solve the problem we have in a matter of few minutes. There are countless other examples that can be thought of, where UC would greatly enhance the efficiency of several activities.

Thus customers will eventually migrate from using the phone as a primary medium to communicate with everyone (including companies whose products/services they purchase) to using fully featured multimedia communication that technologies such as UC offer. And companies have to be ready for that change when this happens.

Of course this will not happen in a fortnight. It will take a long time until the majority of people start using extensively mobile devices to connect to the internet; smartphones are still too basic in terms of hardware to offer an experience equivalent to that of a desktop or laptop high resolution screen. And even those that are starting to come closer are still too expensive for the mainstream users, though this is bound to change within a handful of years. There is also a matter of habits that don’t change easily especially as older people have had those habits for too many years. As with every change of that scale – it will start small and gradually expand. There are users that will adopt mobile internet and multimedia as their primary communication method early on and others that might take many years to do the switch.


Wednesday, June 15, 2011

Does having top customer support matter enough?

The answer to the above question is not really easy to give. It depends on a whole lot of factors, such as the specific line of business of a company and its characteristics, the competition, the size of the market and so on. It also depends on the company itself and its culture and goals.

Most people agree that good customer service provides tangible benefits to any company. Some of these benefits that apply to most types of businesses are:

  • Customer retention and loyalty: satisfied customers are keen to buy again from the same company, if everything went according to plan. Also, if something goes wrong and rectifying is swift and without side – effects, customers most often than not will stay loyal. Loyalty is very important also because it is a lot cheaper to retain a current customer compared to attracting a new one (according to some studies even five times cheaper).
  • Word of mouth: satisfied customers will rarely share their experience with others about a specific company. They will do so if the opportunity arises, but they won’t rush out to advertise good service. On the other hand, dissatisfied customers tend to tell everyone they know about their unpleasant experience and this can quickly lead to the build – up of a bad reputation. Word of mouth is very strong and thus it can lead to negative marketing.
  • Customer support can be used a strategic competitive advantage. In lines of businesses that the end products are more or less the same, in terms of both price and features, superior customer service can be easily transformed into a source of competitive advantage, providing some differentiation.

Customer service is undeniably good. But there are situations that the benefits it offers are too miniscule compared to the costs involved in providing top quality customer service. Therefore, many companies (a lot more than we, the customers, would like) tend to perceive their customer support facilities as a cost center and simply try to minimize that cost. Of course this point of view is a bit short minded, since good customer support and the reputation built – up that goes with it has probably long term benefits that are very hard to quantify. On the other hand, many companies do not really care about long term benefits, especially when there are urgent problems to face now.

There are also several markets that customer support is not as effective as in others. This does not mean that customer support is not necessary in such cases; it rather implies that the company can just stick to the basics and take care of only very important issues, being able to safely ignore more “trivial” and complicated situations. Here are a few examples where customer service may safely be considered a lower priority:

  • Internet retail and e-business in general: In e-commerce customer support is of course very important to help a company become trustworthy. But this can be done without really good customer service, especially when the company performs massive amounts of interactions. Rating systems on internet are typically end up being a 5-point scale (stars) or a 10-point scale with one decimal point (numbered scale). In such systems where the rating is provided by actual consumers, having a very good logistics mechanism and a working supply chain can make sure that 90% of the customers are served without issues. Does the remaining 10% matter enough to invest a lot of money in order to resolve their problems as quick as possible? Or, to put it in another perspective, would you buy a product for 20$ from an online retailer with rating 8/10 or would you prefer the same product for 25$ from an online retailer with rating 9.5/10? For low cost items, cost leadership is everything. On the other hand, if the product at hand is a car (expensive and long-term investment) the situation changes. Customer support there is a lot more critical.
  • Markets with very low number of competitors. A typical example is the telecommunications market after it stabilizes (i.e. small players are out of the market, a handful of giant companies owns the vast bulk of market share). In many countries around the world telecom providers are notorious about their lacking customer support. Understaffed call centers, many badly trained customer service representatives and so on. The reasoning is simple: unsatisfied customers don’t have many options. Since completely dumping the phone is not an option for most, their only escape would be to go to one of the few other competitors which have similar service levels anyway. They are so nicely trapped! And the number of dissatisfied customers that do the switch won’t matter much in the short term. Of course, in the long run, a company with better customer service will probably gain an advantage. But the tradeoffs they have to pay for that long term advantage may be too much.

vs

There are, of course, many other situations that the best customer service offers visible short term benefits. Such as, for example, local retail stores where the customers purchase in person or any type of market where word of mouth plays a very important role in general.

Unfortunately, the customer does not always matter enough.



Thursday, May 19, 2011

IVR services : to charge or not to charge?

Worldwide usage of various self-service systems is rapidly increasing. The trend can be seen in various types of self-service, ranging from ATMs and vending machines to customer support. These services are offered either for free or for small fees. For example, ATM usage is usually free for customers of the bank at hand, while customers of other banks have to pay commission for every transaction they perform.

This simple rule of thumb that most companies tend to follow is not always sufficient to determine whether and who to charge for self-service usage. There are other factors that have to be taken into account, such as what the specific service is about, how much it costs to implement and maintain it and how valuable it is to the customer.

I will refer to a recent situation I recently became aware of as an example. Company A has a competitor, the company B. Company A is deploying a customer care system via telephone which includes both a self-service component for simple, streamlined issues and agents to deal with more complicated situations and fill in any gaps. The managers of the company decided to not charge their customers for IVR usage and start charging a call as soon as it is connected to a live agent (fixed and low amount of charged money – independent of call duration).

The specific implementation is mostly referring to customers that have already been associated with company A, as the system is designed to provide information to these people and allow them to perform some actions. However, a part of the system is providing information for the company’s products. This part could be interesting both for current customers of the company (who might wish to buy an additional product or maybe upgrade their current product) as well as potential new customers.

Company A can distinguish whether a caller using the IVR is a current customer or not. They decide that if the caller is not a customer, they should apply an extra charge for using this system, as most companies do for non-customers. There is a controversy here though: when a company wishes to attract new customers, they typically spend money to do so, via advertising and marketing campaigns. In this scenario, potential new customers that wish to learn about the products are charged for this.

This could very well be a very short-minded decision. Suppose a customer of company B is not very satisfied with their services and considers switching to company A. During their first interaction with company A, where they inquire about the products and their prices, company A charges the customer for this. Is that giving a good impression? To me it would definitely not. Is this policy actually helping in a larger effort to attract new customers? Certainly not. Is the minimal amount charged on each potential customer going to provide more revenues in the long run than getting a few more customers? This question is very hard to answer, because the potential benefits of getting a new customer are not always tangible and the decision making process for consumers is too complicated anyway.

Decisions like this one are not always easy to make. And while in a scenario that each company has a few hundreds or thousands of customers, such “trivial” issues might not have a significant effect, the situation is not the same when the consumer base is several millions.

Wednesday, April 20, 2011

IVR design and implementation fundamendals

IVR technology has been a part of our everyday lives for several years now. We are all familiar with phrases like “To do this, please press 5” or “Please type your PIN number”. Most of us have also been lost in the endless and “labyrinth – style” menus offered by these IVRs. It doesn’t have to be like that!

There are ways to ensure that IVR delivers quality self – service. To do that, it is important to follow some guidelines when designing the IVR application itself. Investing is complementary technologies is also of paramount importance to creating an intuitive voice interface. We will explore these principles and technologies in more detail in this post.

IVR application design principles and complementary technologies

  • Keep menus simple. Complex menus are greatly annoying for the user. When designing the application flow, the number of available options presented at each stage should be kept to the minimum possible (more than 4-5 different options are too much). The options themselves must also be clearly descriptive of what can be found in the sub-menus selected. This is very important since, due to the nature of voice user interfaces, navigation mistakes on IVR are more “costly” compared to visual interfaces such as web.
  • Provide easy navigation for both the new and the experienced system user. New users, that do not know anything about the specific application, need more information on what the system is about and what functionality each option offers. Encompass this information on each menu, when needed, but make sure to include the barge-in(*) functionality so that experienced users can bypass listening to these prompts repeatedly. Experienced users should also be given the option to jump directly through multiple levels. This can be easily achieved using speech recognition technologies.
  • Use the latest speech recognition technologies to provide natural speech recognition. It is more expensive than simple keyword recognition and far more expensive than DTMF, but it makes navigating through voice menus substantially easier and a lot more intuitive. Use speech recognition in conjunction with DTMF when possible. It is a lot easier for the user to type his 12-digit pin than having to repeat it 3-4 times until the speech recognition engine understands it correctly. Using each technological feature on the correct situation is one of the most important parts of voice interface designing.
  • Allow users to exit the application and transfer to live agent on any parts of the application. Inform them about this functionality at the beginning of the menus. Speech – enabled IVR systems often cause problems for various reasons such as accent variations or noise, so this exit option is very important.
  • Use professionally recorded prompts consistently through the voice application. Text to speech, while having seen large improvement during the past few years, is still not in the position to compete with human recorded prompts in terms of quality and thus it remains still the cheap / temporary solution.
  • Ensure that each piece of information is only collected once from the users. Have the IVR complement the live agents gracefully. This can be done by investing in CTI technologies which use the information gathered through the IVR and present them to agents, should the call be transferred.
  • If there are queues to talk to agents, inform the users about it and also communicate them the average expected waiting time in the queue. An additional feature that can be implemented if the contact center includes outbound capabilities is auto – callback. The IVR user can state that he wants to be called back as soon as an agent is available.  The customer types in (or speaks) a phone number that they wish to be called at and then hangs up. The system can then put a token representing the customer in the queue, and when the customer’s turn arrives, an outbound call is initiated and connected directly to an agent, with the CTI information already on the agent’s screen.
  • Reduce the information input to an absolute minimum. This can be done by integrating the IVR and CTI with a database of user profiles (which can be either the company’s CRM or, more often, a dedicated database synchronized with CRM). Collect the database primary key via the IVR (which is usually the user’s name and/or some PIN) and then query the database to retrieve more information about the specific user.
(*)     “Barge-in” : Refers to the ability given to an IVR user to bypass a spoken prompt and immediately enter their response to the answer (either via DTMF or through speech). As soon as the user presses a button or starts speaking, the IVR interrupts the prompt utterance and goes to the next stage to read the input. Barge – in is extremely helpful in eliminating annoying repetitive messages, especially for experienced users of the system.


Monday, April 11, 2011

SS7 protocol technical overview

     In the previous post we have seen what SS7 can do and what types of services that we use everyday are based on it. We will turn now a bit more technical and see in a nutshell how SS7 works (more details can be found in various tutorials over the web). In many of its aspects, the SS7 architecture is similar to the OSI network layer architecture.

Signaling points and their types

     SS7 nodes are called signaling points (SP) and they are usually identified by an integer called point code (PC).  The international SS7 network uses its own unique PC numbering and each national network and each operator use their own numbering schemes internally.  There are three different types of SPs in an SS7 network, categorized based on their functionality:

     A Service Control Point (SCP) is an interface between SS7 networks and databases. A Service Switching Point (SSP) is a voice switch with integrated SS7 functionality. And a Signal Transfer Point (STP) is responsible for transferring information between other SPs. Many SPs in the network typically play multiple roles (for example an SSP can also support STP functionality).

Signaling links and their types

     Signaling points are connected to each other with signaling links, over which signaling information is exchanged. More than two links (up to a maximum of 16) can be used to connect two signaling points, both for increased capacity as well as redundancy. These links bundled together are called combined linksets.  There are several types of logical links that can be used in SS7 networks, categorized depending on their functionality and what types of SPs they connect (the physical link remains the same for all these cases). These are A(access)- links, C(crossover)- links, B(bridge)- links, D-(diagonal) links E(extended)- links and F(fully associated)- links.

Routing

     Routing between nodes is configured statically. Each SP maintains a routing table with all the available information needed for routing. The group of routes that can be used to reach a particular destination is called routeset.

The SS7 protocol stack

     SS7 is a work in progress since the early 1970s. During this period, a stack of protocols was developed to provide the functionality required. The architecture of these protocols is layered and it is pretty much analogous to the OSI 7-layer model which is used on data networks. The lower layer protocols are common for all SS7 implementations. The higher level of the stack differentiates depending on the applications to be used by the SS7 infrastructure. The schematic below shows the most commonly used protocols in the SS7 stack:




Thursday, April 7, 2011

SS7: Advanced telephony signaling provides limitless capabilities

There are two types of signaling protocols in the telephony ecosystem. Channel Associated Signaling (CAS) is used to describe signaling protocols that use one dedicated signaling channel for each voice channel in a 1 to 1 analogy, according to fixed and pre-determined rules. Most telecommunication protocols to date use CAS to transfer information about call setup and control. The other type of signaling protocols uses Common Channel Signaling (CCS), where the signaling capacity for multiple channels is being integrated in one specific channel, called the signaling channel.  The only CCS protocols developed and used so far are the Signaling Systems 6 and 7 (SS6 and SS7) with the latter having seen widespread usage during the past decades.

Over time, SS7 has evolved to address all the limitations of CAS protocols, such as:

-          Very fast call setup times.
-          Flexibility.
-          Capacity to evolve and incorporate new features.
-          More cost effective than CAS.
-          Vastly superior call control capabilities.

On the other hand, the major drawback of SS7 compared to the traditional CAS systems is that the signaling channel is a single point of failure in the ecosystem.

SS7 is the enabling protocol for most of the telephony services we all use today and take for granted. Among other things, telephony systems based on SS7 offer:

-          Toll free numbering that are widely used for marketing campaigns and customer support services.
-          Single directory number  (a company can have a single incoming number and then redirect the incoming calls to the appropriate extension within its private telephony network, using signaling information).
-          Cellular network mobility management and roaming services.
-          Local Number portability (which allows us to retain our phone number when switching carrier).
-          Custom Local Area Signaling Services (CLASS) which include:
o   Call block from pre-specified numbers.
o   Distinctive ringing for groups of callers.
o   Call completion to busy subscriber.
-          SMS.
-          EMS (enhanced functionality added to SMS service).

Furthermore, SS7 is a key protocol in the effort for telecommunications and internet convergence. It allows hybrid network services to be deployed such as:

-          Internet Call Waiting: displays a message in user’s computer screen when they use the same line for telephony and internet, and an incoming call arrives. The user can then redirect the call to voice mail, accept the call or reject it. This application is extremely useful for dialup connections.
-          Click to dial applications (i.e. click a number on a webpage to place a call automatically).
-          Unified web and telephony services. This has evolved to an industry in itself, unified communications.
-          WLAN hotspot billing.
-          Location – based games.

           We can see from the above examples that SS7 infrastructures are used to serve everyone in developed countries in their day – to – day activities, even though it is transparent to the end customer.  It is also a key enabler of various value added services that provide the telecommunication carriers with additional revenue sources. In a subsequent post we will explore how SS7 works in more detail and examine its layers.


Friday, March 18, 2011

Evolving from POTS/PSTN to SS7/SIP


There are several ways to categorize today’s contact centers based on the underlying technology and protocols which provide their capabilities. Knowing and understanding them, at least from a general point of view, is of paramount importance for everyone involved in running a contact center, from engineers and supervisors up to the senior management. In this post and several subsequent ones, we will briefly review the key technologies and protocols involved in various types of contact centers, and how these affect the features and functionality they can offer.

We will start by briefly examining the evolution from the classic circuit – switched telephony that dominated worldwide communications for almost a century and see how it evolved to today’s complex signaling – based protocols that allow a large amount of functionality and value added services to be added to a simple communication session. Appropriate links are provided for more information on several of these terms (most referring to wikipedia).

POTS - PSTN

Classic telephony (usually referred to as P.O.T.S. – Plain Old Telephony Service) has been virtually unchanged over the past century. While it has undergone some changes over the years, most notably the introduction of electronic exchanges and touch-tone dialing (DTMF), the basic principles remain the same. Over the years, a large network of switches came to be, interconnecting telephones all over the world. This network is known as PSTN (Public Switched Telephone Network) and includes a variety of physical interconnection mediums, ranging from copper wire to fiber optic cables and cellular networks, and a number of switches that act as hubs for these mediums.

ISDN: The first step to convergence

While the PSTN network was happily expanding and increasing its coverage in every corner of the globe, a wind of change started blowing in computing labs. The rapid expansion of computers of any type and size soon gave rise to the need of communication between them, to exchange data. Computer communication initially became important for military purposes, but it gradually became apparent that it could be applied in other activities too. As the PSTN was already in place, it was convenient to be used for data transmissions also. However, when the landline was used for data transmission, it was not possible to also use it for telephony at the same time. Thus the technology of ISDN (Integrated Services Digital Network) was developed, which allows for simultaneous digital transmission of voice and data.

Signaling: Enhancing capabilities in two worlds that are blending

During the same period that the computer and the telephony industry started affecting each other, another important change in mentality started taking place, related to signaling. Signaling refers to the use of signals during the information exchange concerning the establishment and control of a telecommunication circuit and the management of the network.

Traditional POTS/PSTN telephony uses in – band signaling (the signaling information was transmitted in the same channel as voice, along with it) and it offers limited call control capabilities (mostly call establishment and termination). More advanced protocols using in – band signaling, such as ISDN, can offer more capabilities, as they also transfer data along with voice.

Gradually, the need for more advanced communication control capabilities in telephony led to the separation of signaling from the data being transferred. Thus, out – of – band signaling was deployed. Several out – of – bound protocols were developed until the dominant protocol of SS7 became widely used during the last 30 years. SS7 is a common channel signaling system, which means that the signaling for several lines is carried in a single, common channel. We will discuss more details on SS7 and its importance to backbone networks and contact centers on subsequent posts.

Several years later after SS7, a similar signaling protocol was developed for the parallel world of data communications. This protocol, SIP (Session Initiation Protocol), similarly offers advanced signaling capabilities to data communications, being able to control transactions while staying independent on the medium used. SIP is the enabling mechanism for the concept of unified communications which is quickly becoming a part of more and more contact centers. Exploring SIP basics is also a very big topic in itself and will be examined in the future.





Tuesday, March 15, 2011

Contact Center Metrics: Revenue per Call


In the previous post we have discussed the metric of cost per call (or the equivalent cost per minute which can be calculated taking into account the AHT) that is widely used to measure how much money does running the contact center cost to a company. This metric has to be balanced against revenue per call (or the equivalent revenue per minute) in order to lead to improved contact center efficiency.

Revenue per call is calculated by dividing the total amount of revenue the contact center generates compared to the total number of calls handled by the center. While the latter is easy to measure, the total amount of revenue is very hard to measure because of indirect revenue generation that results from contact center operations. For example, having a good customer support may result in the influx of new customers via word of mouth, which is impossible to measure accurately. There are methods, though, to extract estimations generated revenue that are good enough to provide insight on what is going on.

Several contact centers opt to use this metric for agent evaluation. This is rather dangerous, as it may lead the agents to focus too much on earning attribution for bringing additional revenue while at the same time neglecting other, more important overall goals such as FCR and quality of service. Calculating who brought how much revenue also incurs additional complexity to the already complicated estimation techniques used and is often based on additional assumptions. Revenue per call should thus be used with caution and preferably seen in the wide context of the whole contact center rather than individuals or sections.

Typical methods used to increase revenue per call are similar to the ones used to decrease the cost per call:

  • Agent training is of utmost importance, helping eliminate costly errors and resulting in quick handling of cases and quality results.
  • Deployment of a robust contact center solution gives the tools to the contact center personnel to maximize their efficiency.
  • Utilization of effective and efficient business processes to streamline back –end operations.
  • Usage of outbound contact center capabilities to proactively take care of potential problems and sell products more effectively.

Tuesday, March 8, 2011

Contact Center Metrics: Cost per Call

In previous posts we have discussed two of the major contact center metrics used worldwide, the Average Handling Time (AHT) and the First Call Resolution (FCR). We will now discuss, over this post and a subsequent one, two more important metrics are introduced, that are tightly related to each other: Cost per call and Revenue per call. These two metrics can also be combined with AHT and produce the equivalent metrics of cost per minute and revenue per minute that can be also used alternatively.

Cost per Call is calculated by dividing the total operational costs for a period of time by the total number of calls handled by the contact center during this time period. Operational costs can be divided in two categories: the costs of the human resources needed for operating the center and the cost of the equipment and the software required to support them. Reducing cost per call is, naturally, an ongoing goal of almost all contact centers. However the first step before decreasing something is actually measuring it. Most contact center software vendors nowadays include this feature in their standard statistics package offerings, but it needs a lot of effort and organizing to actually measure all the possible cost components accurately for each individual company.

There are various methods to decrease the cost per call:

  • Use call monitoring and coaching to improve agent performance. A well – trained agent is capable of providing quality service in low amount of time, thus greatly increases all efficiency metrics.
  • Use self – service options such as IVR where it is convenient to do. Well – designed IVR cost is substantially lower than labor costs.
  • Improve scheduling and adherence. This can be achieved by using workforce management software.
  • Consider outsourcing the software part of the contact center (SaaS). It might be more efficient to do so, especially when the contact center is related to a new venture and you don’t have the data to plan your own efficient in – house solution.
  • Use part – time customer service representatives for peak periods.
There are other commonly used methods to decrease cost per call that have to do with dropping the service levels or quality of service. While these methods were commonly used in the past, today smart companies avoid resorting to such measures, as the revenue decrease that results from utilizing such methods can be potentially far higher than the cost savings. Similarly to AHT, cost per call reduction must be performed very carefully, to avoid negatively affecting other, more crucial conditions.


Tuesday, March 1, 2011

NICE call recording solutions


As of today, I will start presenting some contact center related products and their features, beginning with a very popular interaction recording (also known as call logging) system commissioned by NICE. As we have seen in a previous post, utilization of call logging software is not only an obligation imposed by the law in many countries, but also a very good method to extract useful information from customer interactions.

This product by NICE systems fits well to this mentality. As they mention on their product homepage, Interaction Recording delivers the most comprehensive capabilities for capturing customer interactions, providing organizations operational flexibility and system resiliency while maintaining low total cost of ownership. It enables organizations to record interactions and then to generate valuable business insight through interaction analytics and quality management solutions.



NICE call logger is a server which includes an array of numerous disks, appropriate for storing conversations in real time with high speed and efficiency, so that they can be accessed later. It can not only record voice but also store emails and capture screen displays, fitting nicely in the multi – channel contact center. There also several additional capabilities offered in the bundle such as:

  • Call recording over VoIP, time division multiplexer (TDM) or hybrid environments
  • Flexible recording and call archiving enabling transparent access to recordings from any location
  • Scalable, multi-tier architecture for growing call recording capacity needs
  • Support for server and client virtualization solutions
  • Comprehensive redundancy to ensure business continuity
  • End-to-end media encryption, strong authentication and server hardening for state-of-the-art data security
  • Moblie call recording based on an open architecture that can interface directly with the trader’s handset

The above features, not only help a company comply with regulations, but also they ensure access to past calls to resolve any disputes and detect fraud. Furthermore, combined with the same company’s complementary products of Speech Analytics and Quality Monitoring, the logger can become a very good source of information.




Monday, February 21, 2011

Average Handling Time: A controversial metric

What is AHT:

Average handling time (AHT) is a contact center metric refering to the average duration of one transaction. It includes the duration of the interaction itself along with any prelimary tasks, post – interaction times as well as delays during the interaction.

The importance of AHT and methods to improve it:

AHT is an important metric that is tightly related to staffing needs. Assuming that the amount of incoming transactions is independent of the AHT, lowering AHT directly reduces the needs in personnel. This specific argument has been extremely popular in the past as a method to decrease call center operating costs. Smaller AHT can also help reduce the waiting time of incoming calls, which is another popular metric in various contact centers.

Many contact centers are thus aiming for low AHT and some (like the outsourcing company I was working for a few years ago) use this goal as the cornerstone of their operations. There are several methods that are typically used to reduce this particular metric, some of which are really beneficial while others can cause more problems than they solve and increase total costs despite decreasing costs of human resources:

  • Reduce talking time. The only really successful way to do this is by coaching and training customer service representatives to increase their knowledge as well as their reflexes. It is a slow but highly recommended method for decreasing AHT. Other clumsy methods that are widely used in many horrible contact centers are prompting the customer to try/do something and then ask him to call again or find ways to get rid of the customer in case the call seems to take too long (I remember i was given a list of methods to achieve that during my training as a throwaway agent!).
  • Reduce hold time. This can be achieved again via training agents to multitasking. Another method of critical importance in achieving smaller hold times is deploying a prompt IT infrastructure. Well designed CTI systems help agents find quickly the information they need. A good CRM is also critical in this regard.
  • Reduce after-call time. This is again achieved by both multitasking training (part of post-call work can be completed while talking to a customer) as well as good IT infrastructure (eliminate the need for duplicate entries etc).
  • Use automated systems like IVR to perform mundane tasks such as user authentication and gathering basic information about the customer's query so as to route the call to the best available agent to handle it. This in a very effective method of reducing AHT that is widely used.

AHT criticism:

However, AHT has lost a lot of its former importance lately for various reasons. First of all, the amount of incoming transactions is not completely independent of the AHT. A badly handled call that leaves several unresolved issues can have very low handling time but it may result in several additional calls in the future. Taking into account the fact that pre-interaction and post-interaction overheads are pretty much the same for most calls (an agent requires more or less a specific amount of time to fill in data in the CRM, regardless of the result of the call) this can severely increase the effort needed to actually solve a customer's issue, even by reducing AHT. Furthermore, pushing for lower AHT typically results in agents behaving in a certain manner, deliberately trying to get rid of a customer if the call seems to be taking too long. In general, AHT conflicts with FCR which can be seen by itself a severe drawback. This conflict between these two metrics has been a point of controversy in many companies. Many experts today believe that FCR should be the top goal of every contact center and other metrics that may conflict with it, such as AHT, should be looked at only in case they do not affect FCR.


Friday, February 18, 2011

Avaya Web.alive


Last week, Avaya announced web.alive, an “immersive collaboration platform that uses personalized avatars and rich, spatial audio and visuals to expand on current modes of conferencing and collaboration.” It is offered both as a cloud-based service as well as a standalone installation for in – house deployment. The cloud service offering is a very convenient method to use this product for inter - business purposes. 

It brings on the table attractive characteristics, including a new 3D audio engine, built-in collaboration tools, integration with other Avaya products and an analytics suite among other things. All these features greatly enhance the online collaboration experience. The classic teleconference notion is getting a step ahead, becoming more virtualized. Meetings feel like playing a massive multiplayer online game as you may see on the following videos.

Introduction


Basic Navigation


The user gets the feeling of playing Second Life or World of Warcraft type of game, but in a manner that actually facilitates and enhances business activities. A free demo is available to try this.

I must admit I am extremely impressed by this product and especially its potential. It introduces a radical change to the way business communications work and could be the foundation of a fully virtualized working environment in the future.


Wednesday, February 16, 2011

First Call Resolution


Definition of First Call Resolution:

First call resolution (FCR) is a CRM and contact center concept that refers to serving a customer in a satisfactory manner the first time he contacts a company. It is often used as a metric for assessing contact center performance and many companies consider it to be of paramount importance. Most of the time, FCR conflicts with average handling time (since more actions are required to resolve issues within the limits of one interaction).

Why is FCR important?

FCR is important for several distinct reasons. First and foremost, it highly increases customer satisfaction. Just try to remember how much frustrated you get every time you call a company to express a request or a problem or inquire some information. Doing it right the first time indicates that the company actually cares about its customers and their concerns, offering quality personalized services. It also shows (and requires!) that the company is well organized and knows exactly what is going on with every single customer they got.

The second reason FCR is considered very important is cost savings. High FCR rates directly result in less incoming calls. Various surveys have studied the effect of this in contact center operational costs and they report cost savings that can reach up to 30% for high FCR rates. A simplified way to calculate how much savings can be achieved by improving is described in this article.

There are other benefits also in measuring FCR. It is an indication of agent performance that can be used to reward skilled employees. It helps improve overall business processes by analyzing what went wrong in the cases where one interaction was not enough. And, finally, it helps appreciate the reasons why customers call and what conditions caused their problems in the first place.

Measuring FCR.

Due to the importance of FCR, most contact centers today actively try to measure it. The calculation is included in the metrics presented by many contact center statistics packages available today. However, there are significant problems in measuring FCR accurately.

The first and most important problem is the perspective under which to measure it. Does it matter if the customer is happy with the resolution? This is a subtle difference between methods of measuring FCR and sometimes even subjective. A popular method used to calculate FCR taking into account customer satisfaction is actually asking the customer. This can be done either by an agent or even by a simple IVR question. This approach is not enough to yield accurate results though.

Another problem that cannot be solved by asking the customer is the asynchronous nature of some customer requests. For example, a customer may call about a problem he has and request some credit on his account as compensation. The representative verifies the issue and assures the customer that he will get the credit. However, whether the credit will be actually finalized might not be obvious until the customer receives a future bill (which might occur several months later). At the time the transaction takes place, neither the customer nor the agent really know if the issue was solved in a single interaction. Thus, FCR calculation is usually a long procedure of collecting and evaluating data, so as to be able to take into account as many exceptions as possible.

Improving FCR.

To improve FCR rates, a company must first trace the root cause of failures to deliver fast and efficient customer service. Typical methods used are:
  • Improve agent skills by training.
  • Improve agent access to CRM information, using better integration among company systems. Ensure that these systems are accurate, current and always working.
  • Allow the agents enough time to solve the customer’s issues and avoid rewarding them for achieving a short average handling time.
  • Ensure that back end systems operate correctly, without any bottlenecks or points of failure

Monday, February 14, 2011

Contact Center Statistics Overview


The interactions that take place through the contact center via various channels offer a vast amount of information that companies seek to consolidate, and gain useful insight about their customers as well as about themselves. For this reason, an integral part of contact center software suites are packages designed to collect statistics. 

Data collection and consolidation:

Collection of data can take place on various parts of a contact center. Some usual points this is done are: the PBX, which is a central hub for all calls and can yield information related to call routing, the CTI that might control the PBX, the IVR that shows, among other things, information about the self-service behavior of the customers and the call logger where speech analytics can apply to extract semantic information from actual discussions. In an outbound contact center, the predictive dialer can be a source for information as well. In the case of a contact center based on SIP, the SIP servers can also provide important information. Workforce management software is another source of information, mostly related to human resources. In short, depending on the specific implementation of each contact center, statistics can be gathered from almost every node in the ecosystem, about practically everything that takes place within the center.

For each of the aforementioned points in a contact center, the vendors that produce them tend to include in their software offers extensive analytics tools. These tools not only gather the data about the interactions being performed through the contact center, but also they consolidate them, applying various formulas to extract a variety of metrics (a future post will examine in more detail some of the most widely-used metrics) that can be used to oversee performance. 

Using statistics:

These consolidated statistics can provide a wealth of information that can help substantially improve the contact center. Metrics related with customer service representatives can be used to improve the human resources of the company in various ways. Statistics that show customer preferences over a variety of offered contact channels (phone, e-mail, web chat) can be combined with performance metrics on each of these channels to indicate ways to improve customer service and satisfaction. Other types of statistics may help improve the business processes of the company, and its efficiency in general, showing where bottlenecks typically occur. Outbound campaign statistics can improve, among other things, the marketing tactics of a company.

All the benefits mentioned, and a lot more can be easily gained by including sufficient data gathering and reporting tools in the contact center. However the most important thing is to actually use these statistics after viewing them and also use them wisely. Many companies seem to fail in this area, underestimating the importance of such metrics. On the other hand, some companies go to the other extreme, becoming too attached to specific metrics and trying to improve them at all costs. In the process, they tend to miss the bigger picture and do costly mistakes. Statistics is a great tool but it has to be used with caution.


Thursday, February 10, 2011

Speech Technology: Voice verification


Introduction:

This is the third and final part of a series exploring the most frequently used speech technologies in contact centers. The first two parts discussed Speech Recognition and Speech Synthesis (or Text to Speech). We will now turn to voice verification, an exciting and relatively new technology that can greatly enhance security.

What is voice verification and how it works:

Voice verification is a biometrics technology which focuses on matching a person’s voice with a pre-recorded sample, to verify that the speaker is who they claim to be. Each person’s voice is completely unique, much like a fingerprint.

The speaker initially recites some text or phrase or some discrete words, numbers and so on. The uttered speech is digitized and stored. Biometrics engine splits each spoken word to small segments called formants (much like speech synthesis engine works with phonemes as it was described in the text-to-speech article). These formants are then analyzed into tones that can be then captured in a digitized format and stored in a database. These are the physical characteristics of the voice. In addition to these, additional characteristics are recorded and stored, the so-called behavioral characteristics. An example of behavioral component is pronunciation. The speaker is typically prompted to utter the text/words several times to gather more information about his voice and allow for greater variation.

When the speaker utters the same text in the future, the same procedure takes place and the extracted tones are compared to the stored ones.


Voice verification accuracy and other issues:

The accuracy of this verification can be affected by numerous factors. A person’s voice can change over time based on health issues (having a cold significantly alters voice) or even psychological issues. Background noise is another problem which can distort the uttered speech and microphones tend to enhance this problem. Voice distortion over the telephone can also affect the accuracy of the verification process.

To ensure the highest possible accuracy, the conditions of sample gathering should apply as much as possible to future verification attempts. For example, if a verification procedure is going to be used over phone in the future, the sample gathering should also be performed using a phone. Also, both the sampling and the verification procedures should be performed under low noise conditions.

In any case, the aforementioned limitations sometimes make the verification procedure harder to complete successfully. Therefore, most implementations opt to use the voice verification combined with the classic PIN approach. In this approach, the speaker is prompted during the sampling procedure to utter a series of digits which comprise his PIN. When the speaker tries to authenticate himself in the future, he speaks his PIN again and two procedures take place in parallel. The voice verification engine tries to match the speech characteristics to the stored ones in the database. A speech recognition engine tries to understand the digits been uttered and produce the PIN in a text format. Both engines should confirm that the speakers are who they say. This approach results in substantially higher recognition accuracy overall, and renders systems that use it resistant to fraud attempts.





Thursday, February 3, 2011

Speech Technology: Text to Speech

Introduction to Speech Synthesis or Text-to-Speech.

This is the second part of a 3-post series discussing the basic speech technologies that can be used in contact centers. Automatic Speech Recognition (ASR) was presented in the first part, a technology allowing a computer to interpret human speech and convert it to text. The opposite procedure, producing human speech artificially from a piece of text, is called Speech Synthesis or Text-to-Speech (TTS).

How Speech Synthesis works.

The procedure of synthesizing speech from a piece of text generally works in three steps. In the first step, text is converted to a normalized form that consists of only words (eliminating abbreviations, numbers etc.). The speech engine assigns to each word phonetic transcriptions (using a phonetic alphabet). In the final step, the speech synthesizer uses the phonetic transcript to produce the actual sound.

Speech synthesis is primarily done via concatenating segments of recorded speech by actual humans. The produced waveform consists mostly of actual human speech and is only adjusted by processing at the points of segment concatenation. This type of synthesis is the most natural – sounding, producing speech that closely resembles humans.  Larger databases of prerecorded segments highly increase the quality of the output, but also increase processing power and memory requirements. Another frequently used speech technique, with substantially less computing power requirements, is additive synthesis. This method does not use actual human speech samples, but it is based instead on mathematical models. The speech produced this way sounds robotic. However, it is easier to produce and more uniform than concatenated speech, which might produce glitches at the points of concatenation. Some engines use a combination of both methods.

Quality of synthesized speech.

The quality of synthesized speech has been greatly improved over the past few years. Large amounts of money have been invested to fine tune the synthesizers for languages such as English, which apply to very broad markets. Engines produced by market leaders such as Nuance, Loquendo and Acapela can be often indistinguishable from actual human speech, especially when applied over the phone (these companies, and many others, offer free demos for their engines - check them out at their websites). TTS engines are available for various languages; however their quality is usually somewhat lacking compared to English (though still very good).

TTS and ASR usage in IVR.

Synthesized speech is very convenient to use in IVR systems, when a large amount of required prompts has to be changed frequently, providing large cost savings as well as deployment efficiency (synthesized speech can be produced instantly without any human intervention). Its quality is still not the same as prerecorded prompts, but it gets a lot closer for some languages. Thus, many companies opt to use TTS in their voice applications, especially for English language applications. TTS can be combined with ASR for a complete cycle, as the schematic shows.