Professional Software Consulting

Bellsouth Mobility/Cingular Wireless/AT&T

2000-2006

Enterprise Systems

I was hired by Matrix Resources as a contractor to work for Bellsouth Mobility. Bellsouth's attitude was that contractors were easier to deal with than employees for a variety of reasons, and so most of us were contractors. I believe out of 28 people in our department, 25 were contractors. All this changed when Bellsouth Mobility merged with SBC wireless and changed names to Cingular. Contractors were highly encouraged to convert to be full time employees, and this I did. Several aquisitions later, I was an employee of AT&T.

Picture courtesy of Michael Pugh, www.globalgiants.com

The system I worked on was a middleware system responsible for retail activations. Originally developed in the early 1990's, it acted as a staging area for credit and activation activity for new customer signups, and was distinct from the billing system. The system was designed for 99.95% availability, and was especially designed to ensure that even if the billing or provisioning systems were unavailable, Cingular could still sell phones. Activations were cached in the system up until the point where the billing system could accept the requests again.

The core of the system was written in C and unix shell scripts, with over 30 distinct sub-systems to enable the various business rules and enable scalability and extensibility. It was a fascinating system, being created by very talented architects to handle very heavy loads quickly. It used an ISAM database as opposed to a database server primarily for speed reasons: at the time, the architects didn't want to introduce the latency issues associated with waiting for a database server to respond. In addition to the middleware servers, it also communicated with the credit authorization system, Bellsouth LandLine (the parent company), and Equifax.

The machines on which we ran the code were the largest Solaris boxes available at the time. At the time I left, each of the production boxes had 24 CPU's and 32 GB of memory. Because the system was not designed as a long-term datastore, the amount of disk allocated in the SAN was relatively minor - perhaps 200 GB or so, but it had to be fast. While it was initially fun to work on a machine that cost more than my house, I learned that the price of these large systems came in keeping them up. While our application was designed for 99.95% availability, it didn't always achieve that. There are many things that can go wrong that have nothing to do with your code.

For example, we once had an outage where the system was taking a long time to respond to front end systems. Nothing in the logs showed any indication of error - from the time the message reached our code to the time it responded was within milliseconds. The network group verified that the network was not clogged and we had plenty of bandwidth. The production support team saw nothing out of the ordinary in any of the monitoring programs they used to watch the system. So we executed our Platinum contract with Solaris and they sent over their techs to take a look. They determined that what was happening was that one CPU was designated as the "kernel" CPU, and it was the only one running kernel calls. Because we had so much traffic, the other 23 CPU's were waiting on the kernel CPU to finish. The solution? Remove some of the CPU's. The Director of Operations was incredulous: "This is the only time in my life that I've ever heard of removing CPU's to make things go faster." We theorized that if the kernel could have spread its load over multiple CPU's like other processes, then the issue would not have occurred. We eventually shifted load and and program responsibilities among other machines, and were able to get the system back up to 24 CPU's. Most outages were not as difficult to diagnose as this one, but it made for some sleepless nights.

Enterprise Management

I was involved in many calls related to troubleshooting systems, both our middleware system and other systems connected to it. By the time I became a full-time employee, I was responsible for system maintenance from a non-operational standpoint. This included keeping in contact with the production support team and fixing whatever issues they came across, as well as making changes requested by the table configuration and user security teams. Later, development was added to my responsibilities.

Picture courtesy of http://2putts4par.com , January 12th 2007 blog posting: 'AT&T Global NOC'

As the development manager, I understood that the system (15 years old at that time) was fast becoming obsolete and needed to be replaced. Several attempts had been made in the past to migrate the system towards the company's chosen direction of Java and Oracle on Solaris, but all had failed, for a variety of reasons. I came up with a migration plan that could be stopped temporarily while higher priority business requests came in, and then continued with no loss of effort.

The fact that we could stop the migration came in handy on several occasions, but the two largest ones were the WLNP efforts and the AT&T acquisition (when Cingular bought AT&T Wireless). During both of these, there were sweeping changes across the entire organization. Each of them lasted many months, and were the focus of the entirety of the company's development and architecture divisions.

Over the lifetime of the system, there were approximately 15 distinct front end systems that were used in various sales channels to sell phones to different customers. These included direct sales (Cingular employees in Cingular owned stores), indirect sales (Cingular equipment in non-Cingular owned stores), telephone sales, internet sales, reseller sales, and several more distinctions besides those.

Achievements at AT&T

Being a development manager, it is sometimes hard to measure achievement. Does delivering changes associated with a business request on time count as an achievement? Or is that just part of the job? So I will try to keep the mundane achievements out of this list, and focus on only the things that reflect my thinking and my capabilities.

  • Contract Server. Prior to becoming a manager, I implemented a contract server as an engine, enabling contracts to be served up to any front ends. This facilitated the changing of contracts without changing code, and also enabled a single point of maintenance for all retail contracts. It wasn't a single point of failure, however, because of how the framework into which it fit was architected.

  • Designed Migration Plan. This this the legacy-to-modern system migration plan of which I spoke earlier.

  • Wireless Local Number Portability. This project caused over six months of chaos across the entire company. A government mandated project, it required substantial changes to almost every system within Cingular, especially those related to the activation process. Being responsible for the middleware activation system for the entire southeast region including Puerto Rico and New York, there were many weeks of 20 hour days and 2am conference calls even after the code was released.

  • Cingular/AT&T Wireless Merger. The changes related to this were probably less than that required for WLNP, but because work started months before the Department of Justice approved the merger, we could not contact the AT&T Wireless personnel directly to understand their business model or how their system worked. All communication had to occur through a "legal white room", staffed by lawyers who ensured that no proprietary information flowed either way. This severely limited what we knew as we modifed the system, and we had to stay agile in order to change things radically as new information became available.

  • Front End System Management. In addition to being responsible for the middleware system, I was also responsible for certain front end systems, including the legacy green-screen activation front end, the activation trouble queue application used by credit and support analysts, and the web-based activation system used by indirect stores; i.e., those stores that sold AT&T phones but were not owned by AT&T.


     Contact Us     

Something wrong with this page or this site? Let the webmaster know by clicking HERE
This website designed, implemented, and maintained by Corey Dulecki
© 2009-2012, Corey's Consulting LLC