Professional Software Consulting

@International Services

2006-

Localization and Linguistics

When it comes to knowledge of localization, few can match @International Services. Initially a translation services business, they have a network in 90 countries and specialize in technical translation and localization. Areas of expertise include audio, video, text, images, software, and any combination of the above, such as sub-titling or simultaneous interpretation. Clients include Microsoft, Cisco/CMG, and Intel, to name a few.

My involvement with @International Services began with the implementation of a C-based content management run-time engine that can be used to deliver localized content to applications. Its inclusion of proprietary linguistic algorithms designed by the company's founder enable it to provide content that is far more linguistically sophisticated than a standard content management system. For example, an application using the run-time engine would have access to 32 different time and date formats, proper number sets, and telephone numbers spoken and displayed appropriately for the locale. Because the run-time engine is written in C using standard library calls, it can run on any system from embedded to enterprise scale.

I had worked on localization issues back at Premiere Technologies, but in working with @International Services, I learned quite a bit about just how complex localization can be. Although I know a smattering of French, I really only communicate effectively in English. This colors how I write my code and how I write my comments. It also makes it so that when I'm designing a user interface, I tend scale things for the length of English sentences. When applications are localized, things can change radically.

For example, in French, people typically think that there are two ways of speaking the word one: masculine (un) and feminine (une). However, while this is true in written form, there are actually three ways of enunciating this depending on whether the following word starts with a vowel. Un galet (a pebble) will sound different than un excuse (an aplogy), thereby requiring a program to act differently.

Localization can be very annoying to programmers. Programminmg languages are logical; they are based on rules that are often decided by committee, and the syntax must be exact for things to work properly. Their evolution is a very controlled matter and nothing is done unless there is a reason. On the other hand, with few exceptions, linguistic languages are a quagmire of chaotic rules and patterns filled with exceptions that were not planned and don't make sense. Nobody planned on googling becoming a verb; it wasn't put to a vote or decided in committee. Nobody decided that the most logical phraseology is to park in a driveway yet drive on a parkway.

The arbitrary nature of linguistics isn't generally recognized by programmers who are creating interfaces in their native language. However, it becomes very evident whenever they have to localize their application. Because we learned our native language at a very young age, the rules seems logical to us no matter how illogical they actually are. Oftentimes, we don't even realize that something isn't right until we see or hear it. For example, if I were to ask an English speaker how many ways the integer 2 could be expressed, they would probably say there are two ways: cardinal (1, 2, 3, etc.) and ordinal (first, second, third, etc.). However, the word double also expresses the integer 2 (single, double, triple, etc.). These are called number sets; English has three, French has four when you take into account pronunciation, and Russian has twenty-eight. Because there is no system, no pattern, no set of rules that span all languages, localizing software can be a gruelling process, even if all of the linguistic rules for the foreign language are known up front - and most of the time, they are not.

Localization isn't just about translation, either. When it comes to advertising, what sells in one culture may have the exact opposite effect in another. For example, the American style of selling tends to be "We're great! Our product is great! Look at us! Buy our stuff!" However, many cultures find this brash and offensive. In China, a better way to sell is to say "There are some who say we are great. Please, we would like you to consider buying our product because there are people who believe it is a quality product." But if you use the Chinese method of advertising in America, you probably won't sell anything at all. Localization is more than just translating words. It requires expertise.

Poor localization can go beyond just turning away customers. Commercials in the United States often include females with bare skin, such as arms, legs, necks, midriffs, or even more. These sorts of commercials are considered extremely offensive in certain places in the middle east, where women are required to be covered from head to toe. Whether or not you agree with the local rules and customs of the culture in question, you need to be aware of these sorts of things if you want to have any success doing business there.

This is the sort of knowledge and information that @International Services specializes in.

Achievements at @International Services

I have designed quite a few disparate systems that all work together for various purposes relating to the productization of @International Services' expertise. What follows is a list of major accomplishments and responsibilities in relation to this company.

  • Run-Time Content Management Engine. Encapsulating linguistic algorithms and rules for content localization and delivery for embedded environments in a software library. Includes automated regression test suite consisting of 2900 distinct unit tests to ensure quality.

  • Run-Time Engine APIs. Includes C#, Java (JNI), RealBasic, C (homegrown) session layer, VBScript using COM+ architecture, plus auto-generated C "wrapper" code specific to a particular client and configuration. Wrote entirety of API documentation.

  • Engine Certification. Developed build procedures so that run-time library could be certified to work on Windows XP, SUN Solaris, Intel Solaris, FreeBSD, Slackware Linux, and Macintosh OS-X using either a unix share object, DLL using Microsoft's compiler, or a DLL using Borland's compiler.

  • Source Code Obfuscator. Created process to obfuscate all source code to make it unreadable, thereby allowing delivery for building on embedded client systems in the form of a shared object or directly linked objects. Client in question used Tornado embedded emulator.

  • Run-Time Engine Desktop Utility. For troubleshooting run-time engine configuration files, this utility enabled the manual execution of the engine's API, enabling the testing of core functionality outside of the client application.

  • Phone Notification Prototype. Using Voxeo, design and implementation of a prototype web-initiated phone notification system that called a user, used text-to-speech to read a message to the user, and then log that the user received the message.

  • ASR Generation. Creation of a suite of programs and utilities used for Automatic Speech Recognition (ASR) grammar generation using the Nuance ASR engine.

  • Prototype Datacenter. Implemented a data center at main office, assisting with the installation of a T1 circuit, cooling provisions, network infrastructure supporting five separate public IP networks, DNS setup, SSL certificate purchase and installation, two top-level domains with as many as thirty sub-domains. Includes purchasing and installing equipment, running cables, and configuring routers.

  • Hardware Installation Methodology. Created and tested standardized install and configuration instructions for hardware so that all hardware has the same capabilities but with functionality turned on commensurate with the box's purpose. For example, all boxes would have MySQL installed and configured, but it would only be running on the box that was designated as the databse server. Includes NTP time server configuration, software install, and standardization of configuration information so that, for example, a web server for the production network would be configured identically to the web server for the development network. Includes designation of machine types: web, database, batch, ftp, linguistic server (Macintosh), Microsoft utility server.

  • Disaster Recovery Plan. Design of multi-site tertiary backup system to facilitate disaster recovery to achieve 99.999% availability.

  • Apache Server. Installation and configuration of Apache web server version 2.2, using virtual hosts to support production, development, testing, demo, and production-mirror environments. Includes installation of PHP 5.

  • MySQL Server. Installation and configuration of MySQL database server version 5. Includes schema creation for various applications and environments (production, test, demo, etc) as well as nightly backups copied to a different location.

  • FTP Server. Configuration of multiple FTP servers using SSH as well as standard FTP protocol to create a common area for all company project managers to access shared files. In the same area, implemented distinct customer logins for certain subdirectories so project managers could access all the client's data, but the clients could not see each others' data. Same technique used to create a "demo file" area where potential clients had read-only access to voice talent files but project managers had full permission.

  • Amazon Web Services S3. Wrote command line scripts to automate listing, uploading, downloading, and deleting of files from the company's AWS S3 repository, enabling the application to be able to use S3 as an unlimited capacity long-term as well as short-term datastore. Includes setting up company S3 account.

  • Large File Tracking System. Design and creation of a file storage and tracking mechanism similar to source code control systems, but designed for very large binary files stored either locally, on a remote machine on the LAN, or on Amazon S3. Intended use is for audio and video localization projects.

  • White Papers, Marketing, Sales. Wrote multiple white papers, certain information for patent submission, marketing verbiage as well as power point presentations related to sales calls. Acted as technical sales personnel on both conference calls as well as face-to-face presentations. Includes representing the company in booths at trade shows.

  • Spell Check Server. Using Microsoft Office Automation and RealBasic, created an HTTP API that enables any other program in the system to use the multi-lingual spell checking capability in Microsoft Word.

  • Realbasic/Flash Conduit. Creation of a data "conduit" to ferry requests between a Flash application in the browser and a backend server written in RealBasic (see next bullet point). PHP code executing in Apache acted as the intermediary between the two; it contained no business logic.

  • RealBasic Server Framework. Created an HTTP server framework in RealBasic so that other developers could easily add business logic to the server without the need to add or alter anything in the communication layer with the client. Because this was used in the RealBasic/Flash conduit, the programs created with this framework could be assured that their only client was an Apache request using CURL.

  • Data Conduit Testing Methodology. Created a web interface to assist other developers in starting, stopping, testing, and checking the status of the back-end RealBasic HTTP server independent of a Flash client.

  • Online Vendor Management System. Creation of an online system used by the company to automate and track their localization projects. Included online vendor invoice submission, purchase order entry and cloning (for jobs with dozens of languages), automatic email notification, historical data retention, vendor evaluation system based on language and service provided, accounting, and reports.

  • Acting CTO, CIO, and Chief Architect. Responsibilities have included researching and testing new technologies, making decisions regarding development resources and certain areas of product development.


     Contact Us     

Something wrong with this page or this site? Let the webmaster know by clicking HERE
This website designed, implemented, and maintained by Corey Dulecki
© 2009-2012, Corey's Consulting LLC