When it comes to knowledge of localization, few can match @International
Services. Initially a translation services business, they have a network
in 90 countries and specialize in technical translation and localization.
Areas of expertise include audio, video, text, images, software, and
any combination of the above, such as sub-titling or simultaneous
interpretation. Clients include Microsoft, Cisco/CMG, and Intel, to name a
few.
My involvement with @International Services began with the implementation
of a C-based content management run-time engine that can be used to deliver
localized content to applications. Its inclusion of proprietary linguistic
algorithms designed by the company's founder enable it to provide content
that is far more linguistically sophisticated than a standard content
management system. For example, an application using the run-time engine would
have access to 32 different time and date formats, proper number sets,
and telephone numbers spoken and displayed appropriately for the locale.
Because the run-time engine is written in C using standard library calls, it
can run on any system from embedded to enterprise scale.
I had worked on localization issues back at Premiere Technologies, but in
working with @International Services, I learned quite a bit about just how
complex localization can be. Although I know a smattering of French, I really
only communicate effectively in English. This colors how I write my code
and how I write my comments. It also makes it so that when I'm designing a
user interface, I tend scale things for the length of English sentences.
When applications are localized, things can change radically.
For example, in French, people typically think that there are two ways of
speaking the word one: masculine (un) and feminine (une). However, while this is true in written form, there are actually three ways
of enunciating this depending on whether the following word starts with a vowel.
Un galet (a pebble) will sound different than un excuse (an aplogy), thereby requiring a program to act differently.
Localization can be very annoying to programmers. Programminmg languages are
logical; they are based on rules that are often decided by committee, and the
syntax must be exact for things to work properly. Their evolution is a very
controlled matter and nothing is done unless there is a reason. On the other hand, with few exceptions, linguistic languages are a quagmire of chaotic rules
and patterns filled with exceptions that were not planned and don't make sense.
Nobody planned on googling becoming a verb; it wasn't put to a vote
or decided in committee. Nobody decided that the most logical phraseology
is to park in a driveway yet drive on a parkway.
The arbitrary nature of linguistics isn't generally recognized by programmers
who are creating interfaces in their native language. However, it becomes
very evident whenever they have to localize their application. Because
we learned our native language at a very young age, the rules seems logical to us
no matter how illogical they actually are. Oftentimes, we don't even realize
that something isn't right until we see or hear it. For example, if I were
to ask an English speaker how many ways the integer 2 could be expressed, they
would probably say there are two ways: cardinal (1, 2, 3, etc.) and ordinal
(first, second, third, etc.). However, the word double also
expresses the integer 2 (single, double, triple, etc.).
These are called number sets; English
has three, French has four when you take into account pronunciation, and
Russian has twenty-eight. Because there is no system, no pattern, no
set of rules that span all languages, localizing software can be a gruelling
process, even if all of the linguistic rules for the foreign language are
known up front - and most of the time, they are not.
Localization isn't just about translation, either. When it comes to
advertising, what sells in one culture may have the exact opposite effect
in another. For example, the American style of selling tends to be "We're
great! Our product is great! Look at us! Buy our stuff!" However, many
cultures find this brash and offensive. In China, a better way to sell
is to say "There are some who say we are great. Please, we would like you
to consider buying our product because there are people who believe it is a
quality product." But if you use the Chinese method of advertising in America,
you probably won't sell anything at all. Localization is more than just
translating words. It requires expertise.
Poor localization can go beyond just turning away customers.
Commercials in the United States often include females with bare skin, such
as arms, legs, necks, midriffs, or even more. These sorts of commercials
are considered extremely offensive in certain places in the middle east, where
women are required to be covered from head to toe. Whether or not you agree
with the local rules and customs of the culture in question,
you need to be aware of these sorts of things if you want to
have any success doing business there.
This is the sort of knowledge and information that @International Services
specializes in.
Achievements at @International Services
I have designed quite a few disparate systems that all work together
for various purposes relating to the productization of @International Services'
expertise. What follows is a list of major accomplishments and
responsibilities in relation to this company.
Run-Time Content Management Engine.
Encapsulating linguistic algorithms and rules for content localization and
delivery for embedded environments in a software library. Includes
automated regression test
suite consisting of 2900 distinct unit tests to ensure quality.
Run-Time Engine APIs. Includes C#, Java (JNI), RealBasic, C
(homegrown) session layer, VBScript using COM+ architecture, plus
auto-generated C "wrapper" code specific to a particular client and
configuration. Wrote entirety of API documentation.
Engine Certification. Developed build procedures so that
run-time library could be certified to work on Windows
XP, SUN Solaris, Intel Solaris, FreeBSD, Slackware Linux, and Macintosh
OS-X using either a unix share object, DLL using Microsoft's compiler, or
a DLL using Borland's compiler.
Source Code Obfuscator. Created process to obfuscate all source
code to make it unreadable, thereby allowing delivery for building on
embedded client systems in the form of a shared object or directly linked
objects. Client in question used Tornado embedded emulator.
Run-Time Engine Desktop Utility. For troubleshooting run-time
engine configuration files, this utility enabled the manual execution
of the engine's API, enabling the testing of core functionality outside
of the client application.
Phone Notification Prototype. Using Voxeo, design and
implementation of a prototype web-initiated phone notification system
that called a user, used text-to-speech to read a message to the user,
and then log that the user received the message.
ASR Generation. Creation of a suite of programs and utilities used for Automatic
Speech Recognition (ASR) grammar generation using the Nuance ASR engine.
Prototype Datacenter. Implemented a data center at main
office, assisting with the installation of a T1 circuit, cooling provisions,
network infrastructure supporting five separate public IP networks,
DNS setup,
SSL certificate purchase and installation, two top-level domains with
as many as thirty sub-domains. Includes purchasing and installing
equipment, running
cables, and configuring routers.
Hardware Installation Methodology. Created and tested
standardized install and configuration instructions for hardware so that
all hardware has the same capabilities but with functionality turned on
commensurate with the box's purpose. For example, all boxes would have
MySQL installed and configured, but it would only be running on the box
that was designated as the databse server. Includes NTP time server
configuration, software install, and standardization of configuration
information so that, for example, a web server for the production
network would be configured identically to the web server for the
development network. Includes designation of machine types: web,
database, batch, ftp, linguistic server (Macintosh), Microsoft utility
server.
Disaster Recovery Plan.
Design of multi-site tertiary backup system to facilitate disaster
recovery to achieve 99.999% availability.
Apache Server. Installation and configuration of Apache web
server version 2.2, using virtual hosts to support production,
development, testing, demo, and production-mirror environments.
Includes installation of PHP 5.
MySQL Server. Installation and configuration of MySQL database
server version 5. Includes schema creation for various applications
and environments (production, test, demo, etc) as well as nightly backups
copied to a different location.
FTP Server. Configuration of multiple FTP servers using SSH as
well as standard FTP protocol to create a common area for all company
project managers to access shared files. In the same area, implemented
distinct customer logins for certain subdirectories so project managers
could access all the client's data, but the clients could not see each
others' data. Same technique used to create a "demo file" area where
potential clients had read-only access to voice talent files but project
managers had full permission.
Amazon Web Services S3. Wrote command line scripts to automate
listing, uploading, downloading, and deleting of files from the company's
AWS S3 repository, enabling the application to be able to use S3 as
an unlimited capacity long-term as well as short-term datastore. Includes
setting up company S3 account.
Large File Tracking System.
Design and creation of a file storage and tracking mechanism similar to
source code control systems, but designed for very large binary files
stored either locally, on a remote machine on the LAN, or on Amazon S3.
Intended use is for audio and video localization projects.
White Papers, Marketing, Sales. Wrote multiple white
papers, certain information for patent submission, marketing verbiage as
well as power point presentations related to sales calls. Acted as
technical sales personnel on both conference calls as well as face-to-face
presentations. Includes representing the company in booths at trade
shows.
Spell Check Server. Using Microsoft Office Automation and
RealBasic, created an HTTP API that enables any other program in the
system to use the multi-lingual spell checking capability in Microsoft
Word.
Realbasic/Flash Conduit. Creation of a data "conduit" to
ferry requests between a Flash application in the browser and a
backend server written in RealBasic (see next bullet point). PHP code
executing in Apache acted as the intermediary between the two; it contained
no business logic.
RealBasic Server Framework. Created an HTTP server framework in
RealBasic so that other developers could easily add business logic
to the server without the need to add or alter anything in the
communication layer with the client. Because this was used in the
RealBasic/Flash conduit, the programs created with this framework could
be assured that their only client was an Apache request using CURL.
Data Conduit Testing Methodology. Created a web interface to
assist other developers in starting, stopping, testing, and checking the
status of the back-end RealBasic HTTP server independent of a Flash client.
Online Vendor Management System. Creation of an online system
used by the company to automate and track their localization projects.
Included online vendor invoice submission, purchase order entry and
cloning (for jobs with dozens of languages), automatic email notification,
historical data retention, vendor evaluation system based on language
and service provided, accounting, and reports.
Acting CTO, CIO, and Chief Architect.
Responsibilities have
included researching and testing new technologies, making decisions
regarding development resources and certain areas of product development.