Language facts: Hebrew

Sep 24, 2015

Hebrew is a Semitic language and belongs to the Afroasiatic language family. Biblical Hebrew is closely related to Arabic and Aramaic, which are spoken around the territory where many of the biblical stories are focused – the Middle East. Hebrew is one of the two official languages of the State of Israel, along with Arabic. Modern Hebrew is spoken by some six million people inside Israel and one to two million people outside the country. Liturgical Hebrew is used by quite a few more people, in both Jewish and Christian religious settings.

Resuscitated after centuries

Liturgical Hebrew that was preserved in ancient religious heritage, but vanished from everyday use around 4th century, was actually revived. Modern Hebrew was invented as an adjunct to the Zionist movement in the 19th century. One of its first and most avid innovators was Eliezer Itzchak Perlman of Belarus, who created much of the modern vocabulary between 1885 and 1922. Mr. Perlman is renowned for raising the first “Hebrew-speaking” child – he forbade anyone to utter a word in any other language around his firstborn son, Ben Zion (who later changed his name to Itamar).

Modern Hebrew is governed by an official committee – The Academy of the Hebrew Language. Decisions by the Academy are enshrined in law and frequently ignored by speakers of the language. It is interesting, though, that Hebrew is a native language of less than 49% of Israelis – other major native languages of Israel inhabitants are Russian, Arabic, English, French and Yiddish (though similar to Hebrew by using the same alphabet set as well as similar expressions, the two languages have very different origin and history. Yiddish is a fusion language originating in Liturgical Hebrew and Armaic, but mixes with High German and Slavic languages).


Hebrew is read from right to left using a distinctive 22-letter alphabet.

בּ ב ג גּ ג׳ ד דּ ד׳ ה ו וּ וֹ ו׳ ז ז׳ ח ט י ִי כּ ךּ ך כ ל

/ ם מ ן נ ס ע פּ ףּ פ ף ץ צ ץ׳ צ׳ ק ר שׁ שׂ תּ ת ת׳

Translation Memory Creation: How It Works

Sep 17, 2015

In our previous blogs, we emphasized the advantages of having well-maintained translation resources, which reduce your translation costs, mainly in connection with continuous translation undertakings. We have also touched on the process of "cleaning" and maintaining your translation memories, but what if you don't have any? If that's the case, you should definitely create one if you publish documents with similar content. Here we explain how to go about this step by step.

5 Steps to creating a Translation Memory

The process begins with two similar files – one with source text, the other with corresponding target text in a different language. To create a TM for you, we first need to pool as many of your original files with the translated equivalents. The file format doesn't matter too much. The documents can be in Word, Excel, FrameMaker, InDesign, Acrobat PDF, or whatever other format you may have.

Then, we apply these 5 steps in order to create your translation resources.

  1. SEGMENT EXTRACTION. We extract all text segments (basically sentences) from the source and target files to create a kind of bilingual database with original text and the corresponding translated text.
  2. SEGMENT ALIGNMENT. We then confirm that all segments are correctly aligned using our unique, in-house developed iSync solution. It pairs segments based on the placement markings and content. The process is highly automated and enables very fast processing, much faster than humans and with high precision.
  3. HUMAN EDITING. A human review of the result is necessary, though, in which native translators browse the paired segments to ensure they match.
  4. TM CREATION. At the end of the process we delete redundant segments that have no matches in neither source nor target, and we then export the bilingual text segments to the universal .tmx format or any other format you may require.
  5. TM QA AND EXPORT. The last step involves applying our QA tools on the final result to ensure segments are consistent, numbers, tags and symbols match and there is no text in the wrong language.

TMs belong to you

Once a translation memory has been created and delivered to you, it becomes your legal property, and you can use it in all your internal processes or whenever you outsource translation projects. This is important to remember, and you can always ask your existing or previous suppliers to deliver TMs they have created in projects ordered by to – we then hope you will relocate your projects to idioma :)

There's no time to waste when it comes to ensuring you have TMs and also keeping them up to date.

Contact us TODAY to get your TM created!

TM HEALTH CHECK™: 4 Steps to Updated Translation Memories:

Sep 10, 2015

If you have ever been involved in the content translation process in your company, you are surely familiar with Translation Memories (TMs). And if you are not, you really should be, as TMs are your way to less stress, less overtime work, possibly even a promotion or other reward thanks to reducing your company's translation expenses.

Why you should keep your TMs up to date

As we have mentioned in a previous blog, it is very common that TMs get outdated and out of sync with your current published documents. Even if you have gone to the effort of creating a translation memory, it most likely does not include all the additions, changes, etc. that your published documents have undergone.

What you risk with outdated TMs

If you fail to keep your TM up to date, you risk issues in the translation process, such as:

  • Inconsistent resources and terminology that slow down translation, create context errors that compromise translation accuracy and cause confusion
  • Reuse of 100% identical text segments that contain errors, which are used and published again and again
  • Jack is not always a jack, nor a jack... A common TM issue is the existence of different translations of the same source segment, especially for short text segments. Such segments should be kept, but their actual usage checked.
  • If selected segments are updated in local projects, they risk becoming inconsistent with the existing segments in your TMs causing even more confusion – what should be used when and where?.

All these issues lead to inconsistent and inaccurate translation. Translation takes longer and becomes more expensive. These extra costs can be prevented by performing a regular health check on your TMs, and making sure they undergo regular maintenance.

What does the TM overhaul involve?

  1. PROJECT LAUNCH- We pool all your TM resources, i.e. those you have and those we may have in our storage. We then determine priorities for maintenance if there are multiple TMs to check – e.g. emphasize a main TM that serves as the source for the greater part of your content, or cover content that is most visible in the company.

- To "clean" the TM and match every source segment with its target, we run a “Health check” analysis to detect possible errors, such as:

  • number mistakes and typos, URL mismatches, tag orders, usage of special characters and spaces or formatting based on language style sheets
  • untranslated text, which could signal missing target text
  • text consistency such as inconsistent matches and duplicate TM entries, including word consistency
  • glossary usage / non-usage (use of forbidden terms, opposite terms, etc.)
  1. TM MAINTENANCE– based on the TM HEALTH CHECK™, our human TM specialists and editors implement a thorough maintenance and editing process, some that rely on IT-based tools to clean the memory and repair detected issues. The process includes the following:
  • Updating the segments one-to-one (source-to-target) for each TM (cleaning out duplicates, forbidden terms, etc.)
  • Verifying terminology consistency including correction by human reviewers
  • Replacing your existing TMs with the updated bilingual text files to replace outdated and faulty segments.
  1. TM IMPLEMENTATION– This step is chiefly a client-side issue. Your memory is now in good health. Going forward, to benefit from the entire process you need to ensure that the updated TMs are used at all company levels in the document flow and for all published content. And since TM maintenance and the health check should be a regular event, the company needs to decide on a policy on how to continue its involvement in its TM overhauls:

Manage content and update resources in-house, and only have the cleaning process covered by an external professional LSP. This is often preferred by big companies and requires a considerable organizational structure (defining the TM update process, appointing managers responsible for the TM resources and its updates, etc.), or

Focus on your core business and outsource TM maintenance to your LSP to

  • perform regular TM health checks of your resources
  • follow up on these checks with reporting on inconsistent text segments, unused glossary terminology, poor/incorrect source text, conditional glossary terms, or even inconsistent source segments. idioma offers a comprehensive reporting service that serves as a valuable foundation for maintenance and consistent translation quality in ongoing and future translation projects.

BENEFITS of up-to-date translation memories

Well, the benefits are straight forward and clear, and offer:

  1. Reduced translation costs
  2. Reduced translationtime
  3. Brand and content consistency on international markets across all languages
  4. Improved SEO and online visibility
  5. Error-free content that ensures less potential damage control in future communication with company clients (high-quality brochures and manuals, easily comprehensible content for end-users ensuring correct application)

There's no time to waste when it comes to maintenance of your company's valuable TM resources.

Contact us TODAY to get your TMs checked and updated!

Language facts: Arabic

Sep 3, 2015

Arabic belongs to the Afroasiatic language family and is a Semitic language of the Arabo-Canaanite subgroup – therefore closely related to Hebrew or Phoenician.

With approximately 290 million speakers (of Modern Standard Arabic), it ranks in sixth place among the world’s major languages. In today's form, Modern Standard Arabic happens to be an official language of 27 states. Only English and French score higher. As the Arabic world is very large, it is not surprising that a large number of Arabic dialects have developed – counting all these, the number of Arabic speakers rises to an estimated 420 million. Arabic is the language of the Holy Quran, poetry and literature as well as an official UN language. As a liturgical language of Islam, it is used by an astonishing 1.6 billion Muslims.

Complicated language of complicated society

Arabic is a so-called sociolinguistic language, which means that from a purely linguistic view it's actually a group of familiar languages. For cultural (e.g. religious) or socioeconomic reasons it is considered as one language, though, despite that there are branches of Arabic that are mutually unintelligible. Arabic can be sub-classified as follows: Classical Arabic, Modern Standard Arabic and Colloquial Arabic. Obviously there's also a large number of dialects.

Classical Arabic (or Quranic Arabic) is used as the language of prayer and recitation throughout the Islamic world. Modern Standard Arabic, a constituted version of the language, is, though intelligible, much distinct from the spoken variants of Arabic dialects (with no observable boundaries or rules). The official constituted form of Arabic actually co-exist in common usage with various Arabic dialects while covering different social situations.

The language of culture

Because of Muslim expansion in the past, Arabic has influenced a lot of the world's languages, including Indian languages such as Urdu (which is in fact a Muslim influenced version of Hindi – that was actually also previously influenced by Arabic), Punjabi or Bengali. Also Roman languages, mainly Spanish, Catalan or Portuguese, borrowed many expressions from Arabic in the middle ages, when the Muslim world represented the cultural and scientific drive in then decimated Europe.


The Arabic alphabet has twenty-eight (28) letters. Arabic differs from Latin languages in that it is written right to left, but sequences of digits, such as telephone numbers, read from left to right.

أ ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ك ل م ن ه و ي