This is Info file gettext.info, produced by Makeinfo version 1.68 from the input file gettext.texi. INFO-DIR-SECTION GNU Gettext Utilities START-INFO-DIR-ENTRY * Gettext: (gettext). GNU gettext utilities. * gettextize: (gettext)gettextize Invocation. Prepare a package for gettext. * msgfmt: (gettext)msgfmt Invocation. Make MO files out of PO files. * msgmerge: (gettext)msgmerge Invocation. Update two PO files into one. * xgettext: (gettext)xgettext Invocation. Extract strings into a PO file. END-INFO-DIR-ENTRY This file provides documentation for GNU `gettext' utilities. It also serves as a reference for the free Translation Project. Copyright (C) 1995, 1996, 1997 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation.  File: gettext.info, Node: Top, Next: Introduction, Prev: (dir), Up: (dir) GNU `gettext' utilities *********************** * Menu: * Introduction:: Introduction * Basics:: PO Files and PO Mode Basics * Sources:: Preparing Program Sources * Initial:: Making the Initial PO File * Updating:: Updating Existing PO Files * Binaries:: Producing Binary MO Files * Users:: The User's View * Programmers:: The Programmer's View * Translators:: The Translator's View * Maintainers:: The Maintainer's View * Conclusion:: Concluding Remarks * Country Codes:: ISO 639 country codes -- The Detailed Node Listing -- Introduction * Why:: The Purpose of GNU `gettext' * Concepts:: I18n, L10n, and Such * Aspects:: Aspects in Native Language Support * Files:: Files Conveying Translations * Overview:: Overview of GNU `gettext' PO Files and PO Mode Basics * Installation:: Completing GNU `gettext' Installation * PO Files:: The Format of PO Files * Main PO Commands:: Main Commands * Entry Positioning:: Entry Positioning * Normalizing:: Normalizing Strings in Entries Preparing Program Sources * Triggering:: Triggering `gettext' Operations * Mark Keywords:: How Marks Appears in Sources * Marking:: Marking Translatable Strings * c-format:: Telling something about the following string * Special cases:: Special Cases of Translatable Strings Making the Initial PO File * xgettext Invocation:: Invoking the `xgettext' Program * C Sources Context:: C Sources Context * Compendium:: Using Translation Compendiums Updating Existing PO Files * msgmerge Invocation:: Invoking the `msgmerge' Program * Translated Entries:: * Fuzzy Entries:: Fuzzy translated Entries * Untranslated Entries:: Untranslated Entries * Obsolete Entries:: Obsolete Entries * Modifying Translations:: Modifying Translations * Modifying Comments:: Modifying Comments * Auxiliary:: Consulting Auxiliary PO Files Producing Binary MO Files * msgfmt Invocation:: Invoking the `msgfmt' Program * MO Files:: The Format of GNU MO Files The User's View * Matrix:: The Current `ABOUT-NLS' Matrix * Installers:: Magic for Installers * End Users:: Magic for End Users The Programmer's View * catgets:: About `catgets' * gettext:: About `gettext' * Comparison:: Comparing the two interfaces * Using libintl.a:: Using libintl.a in own programs * gettext grok:: Being a `gettext' grok * Temp Programmers:: Temporary Notes for the Programmers Chapter About `catgets' * Interface to catgets:: The interface * Problems with catgets:: Problems with the `catgets' interface?! About `gettext' * Interface to gettext:: The interface * Ambiguities:: Solving ambiguities * Locating Catalogs:: Locating message catalog files * Optimized gettext:: Optimization of the *gettext functions Temporary Notes for the Programmers Chapter * Temp Implementations:: Temporary - Two Possible Implementations * Temp catgets:: Temporary - About `catgets' * Temp WSI:: Temporary - Why a single implementation * Temp Notes:: Temporary - Notes The Translator's View * Trans Intro 0:: Introduction 0 * Trans Intro 1:: Introduction 1 * Discussions:: Discussions * Organization:: Organization * Information Flow:: Information Flow Organization * Central Coordination:: Central Coordination * National Teams:: National Teams * Mailing Lists:: Mailing Lists National Teams * Sub-Cultures:: Sub-Cultures * Organizational Ideas:: Organizational Ideas The Maintainer's View * Flat and Non-Flat:: Flat or Non-Flat Directory Structures * Prerequisites:: Prerequisite Works * gettextize Invocation:: Invoking the `gettextize' Program * Adjusting Files:: Files You Must Create or Alter Files You Must Create or Alter * po/POTFILES.in:: `POTFILES.in' in `po/' * configure.in:: `configure.in' at top level * aclocal:: `aclocal.m4' at top level * acconfig:: `acconfig.h' at top level * Makefile:: `Makefile.in' at top level * src/Makefile:: `Makefile.in' in `src/' Concluding Remarks * History:: History of GNU `gettext' * References:: Related Readings  File: gettext.info, Node: Introduction, Next: Basics, Prev: Top, Up: Top Introduction ************ This manual is still in *DRAFT* state. Some sections are still empty, or almost. We keep merging material from other sources (essentially e-mail folders) while the proper integration of this material is delayed. In this manual, we use *he* when speaking of the programmer or maintainer, *she* when speaking of the translator, and *they* when speaking of the installers or end users of the translated program. This is only a convenience for clarifying the documentation. It is *absolutely* not meant to imply that some roles are more appropriate to males or females. Besides, as you might guess, GNU `gettext' is meant to be useful for people using computers, whatever their sex, race, religion or nationality! This chapter explains the goals sought in the creation of GNU `gettext' and the free Translation Project. Then, it explains a few broad concepts around Native Language Support, and positions message translation with regard to other aspects of national and cultural variance, as they apply to to programs. It also surveys those files used to convey the translations. It explains how the various tools interact in the initial generation of these files, and later, how the maintenance cycle should usually operate. Please send suggestions and corrections to: Internet address: bug-gnu-utils@prep.ai.mit.edu Please include the manual's edition number and update date in your messages. * Menu: * Why:: The Purpose of GNU `gettext' * Concepts:: I18n, L10n, and Such * Aspects:: Aspects in Native Language Support * Files:: Files Conveying Translations * Overview:: Overview of GNU `gettext'  File: gettext.info, Node: Why, Next: Concepts, Prev: Introduction, Up: Introduction The Purpose of GNU `gettext' ============================ Usually, programs are written and documented in English, and use English at execution time to interact with users. This is true not only of GNU software, but also of a great deal of commercial and free software. Using a common language is quite handy for communication between developers, maintainers and users from all countries. On the other hand, most people are less comfortable with English than with their own native language, and would prefer to use their mother tongue for day to day's work, as far as possible. Many would simply *love* to see their computer screen showing a lot less of English, and far more of their own language. However, to many people, this dream might appear so far fetched that they may believe it is not even worth spending time thinking about it. They have no confidence at all that the dream might ever become true. Yet some have not lost hope, and have organized themselves. The Translation Project is a formalization of this hope into a workable structure, which has a good chance to get all of us nearer the achievement of a truly multi-lingual set of programs. GNU `gettext' is an important step for the Translation Project, as it is an asset on which we may build many other steps. This package offers to programmers, translators and even users, a well integrated set of tools and documentation. Specifically, the GNU `gettext' utilities are a set of tools that provides a framework within which other free packages may produce multi-lingual messages. These tools include a set of conventions about how programs should be written to support message catalogs, a directory and file naming organization for the message catalogs themselves, a runtime library supporting the retrieval of translated messages, and a few stand-alone programs to massage in various ways the sets of translatable strings, or already translated strings. A special mode for GNU Emacs also helps ease interested parties into preparing these sets, or bringing them up to date. GNU `gettext' is designed to minimize the impact of internationalization on program sources, keeping this impact as small and hardly noticeable as possible. Internationalization has better chances of succeeding if it is very light weighted, or at least, appear to be so, when looking at program sources. The Translation Project also uses the GNU `gettext' distribution as a vehicle for documenting its structure and methods. This goes beyond the strict technicalities of documenting the GNU `gettext' proper. By so doing, translators will find in a single place, as far as possible, all they need to know for properly doing their translating work. Also, this supplemental documentation might also help programmers, and even curious users, in understanding how GNU `gettext' is related to the remainder of the Translation Project, and consequently, have a glimpse at the *big picture*.  File: gettext.info, Node: Concepts, Next: Aspects, Prev: Why, Up: Introduction I18n, L10n, and Such ==================== Two long words appear all the time when we discuss support of native language in programs, and these words have a precise meaning, worth being explained here, once and for all in this document. The words are *internationalization* and *localization*. Many people, tired of writing these long words over and over again, took the habit of writing "i18n" and "l10n" instead, quoting the first and last letter of each word, and replacing the run of intermediate letters by a number merely telling how many such letters there are. But in this manual, in the sake of clarity, we will patiently write the names in full, each time... By "internationalization", one refers to the operation by which a program, or a set of programs turned into a package, is made aware of and able to support multiple languages. This is a generalization process, by which the programs are untied from calling only English strings or other English specific habits, and connected to generic ways of doing the same, instead. Program developers may use various techniques to internationalize their programs. Some of these have been standardized. GNU `gettext' offers one of these standards. *Note Programmers::. By "localization", one means the operation by which, in a set of programs already internationalized, one gives the program all needed information so that it can adapt itself to handle its input and output in a fashion which is correct for some native language and cultural habits. This is a particularisation process, by which generic methods already implemented in an internationalized program are used in specific ways. The programming environment puts several functions to the programmers disposal which allow this runtime configuration. The formal description of specific set of cultural habits for some country, together with all associated translations targeted to the same native language, is called the "locale" for this language or country. Users achieve localization of programs by setting proper values to special environment variables, prior to executing those programs, identifying which locale should be used. In fact, locale message support is only one component of the cultural data that makes up a particular locale. There are a whole host of routines and functions provided to aid programmers in developing internationalized software and which allow them to access the data stored in a particular locale. When someone presently refers to a particular locale, they are obviously referring to the data stored within that particular locale. Similarly, if a programmer is referring to "accessing the locale routines", they are referring to the complete suite of routines that access all of the locale's information. One uses the expression "Native Language Support", or merely NLS, for speaking of the overall activity or feature encompassing both internationalization and localization, allowing for multi-lingual interactions in a program. In a nutshell, one could say that internationalization is the operation by which further localizations are made possible. Also, very roughly said, when it comes to multi-lingual messages, internationalization is usually taken care of by programmers, and localization is usually taken care of by translators.  File: gettext.info, Node: Aspects, Next: Files, Prev: Concepts, Up: Introduction Aspects in Native Language Support ================================== For a totally multi-lingual distribution, there are many things to translate beyond output messages. * As of today, GNU `gettext' offers a complete toolset for translating messages output by C programs. Perl scripts and shell scripts will also need to be translated. Even if there are today some hooks by which this can be done, these hooks are not integrated as well as they should be. * Some programs, like `autoconf' or `bison', are able to produce other programs (or scripts). Even if the generating programs themselves are internationalized, the generated programs they produce may need internationalization on their own, and this indirect internationalization could be automated right from the generating program. In fact, quite usually, generating and generated programs could be internationalized independently, as the effort needed is fairly orthogonal. * A few programs include textual tables which might need translation themselves, independently of the strings contained in the program itself. For example, RFC 1345 gives an English description for each character which GNU `recode' is able to reconstruct at execution. Since these descriptions are extracted from the RFC by mechanical means, translating them properly would require a prior translation of the RFC itself. * Almost all programs accept options, which are often worded out so to be descriptive for the English readers; one might want to consider offering translated versions for program options as well. * Many programs read, interpret, compile, or are somewhat driven by input files which are texts containing keywords, identifiers, or replies which are inherently translatable. For example, one may want `gcc' to allow diacriticized characters in identifiers or use translated keywords; `rm -i' might accept something else than `y' or `n' for replies, etc. Even if the program will eventually make most of its output in the foreign languages, one has to decide whether the input syntax, option values, etc., are to be localized or not. * The manual accompanying a package, as well as all documentation files in the distribution, could surely be translated, too. Translating a manual, with the intent of later keeping up with updates, is a major undertaking in itself, generally. As we already stressed, translation is only one aspect of locales. Other internationalization aspects are not currently handled by GNU `gettext', but perhaps may be handled in future versions. There are many attributes that are needed to define a country's cultural conventions. These attributes include beside the country's native language, the formatting of the date and time, the representation of numbers, the symbols for currency, etc. These local "rules" are termed the country's locale. The locale represents the knowledge needed to support the country's native attributes. There are a few major areas which may vary between countries and hence, define what a locale must describe. The following list helps putting multi-lingual messages into the proper context of other tasks related to locales, and also presents some other areas which GNU `gettext' might eventually tackle, maybe, one of these days. *Characters and Codesets* The codeset most commonly used through out the USA and most English speaking parts of the world is the ASCII codeset. However, there are many characters needed by various locales that are not found within this codeset. The 8-bit ISO 8859-1 code set has most of the special characters needed to handle the major European languages. However, in many cases, the ISO 8859-1 font is not adequate. Hence each locale will need to specify which codeset they need to use and will need to have the appropriate character handling routines to cope with the codeset. *Currency* The symbols used vary from country to country as does the position used by the symbol. Software needs to be able to transparently display currency figures in the native mode for each locale. *Dates* The format of date varies between locales. For example, Christmas day in 1994 is written as 12/25/94 in the USA and as 25/12/94 in Australia. Other countries might use ISO 8061 dates, etc. Time of the day may be noted as HH:MM, HH.MM, or otherwise. Some locales require time to be specified in 24-hour mode rather than as AM or PM. Further, the nature and yearly extent of the Daylight Saving correction vary widely between countries. *Numbers* Numbers can be represented differently in different locales. For example, the following numbers are all written correctly for their respective locales: 12,345.67 English 12.345,67 French 1,2345.67 Asia Some programs could go further and use different unit systems, like English units or Metric units, or even take into account variants about how numbers are spelled in full. *Messages* The most obvious area is the language support within a locale. This is where GNU `gettext' provides the means for developers and users to easily change the language that the software uses to communicate to the user. In the near future we see no chance that components of locale outside of message handling will be made available for use in other packages. The reason for this is that most modern systems provide a more or less reasonable support for at least some of the missing components. Another point is that the GNU `libc' and Linux will get a new and complete implementation of the whole locale functionality which could be adopted by system lacking a reasonable locale support.  File: gettext.info, Node: Files, Next: Overview, Prev: Aspects, Up: Introduction Files Conveying Translations ============================ The letters PO in `.po' files means Portable Object, to distinguish it from `.mo' files, where MO stands for Machine Object. This paradigm, as well as the PO file format, is inspired by the NLS standard developed by Uniforum, and implemented by Sun in their Solaris system. PO files are meant to be read and edited by humans, and associate each original, translatable string of a given package with its translation in a particular target language. A single PO file is dedicated to a single target language. If a package supports many languages, there is one such PO file per language supported, and each package has its own set of PO files. These PO files are best created by the `xgettext' program, and later updated or refreshed through the `msgmerge' program. Program `xgettext' extracts all marked messages from a set of C files and initializes a PO file with empty translations. Program `msgmerge' takes care of adjusting PO files between releases of the corresponding sources, commenting obsolete entries, initializing new ones, and updating all source line references. Files ending with `.pot' are kind of base translation files found in distributions, in PO file format, and `.pox' files are often temporary PO files. MO files are meant to be read by programs, and are binary in nature. A few systems already offer tools for creating and handling MO files as part of the Native Language Support coming with the system, but the format of these MO files is often different from system to system, and non-portable. They do not necessary use `.mo' for file extensions, but since system libraries are also used for accessing these files, it works as long as the system is self-consistent about it. If GNU `gettext' is able to interface with the tools already provided with systems, it will consequently let these provided tools take care of generating the MO files. Or else, if such tools are not found or do not seem usable, GNU `gettext' will use its own ways and its own format for MO files. Files ending with `.gmo' are really MO files, when it is known that these files use the GNU format.  File: gettext.info, Node: Overview, Prev: Files, Up: Introduction Overview of GNU `gettext' ========================= The following diagram summarizes the relation between the files handled by GNU `gettext' and the tools acting on these files. It is followed by a somewhat detailed explanations, which you should read while keeping an eye on the diagram. Having a clear understanding of these interrelations would surely help programmers, translators and maintainers. Original C Sources ---> PO mode ---> Marked C Sources ---. | .---------<--- GNU gettext Library | .--- make <---+ | | `---------<--------------------+-----------' | | | .-----<--- PACKAGE.pot <--- xgettext <---' .---<--- PO Compendium | | | ^ | | `---. | | `---. +---> PO mode ---. | +----> msgmerge ------> LANG.pox --->--------' | | .---' | | | | | `-------------<---------------. | | +--- LANG.po <--- New LANG.pox <----' | .--- LANG.gmo <--- msgfmt <---' | | | `---> install ---> /.../LANG/PACKAGE.mo ---. | +---> "Hello world!" `-------> install ---> /.../bin/PROGRAM -------' The indication `PO mode' appears in two places in this picture, and you may safely read it as merely meaning "hand editing", using any editor of your choice, really. However, for those of you being the lucky users of GNU Emacs, PO mode has been specifically created for providing a cozy environment for editing or modifying PO files. While editing a PO file, PO mode allows for the easy browsing of auxiliary and compendium PO files, as well as for following references into the set of C program sources from which PO files have been derived. It has a few special features, among which are the interactive marking of program strings as translatable, and the validatation of PO files with easy repositioning to PO file lines showing errors. As a programmer, the first step to bringing GNU `gettext' into your package is identifying, right in the C sources, those strings which are meant to be translatable, and those which are untranslatable. This tedious job can be done a little more comfortably using emacs PO mode, but you can use any means familiar to you for modifying your C sources. Beside this some other simple, standard changes are needed to properly initialize the translation library. *Note Sources::, for more information about all this. For newly written software the strings of course can and should be marked while writing the it. The `gettext' approach makes this very easy. Simply put the following lines at the beginning of each file or in a central header file: #define _(String) (String) #define N_(String) (String) #define textdomain(Domain) #define bindtextdomain(Package, Directory) Doing this allows you to prepare the sources for internationalization. Later when you feel ready for the step to use the `gettext' library simply remove these definitions, include `libintl.h' and link against `libintl.a'. That is all you have to change. Once the C sources have been modified, the `xgettext' program is used to find and extract all translatable strings, and create an initial PO file out of all these. This `PACKAGE.pot' file contains all original program strings. It has sets of pointers to exactly where in C sources each string is used. All translations are set to empty. The letter `t' in `.pot' marks this as a Template PO file, not yet oriented towards any particular language. *Note xgettext Invocation::, for more details about how one calls the `xgettext' program. If you are *really* lazy, you might be interested at working a lot more right away, and preparing the whole distribution setup (*note Maintainers::.). By doing so, you spare yourself typing the `xgettext' command, as `make' should now generate the proper things automatically for you! The first time through, there is no `LANG.po' yet, so the `msgmerge' step may be skipped and replaced by a mere copy of `PACKAGE.pot' to `LANG.pox', where LANG represents the target language. Then comes the initial translation of messages. Translation in itself is a whole matter, still exclusively meant for humans, and whose complexity far overwhelms the level of this manual. Nevertheless, a few hints are given in some other chapter of this manual (*note Translators::.). You will also find there indications about how to contact translating teams, or becoming part of them, for sharing your translating concerns with others who target the same native language. While adding the translated messages into the `LANG.pox' PO file, if you do not have GNU Emacs handy, you are on your own for ensuring that your efforts fully respect the PO file format, and quoting conventions (*note PO Files::.). This is surely not an impossible task, as this is the way many people have handled PO files already for Uniforum or Solaris. On the other hand, by using PO mode in GNU Emacs, most details of PO file format are taken care of for you, but you have to acquire some familiarity with PO mode itself. Besides main PO mode commands (*note Main PO Commands::.), you should know how to move between entries (*note Entry Positioning::.), and how to handle untranslated entries (*note Untranslated Entries::.). If some common translations have already been saved into a compendium PO file, translators may use PO mode for initializing untranslated entries from the compendium, and also save selected translations into the compendium, updating it (*note Compendium::.). Compendium files are meant to be exchanged between members of a given translation team. Programs, or packages of programs, are dynamic in nature: users write bug reports and suggestion for improvements, maintainers react by modifying programs in various ways. The fact that a package has already been internationalized should not make maintainers shy of adding new strings, or modifying strings already translated. They just do their job the best they can. For the Translation Project to work smoothly, it is important that maintainers do not carry translation concerns on their already loaded shoulders, and that translators be kept as free as possible of programmatic concerns. The only concern maintainers should have is carefully marking new strings as translatable, when they should be, and do not otherwise worry about them being translated, as this will come in proper time. Consequently, when programs and their strings are adjusted in various ways by maintainers, and for matters usually unrelated to translation, `xgettext' would construct `PACKAGE.pot' files which are evolving over time, so the translations carried by `LANG.po' are slowly fading out of date. It is important for translators (and even maintainers) to understand that package translation is a continuous process in the lifetime of a package, and not something which is done once and for all at the start. After an initial burst of translation activity for a given package, interventions are needed once in a while, because here and there, translated entries become obsolete, and new untranslated entries appear, needing translation. The `msgmerge' program has the purpose of refreshing an already existing `LANG.po' file, by comparing it with a newer `PACKAGE.pot' template file, extracted by `xgettext' out of recent C sources. The refreshing operation adjusts all references to C source locations for strings, since these strings move as programs are modified. Also, `msgmerge' comments out as obsolete, in `LANG.pox', those already translated entries which are no longer used in the program sources (*note Obsolete Entries::.). It finally discovers new strings and inserts them in the resulting PO file as untranslated entries (*note Untranslated Entries::.). *Note msgmerge Invocation::, for more information about what `msgmerge' really does. Whatever route or means taken, the goal is to obtain an updated `LANG.pox' file offering translations for all strings. When this is properly achieved, this file `LANG.pox' may take the place of the previous official `LANG.po' file. The temporal mobility, or fluidity of PO files, is an integral part of the translation game, and should be well understood, and accepted. People resisting it will have a hard time participating in the Translation Project, or will give a hard time to other participants! In particular, maintainers should relax and include all available official PO files in their distributions, even if these have not recently been updated, without banging or otherwise trying to exert pressure on the translator teams to get the job done. The pressure should rather come from the community of users speaking a particular language, and maintainers should consider themselves fairly relieved of any concern about the adequacy of translation files. On the other hand, translators should reasonably try updating the PO files they are responsible for, while the package is undergoing pretest, prior to an official distribution. Once the PO file is complete and dependable, the `msgfmt' program is used for turning the PO file into a machine-oriented format, which may yield efficient retrieval of translations by the programs of the package, whenever needed at runtime (*note MO Files::.). *Note msgfmt Invocation::, for more information about all modalities of execution for the `msgfmt' program. Finally, the modified and marked C sources are compiled and linked with the GNU `gettext' library, usually through the operation of `make', given a suitable `Makefile' exists for the project, and the resulting executable is installed somewhere users will find it. The MO files themselves should also be properly installed. Given the appropriate environment variables are set (*note End Users::.), the program should localize itself automatically, whenever it executes. The remainder of this manual has the purpose of explaining in depth the various steps outlined above.  File: gettext.info, Node: Basics, Next: Sources, Prev: Introduction, Up: Top PO Files and PO Mode Basics *************************** The GNU `gettext' toolset helps programmers and translators at producing, updating and using translation files, mainly those PO files which are textual, editable files. This chapter stresses the format of PO files, and contains a PO mode starter. PO mode description is spread throughout this manual instead of being concentrated in one place. Here we present only the basics of PO mode. * Menu: * Installation:: Completing GNU `gettext' Installation * PO Files:: The Format of PO Files * Main PO Commands:: Main Commands * Entry Positioning:: Entry Positioning * Normalizing:: Normalizing Strings in Entries  File: gettext.info, Node: Installation, Next: PO Files, Prev: Basics, Up: Basics Completing GNU `gettext' Installation ===================================== Once you have received, unpacked, configured and compiled the GNU `gettext' distribution, the `make install' command puts in place the programs `xgettext', `msgfmt', `gettext', and `msgmerge', as well as their available message catalogs. To top off a comfortable installation, you might also want to make the PO mode available to your GNU Emacs users. During the installation of the PO mode, you might want modify your file `.emacs', once and for all, so it contains a few lines looking like: (setq auto-mode-alist (cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist)) (autoload 'po-mode "po-mode") Later, whenever you edit some `.po', `.pot' or `.pox' file, or any file having the string `.po.' within its name, Emacs loads `po-mode.elc' (or `po-mode.el') as needed, and automatically activates PO mode commands for the associated buffer. The string *PO* appears in the mode line for any buffer for which PO mode is active. Many PO files may be active at once in a single Emacs session. If you are using Emacs version 20 or better, and have already installed the appropriate international fonts on your system, you may also manage for the these fonts to be automatically loaded and used for displaying the translations on your Emacs screen, whenever necessary. For this to happen, you might want to add the lines: (autoload 'po-find-file-coding-system "po-mode") (modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\." 'po-find-file-coding-system) to your `.emacs' file.  File: gettext.info, Node: PO Files, Next: Main PO Commands, Prev: Installation, Up: Basics The Format of PO Files ====================== A PO file is made up of many entries, each entry holding the relation between an original untranslated string and its corresponding translation. All entries in a given PO file usually pertain to a single project, and all translations are expressed in a single target language. One PO file "entry" has the following schematic structure: WHITE-SPACE # TRANSLATOR-COMMENTS #. AUTOMATIC-COMMENTS #: REFERENCE... #, FLAG... msgid UNTRANSLATED-STRING msgstr TRANSLATED-STRING The general structure of a PO file should be well understood by the translator. When using PO mode, very little has to be known about the format details, as PO mode takes care of them for her. Entries begin with some optional white space. Usually, when generated through GNU `gettext' tools, there is exactly one blank line between entries. Then comments follow, on lines all starting with the character `#'. There are two kinds of comments: those which have some white space immediately following the `#', which comments are created and maintained exclusively by the translator, and those which have some non-white character just after the `#', which comments are created and maintained automatically by GNU `gettext' tools. All comments, of either kind, are optional. After white space and comments, entries show two strings, giving first the untranslated string as it appears in the original program sources, and then, the translation of this string. The original string is introduced by the keyword `msgid', and the translation, by `msgstr'. The two strings, untranslated and translated, are quoted in various ways in the PO file, using `"' delimiters and `\' escapes, but the translator does not really have to pay attention to the precise quoting format, as PO mode fully intend to take care of quoting for her. The `msgid' strings, as well as automatic comments, are produced and managed by other GNU `gettext' tools, and PO mode does not provide means for the translator to alter these. The most she can do is merely deleting them, and only by deleting the whole entry. On the other hand, the `msgstr' string, as well as translator comments, are really meant for the translator, and PO mode gives her the full control she needs. The comment lines beginning with `#,' are special because they are not completely ignored by the programs as comments generally are. The comma separated list of FLAGs is used by the `msgfmt' program to give the user some better disgnostic messages. Currently there are two forms of flags defined: `fuzzy' This flag can be generated by the `msgmerge' program or it can be inserted by the translator herself. It shows that the `msgstr' string might not be a correct translation (anymore). Only the translator can judge if the translation requires further modification, or is acceptable as is. Once satisfied with the translation, she then removes this `fuzzy' attribute. The `msgmerge' programs inserts this when it combined the `msgid' and `msgstr' entries after fuzzy search only. *Note Fuzzy Entries::. `c-format' `no-c-format' These flags should not be added by a human. Instead only the `xgettext' program adds them. In an automatized PO file processing system as proposed here the user changes would be thrown away again as soon as the `xgettext' program generates a new template file. In case the `c-format' flag is given for a string the `msgfmt' does some more tests to check to validity of the translation. *Note msgfmt Invocation::. It happens that some lines, usually whitespace or comments, follow the very last entry of a PO file. Such lines are not part of any entry, and PO mode is unable to take action on those lines. By using the PO mode function `M-x po-normalize', the translator may get rid of those spurious lines. *Note Normalizing::. The remainder of this section may be safely skipped by those using PO mode, yet it may be interesting for everybody to have a better idea of the precise format of a PO file. On the other hand, those not having GNU Emacs handy should carefully continue reading on. Each of UNTRANSLATED-STRING and TRANSLATED-STRING respects the C syntax for a character string, including the surrounding quotes and imbedded backslashed escape sequences. When the time comes to write multi-line strings, one should not use escaped newlines. Instead, a closing quote should follow the last character on the line to be continued, and an opening quote should resume the string at the beginning of the following PO file line. For example: msgid "" "Here is an example of how one might continue a very long string\n" "for the common case the string represents multi-line output.\n" In this example, the empty string is used on the first line, to allow better alignment of the `H' from the word `Here' over the `f' from the word `for'. In this example, the `msgid' keyword is followed by three strings, which are meant to be concatenated. Concatenating the empty string does not change the resulting overall string, but it is a way for us to comply with the necessity of `msgid' to be followed by a string on the same line, while keeping the multi-line presentation left-justified, as we find this to be a cleaner disposition. The empty string could have been omitted, but only if the string starting with `Here' was promoted on the first line, right after `msgid'.(1) It was not really necessary either to switch between the two last quoted strings immediately after the newline `\n', the switch could have occurred after *any* other character, we just did it this way because it is neater. One should carefully distinguish between end of lines marked as `\n' *inside* quotes, which are part of the represented string, and end of lines in the PO file itself, outside string quotes, which have no incidence on the represented string. Outside strings, white lines and comments may be used freely. Comments start at the beginning of a line with `#' and extend until the end of the PO file line. Comments written by translators should have the initial `#' immediately followed by some white space. If the `#' is not immediately followed by white space, this comment is most likely generated and managed by specialized GNU tools, and might disappear or be replaced unexpectedly when the PO file is given to `msgmerge'. ---------- Footnotes ---------- (1) This limitation is not imposed by GNU `gettext', but comes from the `msgfmt' implementation on Solaris.  File: gettext.info, Node: Main PO Commands, Next: Entry Positioning, Prev: PO Files, Up: Basics Main PO mode Commands ===================== After setting up Emacs with something similar to the lines in *Note Installation::, PO mode is activated for a window when Emacs finds a PO file in that window. This puts the window read-only and establishes a po-mode-map, which is a genuine Emacs mode, in a way that is not derived from text mode in any way. Functions found on `po-mode-hook', if any, will be executed. When PO mode is active in a window, the letters `PO' appear in the mode line for that window. The mode line also displays how many entries of each kind are held in the PO file. For example, the string `132t+3f+10u+2o' would tell the translator that the PO mode contains 132 translated entries (*note Translated Entries::., 3 fuzzy entries (*note Fuzzy Entries::.), 10 untranslated entries (*note Untranslated Entries::.) and 2 obsolete entries (*note Obsolete Entries::.). Zero-coefficients items are not shown. So, in this example, if the fuzzy entries were unfuzzied, the untranslated entries were translated and the obsolete entries were deleted, the mode line would merely display `145t' for the counters. The main PO commands are those which do not fit into the other categories of subsequent sections. These allow for quitting PO mode or for managing windows in special ways. `U' Undo last modification to the PO file. `Q' Quit processing and save the PO file. `q' Quit processing, possibly after confirmation. `O' Temporary leave the PO file window. `?' `h' Show help about PO mode. `=' Give some PO file statistics. `V' Batch validate the format of the whole PO file. The command `U' (`po-undo') interfaces to the GNU Emacs *undo* facility. *Note Undoing Changes: (emacs)Undo. Each time `U' is typed, modifications which the translator did to the PO file are undone a little more. For the purpose of undoing, each PO mode command is atomic. This is especially true for the `' command: the whole edition made by using a single use of this command is undone at once, even if the edition itself implied several actions. However, while in the editing window, one can undo the edition work quite parsimoniously. The commands `Q' (`po-quit') and `q' (`po-confirm-and-quit') are used when the translator is done with the PO file. The former is a bit less verbose than the latter. If the file has been modified, it is saved to disk first. In both cases, and prior to all this, the commands check if some untranslated message remains in the PO file and, if yes, the translator is asked if she really wants to leave off working with this PO file. This is the preferred way of getting rid of an Emacs PO file buffer. Merely killing it through the usual command `C-x k' (`kill-buffer') is not the tidiest way to proceed. The command `O' (`po-other-window') is another, softer way, to leave PO mode, temporarily. It just moves the cursor to some other Emacs window, and pops one if necessary. For example, if the translator just got PO mode to show some source context in some other, she might discover some apparent bug in the program source that needs correction. This command allows the translator to change sex, become a programmer, and have the cursor right into the window containing the program she (or rather *he*) wants to modify. By later getting the cursor back in the PO file window, or by asking Emacs to edit this file once again, PO mode is then recovered. The command `h' (`po-help') displays a summary of all available PO mode commands. The translator should then type any character to resume normal PO mode operations. The command `?' has the same effect as `h'. The command `=' (`po-statistics') computes the total number of entries in the PO file, the ordinal of the current entry (counted from 1), the number of untranslated entries, the number of obsolete entries, and displays all these numbers. The command `V' (`po-validate') launches `msgfmt' in verbose mode over the current PO file. This command first offers to save the current PO file on disk. The `msgfmt' tool, from GNU `gettext', has the purpose of creating a MO file out of a PO file, and PO mode uses the features of this program for checking the overall format of a PO file, as well as all individual entries. The program `msgfmt' runs asynchronously with Emacs, so the translator regains control immediately while her PO file is being studied. Error output is collected in the GNU Emacs `*compilation*' buffer, displayed in another window. The regular GNU Emacs command `C-x`' (`next-error'), as well as other usual compile commands, allow the translator to reposition quickly to the offending parts of the PO file. Once the cursor is on the line in error, the translator may decide on any PO mode action which would help correcting the error.