Development/Tutorials/Localization/i18n Krazy - KDE TechBase ...
Jump to content
From KDE TechBase
Development
Tutorials
Localization
Abstract
There are small technical details of i18n which are not that easy to keep in mind at all times, as well as number of i18n recommendations to uphold during the development. To help you with this, the
Krazy code checker
also looks for some frequently encountered i18n issues. This article documents these issues as reported by Krazy, for cases when you are not sure what the remedy should be.
Placeholders and Arguments
The i18n API is very strict about congruence between the
number
placeholders in the message, and the arguments actually supplied to substitute them. Effectively, the placeholders directly index arguments, albeit one- rather than zero-based.
"wrong argument count, have
num1
need
num2
Either some of the arguments have not been provided, or there is a stray placeholder inside the message string. A likely cause is forgetting that in KDE4, arguments are added as parameters to the i18n call itself, rather than appended via
arg()
methods as in KDE3:
i18n
"Found key: %1"
key
);
// correct
i18n
"Found key: %1"
).
arg
key
);
// ***wrong
"too many arguments, have
num
max 9"
i18n calls can take at most 9 arguments as parameters to the call. If more than that is needed, the
ki18n*()
series of calls must be used, see the documentation to
KLocalizedString
class. The calls with more than 9 parameters are extremely rare, though.
"gaps in placeholder numbering, ..."
Except in case of plural i18n calls, there must be no gaps in placeholder sequence, starting from
%1
. In plural calls, the placeholder of the first number (which determines plural) form, may be omitted, both in singular and plural:
i18n
"Line: %1 Column: %2"
lineNo
colNo
);
// correct
i18n
"Line: %1 Column: %3"
lineNo
colNo
);
// ***wrong
i18np
"Found a file in folder %2"
"Found %1 files in folder %2"
nfiles
folder
);
// correct
i18np
"Found a file in folder %2"
"Found some files in folder %2"
nfiles
folder
);
// also correct
i18np
"Found a file in folder %1"
"Found some files in folder %1"
folder
nfiles
);
// ***wrong
"legacy %n placeholder in plural call"
This is a remnant from KDE3, where in plural i18n calls, the argument determining the plural form had special
%n
placeholder. In KDE4, all arguments have ordinary
number
placeholders, as in the examples above (the plural form is decided upon the lowest-numbered argument that is an integer).
Ambiguous Short Messages
English is a rather noninflected language compared to many others; single English word can frequently be noun, verb, or adjective, while retaining its original form. This presents frequent problems for the translator while translating into inflected languages when the original message is short, especially single-worded. The solution is to
add context
to the message via
i18nc()
call.
"single adjective as message, probably ambiguous; ..."
Words that can be treated as adjectives are especially prone to ambiguities. Therefore Krazy checks single-worded messages against a list of adjectives collected from the KDE codebase, and issues this warning if the matching message does not have a context. For example:
titleFinal
title
isEmpty
()
i18n
"Unknown"
title
// ambiguous
titleFinal
title
isEmpty
()
i18nc
"An unknown title"
"Unknown"
title
// clarified
"reported ambiguous message by translators; ..."
There are other troublesome words, or even phrases, which were explicitly reported by translators as ambiguous. This warning means that such a message without context has been detected.
While you are at adding contexts, consider providing the appropriate
KUIT context marker
as well, which will further zero-in translators' job:
titleFinal
title
isEmpty
()
i18nc
"@item:intable An unknown title"
"Unknown"
title
// way to go!
The ambiguity warning can also be issued for
.ui
.rc
and
.kcfg
files. In
.ui
files, text labels can have the
comment
attribute (accessible in Qt Designer as a "disambiguation" property to the label, or "comment" prior to Qt 4.5), which can be used for adding contexts same as the first argument of the
i18nc()
call. In
.rc
and
.kcfg
files, contexts are added via the
context
attribute.
Number Formatting
The number-valued (either integer or real) arguments to i18n messages are formatted automatically into given language, without programmer's intervention. Using other methods to format numbers into strings may circumvent proper formatting for the language.
"use of QString::number() on an argument"
QString::number()
should never be used to format "amount" numbers, because within KDE code it will do so using English conventions. However, sometimes the number is not an amount, e.g. port number 15000 should not be formatted as "15,000" in English; use
KUIT tag in this situation:
i18n
"Number of pages: %1"
numPages
);
// good, localized amount format
i18n
"Connected to port %1."
port
);
// bad, amount format not desired
i18n
"Connected to port %1."
QString
::
number
port
));
// bad, not localized
i18n
"Connected to port
port
);
// good
"use of KLocale::formatNumber() on an argument (...)"
A smarter way to format numbers is using
KLocale::formatNumber()
, which will honor user's settings. However, then the format cannot be decided upon the language of the particular message in question (some applications may not have translations), so it's best avoided in i18n arguments. Use it for "live numbers", e.g. in spreadsheet tables and calculator displays, where the format should match user's number-typing habit.
Even when the complete message is a single number, it should be i18n'd, with a proper context:
result
QString
::
number
);
// bad
result
i18nc
"Atomic number"
"%1"
);
// good
When the number is to be formatted in a special way (field width, number of decimals, etc.) into the message, still neither
QString::number()
nor
KLocale::formatNumber()
should be used, but
ki18n*()
series of calls with
subs()
methods (see
KLocalizedString
documentation):
i18n
"Percent complete: %1"
QString
::
number
percent
'f'
));
// bad
ki18n
"Percent complete: %1"
).
subs
percent
'f'
).
toString
();
// good
HTML and KUIT Markup
Every i18n message in KDE4 is effectively XML markup. HTML tags come from Qt's rich text, and can be used only in rich-text capable widgets; KUIT tags are new KDE4
semantic markup
, which should be preferred to HTML, and can be used in
any
i18n message (plain or rich-text output is decided on the basis of
context marker
).
"malformed markup (unmatched tags, etc.)"
Since every message is XML, all tags must be properly closed. Opening
must not miss closing
, etc. This also holds for breaking HTML tags like
and
, which must be closed in place:
Verbatim greater-than sign means opening of a tag, which is not always meant. This can be avoided by use of predefined XML
<
entity (other predefined entities are
>
&
'
, and
"
), but for a frequent case of marking generic or user-replaceable text, it is better to use
KUIT tag:
i18n
"headers go into
);
// ***error in XML markup
i18n
"headers go into <includes>"
);
// no markup problem, but...
i18n
"headers go into
);
// better
Given it's frequency, it is not needed to use
&
for shortcut markers; there is some heuristics around XML parsing to allow presence of naked &. Basically,
&
is needed only in the rare case of
no_whitespace_sequence
pattern which is
not
meant as an XML entity.
"unclosed
... use proper paragraphs
This just signals a particularly frequent markup problem, that of
not closed in place, like
. Also, sometimes
is used to split logical paragraphs (especially when doubled,
), where proper paragraph tags should be used instead.
tag
is neither KUIT nor HTML tag"
Some of the tags in the message is simply unknown, and will not be understood at runtime. If it is not a typo, but the message is really speaking about tags to the user (e.g. in a HTML editing application), use
<foo>
pattern.
"HTML tag
tag
is not advised with KUIT markup"
When Krazy encounters the KUIT context marker in a message, it assumes that message is semantically tagged, and disapproves of some HTML tags which are best replaced with semantic equivalents. For example:
i18n
"@info:whatsthis"
"...this cannot be undone."
);
// Krazy complains
i18n
"@info:whatsthis"
"...this
);
// fine
i18n
"@info"
"Really delete %1?"
filename
);
// complains
i18n
"@info"
"Really delete
filename
);
// fine
"tag
tag1
cannot be subtag of
tag2
"tag
tag
has no
att
attribute"
"tag
tag
cannot have text content"
These are validity checks for KUIT markup, a bit relaxed form of formal XML validation. The rules of which KUIT tag can contain which, and so on, are given with
tag descriptions
KUIT Context Markers
Semantic
context markers
give great deal of information to translators of where and how the message is used at runtime. Providing them for
any
future messages is strongly encouraged, and even equipping existing messages when not in message freeze is welcomed. Krazy helps with the following checks.
"missing KUIT context marker"
If Krazy detects that the source file has a number of messages equipped with KUIT context markers, above some threshold, it assumes that developer's intention was to have all messages marked (as is recommended), and issues this warning for any non-marked message.
"invalid semantic role
role
"invalid interface subcue
cue
to role
role
"invalid visual format
fmt
Krazy checks that all ingredients of the context marker are defined, and that a particular combination of them is valid.
"expected context marker
ctxmark1
, got
ctxmark2
For messages used in some boiler-plate situations, canonical context markers have been established. Krazy warns if the context marker to such a message, although valid by itself, was not the expected one. For example, such messages are encountered in setting up the
KAboutData
information.
UI files
When Krazy complains about ambiguity in i18n translation in an UI file. You can either add a comment="" attribute using QtDesigner or using a text editor.
Contact
For any questions or suggestions, Krazy i18n checks are presently maintained by Chusslove Illich <
[email protected]
>.
Retrieved from "
US