2012-10-14

more minor (hopefully) improvements

← Older revision

Revision as of 11:26, 14 October 2012

(One intermediate revision by one user not shown)

Line 1:

Line 1:

FFS stands for '''file format support'''. This is the general name for a group of classes for reading and writing translated messages in different file formats in the Translate extension.

FFS stands for '''file format support'''. This is the general name for a group of classes for reading and writing translated messages in different file formats in the Translate extension.



Software developers reinvented the wheel with localization technologies many times, so there are many different formats for storing translatable software messages. The two main groups of such formats are:

+

Software developers reinvented the wheel with localization technologies many times, so there are many different formats for storing translatable software messages. There are two main groups of such formats.

* Key-based formats: each message has a key, usually a more-or-less meaningful string. Translation to every language is a map of keys pointing to values. Most formats fall into this group, including DTD, JSON, and MediaWiki's own format (essentially a PHP array).

* Key-based formats: each message has a key, usually a more-or-less meaningful string. Translation to every language is a map of keys pointing to values. Most formats fall into this group, including DTD, JSON, and MediaWiki's own format (essentially a PHP array).

* Gettext-like: The message in the original language of the program, usually English, is itself used as a key that points to translations into other languages. This requires generation of inherently non-stable pseudo-keys for storing the messages in a different format.

* Gettext-like: The message in the original language of the program, usually English, is itself used as a key that points to translations into other languages. This requires generation of inherently non-stable pseudo-keys for storing the messages in a different format.

Line 12:

Line 12:

; readFromVariable( $data ): Read the messages from a string variable that has the same format as the file and return them as an array of AUTHORS and MESSAGES. This is where the actual parsing of the file's text is supposed to happen.

; readFromVariable( $data ): Read the messages from a string variable that has the same format as the file and return them as an array of AUTHORS and MESSAGES. This is where the actual parsing of the file's text is supposed to happen.

; write( MessageCollection $collection ): Write the messages to the file.

; write( MessageCollection $collection ): Write the messages to the file.



; writeIntoVariable( MessageCollection $collection ): Write the messages to a string variable that has the same format as the file. This is where the careful construction of the resultant messages file is supposed to happen.

+

; writeIntoVariable( MessageCollection $collection ): Write the messages to a string variable that has the same format as the file. This is where the careful construction of the resulting messages file is supposed to happen.

=== MediaWiki translations ===

=== MediaWiki translations ===

Line 18:

Line 18:

=== class SimpleFFS ===

=== class SimpleFFS ===



The class SimpleFFS is the ancestor of all other FFS classes, and it is also a simple example of how an FFS class should be written. It implements a simplistic key-based format:

+

The class SimpleFFS is the ancestor of all the other FFS classes, and it is also a simple example of how an FFS class should be written. It implements a simplistic key-based format:



* each file has two sections, separated by "\0\0\0\0".

+

* each file has two sections, separated by "\0\0\0\0";



* One section has the translators' names separated by "\0".

+

* one section has the translators' names separated by "\0";



* The other has the translations in "key=value" format, also separated by "\0".

+

* the other has the translations in "key=value" format, also separated by "\0".



Since SimpleFFS is intentionally simplistic, it demonstrates possible bugs and complications. For example, the '=' character is not escaped, so the key and the value may not contain them. Obviously, this is not something that is suitable for real-world programs. SimpleFFS also implements useful utility methods:

+

Since SimpleFFS is intentionally simplistic, it demonstrates possible bugs and complications. For example, the "=" character is not escaped, so the key and the value may not contain them. Obviously, this is not something that is suitable for real-world programs. SimpleFFS also implements useful utility methods:

; exists( $code ): Tests whether the file exists.

; exists( $code ): Tests whether the file exists.

; writeReal( $collection ): Implements internals of file format writing, apart from the more generic writeIntoVariable.

; writeReal( $collection ): Implements internals of file format writing, apart from the more generic writeIntoVariable.

Line 34:

Line 34:

General tips when writing new classes:

General tips when writing new classes:

* Avoid running executable file formats. Parse them.

* Avoid running executable file formats. Parse them.



* Remember to mangle and unmangle message keys.

+

* Remember to [[#Mangling the message keys to ensure correct title handling|mangle and unmangle message keys]].

* Do not assume message keys don't include problematic characters. They will.

* Do not assume message keys don't include problematic characters. They will.

* The output is usually expected to be pretty and readable. Some people like to poke in them manually.

* The output is usually expected to be pretty and readable. Some people like to poke in them manually.

Line 53:

Line 53:

== Mangling the message keys to ensure correct title handling ==

== Mangling the message keys to ensure correct title handling ==



The Translate extension is MediaWiki-based and every message is stored as a MediaWiki page, so the key must be a valid MediaWiki page title. Mangling takes care of this by escaping the key names a manner similar to the [[:w:Quoted-printable|quoted-printable encoding]], but with some modifications before storing the message as a wiki page. Before the message is written back to the file, the message is unmangled.

+

The Translate extension is MediaWiki-based and every message is stored as a MediaWiki page, so the key must be a valid MediaWiki page title. Mangling takes care of this by escaping the key names a manner similar to the [[:w:Quoted-printable|quoted-printable encoding]], but with some modifications before storing the message as a wiki page (see also [[Manual:PAGENAMEE encoding]]). Before the message is written back to the file, the message is unmangled.



When an FFS class overrides the functions that call the mangling routines, it must make sure the roundtrip is done correctly - that is, that the key is mangled before writing to MediaWiki and unmangled before writing the translation back to the file.

+

When an FFS class overrides the functions that call the mangling routines, it must make sure the roundtrip is done correctly – that is, that the key is mangled before writing to MediaWiki and unmangled before writing the translation back to the file.

Mangling is done in the StringMatcher class.

Mangling is done in the StringMatcher class.

Line 63:

Line 63:

* Parsing of the format: Essentially testing that the readFromVariable function returns the right keys and values for AUTHORS and MESSAGES.

* Parsing of the format: Essentially testing that the readFromVariable function returns the right keys and values for AUTHORS and MESSAGES.

* Roundtrip: Test that the keys and the messages are written and read correctly.

* Roundtrip: Test that the keys and the messages are written and read correctly.



You can use existing test routines, such as JavaFFSTest as examples.

+

You can use existing test routines, such as JavaFFSTest, as examples.

{{Translatable navigation template|Extension-Translate}}

{{Translatable navigation template|Extension-Translate}}

Show more