XML to BioWare TLK converter
xml2tlk [<options>] [<input file>] <output file>
xml2tlk converts XML files created by the tlk2xml tool back into the BioWare TLK format. For a more in-depth description of TLK files, please see the man page for the tlk2xml tool. Also note that currently, only the non-GFF versions, V3.0 and V4.0, can be created by xml2tlk.
The format of the input XML is pretty simple and straight-forward.
<?xml version="1.0" encoding="utf-8" standalone="yes"?> <tlk language="0"> <string id="1">Continue</string> <string id="2" sound="hello">Well hello there!</string> <string id="3" sound="bye" soundlength="0.5">Bye!</string> <string id="4" soundid="23">Who are you?</string> </tlk>
The root element is "tlk", and it can have an optional language property. That language ID can also be given on the command line, and it then overrides the one in the input XML. When creating a, versions V3.0 and V4.0 need a language ID, while versions V0.2 and V0.5 ignore the language ID.
Each child tag of the root element has to be a "string" element, and each "string" element requires an "id" property. The ID is the string reference (StrRef) for the text line, and the contents of the "string" element is the text itself. The whole file has to be UTF-8 encoded.
Version V3.0 allows the following extra properties on a "string": "sound" (a resource reference of a voice-over for this line, <= 16 characters), "soundlength" (a floating point number denoting the length of the sound file in seconds), "volumevariance" (unused by the games) and "pitchvariance" (unused by the games).
Version V4.0 allows the extra property "soundid" on a "string", which is a numerical references to a voice-over line.
Versions V0.2 and V0.5 do not allow any extra properties.
Because TLK files contain localized string data, it is important to know the encoding of those strings. Unfortunately, the TLK files do not contain information about the encoding. Version 3.0 and 4.0 contain a language identifier, but the meaning of that varies between games. V0.2 and V0.5 even lack those completely. However, due to the Huffman-nature of V0.5 strings, the encoding there is fixed to little-endian UTF-16, and strings in V0.2 files are also usually in little-endian UTF-16 (with the exceptions of files found in the Nintendo DS game Sonic Chronicles: The Dark Brotherhood). To manually select the encoding, this tool provides a wide range command line options for various encodings.
Alternatively, the game this TLK is from can be specified and xml2tlk will write the strings in an appropriate encoding for that game and the language ID. Please note that this does not work for the game Sonic Chronicles: The Dark Brotherhood, since its TLK files do not provide a language ID.
- Show a help text and exit.
- Show version information and exit.
- Write a V3.0 TLK file.
- Write a V4.0 TLK file.
- Override the TLK language ID.
- Write strings as Windows CP-1250. Eastern European, Latin alphabet.
- Write strings as Windows CP-1251. Eastern European, Cyrillic alphabet.
- Write strings as Windows CP-1252. Western European, Latin alphabet.
- Write strings as Windows CP-932. Japanese, extended Shift-JIS.
- Write strings as Windows CP-936. Simplified Chinese, extended GB2312 with GBK codepoints.
- Write strings as Windows CP-949. Korean, similar to EUC-KR.
- Write strings as Windows CP-950. Traditional Chinese, similar to Big5.
- Write strings as UTF-8.
- Write strings in an encoding appropriate for Neverwinter Nights.
- Write strings in an encoding appropriate for Neverwinter Nights 2.
- Write strings in an encoding appropriate for Knights of the Old Republic.
- Write strings in an encoding appropriate for Knights of the Old Republic II: The Sith Lords.
- Write strings in an encoding appropriate for Jade Empire.
- Write strings in an encoding appropriate for The Witcher.
- Write strings in an encoding appropriate for Dragon Age: Origins.
- Write strings in an encoding appropriate for Dragon Age II.
- The XML file to convert. If no input file is specified, the XML data is read from stdin. The encoding of the XML stream must always be UTF-8.
- The TLK file will be written there.
Convert file1.xml into a V3.0 CP-1252 TLK file:
xml2tlk --version30 --cp1252 file1.xml file2.tlk
Convert file1.xml into a V4.0 UTF-8 TLK file and override the language ID:
xml2tlk --version40 --utf8 --language 1 file1.xml file2.tlk
Convert file1.xml into a V3.0 TLK file from Neverwinter Nights:
xml2tlk --version30 --nwn file1.xml file2.tlk
Convert the UTF-8 TLK file1.tlk into an XML file on stdout with tlk2xml(1), modify it using sed(1) and write the result back into a TLK:
tlk2xml --utf8 file1.tlk | sed -e 's/gold/candy/g' | xml2tlk --utf8 --version30 file2.tlk