Xml2tlk

From xoreos Wiki
Jump to navigation Jump to search

XML to BioWare TLK converter

Synopsis

xml2tlk [<options>] [<input file>] <output file>

Description

xml2tlk converts XML files created by the tlk2xml tool back into the BioWare TLK format. For a more in-depth description of TLK files, please see the man page for the tlk2xml tool. Also note that currently, only the non-GFF versions, V3.0 and V4.0, can be created by xml2tlk.

The format of the input XML is pretty simple and straight-forward.

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<tlk language="0">
  <string id="1">Continue</string>
  <string id="2" sound="hello">Well hello there!</string>
  <string id="3" sound="bye" soundlength="0.5">Bye!</string>
  <string id="4" soundid="23">Who are you?</string>
</tlk>

The root element is "tlk", and it can have an optional language property. That language ID can also be given on the command line, and it then overrides the one in the input XML. When creating a, versions V3.0 and V4.0 need a language ID, while versions V0.2 and V0.5 ignore the language ID.

Each child tag of the root element has to be a "string" element, and each "string" element requires an "id" property. The ID is the string reference (StrRef) for the text line, and the contents of the "string" element is the text itself. The whole file has to be UTF-8 encoded.

Version V3.0 allows the following extra properties on a "string": "sound" (a resource reference of a voice-over for this line, <= 16 characters), "soundlength" (a floating point number denoting the length of the sound file in seconds), "volumevariance" (unused by the games) and "pitchvariance" (unused by the games).

Version V4.0 allows the extra property "soundid" on a "string", which is a numerical references to a voice-over line.

Versions V0.2 and V0.5 do not allow any extra properties.

Because TLK files contain localized string data, it is important to know the encoding of those strings. Unfortunately, the TLK files do not contain information about the encoding. Version 3.0 and 4.0 contain a language identifier, but the meaning of that varies between games. V0.2 and V0.5 even lack those completely. However, due to the Huffman-nature of V0.5 strings, the encoding there is fixed to little-endian UTF-16, and strings in V0.2 files are also usually in little-endian UTF-16 (with the exceptions of files found in the Nintendo DS game Sonic Chronicles: The Dark Brotherhood). To manually select the encoding, this tool provides a wide range command line options for various encodings.

Alternatively, the game this TLK is from can be specified and xml2tlk will write the strings in an appropriate encoding for that game and the language ID. Please note that this does not work for the game Sonic Chronicles: The Dark Brotherhood, since its TLK files do not provide a language ID.

Options

-h
--help

Show a help text and exit.

--version

Show version information and exit.

-3
--version30

Write a V3.0 TLK file.

-4
--version40

Write a V4.0 TLK file.

-l <id>
--language <id>

Override the TLK language ID.

--cp1250

Write strings as Windows CP-1250. Eastern European, Latin alphabet.

--cp1251

Write strings as Windows CP-1251. Eastern European, Cyrillic alphabet.

--cp1252

Write strings as Windows CP-1252. Western European, Latin alphabet.

--cp932

Write strings as Windows CP-932. Japanese, extended Shift-JIS.

--cp936

Write strings as Windows CP-936. Simplified Chinese, extended GB2312 with GBK codepoints.

--cp949

Write strings as Windows CP-949. Korean, similar to EUC-KR.

--cp950

Write strings as Windows CP-950. Traditional Chinese, similar to Big5.

--utf8

Write strings as UTF-8.

--utf16le

Write strings as little-endian UTF-16.

--utf16be

Write strings as big-endian UTF-16.

--nwn

Write strings in an encoding appropriate for Neverwinter Nights.

--nwn2

Write strings in an encoding appropriate for Neverwinter Nights 2.

--kotor

Write strings in an encoding appropriate for Knights of the Old Republic.

--kotor2

Write strings in an encoding appropriate for Knights of the Old Republic II: The Sith Lords.

--jade

Write strings in an encoding appropriate for Jade Empire.

--witcher

Write strings in an encoding appropriate for The Witcher.

--dragonage

Write strings in an encoding appropriate for Dragon Age: Origins.

--dragonage2

Write strings in an encoding appropriate for Dragon Age II.

<input file>

The XML file to convert. If no input file is specified, the XML data is read from stdin. The encoding of the XML stream must always be UTF-8.

<output file>

The TLK file will be written there.

Examples

Convert file1.xml into a V3.0 CP-1252 TLK file:

xml2tlk --version30 --cp1252 file1.xml file2.tlk

Convert file1.xml into a V4.0 UTF-8 TLK file and override the language ID:

xml2tlk --version40 --utf8 --language 1 file1.xml file2.tlk

Convert file1.xml into a V3.0 TLK file from Neverwinter Nights:

xml2tlk --version30 --nwn file1.xml file2.tlk

Convert the UTF-8 TLK file1.tlk into an XML file on stdout with tlk2xml(1), modify it using sed(1) and write the result back into a TLK:

tlk2xml --utf8 file1.tlk | sed -e 's/gold/candy/g' | xml2tlk --utf8 --version30 file2.tlk

See also

tlk2xml
TLK language IDs and encodings

References