BMG file

From Pikmin Technical Knowledge Base
Jump to navigation Jump to search

BMG files (Binary Message Generators) are files used to store text for Pikmin 2 and many other Nintendo games on a variety of consoles in the same era.

In Pikmin 2, there is one main BMG file per language for most of the text in the game, with the exception of text created by BLO screen files. These main BMG files are located at files/message/mesRes_(language).szs/pikmin2.bmg alongside the font file, called pikmin2.bmc.

Tools for working with BMGs[edit]

  • Cube by Chemi can pack and unpack BMGs as well as several other common file formats used in GameCube games.
  • PikminBMG by Yoshi2 is a Python tool specifically made for working with Pikmin 2 BMG files.

Text Control Codes[edit]

To do: Add more reference images for text control codes.

Control code tags can be placed within text strings as metadata for the game.

Possible colors that Pikmin 2 supports.

In Pikmin 2, the following text codes are available:

  • Change text color: FF0000XX - 00: default | 01: red | 02: light green (default color?) | 03: Dark Green | 04: Dark Blue |05: Light Blue |06: Yellow |07: Dark Yellow |08: Yellow-green |09: Orange |0a: Light Orange |0b: Reddish white |0c: light purple |0d: purple/violet |0e: dark purple |0f:black |1a and up: white
  • Display a button icon: 000000XX - 00: A| 01:B |02: C Stick |03: X |04: Y |05: Z| 06: L |07: R| 08: Control Stick (only used in unused cutscene)| 09: Start Button (completely unused)| 0a: D-pad
  • Text size: ff00010XXX - 64 is the default, can be made smaller or bigger
  • Vertical size: 03000500XX - Used for cave names only. 50 seems to be the default for them, but not normal text.
  • Text speed: 020001XX - I haven't figured this one out yet. The only change I found was 1e drawing 1 letter at a time for exploration kit, with ff being default. Maybe more?
  • Pause: 020000 - In normal dialogue, this should be used every 3 lines and at the end of the text. This will prompt the player to use the A button to scroll to the next 3 lines. It is also used to end a Piklopedia text message.

File Format[edit]

The BMG file format consists of a header followed by at least two (but usually three) data sections:

  1. Section INF1, or the Text Index Table, enumerates the strings in the BMG and provides offsets into the String Pool to find them.
  2. Section DAT1, or the String Pool, is simply a blob of null-terminated strings.
  3. Section MID1, or the Message ID Table, is a mapping from strings to IDs used by some games. Pikmin 2 uses this table but some other games omit it.

All numbers in BMG files are stored in big endian format, except in the Switch release of Pikmin 2 where they're stored as little endian.

Header[edit]

Every BMG file starts with the BMG Header at offset 0.

Offset Name Type Description
0x00 file_magic char[8] Always MESGbmg1 in ASCII.
0x08 data_size uint32 Length of the file in bytes, including padding. If this value is 0, this is instead the number of blocks in the file (same as the next field).
0x0C num_blocks uint32 Number of sections/blocks in this file, excluding the header. Always at least 2.
0x10 encoding byte Encoding scheme used for strings in the String Pool. 0=Undefined (CP1252), 1=CP1252, 2=UTF-16, 3=Shift-JIS, 4=UTF-8.
0x11 unknown byte[15] Unknown. Usually just zeroes.

Sections[edit]

All sections begin with the following two values:

Offset Name Type Description
0x00 kind char[4] Section magic.
0x04 size uint32 Section size. Includes padding at the end.

Sections are always (?) aligned to 32 byte boundaries and padded with zeroes at the ends if necessary.

Text Index Table[edit]

The Text Index Table is the master list of strings in the file and contains offsets into the String Pool to find each of them.

Offset Name Type Description
0x00 magic char[4] Section magic. Always INF1.
0x04 size uint32 Length of the section in bytes.
0x08 num_entries uint16 Number of messages.
0x0A entry_size uint16 Length of each message entry in bytes.
0x0C bmg_file_id uint16 ID for this BMG file. Purpose unclear.
0x0E default_color byte Default color index. Purpose in Pikmin 2 is unclear.
0x0F unknown byte Unknown.
0x10 message_entries Byte array of size num_entries * entry_size List of message entries.

Message Entries[edit]

Offset Name Type Description
0x00 string_offset uint32 Offset into the String Pool where the referenced string begins.
0x04 attributes Byte array of size entry_size - 4 Text attributes. Purpose not entirely clear in Pikmin 2.

String Pool[edit]

The String Pool is the main data section of every BMG file.

Offset Name Type Description
0x00 magic char[4] Section magic. Always DAT1.
0x04 size uint32 Length of the section in bytes.
0x08 strings Byte array of length size - 8 Null-terminated strings. Offsets defined in the Text Index Table are offsets into this array. Always begins with a single empty string (one 0x00 byte).

0x1A Escape Sequences[edit]

Text in the String Pool can contain escape sequences that define text control codes. Escape sequences always start with the byte 0x1A (A.K.A. the ASCII SUB character), followed by the total size of this escape sequence as a binary number, NOT a character including the ASCII SUB and the length byte, then several bytes for the actual escape code. In hex, a full escape code to change the text color in Pikmin 2 might look like this: 1A06FF000001.

Message ID Table[edit]

Messages can be given IDs in addition to just their offsets so they can be referred to more uniformly by game code. The MID1 section stores these IDs in the same order as the Text Index Table. Message IDs can be thought of as 32-bit unsigned integers, but in actuality they are two IDs in one: a 24-bit 'main' ID followed by an 8-bit 'sub' ID.

In Pikmin 2, the purpose of sub-IDs isn't entirely known. Sub-ID 1 is given to one copy of every treasure's name, and sub-IDs 2 and 3 are given to different copies of area and cave names.

Offset Name Type Description
0x00 kind char[4] Section magic. Always MID1.
0x04 size uint32 Length of the section in bytes.
0x08 num_entries uint16 Number of message IDs. Generally should match the number of messages in the Text Index Table.
0x0A format byte Purpose unknown.
0x0B info byte Purpose unknown.
0x0C unknown byte[4] Most likely just padding. Usually 0.
0x10 message_ids uint32[num_entries] Message IDs.