SCO

SCO files contain Grand Theft Auto IV's game scripts. Its new format replaces old scm one.

File Format
A SCO file is layed out into 4 segments. First the header containing information about the SCO file. Then the code segment which contains the opcode's which govern how the script behaves. The next segment is the global variables container which contains enough space to hold the scripts global variables. The last is the public variables container, it is unclear what this segment actually does.

Header
There are 2 types of SCO files, an encrypted and unencrypted one. Each file however shares the same unencrypted header structure, and you can use this to determine which type of SCO file you are dealing with. The size of this header is 24 bytes.

4b - CHAR[4]/UINT32 - SCO Identifier 4b - UINT32 - Code Size 4b - UINT32 - Global Var Count 4b - UINT32 - Public Var Count 4b - UINT32 - Script Flags 4b - UINT32 - Signature

The SCO Identifier will be "SCR\r" (or 0xD524353) in an unencrypted version, and "scr"+0xE (or 0xE726373) in an encrypted version. To decrypt an encrypted version you must decrypt each segment (except the header) separately using GTA IV's AES Cryptography. The Code Size refers to the amount of bytes the code section takes up. The Global Var Count refers to the amount of global variables the SCO file contains. The segment for global variables starts at the end of Header + Code Size, and continues for 4x the global variable count (due to the global variables being stored in 4 byte's each). The Public Var Size refers to the amount of public variables the SCO file contains. The segment for public variables starts at the end of Header + Code Size + Global Var Count * 4, and continues for 4x the public variable count (due to the public variables being stored in 4 byte's each). The Script Flags are boolean bits which are currently unexplained. The Signature only differs in navgen_main, but could possibly set the script priority.

Code Segment
The Code Segment contains the opcodes which govern the scripts behaviour.

Opcodes
Opcodes can have varying sizes, but all opcodes are identified by their first byte. There are 79 opcodes which can occur and any opcode above 80 is a Push opcode which pushes it's own number - 96 onto the stack. For example, opcode 100 will push 4 onto the stack. Opcodes 76,77,78 deal with XLive protected buffers and are available on the PC platform only. Undefined opcodes (i.e. opcode 79), will cause a forceful abort of the script execution.

The Vectors on the stack are pointers to the memory containing the full vector. A Vector is characterised by this structure:

4b - FLOAT32 - X 4b - FLOAT32 - Y 4b - FLOAT32 - Z

Global Variables
This contains the Global Variables in the script. Each global variable is 4 bytes long, and can contain static information in the script file itself.

Public Variables
It is uncertain what this section actually does.

High Level Representation
To turn the assembly of a SCO file into a high level representation some factors have to be considered. Arrays and structures can be defined, so it is reasonable to assume there is some kind of typecasting, even if it is only between the following types; int, float, string or a predefined structure. It is interesting to note that opcodes exist to typecast integers and concatenate them directly to strings, this suggests the SCO language has a java like way of handling strings (using + and += for concatenation). The SCO file does not seem to have native support for the notion of classes, however this is not to say they could not be implemented within SCO itself similar to how they are done at a low level in C++.

Tools

 * SCO (Dis-)assembler by
 * OpenIV – contains a built-in decompiler
 * SparkIV – contains a built-in decompiler