Starcraft 2 Model Format Pt. 2

Due to popular request, here is the second installment of my Starcraft 2 .m3 model format series.
First of all, I’ve gotten quite a few mails and comments on this topic. This includes some links to other people also working on the format, so here is a collection of links:
Thanks everyone for the links!
Today I’ll chat a little about the stored meshes. There are threed parts to rendering out simple meshes: the vertices, the faces, and the regions. But first, we need to find them.
That’s where the MODL chunk comes in. It should be considered the header of the m3 format. It defines where to find what. However, we’re only interested in a few parts. It’s important to note that I’ve found two different versions of the MODL header, version 0x14 and 0x17. 0x14 is slightly different, and I’m not going to talk about it in this post.
struct MODLHeader {
uint32 stuff_we_dont_need_atm[17]
uint32 flags
uint32 vertex_data_size
uint32 vertex_tag
uint32 div_count
uint32 div_tag
[… some more data we don’t care about…]
The precious parts here are the reference to the vertices, the reference to the div, and the flags. The flags will become handy when parsing the vertices, as they describe what can be found in the vertex format. In detail:
if flags & 0x40000 == 0:
vertex_size = 32
vertex_size = 36
There are def. other information in the flags, but this is what we need right now. The vertices can now be found at the vertex tag, which is just a collection of bytes. The number of vertices is vertex_data_size / vertex_size. The format is as follows:
struct Vertex {
float32 position[3]
uint8 boneweights[4]
uint8 boneindices[4]
uint8 normal[4]
uint16 uv[2]
if vertex_size == 36: {
uint8 unknown[4]
uint8 tangent[4]
Position is just the object space position of the vertex.
Boneweights range from 0 to 255 and represent the weight factor (divided by the sum of all weights) for each bone matrix.
Boneindices are indices that point to the corresponding bone matrix.
The normal is compressed and can be extracted as c = 2*c/255.0-1 for each component.
The uvs are scaled by 2048, so they need to be divided by 2048 to be used in opengl.
Finally, the tangent is compressed in the same way as the normal.
To get the bitangent and set up a correctly oriented and orthonormal tangent space, we need to take the cross product of the normal and tangent, and then multiply it by the w component of the normal: ( cross*n.w
The DIV is really just a container for two other important chunks, the faces and the regions:
struct DIV {
uint32 indices_count
uint32 indices_tag
uint32 regions_count
uint32 regions_tag
[… some other stuff …]
The triangles are just stored as tripplets in the indices U16_ tag. There are indices_count/3 triangles in the mesh.
To correctly render the mesh, we also need the region chunk:
struct Region {
uint32 unknown — Note: updated, thanks to NiNtoxicated!
uint16 vertex_offset
uint16 vertex_count
uint32 index_offset
uint32 index_count
uint8 unknown[12]
Now, to render a mesh, we can iterate through all regions, and for each region, we render index_count/3 triangles with the indices from the indices chunk. This looks something like this:
def drawModel(self):
for region in div.regions:
for i in range(region.index_count):
v = vertices[div.indices[region.index_offset+i]]
GL.glTexCoord(v.uv[0]/2048.0, v.uv[1]/2048.0)
GL.glVertex(v.position[0], v.position[1], v.position[2])
And that’s it! Setting up the proper textures and materials is rather complex and is def. worth another blog entry 😛 That’s it for now. As usual, let me know if you have any questions or comments!
PS: let me know if you know how to properly format source code on blogger 🙂

Starcraft 2 Model Format Pt. 1

Due to a number of requests I’ve received, I’ve decided to write down some of my findings in the .m3 model format. Here we go:
Overall structure
The .m3 format is a mixture of the World of Warcraft .m2 format and the Warcraft 3 .mdx format in that it has a parsing structure similar to the former, but uses tags, like the later. The list of tags with offsets are located at the end of the .m3 file. Specifically, the file header is
struct M3Header {
fourcc header_tag
uint32 tagindex_offset
uint32 tagindex_size
uint32 unknown1
uint32 unknown2
The header_tag should be ‘MD33’. The tagindex is the aforementioned list of tags, and starts at tagindex_offset and has tagindex_size elements. I suspect that unknown1 and unknown2 may point to the tag where to start the recursive parsing, which is the ‘MODL’ tag. The elements of the tagindex have the form:
struct Tag {
fourcc tag
uint32 offset
uint32 repetitions
uint32 version
The first two elements, tag and offset, are pretty obvious I guess. The interesting part starts at the repetitions part. Each tag describes a fixed-size structure, which may be a little different for different files, hence the version number. The number of repetitions tells us how often this
structures is repeated in the chunk. As an example, a string chunk (with a ‘CHAR’ tag) describes a string that consists of repetitions many characters of size 1 byte (unless it’s UTF8, but I’ve not encountered any special characters yet). This is great for loading, because we don’t need dynamic parsing. We can just allocate and read the structure as many times as requested, without worrying about dynamically sized content.
If dynamic sizes are required, they are placed in a seperate chunk and referenced via the tag. Again, a good example are strings, which you can find all over the file, usually directly after the chunks where they are needed.
All chunks are padded to 16 byte boundaries using 0xaa.
Once the tag directory has been read, my parsing starts at the ‘MODL’ tag (and I suspect Blizzards parsing does, too). The ‘MODL’ chunk is essentially the root of the parsing tree for the .m3 format, just like the header of the .m2 format, but instead of referencing the other chunks by offsets, they are referenced by their index in the tagindex. The references usually also contain the number of repetitions, which make them quite easy to spot.
That’s it for today and the general parsing and file structure. Next time I’ll get into the details of detecting the vertex format, reading out the vertices and faces and finally rendering a rough version of the model.
As an outlook, here are some more pictures (Templar, Ultralisk, Hydralisk):

Blizzards art team is just incredible!

Starcraft 2

By now most of my regular readers probably know that I’m going to pause working at the university by then end of February and start working full-time for Limbic soon after. That means I’ll also start to blog more, again.
By now most of you also have heard the exciting news that the Starcraft 2 Beta launched last week. As I have done many times before, I couldn’t resist to dig a little into the game data formats and see what I could find. Of key interest to me is the .m3 model format that is being used in SC2. It’s a natural evolution of the Warcraft 3 .mdx format, and the World of Warcraft .m2 format.
Since I can think, I’ve been dissecting formats of games that I liked from a technical point of view, in order to learn how they accomplish those great games (some prime examples are Half Life, Quake 3, Black&White;, Warcraft 3, World of Warcraft). Hence, for the sake of the good old times and out of pure curiosity, I couldn’t resists opening up the SC2 files in my Hex Editor and dig a little around. To my surprise, the .m3 format is quite a bit different from .mdx and .m2. It didn’t take too long though to find myself through the file and extract the meaningful bits of data I needed to get a better understanding of the whole thing. So far I can say that I already love the SC2 engine, even more than the WC3 and WoW engines.
Here are two examples of cool models I rendered with my python tool, only vertices+normals for now (Hatchery and Hydralisk):

If someone is interested in my findings or if I find more time to make the tool more worth-while, I may upload it somewhere (github, probably) and I’m happy to share my insights with anyone interested.

Appstore Piracy

I have absolutely no clue where they have those numbers from, but those 75% piracy rate they claim I can not confirm at all. For TowerMadness, it has been at most 10%, even before we released Zero.
Also, if their numbers are right, 0.45 billion paid downloads would generate 1.35 billion wares downloads. Anyone knows a app piracy page that comes even remotely close to that number of downloads? Assuming an average of 5MiB per App (and that’s probably a rather low estimate), that would be 6.29 PETABYTE of traffic. That’s probably around $670k just for the traffic if hosted at Amazon S3.
And whoever started calling it piracy should be put on a boat, sent to the horn of africa and hang out with the real pirates a little.
‘Nuff said.

AppEngine Performance

Two small AppEngine performance tidbits that saved Limbic a lot of money in the long run:

1) Queries on keys:
We’ve got a little query that fetches up to 100 games that have not been processed by our statistics bot yet. Because we have 10 statsbot instances running on 2 servers, it is called quite a lot. We’ve been using a very simple query to achieve the goal. But it used a lot of CPU and API CPU, actually being the second most CPU intensive function. I applied two small tweaks: First, I’m only fetching keys now (instead of the whole objects). Second, I doubled the size of the request from 100 to 200, returning twice as many games on a single request. The results? 5-10x reduction of CPU and API CPU for this handler. That’s a few $ a day 😀 Related AppEngine Documentation
2) Object Deletion:
Because of the increasing number of games we receive every second for TowerMadness and the storage costs on the AppEngine (tm-zero alone generates about 40 gb of data a month), we decided to delete games after a month. This also keeps the Meta-game fresh, because old strategies “leave” the pool as time progresses. Anyways, sounds like an easy deal to reduce long-term costs (and this really accumulates up), but the implementation turned out not to be that easy. The problem is, the AppEngine is really slow when it comes to deletions. When I say really slow, I don’t mean write slow. Writes are pretty slow, but deletes appear to be even more so. Another problem is the AppEngine remote_api latency. Initially, I was only able to delete about 0.5-1.5 games a second (contrast that with a peak game submission rate of about 10 games/second). It was also really expensive (the appengine dashboard was glowing when I ran the remote script :-). Pretty crazy for just “deleting old data”.
I’ve tried many many different optimizations (there’s real money on the line here). What really cut the deal was batching. I’m now accumulating a deletion list, and it is being flushed when it reaches the maximum size (which is about 100, because each game has up to 3 datastore entities, and 300 was the maximum deletion batch size that didn’t timeout to excessively for me). Secondly, and even more importantly, I cut down significantly on the latency by employing a small but great trick:
We’re running through the list of games with a key-index based query. We then fetch each of those games, check the date, and decide whether or not to delete it. If it is deleted, we also delete the stats object (for the homepage stats like kills, time, etc) and the game hash entry. Initially, I used a query here (select gamehash where hash=), and later a little optimized gamehash.get_by_key_name(). Both forced the remote api to download the object from the AppEngine though. But it turns out that you can just construct the key of an object if you know the keyname, via db.Key.from_path, like:
hash_key = db.Key.from_path(‘GameHash’, hash_key_name)
This gave an insane speedup, because now there is only one query, which is fetching the games. The other two queries were replaced, since we can get the keys we need for the db.delete function from the from_path function. This whole optimization cut the cost by an extreme amount, and actually boosted the deletion performance way beyond peak upload load, so we can now delete older games at the rate that they originally came in.
All in all, I’m extremely happy with the choice of the Google AppEngine for TowerMadness, but I guess you really have to spend some time to make it perform really good. But this way or another, TowerMadness Zero showed us that it scales better than anything I’ve seen before.

TowerMadness Zero

We’ve got a pretty crazy experiment running at the moment. In order to cannibalize our pretty solid sales for TowerMadness, we decided to release a full version of TowerMadness, for free, called TowerMadness Zero. The only catch with Zero: there are some ads. But it’s not even much, just one ad per game played.

With this move, TM is pretty much indisputably the number one free tower defense on the Appstore, as there aren’t really many free full-featured alternatives. We’re already sky-rocketing in ranks, being #3 in Strategy in Japan, #8 in Strategy in Germany and #16 in Strategy in the US, after a single day. The full version is of course still on the AppStore for $3, and so far sales dropped only a little. We think there is actually a chance the full version sales will go up, as more people find out about TM. The current conversion rate from zero to full version is about 62 (meaning that for every 62 downloads of zero we have one download of the full version).
We’re using Greystripe as an ad provider. Support is great, let’s see how the eCPM work out.

I think it’s a fun experiment, and we can’t really lose with this. What’s better than having thousands of people playing your game every day. While we’re at it, I’ve computed some stats for TM. Turns out that all players together have spent about 26 years playing TowerMadness already (and it’s only out for a few month). The 45k players earned 3.7 billion points in the almost one million games played, killing about 330 million aliens.
You can get TowerMadness Zero here.
Let’s see how this experiment goes. If it doesn’t go well, I guess TM zero will disappear from the appstore in a few days. But if it works, I guess it’s going to be exciting times! Anyways, eat this, pirates! Arrrr!

Piracy on the App Store

Iman sent me a link about some dude spending way too much time on anti piracy measures on the app store:

This is totally insane. The time he is spending on preventing piracy (without actually achieving the 100% piracy protection) is so much better spent on just developing good apps.
As an example, with TowerMadness, we actually measured a little bit of pirate activity (because we had more unique online users than actual sales), especially in the early days. But it was only a tiny fraction of the sales, and we didn’t care about it at all. And so far, we’re on a good course, as sales are better than ever and have been very steady for the last month or so.
Going a step further, I even believe that piracy did probably get us a few more sales. I don’t believe the people that pirated the game would have bought it if they had not been able to pirate it. So we really didn’t lose any money there. On the other hand, it’s free advertisement. TowerMadness is awesome, so the pirates will show it to their friends (yes, even pirates have friends!).
‘Nuff said.