Yann's Blog - Software and hardware

December 28, 2009

About rle_unpack in libevecache

Filed under: EVE-Central — Yann @ 12:07 pm

I have had several questions regarding the “rle_unpack” function in libevecache. In order to not repeat myself in e-mails, I decided to make a quick post describing the logic:

The market rows are compressed with an odd variant of run-length-encoding. In this case, the extra “0″ bytes are suppressed and encoded into one byte.

The row starts with a opcode byte, which is split as follows (high bit to low bit). You can find this broken out in the struct packer_opcap.

x x x t y y y b
7 6 5 4 3 2 1 0

There “t” is a flag to say “unpack 0s” if the bit is set, and “copy from source” if the bit is not set. So if the bit is set, “XXX” bytes of “0″ are copied to the output buffer, and if its not set, “XXX” bytes are copied from the input to the output.

This repeats with B and YYY. After one opcode byte has been processed, the next opcode byte is read, or processing is terminated if the input buffer has been exhausted.

This took a fair bit of time to figure out, with lots of sample data and market logs as the study set.

How much space does it save per order row? About 10-30%, from my samples.

2 Comments »

  1. I have been working on a VB.NET (Don’t laugh dammit) version of libevecache and because it STILL took me hours to figure this one out i think it’s only valid to post my methodology for solving the problem.

    This is a copy+paste of the commented code in my VB project.

    ‘ Observe this methodology for example…

    ‘ BZero & BLen 0100 1000 0001 0111 0001 0100 0011
    ‘ TZero & TLen 1000 1111 0000 1010 1010 1000 1011
    ‘ Operation Bytes 48 8F 10 7A 1A 48 3B
    ‘ Data Bytes CF 1413 04 2440 C0E6 7F39 66B3 CA01 4617 EA58 0A 01 109F 9303 B496 98 9AD3 C901 1579 FF7F 0E
    ‘ Result 00CF 1413 0400 0000 0000 0000 0000 2440 C0E6 7F39 66B3 CA01 4617 EA58 0A00 0000 0100 0000 109F 9303 B496 9800 9AD3 C901 0000 0000 1579 FF7F 0E
    ‘ ZERO’s added 1 9 3 3 1 4
    ‘ Data bytes copied 4 15 1 7 4 5

    ‘ TZero, TLen deals with the first 4 bits of the operation byte.
    ‘ BZero, BLen deals with the last 4 bits of the operation byte.

    ‘ TZero is the 4th bit and if it’s set it means it should ADD X number of &h00 bytes to the result.
    ‘ TLen is the number of Zero’s (+1) to add if the TZero flag is set.
    ‘ If TZero isn’t set then instead of adding null bytes it takes the following (8 – BLen) bytes and adds to the result.

    ‘ This is repeated with BZero and BLen respectively.

    ‘ So, for the first byte in the compressed row (&h48) TZero is set and we add TLen(0) + 1 = 1, null bytes to the result.
    ‘ BZero is not set so we copy 8 – BLen(4) = 4, bytes from the compressed data row.

    ‘ We now read the next operand byte and here TZero is set so we insert TLen(7) + 1 = 8, null bytes to the result.
    ‘ BZero is also set so we insert another BLen(0) + 1 = 1, null bytes to the result.

    ‘ Since the last operation only inserted null bytes (9 of them in total) we simply read the next operand.
    ‘ TZero isn’t set so we copy 8 – TLen(0) = 8, bytes from the compressed row.
    ‘ BZero isn’t set so we copy 8 – BLen(1) = 7, bytes from the compressed row.
    ‘ This means we copied (8 + 7) 15 bytes from the compressed row.

    ‘ That is half the compressed row done, i leave the rest up to you to grasp.

    Hope this helps someone.

    Comment by Cadde — March 16, 2010 @ 7:15 am

  2. Sorry for double comment…

    But of course this stripped the spacetastic formatting i had used so it might not make any sense at all… Try and copy+pasting it into a monospace font editor and adding the spaces back to it.

    Comment by Cadde — March 16, 2010 @ 7:17 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress