Monday, January 21, 2013

Compress/Decompress String in C#

How can we compress and decompress a string in C#?

Some of us might want to narrow down the size of the strings before we uploaded it in the database. This blog will explain you how will you do that in C#.

As compression and decompression terminology has mentioned, in C# the class GZipStream is very popular in this technique. The GZipStream class provides a mechanism to the user (or library adapter) to do the actual compression and decompression.

Please visit the Microsoft documentation for more details. Please note that GZipStream class is not limited in String compression but it can even use the FileStream object to do the compression in the file system (such like other compressor did - winzip and etc).

We will show you our way how did we do the compression in the String. Below are our codes.

public static string Compress(this string text)
{
    try
    {
        var buffer = Encoding.UTF8.GetBytes(text);
        var memoryStream = new MemoryStream();
        using (var stream = new GZipStream(memoryStream, CompressionMode.Compress, true))
        {
            stream.Write(buffer, 0, buffer.Length);
        }
        memoryStream.Position = 0;
        var compressed = new byte[memoryStream.Length];
        memoryStream.Read(compressed, 0, compressed.Length);
        var gZipBuffer = new byte[compressed.Length + 4];
        Buffer.BlockCopy(compressed, 0, gZipBuffer, 4, compressed.Length);
        Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gZipBuffer, 0, 4);
        return Convert.ToBase64String(gZipBuffer);
    }
    catch
    {
        throw;
    }
}

The Compress method used the MemoryStream object to store the compressed string in the computer's memory. After that, with the help of other binary related classes (Buffer, BitConverter) we managed to transfer the bits from one object to other and return back the proper Base64 string based on that buffer.

The opposite way around is to decompress the string. The codes below is the one we wrote and used.

public static string Decompress(this string compressedText)
{
    try
    {
        var gZipBuffer = Convert.FromBase64String(compressedText);
        using (var memoryStream = new MemoryStream())
        {
            int dataLength = BitConverter.ToInt32(gZipBuffer, 0);
            memoryStream.Write(gZipBuffer, 4, gZipBuffer.Length - 4);
            var buffer = new byte[dataLength];
            memoryStream.Position = 0;
            using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
            {
                gZipStream.Read(buffer, 0, buffer.Length);
            }
            return Encoding.UTF8.GetString(buffer);
        }
    }
    catch
    {
        throw;
    }
}

What the codes above is doing its just simply decompressed the existing compressed text. Same way on Compress method, however, you will notice that the CompressionMode used was Decompress. It is the opposite flag used when compression an existing uncompress string.

We hope you will get your own good impression with what we had written. But please be assure that you understand how the process of compression/decompression happens. You could also extend that in the actual file system.

No comments:

Post a Comment

Place your comments and ideas