How to Unzip Azure Blobs programmatically using Memory Streams in Azure WebJOBs

Introduction

Recently I had to extract files from a .zip file stored on Azure Blob Storage in Azure WebJob process.

The approach

This is what I did from scratch:

  1. Create WebJob project

  2. Create Non-continuous Azure WebJob

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    static void Main()
    {
    var host = new JobHost();
    host.Call(typeof(Functions).GetMethod(“MyMethod”));
    }

    public class Functions
    {
    [NoAutomaticTrigger]
    public static void MyMethod()
    {
    // code
    }
    }
  3. Install WindowsAzure.ConfigurationManager and WindowsAzure.Storage nuget packages.

    1
    2
    Install-Package Microsoft.WindowsAzure.ConfigurationManager
    Install-Package WindowsAzure.Storage
  4. Add System.IO.Compression.dll from system dlls.

  5. Add this code:

    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using Microsoft.Azure.WebJobs;
    using Microsoft.WindowsAzure;
    using Microsoft.WindowsAzure.Storage.Blob;
    using System.IO.Compression;
    namespace MyWebJobNamespace
    {
    // Functions.cs file
    public class Functions
    {
    [NoAutomaticTrigger]
    public static void MyMethod()
    {
    // Retrieve storage account from connection string.
    CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
    // Stored on App.config or Azure WebApp UI Settings. CloudConfigurationManager is able to retrieve from these two places)
    //<appSettings>
    // <add key="StorageConnectionString" value="DefaultEndpointsProtocol=https;AccountName=storageAccountName;AccountKey=storageAccountKey" />
    //</appSettings>
    // Create the blob client.
    CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
    // Retrieve reference to a previously created container.
    CloudBlobContainer container = blobClient.GetContainerReference("containerName");
    // Retrieve reference to a blob name
    CloudBlockBlob blockBlob = container.GetBlockBlobReference("MyFile.zip");
    // Save blob contents to a Memory Stream.
    using (var msZippedBlob = new MemoryStream())
    {
    blockBlob.DownloadToStream(msZippedBlob);
    using (ZipArchive zip = new ZipArchive(msZippedBlob))
    {
    var entry = zip.Entries.First();
    using (StreamReader sr = new StreamReader(entry.Open()))
    {
    string result = sr.ReadToEnd();
    System.Console.WriteLine("# Characters: " + result.Length);
    }
    }
    }
    }
    }
    }

Important notes

  • Previously we need to create Azure Storage Account and get the storageAccountName and storageAccountKey.
  • CloudConfigurationManager is able to retrieve settings firdt from Azure WebApp Settings UI and secondly from App.config file.
  • We must create a new Blob Container and change in the code (or add a new app setting).
  • ZipArchive is the system native approach to unzip files.
  • Isn’t needed store temporary files to unzip and I have tested files until 150 MB (10MB compressed) on WebJobs working fine.

 

Author: José Quinto
Link: https://blog.josequinto.com/2016/09/13/how-to-unzip-azure-blobs-programmatically-using-memory-streams-in-azure-webjobs/
Copyright Notice: All articles in this blog are licensed under CC BY-SA 4.0 unless stating additionally.