Plan for media storage and accessibility

May include but is not limited to: media accessibility, global distribution with Content Delivery Network (CDN), blob storage

Delivering static content with Windows Azure is fairly easy. We’ve discussed blobs in the first Azure post recently. For a quick refresher blobs are (large) binary objects which come in two flavors: block blobs and page blobs. The former is of interest when dealing with media, since it supports streaming.

To create a blob, you need roughly the following code:

  1. CloudStorageAccount account = CloudStorageAccount.DevelopmentStorageAccount;
  2.       CloudBlobClient client = account.CreateCloudBlobClient();
  3.       //Notice the use of lowercase container reference
  4.       CloudBlobContainer container = client.GetContainerReference("mycontainer");
  5.       container.CreateIfNotExist();
  6.       //This is mandatory if you'd like to use your blob in CDN
  7.       container.SetPermissions(new BlobContainerPermissions() { PublicAccess = BlobContainerPublicAccessType.Blob });
  8.       
  9.       CloudBlob myBlob = container.GetBlobReference("myblob");
  10.       myBlob.UploadText("Hello Blob!");
* This source code was highlighted with Source Code Highlighter.

 

Now a little about the Content Delivery Network. This technology lets you provide content worldwide with great performance (I actually haven’t tested this). The idea is that content is cached and served by the most optimal server closest to your visitor.

CDN is optimized for static content; you can get your hands burned (financially) if you want to use dynamic content with it. To use it, you have to enable public access to your blobs and then, you have to enable CDN of course. Note that CDN isn’t used automatically. If you enable it, but use a Windows Azure Blob service URL (typically in the form of yourname.blob.core.windows.net) then nothing happens, the content will be served from the blob service. You have to specify the CDN URL, which is something like this: guid/vo.msecnd.net. Of course you can use a custom domain (exactly one per storage account) to access your CDN content.

Setting up and working with CDN is fairly easy, if you have access to the Windows Azure management portal you can surely figure it out yourself. More interesting are the facts from the end of Microsoft’s CDN article, because these little side notes have the nasty habit of turning into exam questions (sometimes keeping their wording, too). So blobs less than 10 GB tends to perform the best using CDN. Note that a custom domain can be used for one storage account at a time, so you cannot use the one domain for two (or more) endpoints, but you can use overlapping domain names (such as http://mydomain.com and http://mysubdomain.mydomain.com). Also note the absence of HTTPS; you can only use HTTP for CDN.

 

Choose the appropriate data storage model based on technical requirements

May include but is not limited to: SQL Azure, Cloud drive, performance, scalability, accessibility from other applications and platforms, Windows Azure storage services: blobs, tables and queues

SQL Azure

SQL Azure in itself is big enough to fill a book (In fact, it does fill a book. Most if this section is based on the book Pro SQL Azure by Scott Klein and Herve Roggero) so this section is just a quick introduction. SQL Azure is a transactional database based on SQL Server 2008. It supports the T-SQL language and a limited set of functions from SQL Server. It also supports ADO.NET and ODBC data access. You can even use your favorite SSMS to connect and manage SQL Azure databases, but there’s an online solution, too.

You should be aware that SQL Azure runs in a multitenant environment. This means that you have restrictions on query time, CPU, etc. So if you have a long running query, massive CPU usage, or something similar that might affect another users’ databases on the server, your database connection can be (and will be) throttled (terminated).

Despite this fact you should be aware that using SQL Azure you pay for storage (GBs of database size)*, so you should perform some CPU intensive tasks within SQL Azure instead of your application. The benefit is that CPU usage in SQL Azure is free, while you have to pay for it on an hourly base in your app hosted in the cloud.

Scalability in SQL Azure is revolving around sharding. The design guidelines are explained here. Sharding is a kind of horizontal partitioning; you store rows separately instead of columns. I’ll explain the concept in another blog post later.

Last but not least, have a look at the (most important) limitations of SQL Azure:

  • No support for backing      up/restoring databases (there are workarounds, of course)
  • No USE statement, and you      cannot use database names (this ends cross-database queries)
  • No Windows Authentication
  • Setting server level collation      is disabled
  • No heap tables, clustered      indexes are a must
  • Maximal database size is 150 GB
  • No SQL Server Agent
  • Idle connection are terminated      (after 30 minutes)

For the full list of limitations, see http://msdn.microsoft.com/en-us/library/ee336245.aspx.

Windows Azure Storage Services

Blobs

Blobs – as their name shows are large binary objects stored in the cloud. At the time of this writing, their size maxes out at 200 GB in the case of a block blob and 1 TB when using page blobs. Usually you would store images or video/audio in blobs. Video usage is especially useful, because block blobs support streaming.

I introduced two different kinds of blobs, block and page blobs. Let’s elaborate further on them. If you need further info, refer to MSDN.

Block blobs

A block blob is built from blocks, which can have the maximum size of 4 MBs (the largest block supported in one operation). You are free to modify, delete and insert block of a block blob, commit or discard your changes as needed. The maximum size of a block blob is 200 GB, and it can contain a total of 50.000 blocks.

Page blobs

Page blobs are optimized for random access. They can be 1 TB large, and they are built from 512 byte pages. You cannot “version” your pages, so updates of one or more pages are immediately in effect.

A special subtype of page blobs is Azure Drive (or Cloud drive). This is a VHD mounted as a local drive letter. It was mostly used before the other APIs were available.

Queue storage

Windows Azure provides a queue-based messaging service that you can use for communication between Azure roles (more on them later). Your messages can be 64 KB in size, and generally they are FIFO, but no guarantee exists that they will be treated in this fashion. You can of course process messages bigger than 64 KBs, by using blobs.

Tables

Tables allow you to store entities of 1 MB up to 100 TB. An entity can have 255 “columns” with different data types. Unlike SQL Azure, there is no relational support, so you can’t have foreign keys, joins, etc. The best usage of these tables is for example a leader board for a game. Small in size, not complex, no relationships required.

Table entities have three reserved properties which define a key for the entity: a partition key for the table itself, a row key for the entity within the table, and a timestamp.

Tables come with a fairly limited type set, these are byte[], bool, DateTime, double, Guid, int, long and string. For more info on tables, refer to MSDN.

There is more info about the various storage options in Windows Azure in this Technet article.

 

*The full truth is that you pay for two things: storage and bandwidth, however, bandwidth within an SQL Azure database and an application running inside Windows Azure is free.

Handle special data types

There are some data types introduced to SQL Server for use in special cases, like storing large objects in a database, or accessing the same objects stored in the file system through a database, and for working with spatial data. Hopefully, you won’t have to be able to write complex queries with geographical data involved for this exam, but you definitely should be able to select the right data type from a list of possible candidates. In this post, we’ll prepare ourselves for these kind of questions.

There is a beautiful acronym, BLOB, which stands for Binary Large Object. BLOBs can be MP3, or video, etc. As long as something is large and binary, it is valid to call it a BLOB. There are two places (from the aspect of ADO.NET) to store BLOBs. The first one is a database, the second is the file system. To work with BLOBs stored in a database, you just have to make yourself familiar with the following SQL types:

  • varbinary(max): stored data is in binary format
  • nvarchar(max): stored data is in text format

Continue reading “Handle special data types”