Saturday, 10 November 2018

Pluralsight Course: Azure Custom Vision Service

Over the past few months I've been working on authoring a first course for Pluralsight on the topic of the Custom Vision Service, part of the Cognitive Services suite provided by Microsoft and hosted on Azure.

It was released a couple of days ago, and those with subscriptions or a free trial, can access it here:

Saturday, 1 September 2018

Streaming mp3s from Azure blob storage gotcha

Have been wrestling with an issue one and off for a few weeks where have found mp3 files that previously played correctly in a web browser were having some issues relating to streaming. There's an example here. What I found - hopefully resolved now if you try the link - was that the audio would play, but the play bar that updates the position on the track wouldn't move, nor could the user navigate to parts of the audio to review it.

We're using jplayer and have the mp3 files hosted in Azure blob storage. All had worked fine for several years, but recently could see this issue on Chrome, even though they still played as expected in Edge. So must have been related to a relatively recent Chrome update.

The resolution though turned out to be a setting on the blob storage account, that I was led to via this Stackoverflow answer, that indicated setting the DefaultServiceVersion to an appropriate value might resolve it. And sure enough, when querying the current value I found it was null. Setting it to 2013-08-15 (and perhaps more recent versions) restored the play bar and streaming behaviour.

Refactoring an interface to facilitate unit testing

Short post to follow, generalising something I've just done in an active project that seems to have general relevance. I had an interface and class like this, that wasn't easy to put under a unit test.

interface ISomething
{
    string GetValue();  
    string GetComplicatedConvertedValue();
}

class MySomething : ISomething
{
    string GetValue()
    {
        // Get value from something that's a simple one-liner call but not easy to mock or stub (in my case, Sitecore settings)
        ...
    }
    
    string GetComplicatedConvertedValue()
    {
        var rawValue = GetValue();
        
        // Do some processing on the raw value that ideally we'd like under unit test, and return it
        ...
    }
}

One answer I realised is that, as the main value for testing this class is in the second method - retrieving the raw value is hard to test, but little value as it's a one liner into platform functionality - I can just remove that from the interface, and instead implement it as a testable extension method.

interface ISomething
{
    string GetValue();
}

class MySomething : ISomething
{
    string GetValue()
    {
        // Get value from something that's a simple one-liner call but not easy to mock or stub (in my case, Sitecore settings)
        ...
    }
}

static class SomethingExtensions
{
    static string GetComplicatedConvertedValue(this ISomething something)
    {
        var rawValue = something.GetValue();
        
        // Do some processing on the raw value that ideally we'd like under unit test, and return it
        ...
    }
}

Then it's quite easy to write a test on the extension method, using a mocked or stubbed implementation of ISomething. Fairly obvious in hindsight - or maybe foresight you might be thinking! - but nonetheless seems something that might be easy to do (i.e. make an interface too wide, and thus make some code hard to test in the implementation of that interface).

Wednesday, 30 May 2018

Tackling Common Concerns with Azure Function Patterns

I've recently had (or perhaps am about to have if you have found this early) the pleasure of talking at the Intelligent Cloud Conference in Copenhagen, in May 2018, on the topic of Tackling Common Concerns with Azure Function Patterns

I've collated here a few links to resources discussed in my session.

First, find a copy of the slides here (in pptx, so you may find fonts a bit off without our company one), or, perhaps better, as a PDF here.

Then links to GitHub repos containing the full implementations of the patterns discussed in the talk, and from where the code samples presented are drawn.

Some tools mentioned in the talk:

Lastly, the Azure functions documentation.

Tuesday, 27 February 2018

Verification of Sitecore Helix Architecture via a Fitness Function

Recently I've been reading - and recommend - the book Building Evolutionary Architectures by by Neal Ford,‎ Rebecca Parsons and Patrick Kua, where they discuss the concept of a fitness function. The idea here is that, as well as implementing the business requirements, a solution will also need to address certain non-functional concerns, such as security, performance, reliability and maintainability. Some of these can be measured or validated, and therefore tested, and ideally, tested in an automated way.

Within a Sitecore project I'm working on, we are adhering to the Helix principles that in part addresses the organisation of projects with a Visual Studio solution. You end up with many individual projects, each within a layer - foundation, feature and project.

  • The foundation layer is considered the highest level and most stable. Projects within this layer may have dependencies on each other, but not on any at the lower levels.
  • Feature layer projects are intended to be cohesive and swappable, in the sense that they can be reused across solutions with minimal impact on other projects. They can depend on foundation projects, but not any other feature ones, nor any at the lowest level.
    With a couple of caveats. In our solution, some features are themselves broken into one or more projects, which we call sub-features. It's OK and necessary that they have dependencies between them. We also of course allow test projects to reference their associated feature project.
  • The lowest level is the project layer, containing one (or more) projects representing the website(s) themselves. They can depend on feature and foundation projects, but not on each other.

Ensuring developers adhere to this and don't create inappropriate project references is an ideal job for an automated fitness function, which I've put together as a unit test using MSTest. There's a strict naming convention we follow with the projects within the solution and I'm leaning on that in order to write the test - of the form SolutionName.LayerName.ProjectName, e.g. MySolution.Foundation.Analytics.

The code for this can be found in the following gist:

Sunday, 25 February 2018

Return JSON from Sitecore rendering controller

As well as displaying the appropriate HTML and content, a Sitecore rendering controller can also be used to handle to form posts. As such you can have a single controller responsible for rendering the mark-up of a form and handling the POST request that's triggered when the form is submitted.

With this setup, I wanted to add some progressive enhancement and use the same form processing logic to handle an AJAX request. Using JavaScript we could hijack the form submission and make an AJAX request using the Fetch API. In the controller action method we can detect that an AJAX request is being made, and if so return a small JSON response, instead of the standard redirect back to the page to display a confirmation message.

This is quite a common pattern with ASP.Net MVC, making use of the Request.IsAjaxRequest() method available in the controller.

(As a quick aside, in order to use this method in an AJAX request triggered via the Fetch API, it's necessary to add a header that's used for the AJAX detection. It's also necessary to add the credentials option, to ensure cookies are passed, so we can make use of the CSRF protection offered by the ValidateAntiforgeryToken attribute).

    fetch(url,{
        method: 'POST',
        body: data,
        credentials: 'include',
        headers: {
            'X-Requested-With': 'XMLHttpRequest'
        });

When examining the response returned though, I was disappointed to discover that rather than the short JSON payload, what was actually coming back in the response was a big chunk of HTML, then the JSON, then further HTML. The reason of course was that we aren't dealing here with a standard MVC controller, rather a Sitecore one responsible for a rendering. This is executed as part of an existing page lifecycle, and so the JSON is getting output as content, within the context of a page output, as explained clearly here in Martina Welander's blog post.

To resolve this, I didn't really want to have to use a separate controller, defined with a custom route, to handle the AJAX requests - as we lose then the simplicity of having one location to post to from the client side, and one set of processing code. Instead I took the following approach:

  • On form processing, detect the AJAX request in the controller action method responsible for handling the form submission using Request.IsAjaxRequest()
  • Create an anonymous object representing the JSON response to return and serialize to a string
  • Pass that string to a custom ActionResult, which utilises HttpContext.Server.TransferRequest to end processing of the full page and pass the value to a new controller configured with a custom route
  • That controller then returns the JSON response, on it's own, with the appropriate content type

The relevant code is shown below, firstly the custom ActionResult:

    using System.Collections.Specialized;
    using System.Web.Mvc;

    public class TransferToJsonResult : ActionResult
    {
        public TransferToJsonResult(string serializedJson)
        {
            SerializedJson = serializedJson ?? string.Empty;
        }

        public string SerializedJson { get; }

        public override void ExecuteResult(ControllerContext context)
        {
            // Create a header to hold the serialized JSON value
            var headers = new NameValueCollection
                {
                    ["SerializedJson"] = SerializedJson
                };

            // And pass in the transfer request so it's available to the controller
            // that picks up generating the response.
            context.HttpContext.Server.TransferRequest("/JsonResponse", false, null, headers);
        }
    }

And then the controller that control is transferred to, in order to return the JSON response:

    using System.Web.Mvc;

    public class JsonResponseController : Controller
    {
        public ContentResult Index()
        {
            // Retrieve JSON to render from header
            var serializedJson = Request.Headers["SerializedJson"] ?? string.Empty;

            Response.ContentType = "application/json";
            return Content(serializedJson);
        }
    }

And finally an excerpt form the processing logic that handles the transfer behaviour:

    if (Request.IsAjaxRequest())
    {
       var serializedJson = JsonConvert.SerializeObject(new { success = true });
       return new TransferToJsonResult(serializedJson);
    }

Monday, 22 January 2018

Queue based load levelling using Azure Functions

On a current project I've been looking to employ a cloud design pattern known as queue based load levelling, and to implement it using Azure storage components and functions.

The Queue based load levelling pattern

The pattern is useful in a range of situations where there's a need for timely and reliable integration between different software systems. In short, the pattern utilises a queue service as a buffer for messages from one system to the other, allowing them to be passed to the destination system for processing in a controlled manner, and at a rate that won’t overwhelm available resources.

It can be adopted where one software system must send messages to another but for various reasons we want to avoid a direct connection between the two. One good reason we might want to do this is simply to reduce coupling – the two systems will need to agree on a message format, but ideally we don’t want to tie them to each other’s implementation details any further than that.

We may also have a situation where the messages come in a variable or bursting pattern – perhaps few or none for a period of time and then a lot may arrive in one go. If processing the messages at the destination is a relatively expensive operation, there’s a danger of overwhelming it, leading to timeouts and lost messages. By introducing a queue, we decouple the source system from the destination – the source posts messages to the queue that are accepted at whatever speed they arrive. The destination system can then be fed messages at a controlled and consistent rate; one that allows messages to be reliably processed.

The specific scenario to support is a series of messages that come from a client's internal system in XML format. The details contained within them need to be applied to a Sitecore CMS instance in order to update various content items.

Implementing with Azure functions and storage components

We've implemented this initially using two Azure functions, queue, and table storage as illustrated in the following diagram.




The first function project – the "message receiver" – contains an HTTP triggered function that responds to an incoming HTTP POST request accepting an XML message. It performs some validation on it and add, if passing, adds it to the queue. A record is written to the log table in table storage. It will also accept a GET request via a second function to return the status of the message.

The second project – the "message processor" – contains a function set up on a queue trigger, firing off as new messages are detected on the queue. It will be responsible for taking the validated message and passing it to the destination system for processing (in our case by posting it to a Sitecore API end-point).

This was working nicely in initial testing, but we started to find some edge cases where duplicate content was getting created in Sitecore. We narrowed this down to race conditions - a check would be made for a node at a particular path, and if it wasn't there it would be created. But in some cases the processing of another message would get in there first, and we'd end up with two nodes of the same name.

Controlling the speed of queue processing

This I thought should have been covered via some settings available on the queue triggered function to manage how many messages are dequeued at a time and how many instances the function app will scale out to. But seems I may have got confused with what's available for web jobs. In the end I came across this Github issue that indicates that, at least as of the time of writing, supporting singleton (or "process one queue message at a time") isn't supported directly on the functions platform.

So we needed an alternative - some way of having a function first check to see if other messages are already being processed. I read the Azure Web Jobs platform makes use of blob storage leases for this purpose, so tackling the problem in a similar way seemed sensible.

The solution we used in the end was to have the function triggered by the queue message to first try to acquire a lease on a particular blob. If it could get one, the message would be processed and, just before the function terminates as part of a try/finally block, the lease was released. If a second message was processed whilst the first was running, it won't be able to acquire the lease and so instead exits the function just after putting the message back on the queue. Note that it's important here to explicitly put the message back on the on the queue rather than just throw an exception. Doing this will leave the message back on the queue but with the dequeue count incremented, and when this reaches a configured level the platform will deem that message can't be processed and migrate it to a "poison" queue.



Setting these types of global flags can be risky as if an unexpected exception occurs there's a danger that the flag doesn't get reset, and if that were the case, no messages would get processed at all. Fortunately though blob leases when initially acquired can be set to expire after a period of time, so even if not explicitly released there's no risk of the system getting blocked in this way.

We coupled this solution with a second step which was to add messages to the queue initially with a short, random delay in the time that the message will be visible for processing. That way, even if we get a load of messages at once, their presence on the queue and the time at which they are processed (or put back for processing) will be staggered.

I've pulled out a simplified version of our implementation of this pattern in this Github repository.