Wednesday, 30 May 2018

Tackling Common Concerns with Azure Function Patterns

I've recently had (or perhaps am about to have if you have found this early) the pleasure of talking at the Intelligent Cloud Conference in Copenhagen, in May 2018, on the topic of Tackling Common Concerns with Azure Function Patterns

I've collated here a few links to resources discussed in my session.

First, find a copy of the slides here (in pptx, so you may find fonts a bit off without our company one), or, perhaps better, as a PDF here.

Then links to GitHub repos containing the full implementations of the patterns discussed in the talk, and from where the code samples presented are drawn.

Some tools mentioned in the talk:

Lastly, the Azure functions documentation.

Tuesday, 27 February 2018

Verification of Sitecore Helix Architecture via a Fitness Function

Recently I've been reading - and recommend - the book Building Evolutionary Architectures by by Neal Ford,‎ Rebecca Parsons and Patrick Kua, where they discuss the concept of a fitness function. The idea here is that, as well as implementing the business requirements, a solution will also need to address certain non-functional concerns, such as security, performance, reliability and maintainability. Some of these can be measured or validated, and therefore tested, and ideally, tested in an automated way.

Within a Sitecore project I'm working on, we are adhering to the Helix principles that in part addresses the organisation of projects with a Visual Studio solution. You end up with many individual projects, each within a layer - foundation, feature and project.

  • The foundation layer is considered the highest level and most stable. Projects within this layer may have dependencies on each other, but not on any at the lower levels.
  • Feature layer projects are intended to be cohesive and swappable, in the sense that they can be reused across solutions with minimal impact on other projects. They can depend on foundation projects, but not any other feature ones, nor any at the lowest level.
    With a couple of caveats. In our solution, some features are themselves broken into one or more projects, which we call sub-features. It's OK and necessary that they have dependencies between them. We also of course allow test projects to reference their associated feature project.
  • The lowest level is the project layer, containing one (or more) projects representing the website(s) themselves. They can depend on feature and foundation projects, but not on each other.

Ensuring developers adhere to this and don't create inappropriate project references is an ideal job for an automated fitness function, which I've put together as a unit test using MSTest. There's a strict naming convention we follow with the projects within the solution and I'm leaning on that in order to write the test - of the form SolutionName.LayerName.ProjectName, e.g. MySolution.Foundation.Analytics.

The code for this can be found in the following gist:

Sunday, 25 February 2018

Return JSON from Sitecore rendering controller

As well as displaying the appropriate HTML and content, a Sitecore rendering controller can also be used to handle to form posts. As such you can have a single controller responsible for rendering the mark-up of a form and handling the POST request that's triggered when the form is submitted.

With this setup, I wanted to add some progressive enhancement and use the same form processing logic to handle an AJAX request. Using JavaScript we could hijack the form submission and make an AJAX request using the Fetch API. In the controller action method we can detect that an AJAX request is being made, and if so return a small JSON response, instead of the standard redirect back to the page to display a confirmation message.

This is quite a common pattern with ASP.Net MVC, making use of the Request.IsAjaxRequest() method available in the controller.

(As a quick aside, in order to use this method in an AJAX request triggered via the Fetch API, it's necessary to add a header that's used for the AJAX detection. It's also necessary to add the credentials option, to ensure cookies are passed, so we can make use of the CSRF protection offered by the ValidateAntiforgeryToken attribute).

    fetch(url,{
        method: 'POST',
        body: data,
        credentials: 'include',
        headers: {
            'X-Requested-With': 'XMLHttpRequest'
        });

When examining the response returned though, I was disappointed to discover that rather than the short JSON payload, what was actually coming back in the response was a big chunk of HTML, then the JSON, then further HTML. The reason of course was that we aren't dealing here with a standard MVC controller, rather a Sitecore one responsible for a rendering. This is executed as part of an existing page lifecycle, and so the JSON is getting output as content, within the context of a page output, as explained clearly here in Martina Welander's blog post.

To resolve this, I didn't really want to have to use a separate controller, defined with a custom route, to handle the AJAX requests - as we lose then the simplicity of having one location to post to from the client side, and one set of processing code. Instead I took the following approach:

  • On form processing, detect the AJAX request in the controller action method responsible for handling the form submission using Request.IsAjaxRequest()
  • Create an anonymous object representing the JSON response to return and serialize to a string
  • Pass that string to a custom ActionResult, which utilises HttpContext.Server.TransferRequest to end processing of the full page and pass the value to a new controller configured with a custom route
  • That controller then returns the JSON response, on it's own, with the appropriate content type

The relevant code is shown below, firstly the custom ActionResult:

    using System.Collections.Specialized;
    using System.Web.Mvc;

    public class TransferToJsonResult : ActionResult
    {
        public TransferToJsonResult(string serializedJson)
        {
            SerializedJson = serializedJson ?? string.Empty;
        }

        public string SerializedJson { get; }

        public override void ExecuteResult(ControllerContext context)
        {
            // Create a header to hold the serialized JSON value
            var headers = new NameValueCollection
                {
                    ["SerializedJson"] = SerializedJson
                };

            // And pass in the transfer request so it's available to the controller
            // that picks up generating the response.
            context.HttpContext.Server.TransferRequest("/JsonResponse", false, null, headers);
        }
    }

And then the controller that control is transferred to, in order to return the JSON response:

    using System.Web.Mvc;

    public class JsonResponseController : Controller
    {
        public ContentResult Index()
        {
            // Retrieve JSON to render from header
            var serializedJson = Request.Headers["SerializedJson"] ?? string.Empty;

            Response.ContentType = "application/json";
            return Content(serializedJson);
        }
    }

And finally an excerpt form the processing logic that handles the transfer behaviour:

    if (Request.IsAjaxRequest())
    {
       var serializedJson = JsonConvert.SerializeObject(new { success = true });
       return new TransferToJsonResult(serializedJson);
    }

Monday, 22 January 2018

Queue based load levelling using Azure Functions

On a current project I've been looking to employ a cloud design pattern known as queue based load levelling, and to implement it using Azure storage components and functions.

The Queue based load levelling pattern

The pattern is useful in a range of situations where there's a need for timely and reliable integration between different software systems. In short, the pattern utilises a queue service as a buffer for messages from one system to the other, allowing them to be passed to the destination system for processing in a controlled manner, and at a rate that won’t overwhelm available resources.

It can be adopted where one software system must send messages to another but for various reasons we want to avoid a direct connection between the two. One good reason we might want to do this is simply to reduce coupling – the two systems will need to agree on a message format, but ideally we don’t want to tie them to each other’s implementation details any further than that.

We may also have a situation where the messages come in a variable or bursting pattern – perhaps few or none for a period of time and then a lot may arrive in one go. If processing the messages at the destination is a relatively expensive operation, there’s a danger of overwhelming it, leading to timeouts and lost messages. By introducing a queue, we decouple the source system from the destination – the source posts messages to the queue that are accepted at whatever speed they arrive. The destination system can then be fed messages at a controlled and consistent rate; one that allows messages to be reliably processed.

The specific scenario to support is a series of messages that come from a client's internal system in XML format. The details contained within them need to be applied to a Sitecore CMS instance in order to update various content items.

Implementing with Azure functions and storage components

We've implemented this initially using two Azure functions, queue, and table storage as illustrated in the following diagram.




The first function project – the "message receiver" – contains an HTTP triggered function that responds to an incoming HTTP POST request accepting an XML message. It performs some validation on it and add, if passing, adds it to the queue. A record is written to the log table in table storage. It will also accept a GET request via a second function to return the status of the message.

The second project – the "message processor" – contains a function set up on a queue trigger, firing off as new messages are detected on the queue. It will be responsible for taking the validated message and passing it to the destination system for processing (in our case by posting it to a Sitecore API end-point).

This was working nicely in initial testing, but we started to find some edge cases where duplicate content was getting created in Sitecore. We narrowed this down to race conditions - a check would be made for a node at a particular path, and if it wasn't there it would be created. But in some cases the processing of another message would get in there first, and we'd end up with two nodes of the same name.

Controlling the speed of queue processing

This I thought should have been covered via some settings available on the queue triggered function to manage how many messages are dequeued at a time and how many instances the function app will scale out to. But seems I may have got confused with what's available for web jobs. In the end I came across this Github issue that indicates that, at least as of the time of writing, supporting singleton (or "process one queue message at a time") isn't supported directly on the functions platform.

So we needed an alternative - some way of having a function first check to see if other messages are already being processed. I read the Azure Web Jobs platform makes use of blob storage leases for this purpose, so tackling the problem in a similar way seemed sensible.

The solution we used in the end was to have the function triggered by the queue message to first try to acquire a lease on a particular blob. If it could get one, the message would be processed and, just before the function terminates as part of a try/finally block, the lease was released. If a second message was processed whilst the first was running, it won't be able to acquire the lease and so instead exits the function just after putting the message back on the queue. Note that it's important here to explicitly put the message back on the on the queue rather than just throw an exception. Doing this will leave the message back on the queue but with the dequeue count incremented, and when this reaches a configured level the platform will deem that message can't be processed and migrate it to a "poison" queue.



Setting these types of global flags can be risky as if an unexpected exception occurs there's a danger that the flag doesn't get reset, and if that were the case, no messages would get processed at all. Fortunately though blob leases when initially acquired can be set to expire after a period of time, so even if not explicitly released there's no risk of the system getting blocked in this way.

We coupled this solution with a second step which was to add messages to the queue initially with a short, random delay in the time that the message will be visible for processing. That way, even if we get a load of messages at once, their presence on the queue and the time at which they are processed (or put back for processing) will be staggered.

I've pulled out a simplified version of our implementation of this pattern in this Github repository.

Saturday, 30 December 2017

Creating a dynamic, multi-source, search enabled TreeList for Sitecore

I spent a partially frustrating but ultimately reasonably successful day recently looking to extend a Sitecore editorial control to add some functionality that would be of benefit to our editors.

Our starting point was the built in TreeList control that we were using to allow editors to select images for a carousel. With this we can set a source to limit the selections to an appropriate location in the media library tree. That did the job for the initial release of the feature but we then had to handle some additional requirements, as follows.

  1. We need to allow editors to select from two locations, not just one. Specifically they were selecting images for display on a page on a house-builder's website, with the page showing a house that's of a particular type and found on a particular site. So images for the site and the house type were appropriate, and these were stored in different folders.
  2. We needed to dynamically set the source of these locations, so they were picking from the appropriate site and house type folders for the page the image carousel component was hosted on.
  3. Although we wanted to make these folders the default, we also wanted to allow editors to search and use tagged images across the media library.

We then needed to consider extending the existing TreeList field type, or look at another option that might provide the other features we need, such as MultiList with Search. We decided on the former approach given the TreeList was giving us some of the functionality we needed and it looked feasible to extend to add the additional features. In contrast it looked tricky to adapt the MultiList with Search to our requirements of making it easy for editors to select from specific folder locations, where we'd expect most of the images they wanted to use to be found.

Multiple root nodes

The first issue was relatively easy to resolve, given Kam Figy had posted a means of doing this on his blog a few years ago. Using that technique we were able to set a source with multiple paths - e.g. /sitecore/media library/folder a|/sitecore/media library/folder b. When the editing control is rendered within the Sitecore Content Editor, the default TreeList is replaced by a MultiRootTreeList used by internal Sitecore controls, and configured with the provided roots.

Dynamic sources

We had part of the solution for this already which we could adapt for use in this context. Although Sitecore complains that the entry is invalid, it doesn't stop you entering a source for a field using the custom TreeList containing tokens of the form: /sitecore/media library/$sitePath|/sitecore/media library/$houseTypePath. When the control is rendered we'd want to replace $sitePath with the appropriate folder for the specific site's image and do similar for the $houseTypePath token.

In the CreateDataContext method from the code linked above, we get a string representing one of our sources, e.g. /sitecore/media library/$sitePath. We then need to replace $sitePath with the actual path to the folder containing images for the site that correspond to the current page where the image carousel component is located. In our case a field from one of the ancestors of the page provides this information, but in the general case, the only issue really is determining the current page item that's being edited. Once you can do that, you can access it's fields or ancestors, and carry out whatever logic is necessary to determine the appropriate media library path.

Fortunately this is quite straightforward. The base TreeList provides an ItemID property, so we can retrieve the page item as follows:

    var database = Sitecore.Data.Database.GetDatabase("master");
    return database.GetItem(ItemID);

Providing search functionality

Image template with tags

In our solution we've created a custom template to use for images, that inherits from the base image field as described here. With that in place we can create other templates containing fields for tagging and add that as a base template too.

So any images uploaded to the media library will now be of our custom template and the editor will be able to assign tags to them. However, we can't currently make use of that when selecting images for our carousels, as we only have a field that supports selection of images from given folders.

Adding the search interface

Having worked with the ASP.Net MVC framework for several years, it's been a while for me since the time of working with ASP.Net user controls, but that's what we need to work with here. Although we don't have access to the design surface (in the .ascx file) we can still add controls programatically to the interface.

Initially I started doing this with ASP.Net controls such as TextBox and Button, but then lost a few hours trying to debug issues around post-backs. When the button was clicked, an exception of "Failed to load ViewState" would be thrown. Usually this occurs due to issues with the ASP.Net page lifecycle - creating controls in the wrong page event handler or not recreating them on post-backs. Unfortunately any combination I could try led to the same issue, so this doesn't seem to be the way to go when looking to create or modify a user control intended for use within the Sitecore CMS itself.

Instead I ended up simply applying the necessary mark-up using a Literal control, added to the user control in an overridden OnLoad page event:

    const string SearchFieldId = "MultiSourceTreeListWithSearchSearchQuery";
    const string PerformSearchMessageName = "multisourcetreelistwithsearch:search";
    var markup = $@"
        
"; Controls.AddAt(0, new Literal { Text = markup });

That still left the challenge of needing to find a way to "post-back" the search query entered in the text box so we can respond to it server side, by carrying out a search and then making the search results available to the user for selection.

I found the solution to that via looking at the little context menu items that can be added to controls and are found on many of the built-in ones. You do this by adding a folder named Menu under your custom field template, and then creating items of type Menu item below that.

Although I didn't want to use these in the final control - as a menu button on it's own was no use, we needed a text box too - by temporarily creating one of these with an appropriate message, I could see what JavaScript was generated, copy it and add it onto my own button:

    var clickHandler = $"javascript: return scForm.postEvent(this, event, '{PerformSearchMessageName}:' +  document.getElementById('{SearchFieldId}').value);";

In the above, when the button is clicked it'll pass a message of the form multisourcetreelistwithsearch:search:[search query] with the latter part read from the value entered into the text box.

Handling the search message

Server-side we can catch and respond to these messages like this:

    public override void HandleMessage(Message message)
    {
        if (message.Name.StartsWith(PerformSearchMessageName))
        {
            HandleSearchMessage(ExtractQueryFromMessage(message));
        }
    }

    private static string ExtractQueryFromMessage(Message message)
    {
        return message.Name.Split(':').Last();
    }

    private void HandleSearchMessage(string query)
    {
        HttpContext.Current.Items.Add("SearchQuery", query);

        var treeView = GetMultiRootTreeView();
        treeView.RefreshRoot();
    }
    
    private MultiRootTreeview GetMultiRootTreeView()
    {
        return (MultiRootTreeview)WebUtil.FindControlOfType(this, typeof(MultiRootTreeview));
    }

Messages are retrieved via the HandleMessage method where we check the value of the message to see if it's one we want to process. If it's the one passed from the search button we extract the search query from the end of the message and store it in the HttpContext.Current.Items dictionary so we can make use of to it later in the request cycle. How we do that is described shortly, but its triggered by the call to RefreshRoot of the MultiRootTreeview we have that displays the items we can select from when constructing our image carousel.

Rendering the search results

Before we can run the search and display the results, we need to prepare the MultiRootTreeview control to allow for display of the results such that they can then be selected by the editor for the image carousel. The aim is that it'll look and work like this:

By default the control will render the root folders we've configured and allow selection of any items found within those folders. We also create an empty root named "Search results" that will be populated by items that match the editor's search query.

We do this in code like this, calling the following method on the OnLoad page event handler of our derived control:

    protected override void OnLoad(EventArgs args)
    {
        ...

        var treeView = GetMultiRootTreeView();
        var dataContext = GetExistingDataContext();
        AddSearchResultsSourceToTreeView(dataContext, treeView);
    }
    
    private MultiRootTreeview GetMultiRootTreeView()
    {
        return (MultiRootTreeview)WebUtil.FindControlOfType(this, typeof(MultiRootTreeview));
    }
    
    private DataContext GetExistingDataContext()
    {
        return (DataContext)WebUtil.FindControlOfType(this, typeof(DataContext));
    }

    private void AddSearchResultsSourceToTreeView(DataContext dataContext, TreeviewEx treeView)
    {
        var searchResultsDataContext = CreateDataContext(
            dataContext,
            "/sitecore/templates/Foundation/Media/Search Results",
            "TreeListSearchResults");
        treeView.DataContext = $"{searchResultsDataContext.ID}|{treeView.DataContext}";
        dataContext.Parent.Controls.Add(searchResultsDataContext);
    }
    
    private DataContext CreateDataContext(DataContext baseDataContext, string dataSource, string dataViewName = "Master")
    {
        var dataContext = new DataContext
            {
                ID = GetUniqueID("D"),
                Filter = baseDataContext.Filter,
                DataViewName = dataViewName,
                Root = dataSource,
                Language = Language.Parse(ItemLanguage)
            };
        if (!string.IsNullOrEmpty(DatabaseName))
        {
            dataContext.Parameters = "databasename=" + DatabaseName;
        }

        return dataContext;
    }

As well as the addition of this new "root" (that maps to a Sitecore item we've created that has no children), the key thing to note is that we're using a custom value for DataViewName, of TreeListSearchResults. This corresponds to a class we've created that derives from MasterDataView:

    public class TreeListSearchResultsDataView : MasterDataView
    {
        protected override void GetChildItems(ItemCollection items, Item item)
        {
            var query = RetrieveSearchQuery();
            var results = GetSearchResults(query);
            AddSearchResultsToView(items, results);
        }

        private static string RetrieveSearchQuery()
        {
            return HttpContext.Current.Items["SearchQuery"] as string;
        }

        private static IList<BaseSearchResultItem> GetSearchResults(string query)
        {
            if (string.IsNullOrWhiteSpace(query))
            {
                return new List<BaseSearchResultItem>();
            }

            var searchRepo = new SitecoreSearchRepository("master");
            return searchRepo.Search<BaseSearchResultItem>(
                    q => (q.TemplateId == Settings.TemplateIDs.UnversionedJpeg || q.TemplateId == Settings.TemplateIDs.VersionedJpeg) &&
                        (q.Name.Contains(query) || q.Content.Contains(query)),
                    o => o.Name)
                .ToList();
        }

        private static void AddSearchResultsToView(ItemCollection items, IList<BaseSearchResultItem> results)
        {
            items.Clear();
            if (!results.IsAndAny())
            {
                return;
            }

            foreach (var result in results)
            {
                items.Add(result.GetItem());
            }
        }
    }

Within this class we override the GetChildItems method and carry out our search - retrieving the value from the HttpContext.Current.Items dictionary where we stashed it earlier and carrying out a search. Code for SitecoreSearchRepository isn't shown, but it's just wrapping the standard Sitecore functionality used for retrieving values from the Lucene or Solr search index.

When items are found in the search results, they are added to the ItemCollection passed as a parameter to the GetChildItems method. This populates the child items that appear under "Search Results" in the left hand side of our TreeList, from where the editor can select them as normal.

Tag indexing

The only thing missing now is that the search currently only operates on the name of the item in the media library. It's searching the Content field too, but this doesn't contain any detail about the selected tags - we'll need to add them when the item is added to the search index.

To do this we can create a computed field as follows:

    public class MediaContent : IComputedIndexField
    {
        public string FieldName { get; set; }

        public string ReturnType { get; set; }

        public object ComputeFieldValue(IIndexable indexable)
        {
            Assert.ArgumentNotNull(indexable, nameof(indexable));

            var indexableItem = indexable as SitecoreIndexableItem;
            if (indexableItem == null)
            {
                return null;
            }

            var imageTemplateIds = new[] { Settings.TemplateIDs.UnversionedJpeg, Settings.TemplateIDs.VersionedJpeg };
            if (!imageTemplateIds.Contains(indexableItem.Item.TemplateID))
            {
                return null;
            }

            var fields = new List<string>
                {
                    "Tags1",
                    "Tags2"
                };

            return ConcatenateIndexableContent(indexableItem.Item, fields);
        }

        private static string ConcatenateIndexableContent(Item item, IEnumerable<string> fields)
        {
            var sb = new StringBuilder();
            foreach (var field in fields)
            {
                var value = item[field];
                if (string.IsNullOrEmpty(value))
                {
                    continue;
                }

                sb.Append(GetItemIdListValueNames(item.Database, value));
                sb.Append(" ");
            }

            RemoveTrailingSpace(sb);
            return sb.ToString();
        }

        private static string GetItemIdListValueNames(Database database, string fieldValue)
        {
            var values = new ListString(fieldValue);
            var sb = new StringBuilder();

            foreach (var value in values)
            {
                var item = database.GetItem(value);
                if (item == null)
                {
                    continue;
                }

                sb.Append(item.Name);
                sb.Append(" ");
            }

            RemoveTrailingSpace(sb);
            return sb.ToString();
        }

        private static void RemoveTrailingSpace(StringBuilder sb)
        {
            if (sb.Length > 0)
            {
                sb.Length = sb.Length - 1;
            }
        }
    }

With the code above we locate the field(s) containing our tags, convert the stored pipe (|) delimited list of item Ids into the tag names and return a space delimited list of these names as the field value. By configuring this to be added to the _content field in the search index via a patch file like the following (example for Solr), we ensure the Content field is populated and can be searched on by tag values.

  <sitecore>
    <contentSearch>
      <indexConfigurations>
        <defaultSolrIndexConfiguration>
          <fields hint="raw:AddComputedIndexField">
            <field fieldName="_content" type="MyApp.ComputedFields.MediaContent, MyApp" />
          </fields>
        </defaultSolrIndexConfiguration>
      </indexConfigurations>
    </contentSearch>
  </sitecore>

Saturday, 11 November 2017

FakeDb, Layout tests and Sitecore 9

Following the previous post on this blog involving an issue we ran into upgrading a Sitecore project from 8.2 to 9, this one is also on the same subject. The issue this time was some unit tests that make use of the FakeDb library began to fail following the upgrade.

Noticing that it was just tests that involved checking the status of layout related fields, with the help of some de-compilation, it was clear that a change had occurred in the Sitecore.Data.Fields.LayoutField.GetFieldValue method that now involve the use of a pipeline components.

I found it was possible to resolve this by adding the following to the app.config file, within the &lgt;sitecore&ggt; element, for the test projects containing the failing tests:
    <pipelines>
      <getLayoutSourceFields>
        <processor type="Sitecore.Pipelines.GetLayoutSourceFields.GetFinalLayoutField, Sitecore.Kernel" />
        <processor type="Sitecore.Pipelines.GetLayoutSourceFields.GetLayoutField, Sitecore.Kernel" />
      </getLayoutSourceFields>
    </pipelines>

Have also submitted a PR to have this added to the default configuration for FakeDb, which, if accepted and a new release created, may remove the need for this amend.