Automated Blog File Build

Create a utility application to build an article file used by the simple blog page


In a prior article I documented the method I used to build a rudimentary blog page. It wasn't intended to be a full featured blog or a substitute for a content management system. It was simple and basic, and it fit my modest needs. The simple blog page uses an XML file filled with articles as its' data source. At first I was satisfied with building and updating this XML data source by hand when needed. After using it for several years, I decided to automate the building of this data source and incorporate that automated build capability into the build environment for my web site. If you are interested in or want the background on the simple blog page, the article is located at Simple Blog Page KB3HHA - Seth Cohen.

Authoring the Articles to Include Metadata

I have been using a WYSIWYG HTML editor called BlueGriffon to edit the articles.The author of this program is no longer updating or supporting the program, but it still satisfies my needs. To automate building the article data store, I need a way to embed the metadata for my blog articles into each article. The fields I need are the author, date, article title, article synopsis, category, and the article data. The title and article data (body) of the article are easy enough to parse out of the HTML.For the other fields I added meta tags to hold the required data. As an example, the meta tags for this article look like this:

    <meta name="author" content="Seth Cohen">
    <meta name="date" content="2024-06-11T14:39:00.1463148-04:00">
    <meta name="category" content="Web Development">
    <meta name="summary" content="Build a utility application to create an article file used by the simple blog page">

Now I had a way to embed all of the data I needed to automatically generate the article data store from the articles.

Generating the Data Source

To build the article XML file I wrote a command line utility with two required parameters. This first is the path to a folder that contains all of the article files in HTML format with the embedded metadata. The second parameter is the path and file name to save the generated article file. It was a simple matter to get all of the  HTML files in the specified folder, parse out the required data, and build the article data store.

Here is the code to traverse the folder and create a list of articles

      DirectoryInfo dir = new DirectoryInfo(options.FolderPath);
      Regex reg = new Regex(@".*\.html$");
      List<FileInfo> fileList = dir.GetFiles().Where(fi => reg.IsMatch(fi.Name)).ToList();
      List<Article> articles = new List<Article>();

      // for each file in the folder
      foreach(FileInfo file in fileList)
          Article newArticle = new Article(file);
          // add the article to the list of articles

The Article class has a constructor that accepts a FileInfo. That constructor contains the code to parse the necessary data fields out of the HTML. It looks like this:

        public Article(FileInfo file) 
            using StreamReader sr = new StreamReader(file.FullName);
            string htmlText = sr.ReadToEnd();

            MatchCollection titleMatches = titleRegex.Matches(htmlText);
            if (titleMatches.Count > 0)
                Title = titleMatches[0].Groups[1].Value;

            MatchCollection bodyMatches = bodyRegex.Matches(htmlText);
            if (bodyMatches.Count > 0)
                Content = bodyMatches[0].Groups[1].Value;

            foreach (Match match in metaRegex.Matches(htmlText))
                string name = match.Groups[1].Value;
                string content = match.Groups[2].Value;
                if (name.Equals("author"))
                    Author = content;
                    Debug.WriteLine("Author: " + content);
                else if (name.Equals("date"))
                    Date = DateTime.Parse(content);
                else if (name.Equals("category"))
                    Category = content;
                else if (name.Equals("summary"))
                    Summary = content;

Once we have a list of articles, they need to be sorted in the order we want to display them and we also need to assign unique ID numbers to each article. I like to display the articles with the newest one at the top of the list (descending order). Here is the code to implement that:

    // order articles by date and number them
    articles = articles.OrderByDescending(o => o.Date).ToList();
    int id = articles.Count;
    foreach (Article article in articles)
        article.Id = id--;

The final step in building the data store is to serialize the article list to the file that was specified on the command line.

    // serialize the list of articles to an article file
    XmlSerializer xmlSerializer = new XmlSerializer(typeof(List<Article>));
    using TextWriter writer = new StreamWriter(options.ArticleFile);
    xmlSerializer.Serialize(writer, articles);

Create a Pre-Build or Post-Build Event in Visual Studio

Now that we have built the utility, we need to convince Visual Studio to run it for us each time the web site is built. The way to accomplish this is to right click the project node in the solution explorer and select Properties from the context menu. Select the Build Events property page in he resulting project properties window. This page has areas to enter both pre-build and post-build commands that Visual Studio will run either before the build or after the build. I chose to use the pre-build event. Enter the appropriate command line in whichever event you desire. Visual Studio will now execute the utility, displaying the output in the Output window. If the utility returns a zero Visual Studio assumes it completed without error. Returning anything other than zero indicates that an error occurred.

I hope somebody finds this information useful. Thanks for reading this article.

Written by Seth Cohen on 11-Jun-2024