Introduction
In a prior article I documented the method I used to build a rudimentary
blog page. It wasn't intended to be a full featured blog or a substitute
for a content management system. It was simple and basic, and it fit my
modest needs. The simple blog page uses an XML file filled with articles
as its' data source. At first I was satisfied with building and updating
this XML data source by hand when needed. After using it for several
years, I decided to automate the building of this data source and
incorporate that automated build capability into the build environment for
my web site. If you are interested in or want the background on the simple
blog page, the article is located at Simple
Blog Page KB3HHA - Seth Cohen.
Authoring the Articles to Include Metadata
I have been using a WYSIWYG HTML editor called BlueGriffon to edit the
articles.The author of this program is no longer updating or supporting
the program, but it still satisfies my needs. To automate building the
article data store, I need a way to embed the metadata for my blog
articles into each article. The fields I need are the author, date,
article title, article synopsis, category, and the article data. The title
and article data (body) of the article are easy enough to parse out of the
HTML.For the other fields I added meta tags to hold the required data. As
an example, the meta tags for this article look like this:
<meta name="author" content="Seth Cohen">
<meta name="date"
content="2024-06-11T14:39:00.1463148-04:00">
<meta name="category" content="Web Development">
<meta name="summary" content="Build a utility
application to create an article file used by the simple blog page">
Now I had a way to embed all of the data I needed to automatically
generate the article data store from the articles.
Generating the Data Source
To build the article XML file I wrote a command line utility with two
required parameters. This first is the path to a folder that contains all
of the article files in HTML format with the embedded metadata. The second
parameter is the path and file name to save the generated article file. It
was a simple matter to get all of the HTML files in the specified
folder, parse out the required data, and build the article data store.
Here is the code to traverse the folder and create a list of articles
DirectoryInfo dir = new DirectoryInfo(options.FolderPath);
Regex reg = new Regex(@".*\.html$");
List<FileInfo> fileList = dir.GetFiles().Where(fi => reg.IsMatch(fi.Name)).ToList();
List<Article> articles = new List<Article>();
// for each file in the folder
foreach(FileInfo file in fileList)
{
Article newArticle = new Article(file);
// add the article to the list of articles
articles.Add(newArticle);
}
The Article class has a constructor that accepts a FileInfo. That
constructor contains the code to parse the necessary data fields out of
the HTML. It looks like this:
public Article(FileInfo file)
{
using StreamReader sr = new StreamReader(file.FullName);
string htmlText = sr.ReadToEnd();
MatchCollection titleMatches = titleRegex.Matches(htmlText);
if (titleMatches.Count > 0)
{
Title = titleMatches[0].Groups[1].Value;
}
MatchCollection bodyMatches = bodyRegex.Matches(htmlText);
if (bodyMatches.Count > 0)
{
Content = bodyMatches[0].Groups[1].Value;
}
foreach (Match match in metaRegex.Matches(htmlText))
{
string name = match.Groups[1].Value;
string content = match.Groups[2].Value;
if (name.Equals("author"))
{
Author = content;
Debug.WriteLine("Author: " + content);
}
else if (name.Equals("date"))
{
Date = DateTime.Parse(content);
}
else if (name.Equals("category"))
{
Category = content;
}
else if (name.Equals("summary"))
{
Summary = content;
}
}
}
Once we have a list of articles, they need to be sorted in the order we
want to display them and we also need to assign unique ID numbers to each
article. I like to display the articles with the newest one at the top of
the list (descending order). Here is the code to implement that:
// order articles by date and number them
articles = articles.OrderByDescending(o => o.Date).ToList();
int id = articles.Count;
foreach (Article article in articles)
{
article.Id = id--;
}
The final step in building the data store is to serialize the article
list to the file that was specified on the command line.
// serialize the list of articles to an article file
XmlSerializer xmlSerializer = new XmlSerializer(typeof(List<Article>));
using TextWriter writer = new StreamWriter(options.ArticleFile);
xmlSerializer.Serialize(writer, articles);
writer.Close();
Create a Pre-Build or Post-Build Event in Visual Studio
Now that we have built the utility, we need to convince Visual Studio to
run it for us each time the web site is built. The way to accomplish this
is to right click the project node in the solution explorer and select
Properties from the context menu. Select the Build Events property page in
he resulting project properties window. This page has areas to enter both
pre-build and post-build commands that Visual Studio will run either
before the build or after the build. I chose to use the pre-build event.
Enter the appropriate command line in whichever event you desire. Visual
Studio will now execute the utility, displaying the output in the Output
window. If the utility returns a zero Visual Studio assumes it completed
without error. Returning anything other than zero indicates that an error
occurred.
I hope somebody finds this information useful. Thanks for reading this
article.